idnits 2.17.1 draft-ietf-idr-error-handling-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 11 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 24, 2013) is 3930 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5226' is defined on line 421, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4893 (Obsoleted by RFC 6793) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force (IETF) J. Scudder 3 Internet Draft Juniper Networks 4 Update: 1997, 4271, 4360, 5701 (if approved) E. Chen 5 Intended Status: Standards Track Cisco Systems 6 Expires: December 25, 2013 P. Mohapatra 7 Cumulus Networks 8 K. Patel 9 Cisco Systems 10 June 24, 2013 12 Revised Error Handling for BGP UPDATE Messages 13 draft-ietf-idr-error-handling-04.txt 15 Abstract 17 According to the base BGP specification, a BGP speaker that receives 18 an UPDATE message containing a malformed attribute is required to 19 reset the session over which the offending attribute was received. 20 This behavior is undesirable as a session reset would impact not only 21 routes with the offending attribute, but also other valid routes 22 exchanged over the session. This document partially revises the 23 error handling for UPDATE messages, and provides guidelines for the 24 authors of documents defining new attributes. Finally, it revises 25 the error handling procedures for a number of existing attributes. 27 Status of this Memo 29 This Internet-Draft is submitted to IETF in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF), its areas, and its working groups. Note that 34 other groups may also distribute working documents as Internet- 35 Drafts. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 The list of current Internet-Drafts can be accessed at 43 http://www.ietf.org/1id-abstracts.html 45 The list of Internet-Draft Shadow Directories can be accessed at 46 http://www.ietf.org/shadow.html 47 This Internet-Draft will expire on December 25, 2013. 49 Copyright Notice 51 Copyright (c) 2013 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 1. Introduction 66 According to the base BGP specification [RFC4271], a BGP speaker that 67 receives an UPDATE message containing a malformed attribute is 68 required to reset the session over which the offending attribute was 69 received. This behavior is undesirable as a session reset would 70 impact not only routes with the offending attribute, but also other 71 valid routes exchanged over the session. In the case of optional 72 transitive attributes, the behavior is especially troublesome and may 73 present a potential security vulnerability. The reason is that such 74 attributes may have been propagated without being checked by 75 intermediate routers that do not recognize the attributes -- in 76 effect the attribute may have been tunneled, and when they do reach a 77 router that recognizes and checks them, the session that is reset may 78 not be associated with the router that is at fault. 80 The goal for revising the error handling for UPDATE messages is to 81 minimize the impact on routing by a malformed UPDATE message, while 82 maintaining protocol correctness to the extent possible. This can be 83 achieved largely by maintaining the established session and keeping 84 the valid routes exchanged, but removing the routes carried in the 85 malformed UPDATE from the routing system. 87 This document partially revises the error handling for UPDATE 88 messages, and provides guidelines for the authors of documents 89 defining new attributes. Finally, it revises the error handling 90 procedures for a number of existing attributes. Specifically, the 91 error handling procedures of [RFC4271], [RFC1997], [RFC4360] and 92 [RFC5701] are revised. 94 1.1. Specification of Requirements 96 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 97 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 98 document are to be interpreted as described in RFC 2119 [RFC2119]. 100 2. Revision to Base Specification 102 The first paragraph of Section 6.3 of [RFC4271] is revised as 103 follows: 105 Old Text: 107 All errors detected while processing the UPDATE message MUST be 108 indicated by sending the NOTIFICATION message with the Error Code 109 UPDATE Message Error. The error subcode elaborates on the specific 110 nature of the error. 112 New text: 114 An error detected while processing the UPDATE message for which a 115 session reset is specified MUST be indicated by sending the 116 NOTIFICATION message with the Error Code UPDATE Message Error. 117 The error subcode elaborates on the specific nature of the error. 119 The error handling of the following case described in Section 6.3 of 120 [RFC4271] remains unchanged: 122 If the Withdrawn Routes Length or Total Attribute Length 123 is too large (i.e., if Withdrawn Routes Length + Total Attribute 124 Length + 23 exceeds the message Length), then the Error Subcode 125 MUST be set to Malformed Attribute List. 127 The error handling of the following case described in Section 6.3 of 128 [RFC4271] is revised 130 If any recognized attribute has Attribute Flags that conflict with 131 the Attribute Type Code, then the Error Subcode MUST be set to 132 Attribute Flags Error. The Data field MUST contain the erroneous 133 attribute (type, length, and value). 135 as follows: 137 If the "Optional bit" or the "Transitive bit" in the Attribute 138 Flags for an attribute conflicts with the Attribute Type Code, 139 then the error SHOULD be logged, and the conflicting bit in the 140 Attribute Flags MUST be reset to the correct value. The UPDATE 141 message MUST continue to be processed. 143 The error handling of all other cases described in Section 6.3 of 144 [RFC4271] that specify a session reset is revised as follows. 146 When a path attribute (other than the MP_REACH_NLRI attribute 147 [RFC4760] or the MP_UNREACH_NLRI attribute [RFC4760]) in an UPDATE 148 message is determined to be malformed, the UPDATE message containing 149 that attribute MUST be treated as though all contained routes had 150 been withdrawn just as if they had been listed in the WITHDRAWN 151 ROUTES field (or in the MP_UNREACH_NLRI attribute if appropriate) of 152 the UPDATE message, thus causing them to be removed from the Adj-RIB- 153 In according to the procedures of [RFC4271]. In the case of an 154 attribute which has no effect on route selection or installation, the 155 malformed attribute MAY instead be discarded and the UPDATE message 156 continue to be processed. For the sake of brevity, the former 157 approach is termed "treat-as-withdraw", and the latter as "attribute 158 discard". 160 If any of the well-known mandatory attributes are not present in an 161 UPDATE message, then the approach of "treat-as-withdraw" MUST be used 162 for the error handling. 164 The approach of "treat-as-withdraw" MUST be used for the error 165 handling of the cases described in Section 6.3 of [RFC4271] that 166 specify a session reset and involve any of the following attributes: 167 ORIGIN, AS_PATH, NEXT_HOP, MULTI_EXIT_DISC, and LOCAL_PREF. 169 The approach of "attribute discard" MUST be used for the error 170 handling of the cases described in Section 6.3 of [RFC4271] that 171 specify a session reset and involve any of the following attributes: 172 ATOMIC_AGGREGATE and AGGREGATOR. 174 If the MP_REACH_NLRI attribute or the MP_UNREACH_NLRI attribute 175 appears more than once in the UPDATE message, then a NOTIFICATION 176 message MUST be sent with the Error Subcode "Malformed Attribute 177 List". If any other attribute appears more than once in an UPDATE 178 message, then all the occurrences of the attribute other than the 179 first one SHALL be discarded and the UPDATE message continue to be 180 processed. 182 When multiple malformed attributes exist in an UPDATE message, if the 183 same approach (either "session reset", or "treat-as-withdraw" or 184 "attribute discard") is specified for the handling of these malformed 185 attributes, then the specified approach MUST be used. Otherwise the 186 approach with the strongest action MUST be used following the order 187 of "session reset", "treat-as-withdraw" and "attribute discard" from 188 the strongest to the weakest. 190 A document which specifies a new attribute MUST provide specifics 191 regarding what constitutes an error for that attribute and how that 192 error is to be handled. 194 Finally, we observe that in order to use the approach of "treat-as- 195 withdraw", the entire NLRI field and/or the MP_REACH_NLRI and 196 MP_UNREACH_NLRI attributes need to be successfully parsed. If this 197 is not possible, the procedures of [RFC4271] continue to apply. 198 Alternatively the error handling procedures specified in [RFC4760] 199 for disabling a particular AFI/SAFI MAY be followed. 201 3. Parsing of NLRI Fields 203 To facilitate the determination of the NLRI field in an UPDATE with a 204 malformed attribute, the MP_REACH_NLRI or MP_UNREACH_NLRI attribute 205 (if present) SHALL be encoded as the very first path attribute in an 206 UPDATE. An implementation, however, MUST still be prepared to 207 receive these fields in any position. 209 If the encoding of [RFC4271] is used, the NLRI field for the IPv4 210 unicast address family is carried immediately following all the 211 attributes in an UPDATE. When such an UPDATE is received, we observe 212 that the NLRI field can be determined using the "Message Length", 213 "Withdrawn Route Length" and "Total Attribute Length" (when they are 214 consistent) carried in the message instead of relying on the length 215 of individual attributes in the message. 217 4. Operational Considerations 219 Although the "treat-as-withdraw" error-handling behavior defined in 220 Section 2 makes every effort to preserve BGP's correctness, we note 221 that if an UPDATE received on an IBGP session is subjected to this 222 treatment, inconsistent routing within the affected Autonomous System 223 may result. The consequences of inconsistent routing can include 224 long-lived forwarding loops and black holes. While lamentable, this 225 issue is expected to be rare in practice, and more importantly is 226 seen as less problematic than the session-reset behavior it replaces. 228 When a malformed attribute is indeed detected over an IBGP session, 229 we RECOMMEND that routes with the malformed attribute be identified 230 and traced back to the ingress router in the network where the routes 231 were sourced or received externally, and then a filter be applied on 232 the ingress router to prevent the routes from being sourced or 233 received. This will help maintain routing consistency in the 234 network. 236 Even if inconsistent routing does not arise, the "treat-as-withdraw" 237 behavior can cause either complete unreachability or sub-optimal 238 routing for the destinations whose routes are carried in the affected 239 UPDATE message. 241 Note that "treat-as-withdraw" is different from discarding an UPDATE 242 message. The latter violates the basic BGP principle of incremental 243 update, and could cause invalid routes to be kept. (See also 244 Appendix A.) 246 For any malformed attribute which is handled by the "attribute 247 discard" instead of the "treat-as-withdraw" approach, it is critical 248 to consider the potential impact of doing so. In particular, if the 249 attribute in question has or may have an effect on route selection or 250 installation, the presumption is that discarding it is unsafe, unless 251 careful analysis proves otherwise. The analysis should take into 252 account the tradeoff between preserving connectivity and potential 253 side effects. 255 Because of these potential issues, a BGP speaker MUST provide 256 debugging facilities to permit issues caused by a malformed attribute 257 to be diagnosed. At a minimum, such facilities MUST include logging 258 an error listing the NLRI involved, and containing the entire 259 malformed UPDATE message when such an attribute is detected. The 260 malformed UPDATE message SHOULD be analyzed, and the root cause 261 SHOULD be investigated. 263 5. Error Handling Procedures for Existing Attributes 265 5.1. ORIGIN 267 The attribute is considered malformed if its length is not 1, or it 268 has an undefined value [RFC4271]. 270 An UPDATE message with a malformed ORIGIN attribute SHALL be handled 271 using the approach of "treat-as-withdraw". 273 5.2. AS_PATH 275 The error conditions for the attribute have been defined in 276 [RFC4271]. 278 An UPDATE message with a malformed AS_PATH attribute SHALL be handled 279 using the approach of "treat-as-withdraw". 281 5.3. NEXT_HOP 283 The error conditions for the NEXT_HOP attribute have been defined in 284 [RFC4271]. 286 An UPDATE message with a malformed NEXT_HOP attribute SHALL be 287 handled using the approach of "treat-as-withdraw". 289 5.4. MULTI_EXIT_DESC 291 The attribute is considered malformed if its length is not 4 292 [RFC4271]. 294 An UPDATE message with a malformed MULTI_EXIT_DESC attribute SHALL be 295 handled using the approach of "treat-as-withdraw". 297 5.5. LOCAL_PREF 299 The attribute is considered malformed if its length is not 4 300 [RFC4271]. 302 An UPDATE message with a malformed LOCAL_PREF attribute SHALL be 303 handled as follows: 305 o using the approach of "attribute discard" if the UPDATE message 306 is received from an external neighbor, or 308 o using the approach of "treat-as-withdraw" if the UPDATE message 309 is received from an internal neighbor. 311 In addition, if the attribute is present in an UPDATE message from an 312 external neighbor, the approach of "attribute discard" SHALL be used 313 to handle the unexpected attribute in the message. 315 5.6. ATOMIC_AGGREGATE 317 The attribute SHALL be considered malformed if its length is not 0 318 [RFC4271]. 320 An UPDATE message with a malformed ATOMIC_AGGREGATE attribute SHALL 321 be handled using the approach of "attribute discard". 323 5.7. AGGREGATOR 325 The error conditions specified in [RFC4271] for the attribute are 326 revised as follows: 328 The AGGREGATOR attribute SHALL be considered malformed if any of the 329 following applies: 331 o Its length is not 6 (when the "4-octet AS number capability" is 332 not advertised to, or not received from the peer [RFC4893]). 334 o Its length is not 8 (when the "4-octet AS number capability" is 335 both advertised to, and received from the peer). 337 An UPDATE message with a malformed AGGREGATOR attribute SHALL be 338 handled using the approach of "attribute discard". 340 5.8. Community 342 The error handling of [RFC1997] is revised as follows: 344 The Community attribute SHALL be considered malformed if its length 345 is nonzero and is not a multiple of 4. 347 An UPDATE message with a malformed Community attribute SHALL be 348 handled using the approach of "treat-as-withdraw". 350 5.9. Extended Community 352 The error handling of [RFC4360] is revised as follows: 354 The Extended Community attribute SHALL be considered malformed if its 355 length is nonzero and is not a multiple of 8. 357 An UPDATE message with a malformed Extended Community attribute SHALL 358 be handled using the approach of "treat-as-withdraw". 360 Note that a BGP speaker MUST NOT treat an unrecognized Extended 361 Community Type or Sub-Type as an error. 363 5.10. IPv6 Address Specific BGP Extended Community Attribute 365 The error handling of [RFC5701] is revised as follows: 367 The IPv6 Address Specific Extended Community attribute SHALL be 368 considered malformed if its length is nonzero and is not a multiple 369 of 20. 371 An UPDATE message with a malformed IPv6 Address Specific Extended 372 Community attribute SHALL be handled using the approach of "treat-as- 373 withdraw". 375 Note that a BGP speaker MUST NOT treat an unrecognized IPv6 Address 376 Specific Extended Community Type or Sub-Type as an error. 378 6. IANA Considerations 380 This document makes no request of IANA. 382 7. Security Considerations 384 This specification addresses the vulnerability of a BGP speaker to a 385 potential attack whereby a distant attacker can generate a malformed 386 optional transitive attribute that is not recognized by intervening 387 routers (which thus propagate the attribute unchecked) but that 388 causes session resets when it reaches routers that do recognize the 389 given attribute type. 391 In other respects, this specification does not change BGP's security 392 characteristics. 394 8. Acknowledgments 396 The authors wish to thank Juan Alcaide, Ron Bonica, Mach Chen, Andy 397 Davidson, Bruno Decraene, Dong Jie, Rex Fernando, Joel Halpern, Akira 398 Kato, Miya Kohno, Tony Li, Alton Lo, Shin Miyakawa, Tamas Mondal, 399 Jonathan Oddy, Robert Raszuk, Yakov Rekhter, Rob Shakir, Naiming 400 Shen, Shyam Sethuram, Ananth Suryanarayana, Kaliraj Vairavakkalai and 401 Lili Wang for their observations and discussion of this topic, and 402 review of this document. 404 9. Normative References 406 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 407 Communities Attribute", RFC 1997, August 1996. 409 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 410 Requirement Levels", BCP 14, RFC 2119, March 1997. 412 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 413 Protocol 4 (BGP-4)", RFC 4271, January 2006. 415 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 416 Communities Attribute", RFC 4360, February 2006. 418 [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 419 Number Space", RFC 4893, May 2007. 421 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 422 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 423 May 2008. 425 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 426 "Multiprotocol Extensions for BGP-4", RFC 4760, 427 January 2007. 429 [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended 430 Community Attribute", RFC 5701, November 2009. 432 Appendix A. Why not discard UPDATE messages? 434 A commonly asked question is "why not simply discard the UPDATE 435 message instead of treating it like a withdraw? Isn't that safer and 436 easier?" The answer is that it might be easier, but it would 437 compromise BGP's correctness so is unsafe. Consider the following 438 example of what might happen if UPDATE messages carrying bad 439 attributes were simply discarded: 441 AS1 ---- AS2 442 \ / 443 \ / 444 \ / 445 AS3 447 o AS1 prefers to reach AS3 directly, and advertises its route to 448 AS2. 450 o AS2 prefers to reach AS3 directly, and advertises its route to 451 AS1. 453 o Connections AS3-AS1 and AS3-AS2 fail simultaneously. 455 o AS1 switches to prefer AS2's route, and sends an update message 456 which includes a withdraw of its previous announcement. The 457 withdraw is bundled with some advertisements. It includes a bad 458 attribute. As a result, AS2 ignores the message. 460 o AS2 switches to prefer AS1's route, and sends an update message 461 which includes a withdraw of its previous announcement. The 462 withdraw is bundled with some advertisements. It includes a bad 463 attribute. As a result, AS1 ignores the message. 465 The end result is that AS1 forwards traffic for AS3 towards AS2, and 466 AS2 forwards traffic for AS3 towards AS1. This is a permanent (until 467 corrected) forwarding loop. 469 Although the example above discusses route withdraws, we observe that 470 in BGP the announcement of a route also withdraws the route 471 previously advertised. The implicit withdraw can be converted into a 472 real withdraw in a number of ways; for example, the previously- 473 announced route might have been accepted by policy, but the new 474 announcement might be rejected by policy. For this reason, the same 475 concerns apply even if explicit withdraws are removed from 476 consideration. 478 10. Authors' Addresses 480 John G. Scudder 481 Juniper Networks 483 Email: jgs@juniper.net 485 Enke Chen 486 Cisco Systems, Inc. 488 EMail: enkechen@cisco.com 490 Pradosh Mohapatra 491 Cumulus Networks, Inc. 493 EMail: pmohapat@cumulusnetworks.com 494 Keyur Patel 495 Cisco Systems, Inc. 497 EMail: keyupate@cisco.com