idnits 2.17.1 draft-ietf-idr-error-handling-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 11 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 21, 2012) is 4173 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5226' is defined on line 420, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4893 (Obsoleted by RFC 6793) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force (IETF) J. Scudder 3 Internet Draft Juniper Networks 4 Update: 1997, 4271, 4360, 5701 (if approved) E. Chen 5 Intended Status: Standards Track P. Mohapatra 6 Expires: May 22, 2013 K. Patel 7 Cisco Systems 8 November 21, 2012 10 Revised Error Handling for BGP UPDATE Messages 11 draft-ietf-idr-error-handling-03.txt 13 Abstract 15 According to the base BGP specification, a BGP speaker that receives 16 an UPDATE message containing a malformed attribute is required to 17 reset the session over which the offending attribute was received. 18 This behavior is undesirable as a session reset would impact not only 19 routes with the offending attribute, but also other valid routes 20 exchanged over the session. This document partially revises the 21 error handling for UPDATE messages, and provides guidelines for the 22 authors of documents defining new attributes. Finally, it revises 23 the error handling procedures for a number of existing attributes. 25 Status of this Memo 27 This Internet-Draft is submitted to IETF in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF), its areas, and its working groups. Note that 32 other groups may also distribute working documents as Internet- 33 Drafts. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/1id-abstracts.html 43 The list of Internet-Draft Shadow Directories can be accessed at 44 http://www.ietf.org/shadow.html 46 This Internet-Draft will expire on May 22, 2013. 48 Copyright Notice 50 Copyright (c) 2012 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 1. Introduction 65 According to the base BGP specification [RFC4271], a BGP speaker that 66 receives an UPDATE message containing a malformed attribute is 67 required to reset the session over which the offending attribute was 68 received. This behavior is undesirable as a session reset would 69 impact not only routes with the offending attribute, but also other 70 valid routes exchanged over the session. In the case of optional 71 transitive attributes, the behavior is especially troublesome and may 72 present a potential security vulnerability. The reason is that such 73 attributes may have been propagated without being checked by 74 intermediate routers that do not recognize the attributes -- in 75 effect the attribute may have been tunneled, and when they do reach a 76 router that recognizes and checks them, the session that is reset may 77 not be associated with the router that is at fault. 79 The goal for revising the error handling for UPDATE messages is to 80 minimize the impact on routing by a malformed UPDATE message, while 81 maintaining protocol correctness to the extent possible. This can be 82 achieved largely by maintaining the established session and keeping 83 the valid routes exchanged, but removing the routes carried in the 84 malformed UPDATE from the routing system. 86 This document partially revises the error handling for UPDATE 87 messages, and provides guidelines for the authors of documents 88 defining new attributes. Finally, it revises the error handling 89 procedures for a number of existing attributes. Specifically, the 90 error handling procedures of [RFC4271], [RFC1997], [RFC4360] and 91 [RFC5701] are revised. 93 1.1. Specification of Requirements 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in RFC 2119 [RFC2119]. 99 2. Revision to Base Specification 101 The first paragraph of Section 6.3 of [RFC4271] is revised as 102 follows: 104 Old Text: 106 All errors detected while processing the UPDATE message MUST be 107 indicated by sending the NOTIFICATION message with the Error Code 108 UPDATE Message Error. The error subcode elaborates on the specific 109 nature of the error. 111 New text: 113 An error detected while processing the UPDATE message for which a 114 session reset is specified MUST be indicated by sending the 115 NOTIFICATION message with the Error Code UPDATE Message Error. 116 The error subcode elaborates on the specific nature of the error. 118 The error handling of the following case described in Section 6.3 of 119 [RFC4271] remains unchanged: 121 If the Withdrawn Routes Length or Total Attribute Length 122 is too large (i.e., if Withdrawn Routes Length + Total Attribute 123 Length + 23 exceeds the message Length), then the Error Subcode 124 MUST be set to Malformed Attribute List. 126 The error handling of the following case described in Section 6.3 of 127 [RFC4271] is revised 129 If any recognized attribute has Attribute Flags that conflict with 130 the Attribute Type Code, then the Error Subcode MUST be set to 131 Attribute Flags Error. The Data field MUST contain the erroneous 132 attribute (type, length, and value). 134 as follows: 136 If the "Optional bit" or the "Transitive bit" in the Attribute 137 Flags for an attribute conflicts with the Attribute Type Code, 138 then the error SHOULD be logged, and the conflicting bit in the 139 Attribute Flags MUST be reset to the correct value. The UPDATE 140 message MUST continue to be processed. 142 The error handling of all other cases described in Section 6.3 of 143 [RFC4271] that specify a session reset is revised as follows. 145 When a path attribute (other than the MP_REACH_NLRI attribute 146 [RFC4760] or the MP_UNREACH_NLRI attribute [RFC4760]) in an UPDATE 147 message is determined to be malformed, the UPDATE message containing 148 that attribute MUST be treated as though all contained routes had 149 been withdrawn just as if they had been listed in the WITHDRAWN 150 ROUTES field (or in the MP_UNREACH_NLRI attribute if appropriate) of 151 the UPDATE message, thus causing them to be removed from the Adj-RIB- 152 In according to the procedures of [RFC4271]. In the case of an 153 attribute which has no effect on route selection or installation, the 154 malformed attribute MAY instead be discarded and the UPDATE message 155 continue to be processed. For the sake of brevity, the former 156 approach is termed "treat-as-withdraw", and the latter as "attribute 157 discard". 159 If any of the well-known mandatory attributes are not present in an 160 UPDATE message, then the approach of "treat-as-withdraw" MUST be used 161 for the error handling. 163 The approach of "treat-as-withdraw" MUST be used for the error 164 handling of the cases described in Section 6.3 of [RFC4271] that 165 specify a session reset and involve any of the following attributes: 166 ORIGIN, AS_PATH, NEXT_HOP, MULTI_EXIT_DISC, and LOCAL_PREF. 168 The approach of "attribute discard" MUST be used for the error 169 handling of the cases described in Section 6.3 of [RFC4271] that 170 specify a session reset and involve any of the following attributes: 171 ATOMIC_AGGREGATE and AGGREGATOR. 173 If the MP_REACH_NLRI attribute or the MP_UNREACH_NLRI attribute 174 appears more than once in the UPDATE message, then a NOTIFICATION 175 message MUST be sent with the Error Subcode "Malformed Attribute 176 List". If any other attribute appears more than once in an UPDATE 177 message, then all the occurrences of the attribute other than the 178 first one SHALL be discarded and the UPDATE message continue to be 179 processed. 181 When multiple malformed attributes exist in an UPDATE message, if the 182 same approach (either "session reset", or "treat-as-withdraw" or 183 "attribute discard") is specified for the handling of these malformed 184 attributes, then the specified approach MUST be used. Otherwise the 185 approach with the strongest action MUST be used following the order 186 of "session reset", "treat-as-withdraw" and "attribute discard" from 187 the strongest to the weakest. 189 A document which specifies a new attribute MUST provide specifics 190 regarding what constitutes an error for that attribute and how that 191 error is to be handled. 193 Finally, we observe that in order to use the approach of "treat-as- 194 withdraw", the entire NLRI field and/or the MP_REACH_NLRI and 195 MP_UNREACH_NLRI attributes need to be successfully parsed. If this 196 is not possible, the procedures of [RFC4271] continue to apply. 197 Alternatively the error handling procedures specified in [RFC4760] 198 for disabling a particular AFI/SAFI MAY be followed. 200 3. Parsing of NLRI Fields 202 To facilitate the determination of the NLRI field in an UPDATE with a 203 malformed attribute, the MP_REACH_NLRI or MP_UNREACH_NLRI attribute 204 (if present) SHALL be encoded as the very first path attribute in an 205 UPDATE. An implementation, however, MUST still be prepared to 206 receive these fields in any position. 208 If the encoding of [RFC4271] is used, the NLRI field for the IPv4 209 unicast address family is carried immediately following all the 210 attributes in an UPDATE. When such an UPDATE is received, we observe 211 that the NLRI field can be determined using the "Message Length", 212 "Withdrawn Route Length" and "Total Attribute Length" (when they are 213 consistent) carried in the message instead of relying on the length 214 of individual attributes in the message. 216 4. Operational Considerations 218 Although the "treat-as-withdraw" error-handling behavior defined in 219 Section 2 makes every effort to preserve BGP's correctness, we note 220 that if an UPDATE received on an IBGP session is subjected to this 221 treatment, inconsistent routing within the affected Autonomous System 222 may result. The consequences of inconsistent routing can include 223 long-lived forwarding loops and black holes. While lamentable, this 224 issue is expected to be rare in practice, and more importantly is 225 seen as less problematic than the session-reset behavior it replaces. 227 When a malformed attribute is indeed detected over an IBGP session, 228 we RECOMMEND that routes with the malformed attribute be identified 229 and traced back to the ingress router in the network where the routes 230 were sourced or received externally, and then a filter be applied on 231 the ingress router to prevent the routes from being sourced or 232 received. This will help maintain routing consistency in the 233 network. 235 Even if inconsistent routing does not arise, the "treat-as-withdraw" 236 behavior can cause either complete unreachability or sub-optimal 237 routing for the destinations whose routes are carried in the affected 238 UPDATE message. 240 Note that "treat-as-withdraw" is different from discarding an UPDATE 241 message. The latter violates the basic BGP principle of incremental 242 update, and could cause invalid routes to be kept. (See also 243 Appendix A.) 245 For any malformed attribute which is handled by the "attribute 246 discard" instead of the "treat-as-withdraw" approach, it is critical 247 to consider the potential impact of doing so. In particular, if the 248 attribute in question has or may have an effect on route selection or 249 installation, the presumption is that discarding it is unsafe, unless 250 careful analysis proves otherwise. The analysis should take into 251 account the tradeoff between preserving connectivity and potential 252 side effects. 254 Because of these potential issues, a BGP speaker MUST provide 255 debugging facilities to permit issues caused by a malformed attribute 256 to be diagnosed. At a minimum, such facilities MUST include logging 257 an error listing the NLRI involved, and containing the entire 258 malformed UPDATE message when such an attribute is detected. The 259 malformed UPDATE message SHOULD be analyzed, and the root cause 260 SHOULD be investigated. 262 5. Error Handling Procedures for Existing Attributes 264 5.1. ORIGIN 266 The attribute is considered malformed if its length is not 1, or it 267 has an undefined value [RFC4271]. 269 An UPDATE message with a malformed ORIGIN attribute SHALL be handled 270 using the approach of "treat-as-withdraw". 272 5.2. AS_PATH 274 The error conditions for the attribute have been defined in 275 [RFC4271]. 277 An UPDATE message with a malformed AS_PATH attribute SHALL be handled 278 using the approach of "treat-as-withdraw". 280 5.3. NEXT_HOP 282 The error conditions for the NEXT_HOP attribute have been defined in 283 [RFC4271]. 285 An UPDATE message with a malformed NEXT_HOP attribute SHALL be 286 handled using the approach of "treat-as-withdraw". 288 5.4. MULTI_EXIT_DESC 290 The attribute is considered malformed if its length is not 4 291 [RFC4271]. 293 An UPDATE message with a malformed MULTI_EXIT_DESC attribute SHALL be 294 handled using the approach of "treat-as-withdraw". 296 5.5. LOCAL_PREF 298 The attribute is considered malformed if its length is not 4 299 [RFC4271]. 301 An UPDATE message with a malformed LOCAL_PREF attribute SHALL be 302 handled as follows: 304 o using the approach of "attribute discard" if the UPDATE message 305 is received from an external neighbor, or 307 o using the approach of "treat-as-withdraw" if the UPDATE message 308 is received from an internal neighbor. 310 In addition, if the attribute is present in an UPDATE message from an 311 external neighbor, the approach of "attribute discard" SHALL be used 312 to handle the unexpected attribute in the message. 314 5.6. ATOMIC_AGGREGATE 316 The attribute SHALL be considered malformed if its length is not 0 317 [RFC4271]. 319 An UPDATE message with a malformed ATOMIC_AGGREGATE attribute SHALL 320 be handled using the approach of "attribute discard". 322 5.7. AGGREGATOR 324 The error conditions specified in [RFC4271] for the attribute are 325 revised as follows: 327 The AGGREGATOR attribute SHALL be considered malformed if any of the 328 following applies: 330 o Its length is not 6 (when the "4-octet AS number capability" is 331 not advertised to, or not received from the peer [RFC4893]). 333 o Its length is not 8 (when the "4-octet AS number capability" is 334 both advertised to, and received from the peer). 336 An UPDATE message with a malformed AGGREGATOR attribute SHALL be 337 handled using the approach of "attribute discard". 339 5.8. Community 341 The error handling of [RFC1997] is revised as follows: 343 The Community attribute SHALL be considered malformed if its length 344 is nonzero and is not a multiple of 4. 346 An UPDATE message with a malformed Community attribute SHALL be 347 handled using the approach of "treat-as-withdraw". 349 5.9. Extended Community 351 The error handling of [RFC4360] is revised as follows: 353 The Extended Community attribute SHALL be considered malformed if its 354 length is nonzero and is not a multiple of 8. 356 An UPDATE message with a malformed Extended Community attribute SHALL 357 be handled using the approach of "treat-as-withdraw". 359 Note that a BGP speaker MUST NOT treat an unrecognized Extended 360 Community Type or Sub-Type as an error. 362 5.10. IPv6 Address Specific BGP Extended Community Attribute 364 The error handling of [RFC5701] is revised as follows: 366 The IPv6 Address Specific Extended Community attribute SHALL be 367 considered malformed if its length is nonzero and is not a multiple 368 of 20. 370 An UPDATE message with a malformed IPv6 Address Specific Extended 371 Community attribute SHALL be handled using the approach of "treat-as- 372 withdraw". 374 Note that a BGP speaker MUST NOT treat an unrecognized IPv6 Address 375 Specific Extended Community Type or Sub-Type as an error. 377 6. IANA Considerations 379 This document makes no request of IANA. 381 7. Security Considerations 383 This specification addresses the vulnerability of a BGP speaker to a 384 potential attack whereby a distant attacker can generate a malformed 385 optional transitive attribute that is not recognized by intervening 386 routers (which thus propagate the attribute unchecked) but that 387 causes session resets when it reaches routers that do recognize the 388 given attribute type. 390 In other respects, this specification does not change BGP's security 391 characteristics. 393 8. Acknowledgments 395 The authors wish to thank Juan Alcaide, Ron Bonica, Mach Chen, Andy 396 Davidson, Bruno Decraene, Dong Jie, Rex Fernando, Joel Halpern, Akira 397 Kato, Miya Kohno, Tony Li, Alton Lo, Shin Miyakawa, Tamas Mondal, 398 Jonathan Oddy, Robert Raszuk, Yakov Rekhter, Rob Shakir, Naiming 399 Shen, Shyam Sethuram, Ananth Suryanarayana, Kaliraj Vairavakkalai and 400 Lili Wang for their observations and discussion of this topic, and 401 review of this document. 403 9. Normative References 405 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 406 Communities Attribute", RFC 1997, August 1996. 408 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 409 Requirement Levels", BCP 14, RFC 2119, March 1997. 411 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 412 Protocol 4 (BGP-4)", RFC 4271, January 2006. 414 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 415 Communities Attribute", RFC 4360, February 2006. 417 [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 418 Number Space", RFC 4893, May 2007. 420 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 421 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 422 May 2008. 424 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 425 "Multiprotocol Extensions for BGP-4", RFC 4760, 426 January 2007. 428 [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended 429 Community Attribute", RFC 5701, November 2009. 431 Appendix A. Why not discard UPDATE messages? 433 A commonly asked question is "why not simply discard the UPDATE 434 message instead of treating it like a withdraw? Isn't that safer and 435 easier?" The answer is that it might be easier, but it would 436 compromise BGP's correctness so is unsafe. Consider the following 437 example of what might happen if UPDATE messages carrying bad 438 attributes were simply discarded: 440 AS1 ---- AS2 441 \ / 442 \ / 443 \ / 444 AS3 446 o AS1 prefers to reach AS3 directly, and advertises its route to 447 AS2. 449 o AS2 prefers to reach AS3 directly, and advertises its route to 450 AS1. 452 o Connections AS3-AS1 and AS3-AS2 fail simultaneously. 454 o AS1 switches to prefer AS2's route, and sends an update message 455 which includes a withdraw of its previous announcement. The 456 withdraw is bundled with some advertisements. It includes a bad 457 attribute. As a result, AS2 ignores the message. 459 o AS2 switches to prefer AS1's route, and sends an update message 460 which includes a withdraw of its previous announcement. The 461 withdraw is bundled with some advertisements. It includes a bad 462 attribute. As a result, AS1 ignores the message. 464 The end result is that AS1 forwards traffic for AS3 towards AS2, and 465 AS2 forwards traffic for AS3 towards AS1. This is a permanent (until 466 corrected) forwarding loop. 468 Although the example above discusses route withdraws, we observe that 469 in BGP the announcement of a route also withdraws the route 470 previously advertised. The implicit withdraw can be converted into a 471 real withdraw in a number of ways; for example, the previously- 472 announced route might have been accepted by policy, but the new 473 announcement might be rejected by policy. For this reason, the same 474 concerns apply even if explicit withdraws are removed from 475 consideration. 477 10. Authors' Addresses 479 John G. Scudder 480 Juniper Networks 482 Email: jgs@juniper.net 484 Enke Chen 485 Cisco Systems, Inc. 487 EMail: enkechen@cisco.com 489 Pradosh Mohapatra 490 Cisco Systems, Inc. 492 EMail: pmohapat@cisco.com 493 Keyur Patel 494 Cisco Systems, Inc. 496 EMail: keyupate@cisco.com