idnits 2.17.1 draft-ietf-idr-bgp-multisession-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 23, 2010) is 5147 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Event 12' is mentioned on line 567, but not defined == Missing Reference: 'Event 19' is mentioned on line 575, but not defined == Outdated reference: A later version (-16) exists of draft-ietf-idr-dynamic-cap-10 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force J. Scudder 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track C. Appanna 5 Expires: September 24, 2010 Cisco Systems 6 March 23, 2010 8 Multisession BGP 9 draft-ietf-idr-bgp-multisession-05 11 Abstract 13 This specification augments "Multiprotocol Extensions for BGP-4" (MP- 14 BGP) by proposing a mechanism to facilitate the use of multiple 15 sessions between a given pair of BGP speakers. Each session is used 16 to transport routes related by some session-based attribute such as 17 AFI/SAFI. This provides an alternative to the MP-BGP approach of 18 multiplexing all routes onto a single connection. 20 Use of this approach is expected to provide finer-grained fault 21 management and isolation as the BGP protocol is used to support more 22 and more diverse services. 24 Status of this Memo 26 This Internet-Draft is submitted to IETF in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/ietf/1id-abstracts.txt. 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html. 45 This Internet-Draft will expire on September 24, 2010. 47 Copyright Notice 48 Copyright (c) 2010 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the BSD License. 61 This document may contain material from IETF Documents or IETF 62 Contributions published or made publicly available before November 63 10, 2008. The person(s) controlling the copyright in some of this 64 material may not have granted the IETF Trust the right to allow 65 modifications of such material outside the IETF Standards Process. 66 Without obtaining an adequate license from the person(s) controlling 67 the copyright in such materials, this document may not be modified 68 outside the IETF Standards Process, and derivative works of it may 69 not be created outside the IETF Standards Process, except to format 70 it for publication as an RFC or to translate it into languages other 71 than English. 73 Table of Contents 75 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 76 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 77 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 3. Use of BGP Capability Advertisement . . . . . . . . . . . . . 5 79 4. New NOTIFICATION Subcodes . . . . . . . . . . . . . . . . . . 6 80 5. Overview of Operation . . . . . . . . . . . . . . . . . . . . 7 81 5.1. Using Multisession . . . . . . . . . . . . . . . . . . . . 8 82 5.1.1. Initiating Connections . . . . . . . . . . . . . . . . 8 83 5.1.1.1. Continuing a Redirected Connection . . . . . . . . 10 84 5.1.2. Accepting Connections . . . . . . . . . . . . . . . . 10 85 5.1.3. Collision Detection, Graceful Restart . . . . . . . . 11 86 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 11 87 7. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 12 88 7.1. Modifications to Connect State and Active State . . . . . 12 89 7.2. Addition of WaitForOpen State, Deletion of OpenSent 90 State . . . . . . . . . . . . . . . . . . . . . . . . . . 13 91 8. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 13 92 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 93 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 94 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 95 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 96 12.1. Normative References . . . . . . . . . . . . . . . . . . . 14 97 12.2. Informative References . . . . . . . . . . . . . . . . . . 15 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 100 1. Introduction 102 Most BGP [RFC4271] implementations only permit a single ESTABLISHED 103 connection to exist with each peer. More precisely, they only permit 104 a single ESTABLISHED connection for any given pair of IP endpoints. 106 BGP Capabilities [RFC5492] extend BGP to allow diverse information 107 (encoded as "capabilities") to be associated with a session. In some 108 cases, a capability may relate to the operation of the protocol 109 machinery; an example is Route Refresh [RFC2918]. However, in other 110 cases a capability may relate specifically to some common 111 distinguishing characteristic of the routes carried over the session; 112 an example is Multiprotocol BGP [RFC4760]. 114 Multiprotocol BGP [RFC4760] extends BGP to allow information for 115 multiple NLRI families and sub-families to be transported in BGP. 116 Routes for different families are distinguished by AFI and SAFI. 117 Routes for different families are commonly multiplexed onto a single 118 BGP session. 120 A common criticism of BGP is the fact that most malformed messages 121 cause the session to be terminated. While this behavior is necessary 122 for protocol correctness, one may observe that the protocol machinery 123 of a given implementation may only be defective with respect to a 124 given AFI/SAFI. Thus, it would be desirable to allow the session 125 related to that family to be terminated while leaving other AFI/SAFI 126 unaffected. As BGP is commonly deployed, this is not possible. 128 A second criticism of BGP is that it is difficult or in some cases 129 impossible to manage control plane resource contention when BGP is 130 used to support diverse services over a single session. In contrast, 131 if a single BGP session carries only information for a single service 132 (or related set of services) it may be easier to manage such 133 contention. 135 In this specification, we propose a mechanism by which multiple 136 transport sessions may be established between a pair of peers. Each 137 transport session is identified by a distinct set of BGP 138 capabilities, notably the MP-BGP capability. 140 Each session is distinct from a BGP protocol point of view; an error 141 or other event on one session has no implications for any other 142 session. All protocol modifications proposed by this specification 143 take place during the OPEN exchange phase of the session, there are 144 no modifications to the operation of the protocol once a session 145 reaches ESTABLISHED state. 147 Although AFI/SAFI is perhaps the most obvious way to group sets of 148 routes being exchanged between BGP peers, sessions can also be 149 distinguished by other BGP capabilities. In general, any capability 150 used in this fashion would be expected to have semantics of 151 identifying some common distinguishing characteristic of a set of 152 routes, just as AFI/SAFI does; however, specifics are beyond the 153 scope of this document. For the sake of clarity, we generally use 154 the MP-BGP capability (or interchangeably, AFI/SAFI) in this 155 document. Such use is illustrative and is not intended to be 156 limiting. 158 Routers implementing this specification MUST also implement the base 159 criteria that is used to define sessions. For example if AFI/SAFI 160 based sessions are desired then routers implementing this 161 specification MUST also implement MP-BGP [RFC4760]. 163 1.1. Requirements Language 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119 [RFC2119]. 169 2. Definitions 171 "MP-BGP capability" refers to the capability [RFC5492] with code 1, 172 specified in MP-BGP [RFC4760] section 8. 174 A BGP speaker is said to "support" some feature or functionality (for 175 example, to support this specification, or to support a particular 176 AFI/SAFI) when the BGP implementation supports the feature AND the 177 feature has not been disabled by configuration. 179 The Session Identifier is a capability or group of capabilities that 180 will be used to differentiate individual BGP sessions between two IP 181 endpoints. When the AFI/SAFI is used to distinguish sessions, the 182 MP-BGP capability is the session identifier. 184 A pair of session identifiers are said to conflict when considering 185 them as two sets, there is an intersection between them either in the 186 capabilities or the values contained within the capabilities, but 187 neither is a subset of the other. For example, a pair of MP-BGP 188 capabilities is said to "conflict" when considering them as two sets 189 (of AFI/SAFI values), there is an intersection between the sets but 190 neither set is a subset of the other. 192 A BGP speaker is said to be the "active" speaker for a given 193 connection if it was the party that initiated the transport open. 194 The active speaker's transport endpoint will typically use an 195 ephemeral port number. 197 A BGP speaker is said to be the "passive" speaker for a given 198 connection if it was the party that received the transport open. The 199 passive speaker's transport endpoint typically uses the well-known 200 BGP port number, 179, but this document introduces an exception 201 detailed in Section 5.1.1.1. 203 3. Use of BGP Capability Advertisement 205 This specification defines the Multisession capability [RFC5492]: 207 Capability code (1 octet): 68 209 Capability length (1 octet): variable 211 Capability value (1 octet): Flags followed by the list of 212 capabilities that define a session. 214 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 215 +-+-+-+-+-+-+-+-+ 216 |G|R| Reserved | 217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 218 | Port number (if R is set) | 219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 220 | One or more Capability codes (1 octet each) | 221 ~ ~ 222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 224 The most significant bit is defined as the Grouping Support (G) bit. 225 It can be used to indicate support for the ability to group multiple 226 capability values into one session. When set (value 1) this bit 227 indicates that the BGP speaker supports grouping. An example of 228 grouping is if a BGP speaker wishes to use one session for AFI/SAFI 229 values 1/1, 1/2 and 1/4, and another for AFI/SAFI values 2/1, 2/2 and 230 2/4. 232 The next bit is defined as the Redirect (R) bit. When set, it 233 indicates that the sender wishes to continue the current BGP session 234 using a different transport endpoint. This entails the active 235 speaker dropping the current session and starting a fresh one using 236 the proposed endpoint; this is detailed in Section 5.1.1.1 below. 237 When set, the transport endpoint information is encoded in the port 238 number field of the capability as detailed below. 240 The remaining bits are reserved, and should be set to zero by the 241 sender and ignored by the receiver. 243 If the R bit is set, following the reserved bits is the two-octet TCP 244 port number to which the passive speaker wishes to redirect the 245 session. 247 Following the reserved bits and the transport endpoint information if 248 present is a list of one or more Capability codes defined in BGP. 249 The size of the list is inferred from the length of the overall 250 capability; it is the capability length minus one if the R bit is not 251 set, or minus three if the R bit is set. The capabilities listed 252 specify which capabilities in the OPEN message comprise the session 253 identifier. The Multisession capability code itself MUST NOT be 254 listed; if listed it MUST be ignored upon receipt. 256 For example, peers wishing to establish sessions based on AFI/SAFI 257 would exchange the Multiprotocol Extensions capability code (1) only 258 in the list. In this case the Multisession capability would have a 259 length of two octets, or four octets if redirect is being requested. 261 4. New NOTIFICATION Subcodes 263 BGP [RFC4271] Section 4.5 provides a number of subcodes to the 264 NOTIFICATION message, and Section 6.2 elaborates on the use of those 265 subcodes. 267 This specification introduces five new subcodes: 269 OPEN Message Error subcodes: 271 7 - Capability Value Mismatch 273 8 - Grouping Conflict 275 9 - Grouping Required 277 10 - Redirecting Now 279 11 - Redirect Required 281 The Capability Value Mismatch code MAY be used when an OPEN message 282 received contains one or more capabilities whose values are 283 inconsistent with the corresponding capabilities of the local BGP 284 speaker. The Data field MUST list the offending capability code(s). 286 The Grouping Conflict code MAY be used when an OPEN message contains 287 one or more capabilities whose values conflict with the values of one 288 or more capability groups configured on the local BGP speaker. The 289 Data field MUST indicate one of the conflicting locally-configured 290 capability group, encoded as the appropriate capabilities. 292 The Grouping Required code MAY be used when a BGP speaker that is 293 configured to require grouping attempts to establish a connection 294 with a BGP speaker that does not support grouping. (While it is true 295 that it might be possible to communicate much the same information 296 using the Unsupported Capability NOTIFICATION message, this more 297 explicit method is felt to be more transparent.) 299 If the MP-BGP capability is used as the session identifier, the 300 notifications could be used as follows: 302 Capability Value Mismatch MAY be used when an OPEN message contains 303 one or more MP-BGP capabilities, none of which lists an AFI/SAFI 304 supported by the local BGP speaker. It is observed that this subcode 305 may be useful for MP-BGP speakers in general, even if they do not 306 (otherwise) implement this specification. 308 The Grouping Conflict code MAY be used when an OPEN message contains 309 several MP-BGP capabilities whose AFI/SAFI conflict with one or more 310 AFI/SAFI groups configured on the local BGP speaker. The Data field 311 MUST indicate one of the conflicting locally-configured AFI/SAFI 312 groups, encoded as MP-BGP capabilities. (One might think of this as 313 indicating "I'm not willing to combine AFI/SAFI foo and bar as you've 314 tried to do.") 316 Use of the Redirecting Now and Redirect Required codes is detailed in 317 Section 5.1.1.1. 319 The use of these subcodes is further elaborated below. 321 5. Overview of Operation 323 The operation section is divided into two main subsections. 325 The "Using Multisession" sections below discuss the BGP speaker's 326 behavior when the peer does support this specification or is assumed 327 to. The "Backward Compatibility" section discusses the BGP speaker's 328 behavior when the peer does not support this specification, or is 329 assumed not to. Both sections also discuss how to switch to the 330 other mode. 332 A BGP speaker that supports this specification MUST always advertise 333 the Multisession capability, regardless of its peer's known or 334 presumed capability set. 336 In all cases until a BGP speaker has initiated or accepted one 337 connection from a given peer, it is unknown whether the peer supports 338 this specification or not. Two strategies can be considered for 339 making this initial determination -- either the BGP speaker can 340 initially assume that the peer does not support this specification, 341 and switch modes if it is discovered that it does, or vice-versa. 342 Either approach is acceptable. 344 As discussed previously, this section describes the operation from 345 the point of view of the MP-BGP capability and the associated AFI/ 346 SAFI values as the session identifier. It can be replaced with any 347 other capability or groups of capabilities without any changes to the 348 behavior described below. 350 Note that if a BGP speaker only wishes to support a single AFI/SAFI 351 in its communications with a given peer only one session is needed in 352 any case, and so the "multisession" feature is moot. In such a case 353 the behavior required would be indistinguishable from that given in 354 the "backward compatibility" section below. In the illustrative 355 examples in the following sections, it is generally assumed that a 356 BGP speaker does wish to support multiple AFI/SAFI in its 357 communications with a given peer. 359 5.1. Using Multisession 361 The following subsections discuss a BGP speaker's behavior towards a 362 peer that is known or assumed to support this specification. 364 5.1.1. Initiating Connections 366 When a BGP speaker (the "active" speaker) attempts BGP communication 367 with its peer (the "passive" speaker), it initiates one connection 368 per group of AFI/SAFI it wishes to support. (This implies that a new 369 local TCP port will be allocated for each new connection.) The OPEN 370 sent on each connection MUST include the Multisession capability and 371 one or more MP-BGP capabilities indicating the AFI/SAFI to be 372 supported on that session. If a non-trivial group of AFI/SAFI (i.e., 373 a group of two or more) is proposed, the BGP speaker MUST also set 374 the G bit of the Multisession capability. Even if a trivial group of 375 AFI/SAFI is proposed, the G bit SHOULD be set if grouping is 376 supported. The active speaker MUST NOT set the R bit nor include an 377 associated TCP port number. 379 Note that any "group of AFI/SAFI" may be a singleton group, i.e. the 380 speaker may wish to use a separate BGP connection for each AFI/SAFI. 382 If the peer also supports this specification and also wishes to 383 support the AFI/SAFI in question, it will respond with an OPEN that 384 includes the Multisession capability and the AFI/SAFI included in the 385 active speaker's OPEN. If the active speaker's OPEN included a non- 386 trivial group of AFI/SAFI that the peer supports, then the peer's 387 Multisession capability will have the G bit set. 389 If the peer also supports this specification and wishes to support 390 some but not all of the AFI/SAFI in question, it will respond with an 391 OPEN that includes the Multisession capability and a subset of AFI/ 392 SAFI included in the active speaker's OPEN. The reason for listing 393 only a subset may be because some of the AFI/SAFI are simply not 394 supported, or because the peer does not wish to support the AFI/SAFI 395 as a group (i.e. it may be configured to use a smaller group). In 396 this case, the BGP speaker MAY consider the set of AFI/SAFI that were 397 not included in the peer's OPEN to form a new group, and MAY try to 398 initiate a new session using that group. 400 If the peer also supports this specification but does not support 401 grouping, and a non-trivial group of AFI/SAFI has been proposed, then 402 it will respond as given in the previous paragraph but with the 403 additional proviso that the G bit will be clear. In this case, the 404 BGP speaker MAY accept the connection as given in the previous 405 paragraph, or it MAY reply with a NOTIFICATION message with ERROR 406 Code OPEN Message Error and Error Subcode Grouping Required, and the 407 connection will be closed. 409 If the peer wishes to continue the BGP connection on a different 410 transport endpoint, in addition to responding as detailed above, it 411 will set the R bit and will include the TCP port number that should 412 be used to continue the connection. See Section 5.1.1.1 for details 413 regarding how this is handled. 415 If the peer does not wish to support the AFI/SAFI in question, it 416 will reply with a NOTIFICATION message with Error Code OPEN Message 417 Error, and Error Subcode Capability Value Mismatch, and the 418 connection will be closed. 420 A BGP speaker MUST NOT attempt to initiate connections for any AFI/ 421 SAFI for which a connection already exists. 423 If the peer does not support this specification, it will respond with 424 an OPEN that does not include the Multisession capability. In this 425 case the connection SHOULD be terminated, and future connections to 426 the peer should be attempted in the "backward compatibility" mode 427 discussed in Section 6. 429 5.1.1.1. Continuing a Redirected Connection 431 When the active speaker receives an OPEN from the passive speaker 432 that includes transport redirect information, it MUST reply with an 433 Open Message Error NOTIFICATION with its subcode set to Redirecting 434 Now and close the session. Subsequently, it MUST attempt to initiate 435 a new session using the transport endpoint that the passive speaker 436 has proposed in lieu of the original one (which typically would have 437 been the well-known BGP port, 179). The new session should proceed 438 exactly as the original one did; that is, the active speaker SHOULD 439 send an OPEN with the same content, and can expect to receive from 440 the passive speaker an OPEN with the same content as previously with 441 the exception that the R bit should be clear and no associated port 442 number should be present. If the R bit is not clear it (and the 443 accompanying port number) SHOULD be disregarded. 445 Note that although the OPEN messages exchanged on the reinitiated 446 session can be expected to be the same as or similar to those from 447 the previous session as discussed above, an implementation MUST NOT 448 rely on or enforce this assumption when handling the received OPEN. 449 The new session MUST be handled as any other new session would be in 450 this respect. 452 As discussed above, when the passive speaker requests a redirect, the 453 active speaker is expected to drop the current session and initiate a 454 new one. If it does not do so, the passive speaker MAY elect to 455 continue the session, or it MAY elect to terminate the session by 456 sending a Redirect Required NOTIFICATION. 458 5.1.2. Accepting Connections 460 When processing a connection attempt, the BGP speaker MUST wait until 461 the peer's OPEN message has been received before proceeding. This is 462 at variance with the behavior specified in the finite state machine 463 (FSM) of [RFC4271], but is interoperable with that FSM. The FSM 464 changes are specified in Section 7. 466 Once the peer's OPEN message has been received, if it includes the 467 Multisession capability and one or more MP-BGP capabilities 468 indicating a group of AFI/SAFI that the BGP speaker wishes to 469 support, then the BGP speaker responds with an OPEN message that 470 includes the Multisession capability and one or more MP-BGP 471 capabilities indicating the same AFI/SAFI. 473 If the OPEN includes the Multisession capability and one or more MP- 474 BGP capabilities indicating a group of AFI/SAFI that conflicts with 475 an AFI/SAFI grouping that has been configured on the BGP speaker then 476 the BGP speaker MAY reply with an OPEN listing a set of AFI/SAFI that 477 intersect with those proposed by the peer (in effect overriding the 478 locally configured set) or it MAY close the connection with a 479 NOTIFICATION message with Error Code OPEN Message Error and Error 480 Subcode Grouping Conflict. The former behavior is suggested as the 481 default if grouping is supported. 483 If the BGP speaker does not support AFI/SAFI grouping it MAY reply 484 with an OPEN listing one of the AFI/SAFI out of those proposed by the 485 peer. It MUST also set the G bit in the Multisession capability to 486 zero. 488 If the passive speaker wishes to continue the session for this 489 particular grouping on a different port number, it sets the R bit in 490 its OPEN and includes the TCP port number on which it will continue 491 the session. The passive speaker MUST be prepared to accept a 492 connection on the given port immediately following transmission of 493 its OPEN. 495 If the received OPEN message does not include any MP-BGP capability 496 indicating an AFI/SAFI the BGP speaker wishes to support, it SHOULD 497 close the connection with a NOTIFICATION message with Error Code OPEN 498 Message Error and Error Subcode Capability Value Mismatch. 500 If the received OPEN message does not include the Multisession 501 capability, then the peer does not support this specification. The 502 connection MAY be continued in the "backward compatibility" mode 503 discussed in Section 6, or it MAY be terminated and future 504 connections to the peer attempted in the "backward compatibility" 505 mode. 507 5.1.3. Collision Detection, Graceful Restart 509 [RFC4271] Section 6.8 (BGP connection collision detection) considers 510 a pair of connections to have collided if the source and destination 511 IP addresses of both connections match. With respect to peers that 512 support this specification, the AFI/SAFI groups associated with the 513 connections must also intersect for them to be considered to have 514 collided. 516 This consideration also applies to Section 4.2 of BGP Graceful 517 Restart [RFC4724], when determining whether a new connection should 518 be considered equivalent to a reset of a previous TCP session. 520 6. Backward Compatibility 522 This subsection discusses a BGP speaker's behavior towards a peer 523 that is known or assumed not to support this specification. In 524 short, the BGP speaker's behavior towards such a peer should be as 525 otherwise defined for the BGP protocol, according to [RFC4271] and 526 any other extension supported by the BGP speaker. 528 As previously mentioned, the BGP speaker SHOULD always advertise the 529 Multisession capability in its OPEN message, even towards "backward 530 compatibility" peers. 532 If, in opening a BGP connection with such a peer, an OPEN that 533 includes the Multisession capability is received from the peer, then 534 the peer SHOULD be changed to "multisession" mode. How this is done 535 depends on whether the BGP speaker has already sent an OPEN or not -- 537 If the BGP speaker has not yet sent an OPEN to the peer, then the 538 connection MAY be continued in the "multisession" mode discussed 539 above, or it MAY be terminated and future connections to the peer 540 attempted in "multisession" mode. 542 If the BGP speaker has sent an OPEN to the peer, then the current 543 session SHOULD be terminated and future connections to the peer 544 attempted in "multisession" mode. 546 Use of techniques such as dynamic capabilities 547 [I-D.ietf-idr-dynamic-cap] for on-the-fly switching of session modes 548 is beyond the scope of this document. 550 7. State Machine 552 As mentioned under "accepting connections" above, this specification 553 modifies the BGP finite state machine, albeit in a backward- 554 compatible fashion. 556 In addition, note that one state machine is considered to exist for 557 each of the connections that may exist to a given peer. This implies 558 that, for example, any session flap dampening that may exist is 559 performed per session identifier. 561 The specific state machine modifications to [RFC4271] Section 8.2.2 562 are as follows. 564 7.1. Modifications to Connect State and Active State 566 In the actions in response to the events Open Delay timer expires 567 [Event 12] and TCP connection succeeds [Event 16 or Event 17], an 568 OPEN is not sent and the state changes to WaitForOpen and not to 569 OpenSent. 571 7.2. Addition of WaitForOpen State, Deletion of OpenSent State 573 The WaitForOpen state is the same in all respects to OpenSent, except 574 for the action in response to reception of a valid OPEN message 575 [Event 19]. In that event, the local system sends an OPEN message 576 prior to sending a KEEPALIVE message. 578 The OpenSent state is deleted. All references to OpenSent are 579 replaced by references to WaitForOpen. 581 8. Discussion 583 Note that many BGP implementations already permit multiple sessions 584 to be used between a given pair of routers, typically by configuring 585 multiple IP addresses on each router and configuring each session to 586 be bound to a different IP address. The principal contribution of 587 this specification is to allow multiple sessions to be created 588 automatically, without additional configuration overhead or address 589 consumption. 591 The specification supports the simple case of one capability being 592 used as the session identifier and one connection per session 593 identifier value. It also permits connections be established based 594 on multiple capabilities as a session identifier with multiple values 595 per capability grouped together per connection. 597 In the context of MP-BGP based connections, which we believe may be 598 the most prevalent use of this specification, it permits supporting 599 one AFI/SAFI per connection, and also permits arbitrary grouping of 600 AFI/SAFI onto BGP connections. For such grouping to function 601 pleasingly, both peers participating in a connection need to agree on 602 what AFI/SAFI groupings will be used. If conflicting groupings are 603 configured, the connections may not establish, or more connections 604 may be established than were expected (in the degenerate case, one 605 connection per AFI/SAFI could be established despite configured 606 groupings). We observe that the potential for misbehavior in the 607 presence of conflicting configuration is not unusual in BGP, and that 608 support for, and configuration of grouping is purely optional. 610 9. Security Considerations 612 The ability to redirect to a port other than the well-known BGP port 613 implies that a legitimate BGP session may exist for which neither 614 port is equal to 179. This may have implications for firewall 615 filters used to protect the control processor. 617 In other respects, this document does not change the BGP security 618 model. 620 10. Acknowledgements 622 The authors would like to thank Pedro Marques, Keyur Patel, Robert 623 Raszuk, Yakov Rekhter and David Ward for their valuable comments. 625 11. IANA Considerations 627 IANA has allocated BGP Capability Code 68 as the Multisession BGP 628 Capability. 630 This document requests IANA to allocate five new OPEN Message Error 631 subcodes: 633 7 - Capability Value Mismatch 635 8 - Grouping Conflict 637 9 - Grouping Required 639 10 - Redirecting Now 641 11 - Redirect Required 643 12. References 645 12.1. Normative References 647 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 648 Requirement Levels", BCP 14, RFC 2119, March 1997. 650 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 651 Protocol 4 (BGP-4)", RFC 4271, January 2006. 653 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 654 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 655 January 2007. 657 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 658 "Multiprotocol Extensions for BGP-4", RFC 4760, 659 January 2007. 661 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 662 with BGP-4", RFC 5492, February 2009. 664 12.2. Informative References 666 [I-D.ietf-idr-dynamic-cap] 667 Chen, E. and S. Ramachandra, "Dynamic Capability for 668 BGP-4", draft-ietf-idr-dynamic-cap-10 (work in progress), 669 January 2010. 671 [RFC2918] Chen, E., "Route Refresh Capability for BGP-4", RFC 2918, 672 September 2000. 674 Authors' Addresses 676 John G. Scudder 677 Juniper Networks 679 Email: jgs@juniper.net 681 Chandra Appanna 682 Cisco Systems 684 Email: achandra@cisco.com