idnits 2.17.1 draft-ietf-idr-operational-message-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 29, 2012) is 4430 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC2119' is defined on line 946, but no explicit reference was found in the text == Unused Reference: 'RFC4760' is defined on line 955, but no explicit reference was found in the text == Unused Reference: 'I-D.jasinska-ix-bgp-route-server' is defined on line 984, but no explicit reference was found in the text == Unused Reference: 'I-D.nalawade-bgp-inform' is defined on line 990, but no explicit reference was found in the text == Unused Reference: 'I-D.nalawade-bgp-soft-notify' is defined on line 995, but no explicit reference was found in the text == Unused Reference: 'I-D.retana-bgp-security-state-diagnostic' is defined on line 1005, but no explicit reference was found in the text == Unused Reference: 'I-D.shakir-idr-ops-reqs-for-bgp-error-handling' is defined on line 1010, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4893 (Obsoleted by RFC 6793) == Outdated reference: A later version (-07) exists of draft-ietf-grow-ops-reqs-for-bgp-error-handling-02 == Outdated reference: A later version (-19) exists of draft-ietf-idr-error-handling-01 Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group D. Freedman 3 Internet-Draft Claranet 4 Intended status: Standards Track R. Raszuk 5 Expires: September 1, 2012 NTT MCL Inc. 6 R. Shakir 7 BT 8 February 29, 2012 10 BGP OPERATIONAL Message 11 draft-ietf-idr-operational-message-00 13 Abstract 15 The BGP Version 4 routing protocol (RFC4271) is now used in many 16 ways, crossing boundaries of administrative and technical 17 responsibility. 19 The protocol lacks an operational messaging plane which could be 20 utilised to diagnose, troubleshoot and inform upon various conditions 21 across these boundaries, securely, during protocol operation, without 22 disruption. 24 This document proposes a new BGP message type, the OPERATIONAL 25 message, which can be used to effect such a messaging plane for use 26 both between and within Autonomous Systems. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on September 1, 2012. 45 Copyright Notice 47 Copyright (c) 2012 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3. BGP OPERATIONAL message . . . . . . . . . . . . . . . . . . . 5 65 3.1. BGP OPERATIONAL message capability . . . . . . . . . . . . 5 66 3.2. BGP OPERATIONAL message encoding . . . . . . . . . . . . . 5 67 3.3. PRI Format . . . . . . . . . . . . . . . . . . . . . . . . 6 68 3.4. BGP OPERATIONAL message TLVs . . . . . . . . . . . . . . . 9 69 3.4.1. ADVISE TLVs . . . . . . . . . . . . . . . . . . . . . 9 70 3.4.2. STATE TLVs . . . . . . . . . . . . . . . . . . . . . . 10 71 3.4.3. DUMP TLVs . . . . . . . . . . . . . . . . . . . . . . 12 72 3.4.4. CONTROL TLVs . . . . . . . . . . . . . . . . . . . . . 13 73 4. Use of the ADVISE TLVs . . . . . . . . . . . . . . . . . . . . 16 74 5. Use of the STATE TLVs . . . . . . . . . . . . . . . . . . . . 18 75 5.1. Utilising STATE TLVs for Cross-Domain Debugging 76 Functionality . . . . . . . . . . . . . . . . . . . . . . 18 77 5.2. Utilising STATE TLVs in the context of Error Handling . . 18 78 6. Use of the DUMP TLVs . . . . . . . . . . . . . . . . . . . . . 20 79 7. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . 22 80 8. Security considerations . . . . . . . . . . . . . . . . . . . 23 81 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 82 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 83 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 84 11.1. Normative References . . . . . . . . . . . . . . . . . . . 27 85 11.2. Informative References . . . . . . . . . . . . . . . . . . 27 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 29 88 1. Introduction 90 In this document, a new BGP message type, the OPERATIONAL message is 91 defined, creating a communication channel over which messages can be 92 passed, using a series of contained TLV elements. 94 The messages can be human readable, for the attention of device 95 operators or machine readable, in order to provide simple self test 96 routines, which can be exchanged between BGP speakers. 98 A number of TLV elements will be assigned to provide for these 99 message types, along with TLV elements to assist with description of 100 the message data, such as describing precisely BGP prefixes and 101 encapsulating BGP UPDATE messages to be sent back for inspection in 102 order to troubleshoot session malfunctions. 104 The use of OPERATIONAL messages will be negotiated by BGP Capability 105 [RFC5492], since the messages are in-band with the BGP session, they 106 can be assumed to either be authenticated as originating directly 107 from the BGP neighbor. 109 The goal of this document is to provide a simple, extensible 110 framework within which new messaging and diagnostic requirements can 111 live. 113 2. Applications 115 The authors would like to propose three main applications which BGP 116 OPERATIONAL TLVs are designed to address. New TLVs can be easily 117 added to enhance further current applications or to propose new 118 applications. 120 The set of TLVs is organised in the following four functional groups 121 comprising the three applications and some control messaging: 123 o ADVISE TLVs, designed to convey human readable information to be 124 passed, cross boundary to operators, to inform them of past or 125 upcoming error conditions, or provide other relevant, in-band 126 operational information. The "Advisory Demand Message" ADM 127 (Section 3.4.1.1) is an example of this. 129 o STATE TLVs, designed to carry information about BGP state across 130 BGP neighbors, including both per-neighbor and global counters. 132 o DUMP TLVs, designed to describe or encapsulate data to assist in 133 realtime or post-mortem diagnostics, such as structured 134 representations of affected prefixes / NLRI and encapsulated raw 135 UPDATE messages for inspection. 137 o CONTROL TLVs, designed to facilitate control messaging such as 138 replies to requests which can not be satisfied. 140 Means concerning the reporting of information carried by these TLVs, 141 either in reply or request processing are implementation specific but 142 could include methods such as SYSLOG. 144 3. BGP OPERATIONAL message 146 3.1. BGP OPERATIONAL message capability 148 A BGP speaker that is willing to exchange BGP OPERATIONAL Messages 149 with a neighbor should advertise the new OPERATIONAL Message 150 Capability to the neighbor using BGP Capabilities advertisement 151 [RFC5492] . A BGP speaker may send an OPERATIONAL message to its 152 neighbor only if it has received the OPERATIONAL message capability 153 from them. 155 The Capability Code for this capability is specified in the IANA 156 Considerations section of this document. 158 The Capability Length field of this capability is 2 octets. 160 +------------------------------+ 161 | Capability Code (1 octet) | 162 +------------------------------+ 163 | Capability Length (1 octet) | 164 +------------------------------+ 166 OPERATIONAL message BGP Capability Format 168 3.2. BGP OPERATIONAL message encoding 170 The BGP message as defined [RFC4271] consists of a fixed-size header 171 followed by two octet length field and one octet of type value. The 172 RFC limits the maximum message size to 4096 octets. As one of the 173 applications of BGP OPERATIONAL message (through the MUD 174 (Section 3.4.3.3) message) is to be able to carry an entire, 175 potentially malformed BGP UPDATE, this specification mandates that 176 when the neighbor has negotiated the BGP OPERATIONAL message 177 capability, any further BGP message which may be subject to enclosure 178 within a BGP OPERATIONAL message must be sent with the maximum size 179 reduced to accommodate for the potential need of additional wrapping 180 header size requirements. This is applicable to both the current BGP 181 maximum message size limit or for any future modifications. 183 For the purpose of the OPERATIONAL message information encoding we 184 will use one or more Type-Length-Value containers where each TLV will 185 have the following format: 187 0 1 2 3 188 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 190 | Type (2 octets) | Length (2 octets) | 191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 | Value (Variable) | 193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 195 OPERATIONAL message TLV Format 197 TYPE: 2 octet value indicating the TLV type 199 LENGTH: 2 octet value indicating the TLV length in octets 201 VALUE: Variable length value field depending on the type of the TLVs 202 carried. 204 To work around continued BGP churn issues some types of TLVs will 205 need to contain a sequence number to correlate a request with 206 associated replies. The sequence number will consist of 8 octets and 207 will be of the form: (4 octet bgp_router_id) + (local 4 octet 208 number). When the local 4 octet number reaches 0xFFFF it should 209 restart from 0x0000. The sequence number is only used if the TLV 210 requires sequencing else it is not included. 212 The typical application scenario for use of the sequence number is 213 for it to be included in a request TLV to be copied into associated 214 reply messages in order to correlate requests with their associated 215 replies. 217 3.3. PRI Format 219 Prefix Reachability Indicators (PRI) are used to represent prefix 220 NLRI and BGP attributes in a request and only prefix NLRI in a 221 response, in this draft. 223 Each PRI is encoded as a 3-tuple of the form whose fields are described below: 226 0 1 2 3 227 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 228 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 229 | Flags | Payload Type | Payload (Variable) | 230 +---------------------------------------------------------------+ 231 | Payload (Variable) | 232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 234 The use and the meaning of these fields are as follows: 236 a) Flags: 237 Four bits indicating NLRI Reachability: 239 0 1 2 3 4 5 6 7 8 240 +-+-+-+-+-+-+-+-+ 241 |R|I|O|L| Resvd | 242 +-+-+-+-+-+-+-+-+ 244 aa) R Bit: 245 The R (Reachable) bit, if set represents that the prefixes were 246 deemed reachable in the NLRI, else represents that the prefixes 247 were deemed unreachable. This bit is meaningless in the 248 context of all currently defined requests and can thus only be 249 found in a response. If found in a request an implementation 250 MUST ignore its state. 252 ab) I Bit: 253 The I (Adj-RIB-In) bit, if set in a query, indicates that the 254 requestor wishes for the response to be found in the Adj-RIB-In 255 of the neighbor representing this session, if cleared indicates 256 that the Adj-RIB-In of the neighbor representing this session 257 is not searched. If set in a response, indicates that the Adj- 258 RIB-In of the neighbor representing this session contained this 259 information, if cleared it did not. 261 ac) O Bit: 262 The O (Adj-RIB-Out) bit, if set in a query, indicates that the 263 requestor wishes for the response to be found in the Adj-RIB- 264 Out of the neighbor representing this session, if cleared 265 indicates that the Adj-RIB-Out of the neighbor representing 266 this session is not searched. If set in a response, indicates 267 that the Adj-RIB-Out of the neighbor representing this session 268 contained this information, if cleared it did not. 270 ad) L Bit: 271 The L (Loc-RIB) bit, if set in a query, indicates that the 272 requestor wishes for the response to be found in the BGP Loc- 273 RIB of the neighbor, if cleared indicates that the Loc-RIB of 274 the neighbor is not searched. If set in a response, indicates 275 that the Loc-RIB of the neighbor contained this information, if 276 cleared it did not. 278 The rest of the field is reserved for future use. 280 b) Payload Type: 281 This one octet type specifies the type and geometry of the 282 payload. 284 ba) Type 0 - NLRI: 285 The payload contains (perhaps multiple) NLRI, the format of 286 each NLRI is as defined in the base specification of such NLRI 287 appropriate for the AFI/SAFI. 289 bb) Type 1 - Next Hop: 290 The payload contains a Next Hop address, appropriate for the 291 AFI/SAFI. When used in an SSQ (Section 3.4.2.7) message the 292 response is expected to contain prefixes from the selected RIBs 293 which contain this next-hop in their next-hop attribute. 295 bc) Type 2 - AS Number: 296 The payload contains a 16 or 32 bit AS number (as defined in 297 [RFC4893]), when used in an SSQ message the response is 298 expected to contain prefixes from the selected RIBs which 299 contain this AS number in their AS_PATH or AS4_PATH (as 300 appropriate) attributes. 302 bc) Type 3 - Standard Community: 303 The payload contains a standard community (as defined in 304 [RFC1997]), when used in an SSQ message the response is 305 expected to contain prefixes from the selected RIBs which 306 contain this standard community in their communities attribute. 308 bd) Type 4 - Extended Community: 309 The payload contains an extended community (as defined in 310 [RFC4360]), when used in an SSQ message the response is 311 expected to contain prefixes from the selected RIBs which 312 contain this standard community in their extended communities 313 attribute. 315 be) Types 5-65535 - Reserved: 316 Types 5-65535 are reserved for future use. 318 c) Payload: 319 Contains the actual payload, as defined by the payload type, the 320 payload is of variable length, to be calculated from the remaining 321 TLV length. 323 PRI are used for both request and response modes, a response MUST 324 only contain an NLRI (type 0) payload but a request MAY contain 325 payloads specifying a type to search for, an implementation MUST 326 validate all PRI it receives in a request against the type of request 327 which was made. 329 An implementation MUST NOT send a PRI in response with no NLRI (type 330 0) payload, this is considered to be invalid. If the implementation 331 wishes to signal that a request did not yield a any valid results an 332 implementation MAY respond with an NS TLV (Section 3.4.4.2), using 333 the "Not Found" subcode, for example. 335 3.4. BGP OPERATIONAL message TLVs 337 3.4.1. ADVISE TLVs 339 ADVISE TLVs convey human readable information to be passed, cross 340 boundary to operators, to inform them of past or upcoming error 341 conditions, or provide other relevant, in-band operational 342 information. 344 3.4.1.1. Advisory Demand Message (ADM) 346 TYPE: 1 - ADM 348 LENGTH: 3 Octets (AFI+SAFI) + Variable value (up to 2K octets) 350 USE: To carry a message, on demand, comprised of a string of UTF-8 351 characters (up to 2K octets in size), with no null termination. Upon 352 reception, the string SHOULD be reported to the host's administrator. 354 Implementations SHOULD provide their users the ability to transmit a 355 free form text message generated by user input. 357 3.4.1.2. Advisory Static Message (ASM) 359 TYPE: 2 - ASM 361 LENGTH: 3 Octets (AFI+SAFI) + Variable value (up to 2K octets) 362 USE: To carry a message, on demand, comprised of a string of UTF-8 363 characters, with no null termination. Upon reception, the string 364 SHOULD be stored in the BGP neighbor statistics field within the 365 router. The string SHOULD be accessible to the operator by executing 366 CLI commands or any other method (local or remote) to obtain BGP 367 neighbor statistics (e.g. NETCONF, SNMP). 369 The expectation is that the last ASM received from a BGP neighbor 370 will be the message visible to the operator (the most current ASM). 372 Implementations SHOULD provide their users the ability to transmit a 373 free form text message generated by user input. 375 3.4.2. STATE TLVs 377 STATE TLVs reflect, on demand, the internal state of a BGP neighbor 378 as seen from the other neighbor's perspective. 380 3.4.2.1. Reachable Prefix Count Request (RPCQ) 382 TYPE: 3 - RPCQ 384 LENGTH: 3 Octets (AFI+SAFI) + Sequence Number 386 USE: Sent to the neighbor to request that an RPCP (Section 3.4.2.2) 387 message is generated in response. 389 3.4.2.2. Reachable Prefix Count Reply (RPCP) 391 TYPE: 4 - RPCP 393 LENGTH: 3 Octets (AFI+SAFI) + Sequence Number + 4 Octet RX Prefix 394 Counter (RXC) + 4 Octet TX Prefix Counter (TXC) 396 USE: Sent in reply to an RPCQ (Section 3.4.2.1) message from a 397 neighbor, RXC is populated with the number of reachable prefixes 398 accepted from the peer and TXC with the number of prefixes to be 399 transmitted to the peer for the AFI/SAFI. 401 3.4.2.3. Adj-Rib-Out Prefix Count Request (APCQ) 403 TYPE: 5 - APCQ 405 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number 407 USE: Sent to the neighbor to request that an APCP (Section 3.4.2.4) 408 message is generated in response. 410 APCQ can be used as a simple mechanism when an implementation does 411 not permit or support the use of RPCQ. 413 3.4.2.4. Adj-Rib-Out Prefix Count Reply (APCP) 415 TYPE: 6 - APCP 417 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 4 Octet TX Prefix 418 Counter (TXC) 420 USE: Sent in reply to an APCQ (Section 3.4.2.3) message from a 421 neighbor, TXC is populated with the number of prefixes held in the 422 Adj-Rib-Out for the neighbor for the AFI/SAFI. 424 3.4.2.5. BGP Loc-Rib Prefix Count Request (LPCQ) 426 TYPE: 7 - LPCQ 428 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number 430 USE: Sent to the peer to request that an LPCP (Section 3.4.2.6) 431 message is generated in response. 433 3.4.2.6. BGP Loc-Rib Prefix Count Reply (LPCP) 435 TYPE: 8 - LPCP 437 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 4 Octet Loc-Rib 438 Counter (LC) 440 USE: Sent in reply to an LPCQ (Section 3.4.2.5) message from a 441 neighbor, LC is populated with the number of prefixes held in the 442 entire Loc-Rib for the AFI/SAFI. 444 3.4.2.7. Simple State Request (SSQ) 446 TYPE: 9 - SSQ 448 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + Single request PRI 449 (Variable) 451 USE: Using a PRI as a request form (See Section 3.3), an 452 implementation can be asked to return information about prefixes 453 found in various RIBs. 455 A single, simple PRI is used in the request, containing a single NLRI 456 or attribute as the PRI payload. RIB response filtering may take 457 place through the setting of the I, O and L bits in the PRI Flags 458 field. 460 An implementation MAY respond to an SSQ TLV in with an SSP (See 461 Section 3.4.3.4) TLV (containing the appropriate data). An 462 implementation MAY also respond to an SSQ with an NS TLV (with the 463 appropriate subcode set) indicating why there will not be an SSP TLV 464 in response. An implementation MAY also not respond at all (See 465 Section 8). 467 3.4.3. DUMP TLVs 469 DUMP TLVs provide data in both structured and unstructured formats in 470 response to events, for use in debugging scenarios. 472 3.4.3.1. Dropped Update Prefixes (DUP) 474 TYPE: 10 - DUP 476 LENGTH: 3 Octets(AFI+SAFI) + Variable number of dropped UPDATE Prefix 477 Reachability Indicators (PRI) (See Section 3.3) 479 USE: To report to a neighbor a structured set of prefix reachability 480 indicators retrievable from the last dropped UPDATE message, sent in 481 response to an UPDATE message which was well formed but not accepted 482 by the neighbor by policy. 484 For example, an UPDATE which was dropped and the rescued NLRI 485 concerned a number of both reachable and unreachable prefixes, the 486 DUP would encapsulate two PRI, one with the R-Bit (reachable) set, 487 housing the rescued reachable NLRI and the other with the R-Bit 488 cleared (unreachable), housing the rescued unreachable NLRI as 489 payload. 491 3.4.3.2. Malformed Update Prefixes (MUP) 493 TYPE: 11 - MUP 495 LENGTH: 3 Octets(AFI+SAFI) + Variable number of dropped update Prefix 496 Reachability Indicators (PRI) (See Section 3.3) due to UPDATE 497 Malformation. 499 USE: To report to a neighbor a structured set of prefix reachability 500 indicators retrievable from the last UPDATE message dropped through 501 malformation, sent in response to an UPDATE message which was not 502 well formed and not accepted by the neighbor, where a NOTIFICATION 503 message was not sent. A MUP TLV may accompany a MUD 504 (Section 3.4.3.3) TLV. 506 See the example from Section 3.4.3.1. 508 3.4.3.3. Malformed Update Dump (MUD) 510 TYPE: 12 - MUD 512 LENGTH: 3 Octets(AFI+SAFI) + Variable length representing retrievable 513 malformed update octet stream. 515 USE: To report to a peer a copy of the last UPDATE message dropped 516 through malformation, sent in response to an UPDATE message which was 517 not well formed and not accepted by the neighbor, where a 518 NOTIFICATION message was not sent. A MUD TLV may accompany a MUP 519 (Section 3.4.3.2) TLV. 521 3.4.3.4. Simple State Response (SSP) 523 TYPE: 13 - SSP 525 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + Single Response PRI 526 (Variable) 528 USE: Using a PRI as a response form (See Section 3.3), an 529 implementation uses the SSP TLV to return a response to an SSQ (See 530 Section 3.4.2.7) TLV which should contain information about prefixes 531 found in various RIBs. These RIBs should be walked to extract the 532 information according to local policy. 534 A single, simple PRI is used in the response, containing multiple 535 NLRI. The I, O and L bits in the PRI Flags field should be set 536 indicating which RIBs the prefixes were found in. 538 An implementation MAY respond to an SSQ TLV in with an SSP TLV 539 (containing the appropriate data). An implementation MAY also 540 respond to an SSQ with an NS TLV (with the appropriate subcode set) 541 indicating why there will not be an SSP TLV in response. An 542 implementation MAY also not respond at all (See Section 8). 544 If no data is found to satisfy a query which is permitted to be 545 answered, an implementation MAY respond with an NS TLV with the 546 subcode "Not Found" to indicate that no data was found in response to 547 the query. An implementation MUST NOT send a PRI in response with no 548 NLRI payload, this is considered to be invalid. 550 3.4.4. CONTROL TLVs 552 CONTROL TLVs satisfy control mechanism messaging between neighbors, 553 they are used for such functions as to refuse messages and 554 dynamically signal OPERATIONAL capabilities to neighbors during 555 operation. 557 3.4.4.1. Max Permitted (MP) 559 TYPE: 65534 - MP 561 LENGTH: 3 Octets(AFI+SAFI) + 2 Octet Value 563 USE: The Max Permitted TLV is used to signal to the neighbor the 564 maximum number of OPERATIONAL messages that will be accepted in a 565 second of time (see Section 8, Security Considerations), an 566 implementation MUST, on receipt of an MP TLV, ensure that it does not 567 exceed the rate specified in the MP TLV for sending OPERATIONAL 568 messages to the neighbor, for the duration of the session. 570 An implementation MAY send subsequent MP TLVs during the session's 571 lifetime, updating the maximum acceptable rate 573 MP TLVs MAY be rate limited by the receiver as part of OPERATIONAL 574 rate limiting (see Section 8, Security Considerations). 576 3.4.4.2. Not Satisfied (NS) 578 TYPE: 65535 - NS 580 LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 2 Octet Error Subcode 582 USE: To respond to a query to indicate that the implementation can or 583 will not answer this query. The following subcodes are defined: 585 0x01 - Request TLV Malformed: Used to signal to the neighbor that 586 the request was malformed and will not be processed. A neighbor 587 on receiving this message MAY re-transmit the request but MUST 588 increment the sequence number. Implementations SHOULD ensure that 589 the same request is not retransmitted excessively when repeatedly 590 receiving this Error Subcode in response. 592 0x02 - TLV Unsupported for this neighbor: Used to signal to the 593 neighbor that the request was unsupported and will not be 594 processed. A neighbor on receiving this message MUST NOT 595 retransmit the request for the duration of the session. 597 0x03 - Max query frequency exceeded: Used to signal to the neighbor 598 that the request has exceeded the rate at which the neighbor finds 599 acceptable for the implementation to transmit requests at, see 600 Section 3.4.4.1 (MP TLV) and Section 8 and (Security 601 Considerations) for more information. 603 0x04 - Administratively prohibited: Used to signal to the neighbor 604 that the request was administratively prohibited and will not be 605 processed. A neighbor on receiving this message MUST NOT 606 retransmit the request for the duration of the session. 608 0x05 - Busy: Used to signal to the neighbor that the request will 609 not be replied to, due to lack of resources estimated to satisfy 610 the request. It is suggested that, on receipt of this error 611 subcode a message is logged to inform the operator of this failure 612 as opposed to automatically attempting to re-try the previous 613 query. 615 0x06 - Not Found: Used to signal to the neighbor that the request 616 would have been replied to but does not contain any data (i.e the 617 data was not found). An implementation MUST NOT send a PRI 618 response with no NLRI payload, this is considered to be invalid. 620 NS TLVs MAY be rate limited by the receiver as part of OPERATIONAL 621 rate limiting (see Section 8, Security Considerations). 623 4. Use of the ADVISE TLVs 625 The BGP routing protocol is used with external as well as internal 626 neighbors to propagate route advertisements. In the case of external 627 BGP sessions, there is typically a demarcation of administrative 628 responsibility between the two entities. While initial configuration 629 and troubleshooting of these sessions is handled via offline means 630 such as email or telephone calls, there is gap when it comes to 631 advising a BGP neighbor of a behaviour that is occurring or will 632 occur momentarily. There is a need for operators to transmit a 633 message to a BGP neighbor to notify them of a variety of types of 634 messages. These messages typically would include those related to a 635 planned or unplanned maintenance action. These ADVISE messages could 636 then be interpreted by the remote party and either parsed via logging 637 mechanisms or viewed by a human on the remote end via the CLI. This 638 capability will improve operator NOC-to-NOC communication by 639 providing a communications medium on an established and trusted BGP 640 session between two autonomous systems. 642 The reason that this method is preferred for NOC-to-NOC 643 communications is that other offline methods do fail for a variety of 644 reasons. Emails to NOC aliases ahead of a planned maintenance may 645 have ignored the mail or may have not of recorded it properly within 646 an internal tracking system. Even if the message was recorded 647 properly, the staff that are on-duty at the time of the maintenance 648 event typically are not the same staff who received the maintenance 649 notice several days prior. In addition, the staff on duty at the 650 time of the event may not even be able to find the recorded event in 651 their internal tracking systems. The end result is that during a 652 planned event, some subset of eBGP peers will respond to a session/ 653 peer down event with additional communications to the operator who is 654 initiating the maintenance action. This can be via telephone or via 655 email, but either way, it may result in a sizeable amount of replies 656 inquiring as to why the session is down. 658 The result of this is that the NOC responsible for initiating the 659 maintenance can be inundated with calls/emails from a variety of 660 parties inquiring as to the status of the BGP session. The NOC 661 initiating the maintenance may have to further inquire with 662 engineering staff (if they are not already aware) to find out the 663 extent of the maintenance and communicate this back to all of the 664 NOCs calling for additional information. The above scenario outlines 665 what is typical in a planned maintenance event. In an unplanned 666 maintenance event (the need for an immediate router upgrade/reload), 667 the number of calls and emails will dramatically increase as more 668 parties are unaware of the event. 670 With the ADVISE TLV set, an operator can transmit an OPERATIONAL 671 message just prior to initiating the maintenance specifying what 672 event will happen, what ticket number this event is associated with 673 and the expected duration of the event. This message would be 674 received by BGP peers and stored in their logs as well as any 675 monitoring system if they have this capability. Now, all of the BGP 676 peers have immediate access to the information about this session, 677 why it went down, what ticket number this is being tracked under and 678 how long they should wait before assuming there is an actual problem. 679 Even smaller networks without the network management capabilities to 680 correlate BGP events and OPERATIONAL messages would typically have an 681 operator login to a router and examine the logs via the CLI. 683 This draft specifies two types of ADVISE TLV, a DEMAND message (ADM) 684 and a STATIC message (ASM), it is anticipated that the DEMAND message 685 will be used to send a message, on demand to the BGP neighbor, to 686 inform them of realtime events. The STATIC message can be used to 687 provide continual, "Sticky" information to the neighbor, such as a 688 contact telephone number or e-mail address should there be a 689 requirement to have continual access to this information. 691 5. Use of the STATE TLVs 693 At the current time, the BGP-4 protocol, provides no mechanism by 694 which the state of a remote system can be examined. Increasingly, as 695 BGP-4 is utilised for additional applications, there is utility in 696 providing in-band mechanisms for simple integrity checks, and 697 diagnostic information to be exchanged between systems. As such, 698 there are two sets of applications envisaged to be implemented 699 utilising the STATE TLVs of the OPERATIONAL message. 701 5.1. Utilising STATE TLVs for Cross-Domain Debugging Functionality 703 In numerous cases, autonomous system boundaries represent a 704 demarcation point between operational teams - in these cases, 705 debugging the information received over a BGP session between the two 706 systems is likely to result in human-to-human contact. In simple 707 cases, this provides a particularly inefficient means by which 708 specific queries regarding the routing information received via a 709 BGP-4 session can be made. Whilst complex debugging is likely to 710 continue to involve operational personnel, in a number of cases, it 711 is advantageous for an operator to allow the remote administrative 712 team to validate specific characteristics of the router's RIB. Such 713 a means of debugging greatly enhances the speed of localising 714 particular failures, and hence provides a potential reduction in the 715 time to recovery of services dependent on the routing information 716 transmitted via the BGP session. The STATE TLVs described in this 717 document are intended to provide a mechanism by which requests for, 718 and responses containing such debugging information can be 719 implemented. 721 An example of the use of such a mechanism is on BGP-4 sessions making 722 up a network-network interconnection carrying Layer 3 MPLS VPN 723 [RFC4364] services - in these cases, such NNIs may be between 724 particular administrative teams of the same network provider. The 725 OPERATIONAL SSQ is intended to provide a simple query language that 726 can be utilised to receive the subset of routing information that 727 matches a particular query within the remote system's RIB. It is 728 envisaged that such behaviour provides a simple means by which an 729 operator can validate whether particular routing information is 730 present, and as expected, on the remote system. Identification of 731 inconsistencies quickly allows the device responsible for missing or 732 incorrect information to be identified without direct interaction 733 between humans. 735 5.2. Utilising STATE TLVs in the context of Error Handling 737 The enhancements to the BGP-4 protocol intended to provide more 738 targeted error handling described in [I-D.ietf-idr-error-handling] 739 provide a number of cases whereby NLRI that are contained in 740 particular UPDATEs may not be accepted by the remote BGP speaker. In 741 this case, there is currently no mechanism by which an operator can 742 identify whether the routing information received by the local 743 speaker matches that which the remote speaker purports to have 744 advertised. The Adj-Rib-Out Prefix Count Request (APCQ) and 745 Reachable Prefix Count Request (RPCQ) are intended to provide means 746 by which simple validation can be performed between two BGP speakers. 747 It is envisaged that a BGP implementation can simply validate whether 748 the remote system's RIB is consistent utilising such a mechanism, and 749 hence trigger follow-up actions based on this. The extent of such 750 follow-up actions is not intended to be defined by this document, 751 however, it is envisaged that there is utility in such a state being 752 flagged to an operational team to allow investigation of any 753 inconsistency to be examined. Since many BGP-4 UPDATE message errors 754 may be transient, validating the prefix counts in the local RIB 755 against those received in response to the STATE TLV prefix count 756 query messages described herein allows an operator to determine 757 whether any inconsistency is persisting at the time of query, and 758 hence whether any action is required. 760 In addition to allowing a manually-triggered validation of the RIB 761 prefix counts, such a mechanism provides a simple means by which 762 automated consistency checking can be enhanced on a BGP session. A 763 device initiating a periodic check based on the RPCQ or APCQ TLVs can 764 validate basic information regarding the number of entries in a 765 particular RIB of a remote neighbor. Such consistency checks may 766 trigger further (more detailed) sets of consistency validation 767 mechanisms, or be flagged to a local operator. In this case, the 768 potential forwarding black-holes that can be caused by inconsistency 769 in the RIB of two systems can be quickly identified, and examined by 770 an operator, or recovered from via an automated means such as a 771 ROUTE-REFRESH message. As such, the use of the OPERATIONAL TLV in 772 this case allows the resources on the BGP speakers involved to be 773 minimised by allowing the speakers to perform a lightweight check 774 prior to triggering any further action. 776 6. Use of the DUMP TLVs 778 Where a notable condition is experienced by a BGP-4 speaker, 779 currently a limited set of responses are available to the speaker to 780 make human network administrators aware of the condition. Within a 781 local administrative boundary, logging functionality such as SNMP and 782 SYSLOG can be used to record the occurrence of the event, as such, 783 this provides visibility in an effective manner to the local 784 administrator of the device. Whilst this provides a mechanism to 785 make the router operator aware of erroneous states, or messages, 786 where the condition is a direct result of an input from a remote 787 system, or the information is of note to the remote BGP speaker, 788 there is no means to communicate the detection of an erroneous 789 condition to the remote device. As described in 790 [I-D.ietf-grow-ops-reqs-for-bgp-error-handling] such conditions are 791 likely to occur within the context of the handling of erroneous 792 UPDATE messages. 794 The OPERATIONAL message intends to provide a number of message types 795 to a BGP speaker that can be used to communicate information to a 796 remote system. Whilst clearly free-text mechanisms such as the ADM 797 provide a means by which arbitrary information can be transmitted, 798 the use of a structured message type indicating particular message 799 data can be transmitted back to the remote speaker provides means by 800 which this information can be processed and reported directly. As 801 such, the knowledge that particular OPERATIONAL messages relate to 802 particular erroneous conditions that may be affecting network 803 operation allows a system to determine any specific response actions, 804 or prioritise any reporting to network management systems. 806 Where an UPDATE message's NLRI attribute can be wholly parsed, the 807 pertinent information as to the prefixes that have been identified to 808 be in the message is available to the receiving BGP speaker. 809 Clearly, this information is of relevance to the administrators of 810 the remote device, and is likely to provide some information 811 regarding the contents of the message which is considered erroneous. 812 The Malformed UPDATE Prefixes (MUP) TLV defined herein is intended to 813 allow the receiving speaker to transmit the minimum required 814 information regarding an UPDATE identified as malformed to the remote 815 speaker without the overhead of additional path attributes (which may 816 not be available to the receiving speaker). It is envisaged that the 817 Dropped Update Prefixes (DUP) TLV provides analogous behaviour in the 818 case where the UPDATE message is dropped due to local administrative 819 policy, or implementation characteristics. 821 In some cases in order to determine the exact condition resulting in 822 an error, there is a requirement for a network operator (or equipment 823 implementor) to have an exact copy of the protocol message 824 transmitted to a remote system. The operational requirements 825 presented in [I-D.ietf-grow-ops-reqs-for-bgp-error-handling] describe 826 the operational advantage of logging a copy of such a message 827 locally, however, where the message is erroneous due to a bug in the 828 formation or transmission of the message by the sender, and the error 829 is identified on the receiving speaker, this information is not 830 available to the operator responsible for the erroneous network 831 element. The Malformed UPDATE Dump (MUD) TLV is intended to be 832 utilised to transmit an encapsulated copy of such a message back to 833 the remote BGP speaker, and hence allow the operator to determine the 834 exact formation of the invalid message. 836 7. Error Handling 838 An implementation MUST NOT send an OPERATIONAL message to a neighbor 839 in response to an erroneous or malformed OPERATIONAL message. Any 840 erroneous or malformed OPERATIONAL message received SHOULD be logged 841 for the attention of the operator and then MAY be discarded. 843 8. Security considerations 845 No new security issues are introduced to the BGP protocol by this 846 specification. 848 Where a request type is not supported or allowed by an implementation 849 for some reason, the implementation MAY send an NS (Section 3.4.4.2) 850 TLV in response, the Error subcode of this TLV SHOULD be set 851 according to the reason that this request will not be responded to. 853 Implementations MUST rate-limit the rate at which they transmit and 854 receive OPERATIONAL messages. Specifically, an implementation MUST 855 NOT allow the handling of OPERATIONAL messages to negatively impact 856 any other functions on a router such as regular BGP message handling 857 or other routing protocols. 859 Although an NS error subcode is provided to indicate that a request 860 was rate-limited, an implementation need not reply to a request at 861 all, this is the suggested course of action when rate-limiting the 862 sending of responses to a neighbor. 864 An implementation MAY send an MP (Section 3.4.4.1) TLV to indicate 865 the maximum rate at which it will accept OPERATIONAL messages from a 866 neighbor, upon receipt of this TLV the sender MUST ensure it does not 867 transmit above this rate for the duration of the session. 869 An implementation, considering a request to be too computationally 870 expensive, MAY reply with the "Busy" NS error subcode to indicate 871 such, though the implementation need not reply to the request. 873 Implementations MUST provide a mechanism for preventing access to 874 information requested by SSR (Section 3.4.2.7) messages for the 875 operator. Implementations SHOULD ensure that responses concerning 876 the Loc-RIB (PRI with L-Bit set or responses which would set the 877 L-Bit) are filtered in the default configuration. 879 9. IANA Considerations 881 IANA is requested to allocate a type code for the OPERATIONAL message 882 from the BGP Message Types registry, as well as requesting a type 883 code for the new OPERATIONAL Message Capability negotiation from BGP 884 Capability Codes registry. 886 This document requests IANA to define and maintain a new registry 887 named: "OPERATIONAL Message Type Values". The allocation policy is 888 on a first come first served basis. 890 This document makes the following assignments for the OPERATIONAL 891 Message Type Values: 893 ADVISE: 895 * Type 1 - Advisory Demand Message (ADM) 897 * Type 2 - Advisory Static Message (ASM) 899 STATE: 901 * Type 3 - Reachable Prefix Count Request (RPCQ) 903 * Type 4 - Reachable Prefix Count Response (RPCP) 905 * Type 5 - Adj-RIB-Out Prefix Count Request (APCQ) 907 * Type 6 - Adj-RIB-Out Prefix Count Response (APCP) 909 * Type 7 - Loc-Rib Prefix Count Request (LPCQ) 911 * Type 8 - Loc-Rib Prefix Count Response (LPCP) 913 * Type 9 - Simple State Request (SSQ) 915 DUMP: 917 * Type 10 - Dropped Update Prefixes (DUP) 919 * Type 11 - Malformed Update Prefixes (MUP) 921 * Type 12 - Malformed Update Dump (MUD) 923 * Type 13 - Simple State Response (SSP) 925 CONTROL: 927 * Type 65534 - Max Permitted (MP) 929 * Type 65535 - Not Satisfied (NS) 931 10. Acknowledgements 933 This memo is based on existing works [I-D.ietf-idr-advisory] and 934 [I-D.raszuk-bgp-diagnostic-message] which describe a number of 935 operational message types documented here. The authors would like to 936 thank Enke Chen, Bruno Decraene, Alton Lo, Tom Scholl, John Scudder 937 and Richard Steenbergen for their valuable input. 939 11. References 941 11.1. Normative References 943 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 944 Communities Attribute", RFC 1997, August 1996. 946 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 947 Requirement Levels", BCP 14, RFC 2119, March 1997. 949 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 950 Protocol 4 (BGP-4)", RFC 4271, January 2006. 952 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 953 Communities Attribute", RFC 4360, February 2006. 955 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 956 "Multiprotocol Extensions for BGP-4", RFC 4760, 957 January 2007. 959 [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 960 Number Space", RFC 4893, May 2007. 962 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 963 with BGP-4", RFC 5492, February 2009. 965 11.2. Informative References 967 [I-D.ietf-grow-ops-reqs-for-bgp-error-handling] 968 Shakir, R., "Operational Requirements for Enhanced Error 969 Handling Behaviour in BGP-4", 970 draft-ietf-grow-ops-reqs-for-bgp-error-handling-02 (work 971 in progress), October 2011. 973 [I-D.ietf-idr-advisory] 974 Scholl, T., Scudder, J., Steenbergen, R., and D. Freedman, 975 "BGP Advisory Message", draft-ietf-idr-advisory-00 (work 976 in progress), October 2009. 978 [I-D.ietf-idr-error-handling] 979 Scudder, J., Chen, E., Mohapatra, P., and K. Patel, 980 "Revised Error Handling for BGP UPDATE Messages", 981 draft-ietf-idr-error-handling-01 (work in progress), 982 December 2011. 984 [I-D.jasinska-ix-bgp-route-server] 985 Jasinska, E., Hilliard, N., Raszuk, R., and N. Bakker, 986 "Internet Exchange Route Server", 987 draft-jasinska-ix-bgp-route-server-03 (work in progress), 988 October 2011. 990 [I-D.nalawade-bgp-inform] 991 Nalawade, G., Scudder, J., and D. Ward, "BGPv4 INFORM 992 message", draft-nalawade-bgp-inform-02 (work in progress), 993 August 2002. 995 [I-D.nalawade-bgp-soft-notify] 996 Nalawade, G., "BGPv4 Soft-Notification Message", 997 draft-nalawade-bgp-soft-notify-01 (work in progress), 998 July 2005. 1000 [I-D.raszuk-bgp-diagnostic-message] 1001 Raszuk, R., Chen, E., and B. Decraene, "BGP Diagnostic 1002 Message", draft-raszuk-bgp-diagnostic-message-02 (work in 1003 progress), March 2011. 1005 [I-D.retana-bgp-security-state-diagnostic] 1006 Retana, A. and R. Raszuk, "BGP Security State Diagnostic 1007 Message", draft-retana-bgp-security-state-diagnostic-00 1008 (work in progress), March 2011. 1010 [I-D.shakir-idr-ops-reqs-for-bgp-error-handling] 1011 Shakir, R., "Operational Requirements for Enhanced Error 1012 Handling Behaviour in BGP-4", 1013 draft-shakir-idr-ops-reqs-for-bgp-error-handling-01 (work 1014 in progress), February 2011. 1016 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1017 Networks (VPNs)", RFC 4364, February 2006. 1019 Authors' Addresses 1021 David Freedman 1022 Claranet 1023 21 Southampton Row, Holborn 1024 London WC1B 5HA 1025 UK 1027 Email: david.freedman@uk.clara.net 1029 Robert Raszuk 1030 NTT MCL Inc. 1031 101 S Ellsworth Avenue Suite 350 1032 San Mateo, CA 94401 1033 US 1035 Email: robert@raszuk.net 1037 Rob Shakir 1038 BT 1039 pp C3L 1040 BT Centre 1041 81, Newgate Street 1042 London EC1A 7AJ 1043 UK 1045 Email: rob.shakir@bt.com 1046 URI: http://www.bt.com/