idnits 2.17.1 draft-li-rtgwg-protocol-assisted-protocol-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC7854]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 02, 2020) is 1261 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.brockners-inband-oam-requirements' is defined on line 909, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-netconf-yang-push' is defined on line 926, but no explicit reference was found in the text == Unused Reference: 'I-D.song-ntf' is defined on line 931, but no explicit reference was found in the text == Unused Reference: 'RFC1191' is defined on line 942, but no explicit reference was found in the text == Unused Reference: 'RFC1195' is defined on line 946, but no explicit reference was found in the text == Unused Reference: 'RFC1213' is defined on line 950, but no explicit reference was found in the text == Unused Reference: 'RFC3209' is defined on line 960, but no explicit reference was found in the text == Unused Reference: 'RFC3988' is defined on line 965, but no explicit reference was found in the text == Unused Reference: 'RFC7752' is defined on line 975, but no explicit reference was found in the text ** Downref: Normative reference to an Informational draft: draft-brockners-inband-oam-requirements (ref. 'I-D.brockners-inband-oam-requirements') ** Downref: Normative reference to an Informational draft: draft-song-ntf (ref. 'I-D.song-ntf') ** Downref: Normative reference to an Historic RFC: RFC 1157 ** Downref: Normative reference to an Experimental RFC: RFC 3988 ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) Summary: 8 errors (**), 0 flaws (~~), 10 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Z. Li 3 Internet-Draft S. Chen 4 Intended status: Standards Track Y. Gu 5 Expires: May 6, 2021 Huawei 6 November 02, 2020 8 Protocol Assisted Protocol (PAP) 9 draft-li-rtgwg-protocol-assisted-protocol-03 11 Abstract 13 For routing protocol troubleshooting, different approaches exibit 14 merits w.r.t. different situations. They can be generally divided 15 into two categories, the distributive way and the centralized way. A 16 very commonly used distributive approach is to log in possiblly all 17 related devices one by one to check massive data via CLI. Such 18 approach provides very detailed device information, however it 19 requires operators with high NOC (Network Operation Center) 20 experience and suffers from low troubleshooting efficiency and high 21 cost. The centralized approach is realized by collecting data from 22 devices via approaches, like the streaming Telemetry or BMP(BGP 23 Monitoring Protocol) RFC7854 [RFC7854], for the centralized server to 24 analyze all gathered data. Such approach allows a comprehensive view 25 fo the whole network and facilitates automated troubleshooting, but 26 is limited by the data collection boundary set by different 27 management domains, as well as high network bandwidth and CPU 28 computation costs. 30 This document proposes a semi-distributive and semi-centralized 31 approach for fast routing protocol troubleshooting, localizing the 32 target device and possibly the root cause, more precisely. It 33 defines a new protocol, called the PAP (Protocol assisted Protocol), 34 for devices to exchange protocol related information between each 35 other in both active and on-demand manners. It allow devices to 36 request specific information from other devices and receive replies 37 to the requested data. It also allows actively transmission of 38 information without request to inform other devices to better react 39 w.r.t. network issues. 41 Requirements Language 43 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 44 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 45 document are to be interpreted as described in RFC 2119 [RFC2119]. 47 Status of This Memo 49 This Internet-Draft is submitted in full conformance with the 50 provisions of BCP 78 and BCP 79. 52 Internet-Drafts are working documents of the Internet Engineering 53 Task Force (IETF). Note that other groups may also distribute 54 working documents as Internet-Drafts. The list of current Internet- 55 Drafts is at https://datatracker.ietf.org/drafts/current/. 57 Internet-Drafts are draft documents valid for a maximum of six months 58 and may be updated, replaced, or obsoleted by other documents at any 59 time. It is inappropriate to use Internet-Drafts as reference 60 material or to cite them other than as "work in progress." 62 This Internet-Draft will expire on May 6, 2021. 64 Copyright Notice 66 Copyright (c) 2020 IETF Trust and the persons identified as the 67 document authors. All rights reserved. 69 This document is subject to BCP 78 and the IETF Trust's Legal 70 Provisions Relating to IETF Documents 71 (https://trustee.ietf.org/license-info) in effect on the date of 72 publication of this document. Please review these documents 73 carefully, as they describe your rights and restrictions with respect 74 to this document. Code Components extracted from this document must 75 include Simplified BSD License text as described in Section 4.e of 76 the Trust Legal Provisions and are provided without warranty as 77 described in the Simplified BSD License. 79 Table of Contents 81 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 82 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 3 83 1.2. PAP Usage Use cases . . . . . . . . . . . . . . . . . . . 5 84 1.2.1. Use Case 1: BGP Route Oscillation . . . . . . . . . . 5 85 1.2.2. Use Case 2: RSVP-TE Set Up Failure . . . . . . . . . 6 86 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 87 3. PAP Overview . . . . . . . . . . . . . . . . . . . . . . . . 7 88 3.1. PAP Encapsulation . . . . . . . . . . . . . . . . . . . . 7 89 3.2. PAP Speaker and PAP Agent . . . . . . . . . . . . . . . . 7 90 3.3. PAP Event . . . . . . . . . . . . . . . . . . . . . . . . 7 91 3.4. Summary of Operation . . . . . . . . . . . . . . . . . . 8 92 3.4.1. PAP Capability Negotiation Process . . . . . . . . . 8 93 3.4.2. PAP Request and Reply Process . . . . . . . . . . . . 8 94 3.4.3. PAP Notification Process . . . . . . . . . . . . . . 9 96 4. PAP Message Format . . . . . . . . . . . . . . . . . . . . . 9 97 4.1. Common Header . . . . . . . . . . . . . . . . . . . . . . 9 98 4.1.1. Capability Negotiation Message . . . . . . . . . . . 10 99 4.2. Request Message . . . . . . . . . . . . . . . . . . . . . 11 100 4.3. Reply Message . . . . . . . . . . . . . . . . . . . . . . 12 101 4.4. Notification Message . . . . . . . . . . . . . . . . . . 13 102 4.5. ACK Message . . . . . . . . . . . . . . . . . . . . . . . 14 103 5. PAP Operations . . . . . . . . . . . . . . . . . . . . . . . 14 104 5.1. Capability Negotiation Process . . . . . . . . . . . . . 14 105 5.1.1. PAP Peering Relation Establish Process . . . . . . . 14 106 5.1.2. PAP Capability Enabling Notification Process . . . . 15 107 5.1.3. PAP Capability Disabling Notification Process . . . . 16 108 5.2. PAP Request and Reply Process . . . . . . . . . . . . . . 16 109 5.3. PAP Notification Process . . . . . . . . . . . . . . . . 18 110 6. PAP Error Handling . . . . . . . . . . . . . . . . . . . . . 18 111 7. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 19 112 8. Security Considerations . . . . . . . . . . . . . . . . . . . 20 113 9. IANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 114 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 20 115 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 116 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 117 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 119 1. Introduction 121 A healthy control plane, providing network connectivity, is the 122 foundation of a well-functioning network. There have been rich 123 routing and signaling protocols designed and used for IP networks, 124 such as IGP (ISIS,OSPF), BGP, LDP, RSVP-TE and so on. The health 125 issues of these protocols, such as neighbor/peer disconnect/set up 126 failure, LSP set up failure, route flapping and so on, have been 127 devoted with ongoing efforts for diagnosing and remediation. 129 1.1. Motivation 131 The distributive protocol troubleshooting approach is typically 132 realized through manual per-device check. It's both time- and labor- 133 consuming, and requires NOC experience of the operators. Amongst 134 all, localizing the target device is usually the most diffcult and 135 time-consuming part. For example, in the case of route loop, 136 operators first log in a random deivce that reports TTL alarms, and 137 then check the looped route in the Forwarding Information Base (FIB) 138 and/or the Routing Information Base (RIB). It requires device by 139 device check, as well as manul data correlation, to pin point to the 140 exact responsible device, since the information retrival and analysis 141 of such distributive way is fragmented. In addition, the low 142 efficiency and manul troubleshooting activities may further impact 143 new network services and/or enlarge affected areas. 145 The centralized network OAM, by collecting network-wide data from 146 devices, enables automatic routing protocol troubleshooting. Date 147 collection protocols, such as SNMP (Simple Network Management 148 Protocol) [RFC1157], NETCONF (Network Configuration Protocol) 149 [RFC6241], and (BMP) [RFC7854], can provide various information 150 retrival, such as network states, routing data, configurations and so 151 on. Such centrazlized way relies on the existence of a centralized 152 server/controller, which is not supported by some legacy networks. 153 What's more, even with the existence of a centralized server/ 154 controller, it can only collect the data within its own management 155 domain, while the cross-domain data are not available due to 156 independent managment of different ISPs. Thus, the lack of such 157 information may lead to troubleshooting failure. In addition, 158 centralized approaches may suffer from high network bandwidth and CPU 159 computation consumptions. 161 Another way of protocol troubleshooting is utilzing the protocol 162 itself to convey diagnosing information. For example, some reason 163 codes are carried in the Path-Err/ResvErr messages of RSVP-TE, so 164 that to other nodes may know the why the tunnel fails to be set up. 165 Such approaches is semi-distributive and semi-centralized. It does 166 not rely on the deployment of a centralized server, but still gets 167 partial global view of the network. However, there still requires 168 non-trivial augementation works to existing routing protocols in 169 order to support troubleshooting. This then raises the question that 170 whether such non-routing data is suitable to be carried in these 171 routing protocols. The extra encapsulation, parsing and analyzing 172 work for the non-routing data would further slow down the network 173 convergence. Thus, it's better to separate the routing and non- 174 routing data transmission as well as data parsing. In addition, 175 coexisting with legacy devices may cause interop issues. Thus, 176 relying on augumenting existing routing protocols without network- 177 wide upgrading may not only fail to provide the truobleshooting 178 benefit, but further affect the operation of the existing routing 179 system. What's more, the failure of routing protocol instance would 180 lead to the failure of diagnosing itself. All in all, it's 181 reasonable to separate the protocol diagnosing data 182 generation/encapsulation/transmission/parsing from the protocol 183 itself. 185 This document proposes a new protocol, called the PAP (Protocol 186 assisted Protocol), for devices to exchange protocol related 187 information between each other. It allows both active and on-demand 188 data exchange. Considering that massiveness of protocol/routing 189 related data, the intuitive of designing PAP is not to exchange the 190 comprehensive protocol/routing status between devices, but to provide 191 very specific information required for fast troubleshooting. The 192 benefits of such a semi-distributive and semi-centralized approach 193 are summarized as follows: 195 1. It facilitates automatic troubleshooting without requiring manul 196 device by device check. 198 2. It allows individual device to have a more global view by 199 requesting data from other devices. 201 3. It does not rely on the deployement of a centralized server/ 202 controller. 204 4. It passes the dtata collection boundary set by different 205 management domains by cross-domain data exchange between devices. 207 5. It relieves the bandwidth pressure of network-wide data 208 collection, and the processing pressure of the centralized 209 server. 211 6. It does not affect the running of existing routing protocols. 213 1.2. PAP Usage Use cases 215 PAP allows both data request/reply and data notification between 216 devices. PAP speakers use the exchanged PAP data to help fast 217 localize the network issues. 219 1.2.1. Use Case 1: BGP Route Oscillation 221 A BGP route oscillation can be caused by various reasons, and usually 222 leaves network-wide impact. In order to find the root cause and take 223 remediation actions, the first step is to localize the oscillation 224 source. In this case, a BGP speaker can send a PAP Request Message 225 to the next hop device of the oscillating route asking " Are you the 226 oscillation source?". If the BGP speaker is the oscillation source, 227 possiblly knows by running a device diagnosing system, replies with a 228 PAP Reply Message saying that "I'm the oscillation source!" to the 229 device who sends the PAP Request Message. If the BGP speaker is not 230 the oscillation source, it further asks the same question with a PAP 231 Request Message to its next hop device of the oscillating route. 232 This request and reply process continues util the request has reached 233 the oscillation source. The source device then sends a PAP Reply 234 Message to tell its upstream device along the PAP request path that " 235 I am the oscillation source!", and then "xx is the oscillation 236 source!" information is further sent back hop by hop to the device 237 who originates the request. 239 1.2.2. Use Case 2: RSVP-TE Set Up Failure 241 The MPLS label switch path set up, either using RSVP-TE or LDP, may 242 fail due to various reasons. Typical troubleshooting procedures are 243 to log in the device, and then check if the failure lies on the 244 configuration, or path computation error, or link failure. 245 Sometimes, it requires the check of multiple devices along the 246 tunnel. Certain reason codes can be carried in the Path-Err/ResvErr 247 messages of RSVP-TE, while other data are currently not supported to 248 be transmitted to the path ingress/egress node, such as the 249 authentication failure. Using PAP, the device, which is reponsible 250 for the tunnel set up failure, can send the PAP Notification Message 251 to the Ingress device, and possibly with some reason codes so that 252 the ingress device can not only localize the target device but also 253 the root cause. 255 2. Terminology 257 IGP: Interior Gateway Protocol 259 IS-IS: Intermediate System to Intermediate System 261 OSPF: Open Shortest Path First 263 BGP: Boarder Gateway Protocol 265 BGP-LS: Boarder Gateway Protocol-Link State 267 MPLS: Multi-Protocol Label Switching 269 RSVP-TE: Resource Reservation Protocol-Traffic Engineering 271 LDP: Label Distribution Protocol 273 BMP: BGP Monitoring Protocol 275 LSP: Link State Packet 277 IPFIX: Internet Protocol Flow Information Export 279 PAP: Protocol assisted Protocol 281 UDP: User Datagram Protocol 283 3. PAP Overview 285 3.1. PAP Encapsulation 287 PAP uses UDP as its transport protocol, which is connectionless. The 288 reason that UDP is selected over TCP is because PAP is intended for 289 on-demand communications. The PAP packet is defined as follows. 290 This document requires the assignment of a User Port registry for the 291 UDP Destination Port. 293 +-------------+-------------+-------------+-------------+-------------+ 294 | ETH. Header | IP Header | UDP Header | PAP Header | PAP Payload | 295 +-------------+-------------+-------------+-------------+-------------+ 297 Figure 1. Encapsulation in UDP 299 3.2. PAP Speaker and PAP Agent 301 This document uses PAP speakers to refer to routing devices that 302 communicate with each other using PAP. PAP speakers SHOULD be 303 implemented with a supporting module (or multiple modules) to 304 receive, parse, analyze, generate, and send PAP messages. For 305 example, a BGP diagnosing module used for BGP related PAP message 306 handling functions as a PAP agent. A PAP Agent is the union of 307 multiple such modules regarding different protocols, or one module 308 for all protocols. Such supporting module is called PAP Agent in 309 this document. PAP Agent, standalone, SHOULD be able to provide 310 protocol troubleshooting capability with local information. Enabling 311 PAP exchange capability, PAP agent gains information from remote PAP 312 speakers to improve diagnosing accuracy . The primary function of PAP 313 is to provide a unfied tunnel for protocol diagnosing information 314 exchange without augumenting each specific protocol. 316 3.3. PAP Event 318 A PAP Event is referred to as the a troubleshooting instance running 319 within a PAP Agent. A PAP Agent may instantiate one or multiple PAP 320 Events for each protocol at the same time depending on the configured 321 troubleshooting triggering condition. For example, an PAP Event is 322 intiated automatically when device CPU is over high, or manually with 323 related command line input from a device operator. Once a PAP Event 324 is generated, corresponding PAP processes are to be called on demand. 325 Notice, the initiation of PAP Capability Negotiation does not require 326 the existance of a PAP Event. 328 3.4. Summary of Operation 330 The communications between two PAP speakders should follow three 331 major processes, i.e., the Capability Negotiation Process, the 332 Request and Reply Process, and the Notification Process. This 333 document defines 5 PAP Message types, i.e., Negotiation Message, 334 Request Message, Reply Message, Notification Message, and ACK 335 Message, which are used in the above PAP processes. 337 3.4.1. PAP Capability Negotiation Process 339 The purpose of the Capability Negotiation process is to inform two 340 PAP speakers of each other's PAP capabilties. The PAP capability 341 indicates, for which specific protocol(s), that PAP supports its/ 342 their diagnosing information exchange. The process can be further 343 divided into three procudures: 1) PAP Peering Relations Establish 344 process, 2) PAP Capability Enabling Notification Process, 3) PAP 345 Capability Disabling Notification Process. The Capability 346 Negotiation Process is realized by the exchange of PAP Capability 347 Negotiation Message, which is defined in Section 4. 349 Although PAP is connectionless, a successful PAP Peering Relations 350 Establish Process is required to be successfully performed before any 351 other PAP process. This process can be initiated by either the local 352 or remote PAP speaker through sending out a PAP Capability 353 Negotiation Message. The Negotiation Message may or may not require 354 an ACK Message, as indicated in the Negotiation Message. A 355 successful Peering is established if both PAP speakers have correctly 356 received the other speaker's Capability Negotiation Message. After a 357 successful negotiation, two PAP speakers can exchange any PAP Message 358 on-demand. The PAP Capability Enabling Notification Process is used 359 to inform the PAP peer its newly supported capability, which can be 360 intiated by the PAP speaker at any moment after a PAP Peering is 361 established with the respective PAP Peer. The PAP Capability 362 Disabling Notification Process is used to inform the PAP peer its 363 newly unsupported capability, which can be intiated by the PAP 364 speaker at any moment after a PAP Peering is established with the 365 respective PAP Peer. 367 3.4.2. PAP Request and Reply Process 369 The purpose of the PAP Request and Reply Process is to acquire 370 information needed by a PAP speaker from other PAP speakers for a 371 specific PAP Event. The Request and Reply Messages can be customized 372 for different events. The process is triggered by the instantiation 373 of a PAP Event, and starts with sending a Request Message to a target 374 PAP peer. The target PAP peer is selected by the PAP agent regarding 375 the current PAP Event, which is out of the scope of this document. 377 The remote PAP speaker, after receiving the Request Message, sends 378 out a Reply Message to the request sender. ACK is required or not as 379 indicated in the Message Flag. 381 One Request Message received at the local PAP speaker from a PAP peer 382 may further results in a new Request Message generation regarding a 383 third PAP speaker, if the local PAP speaker does not have the right 384 Reply to this PAP peer. This local PAP speaker does not send Reply 385 Message to the requesting PAP peer until it receives a new Reply 386 Message from this third PAP speaker. So the whole process In order 387 to avoid Request/Reply loops, a Residua Hop value is used to limit 388 the Request/Reply rounds. 390 3.4.3. PAP Notification Process 392 The Notification Process is used by a PAP speaker voluntarily to 393 notify other PAP speakers of certain information regarding a PAP 394 Event. The process is triggered by the instantiation of a PAP Event, 395 and starts with sending a Notification Message to one or multiple 396 target PAP peer(s). The target PAP peer(s) is/are selected by the 397 PAP agent regarding the current PAP Event, which is out of the scope 398 of this document. The Notification Message may or may not require an 399 ACK Message, as indicated in the Notification Message. 401 4. PAP Message Format 403 4.1. Common Header 405 The common header is encapsulated in all PAP messages. It is defined 406 as follows. 408 0 1 2 3 409 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 410 +---------------+----------------+------------------------------+ 411 |V| Flag | Msg. Type | Length | 412 +---------------+----------------+------------------------------+ 413 + Peer Address (16 bytes) + 414 ~ ~ 415 +--------------------------------+------------------------------+ 416 | Msg. Sequence | 417 +--------------------------------+ 419 Figure 2. PAP Common Header 421 o Flag (1 byte): The V flag indicates that the source IP address is 422 an IPv6 address. For IPv4 address, this is set to 0. 424 o Message Type (1 byte): This indicates the PAP message type.The 425 following types are defined, and listed as follows. 427 * Type = TBD1: Capability Negotiation Message. It is used for 428 two devices to inform each other of the capabilties they 429 support and no longer support. 431 * Type = TBD2: Request Message. 433 * Type = TBD3: Reply Message. 435 * Type = TBD4: Notification Message. 437 * Type = TBD5: ACK Message. It is used to confirm to the local 438 device that the remote device has received a previous sent PAP 439 message, which can be either a Negotiation Message, a Request 440 Message, a Reply Message or an Notification Message. 442 o Length (2 bytes): Length of the message in bytes, including the 443 Common Header and the following Message. 445 o Souece IP Address (16 bytes): It indicates the IP address who 446 initiates the PAP message. It is 4 bytes long if an IPv4 address 447 is carried in this field (with the 12 most significant bytes zero- 448 filled) and 16 bytes long if an IPv6 address is carried in this 449 field. 451 o Message Sequence (2 bytes): It indicates the sequence number of 452 each PAP message. 454 4.1.1. Capability Negotiation Message 456 The Negotiation Message is used in the PAP Capability Negotiation 457 Process. It is defined as follows. 459 0 1 2 3 460 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 461 +--------------------------------+------------------------------+ 462 | Version |A|E| Flag | 463 +--------------------------------+------------------------------+ 464 | Protocol Capacity | 465 +---------------------------------------------------------------+ 467 Figure 3. PAP Negotiation Message 469 o Version (1 byte): It indicates the PAP version. The current 470 version is 0. 472 o Flags (1 bytes): Two flag bits are currently defined. 474 * The A bit is used to indicate if an ACK Message from the remote 475 PAP speaker is required for each Negotiation Message sent. If 476 an ACK is required, then the A bit SHOULD be set to "1", and 477 "0" otherwise. 479 * The E bit is used to indicate the enabling/disabling of the 480 capabilities that carried in the Protocol Capability field. If 481 the local device wants to inform the remote device of enabling 482 one or more capabilities, the E bit SHOULD be set to "1". If 483 the local device wants to inform the remote device of disabling 484 one or more capabilities, the E bit SHOULD be set to "0". 486 o Protocol Capability (4 bytes): It is 4-byte bitmap that indicates 487 the capability of inforamtion exchange regarding various 488 protocols. Each bit represents one protocol. The following 489 protocol capability is defined (from the rightmost bit). 491 * Bit 0: ISIS 493 * Bit 1: OSPF 495 * Bit 2: BGP 497 * Bit 3: LDP 499 4.2. Request Message 501 The Request Message is used for the local device to request specific 502 data regarding one specific protocol or application from the remote 503 device. It MUST be sent after a successful Capability Negotiation 504 Process (described in Section 5.1), and the requested protocol/ 505 application MUST be supported by both the local and remote devices, 506 as indicated in the Negotiation Messages exchanged between the local 507 and remote devices. It is defined as follows. 509 0 1 2 3 510 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 511 +---------------+----------------+------------------------------+ 512 |A| Flag | Prot. Capb. | Event ID | 513 +--------------------------------+------------------------------+ 514 | Res. Hop | 515 +---------------+-----------------------------------------------+ 516 + Request Data + 517 ~ ~ 518 +---------------------------------------------------------------+ 520 Figure 4. PAP Request Message 522 o Flags (1 byte): It is currently reserved. The A bit is used to 523 indicate if an ACK Message from the remote PAP speaker is required 524 for each Request Message sent. If an ACK is required, then the A 525 bit SHOULD be set to "1", and "0" otherwise. 527 o Capability (1 byte): It represents the bit index of the protocol, 528 which the Request Message is requesting data for. 530 o Event ID (2 bytes): It indicates the event number that this 531 Request message is regarding. 533 o Residua hop (1 byte): it indicates the residua Request hops of the 534 current PAP Event. It is reduced by 1 at each PAP speaker when 535 generating a further PAP Request to a third PAP speaker. 537 o Request Data (Variable): Specifies information of the data that 538 the local device is requesting. The specific format remains to be 539 determined per each protocol, as well as each use case. 541 4.3. Reply Message 543 The Reply Message is used to carry the information that the local 544 device requests from the remote device through the Request Message. 545 It is defined as follows. 547 0 1 2 3 548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 549 +---------------+----------------+------------------------------+ 550 |A| Flag | Prot. Capb. | Event ID | 551 +---------------+----------------+------------------------------+ 552 + Reply Data + 553 ~ ~ 554 +---------------------------------------------------------------+ 556 Figure 5. PAP Reply Message 558 o Flags (1 byte): It is currently reserved. The A bit is used to 559 indicate if an ACK Message from the remote PAP speaker is required 560 for each Reply Message sent. If an ACK is required, then the A 561 bit SHOULD be set to "1", and "0" otherwise. 563 o Capability (1 byte): It represents the bit index of the protocol, 564 which the Reply Message is replying data for. 566 o Event ID (2 bytes): It indicates the event number that this Reply 567 message is regarding. 569 o Reply Data (Variable): Specifies information of the data that the 570 local device is replying. The specific format remains to be 571 determined per each protocol, as well as each use case. 573 4.4. Notification Message 575 The Notification Message is used to carry the information that the 576 local device sends to the remote device. 578 0 1 2 3 579 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 580 +---------------+----------------+------------------------------+ 581 |A| Flag | Prot. Capb. | Event ID | 582 +---------------+----------------+------------------------------+ 583 + Notification Data + 584 ~ ~ 585 +---------------------------------------------------------------+ 587 Figure 6. PAP Notification Message 589 o Flags (1 byte): It is currently reserved. The A bit is used to 590 indicate if an ACK Message from the remote PAP speaker is required 591 for each Notification Message sent. If an ACK is required, then 592 the A bit SHOULD be set to "1", and "0" otherwise. 594 o Capability (1 byte): It represents the bit index of the protocol, 595 which the Notification Message is notifying for. 597 o Event ID (2 bytes): It indicates the event number that this 598 Notification Message is regarding. 600 o Notification Data (Variable): Specifies information of the data 601 that the local device is notifying. The specific format remains 602 to be determined per each protocol, as well as each use case. 604 o 606 4.5. ACK Message 608 The ACK Message is used to confirm that the remote device has 609 received a PAP Message with the A bit set to "1". The ACK Message 610 includes only the PAP Common Header. The Msg. Sequence MUST be set 611 to the sequence number carried in the received PAP message, which 612 requires this ACK. 614 5. PAP Operations 616 The PAP operations include the following 3 major processes, the 617 Capability Negotiation Process, the Data Request and Reply Process, 618 and the Data Notification Process. 620 5.1. Capability Negotiation Process 622 5.1.1. PAP Peering Relation Establish Process 624 A successful PAP Peering relation MUST be Established between two PAP 625 speakers before any other PAP process. 627 As the first step, a Capability Negotiation Message can be initiated 628 at any time by a PAP speaker,as long as the target PAP peer is IP 629 reachable. It usually companies the establishment of neighboring/ 630 peering relation between two routing devices. The "A" bit in the 631 Negotiation Message MUST be set as 1 during the PAP Peering Establish 632 Process, meaning ACK required. The "E" in the Negotiation Message 633 MUST be set to 1 during this process, meaning the capabilities 634 indicated in the Protocol Capability field are enabled by default. 635 The Protocol Capability field SHOULD indicate all the protocol 636 capabilities that are supported by the local PAP Agent and currently 637 enabled. After the first Negotiation Message is sent, the local 638 device SHUOLD wait for the ACK Message from the remote device for a 639 certain time period before taking further actions, and if no ACK 640 Message is received within this time frame, the local device SHOULD 641 resend the Negotiation Message to the remote device. The waiting 642 period can be configured locally. This send and wait process CAN be 643 repeated for at most 3 times before receiving a ACK Message from the 644 remote device. If after 3 times of resending the Negotiation 645 Message, still no ACK received, then this peering establishment is 646 treated as unsuccessful. 648 The next step for the local PAP speaker is to wait for the 649 Negotiation Message from the remote PAP speaker. If no Negotiation 650 Message is received from the remote PAP speaker within a time frame 651 after its own Negotiation Message is sent , the local PAP speaker CAN 652 resend the Negotiation Message. This time frame is also configured 653 locally. This send and wait process CAN be repeated for at most 3 654 times before receiving a Negotiation Message from the remote PAP 655 speaker. If after 3 times of resending the Negotiation Message, 656 still no Negotiation Message received, then this negotiation is 657 treated as unsuccessful. If a Negotiation Message is received and 658 parsed correctly, an ACK MUST be sent to the remote PAP speaker. 660 Once an ACK Message and a Negotiation Message are received from the 661 remote PAP speaker and correctly parsed, a PAP Peering relation is 662 considered as successfully established. The local PAP speaker 663 maintains locally the protocol capabilities of the remote PAP 664 speaker, and uses them during other PAP processes. 666 5.1.2. PAP Capability Enabling Notification Process 668 Once the PAP Peering relation is set up between two PAP speakers, 669 they become PAP peers. Thereafter, any PAP speaker supports a new 670 protocol capability, it SHOULD call the Capability Enabling 671 Notification Process to inform all its PAP peers. 673 When the local PAP speaker initates a PAP Capability Enabling 674 Notification Process: The "A" bit in the Negotiation Message MUST be 675 set as 1 during the PAP Capability Enabling Notification Process, 676 meaning ACK required. The "E" in the Negotiation Message MUST be set 677 to 1 during this process, meaning the capabilities indicated in the 678 Protocol Capability field are enabled. The Protocol Capability field 679 SHOULD indicate all the protocol capabilities that are supported by 680 the local PAP Agent and currently enabled. After the Negotiation 681 Message is sent, the local PAP speaker SHUOLD wait for the ACK 682 Message from the PAP peer for a certain time period before taking 683 further actions, and if no ACK Message is received within this time 684 frame, the local device SHOULD resend the Negotiation Message to the 685 remote device. The waiting period can be configured locally. This 686 send and wait process CAN be repeated for at most 3 times before 687 receiving a ACK Message from the remote device. If after 3 times of 688 resending the Negotiation Message, still no ACK received, then this 689 Capability Enabling Notification Process is treated as unsuccessful. 690 This process MAY be intiated at another time thereafter. If a ACK is 691 received, the Capability Enabling Notification Process is considered 692 successful. 694 When a PAP peer initates a PAP Capability Enabling Notification 695 Process: The local PAP speaker, after receiving the PAP Negotiation 696 Message and correctly parsing it, sends out an ACK. This Capability 697 Enabling Notification Process is considered successful. The local 698 PAP speaker updates the capability status maintained accordingly. 700 5.1.3. PAP Capability Disabling Notification Process 702 Whenever a PAP speaker disables a PAP capability, it SHOULD initiate 703 a PAP Capability Disabling Notification Process to inform all its PAP 704 peers. 706 When the local PAP speaker initates a PAP Capability Disabling 707 Notification Process: The "A" bit in the Negotiation Message MUST be 708 set as 1 during the PAP Capability Disabling Notification Process, 709 meaning ACK required. The "E" in the Negotiation Message MUST be set 710 to 0 during this process, meaning the capabilities indicated in the 711 Protocol Capability field are disabled. The Protocol Capability 712 field SHOULD indicate all the protocol capability that is disabled. 713 After the Negotiation Message is sent, the local PAP speaker SHUOLD 714 wait for the ACK Message from the PAP peer for a certain time period 715 before taking further actions, and if no ACK Message is received 716 within this time frame, the local device SHOULD resend the 717 Negotiation Message to the remote device. The waiting period can be 718 configured locally. This send and wait process CAN be repeated for 719 at most 3 times before receiving a ACK Message from the remote 720 device. If after 3 times of resending the Negotiation Message, still 721 no ACK received, then this Capability Disabling Notification Process 722 is treated as unsuccessful. This process MAY be intiated at another 723 time thereafter. 725 When a PAP peer initates a PAP Capability Disabling Notification 726 Process: The local PAP speaker, after receiving the PAP Negotiation 727 Message and correctly parsing it, sends out an ACK. This Capability 728 Disabling Notification Process is considered successful. The local 729 PAP speaker updates the capability status maintained accordingly. 731 5.2. PAP Request and Reply Process 733 When a local PAP Event triggers a PAP Request and Reply Process, the 734 local PAP speaker initates a Request Message, and send to a target 735 PAP peer as indicated by PAP Agent per this PAP Event. This local 736 PAP speaker is called the Request and Reply Process Starter. It sets 737 the Residua Hop as the maximum number of Request/Reply rounds (e.g., 738 10) it will wait in order to receive the final Reply. The Event ID 739 and the Request are set by the local PAP Agent. The A bit of the 740 Request Message MUST be set to "1" (i.e., ACK is required). The 741 local device waits for the ACK Message from the remote device for a 742 certain time period before taking further actions, and if no ACK 743 Message is received within this time frame, the local device SHOULD 744 resend the Request Message to the remote device. The waiting period 745 can be configured locally. This send and wait process CAN be 746 repeated for at most 3 times before receiving a ACK Message from the 747 remote device. If after 3 times of resending the Request Message, 748 still no ACK received, then this Request and Reply Process is treated 749 as unsuccessful. If ACK received, the local device waits for the 750 Reply Message. If no Reply Message is received from the remote 751 device within a time frame, the local device can resend the Request 752 Message. This send and wait process CAN be repeated for at most 3 753 times before receiving a Reply Message from the remote device. If 754 after 3 times of resending the Request Message, still no Reply 755 Message received, then this Request and Reply Process is treated as 756 unsuccessful. The waiting period can be configured locally, and 757 SHOULD take into consideration of the Residua Hop value. If the 758 Request and Reply Process Starter receives the Reply Message within 759 the time frame, and the Event ID is matched to the local PAP Event, 760 the PAP Request and Reply Process is considered as successful. 762 When a local PAP speaker receives a Request Message from its PAP peer 763 (i.e., it is not the Pequest and Reply Process Starter), it sends 764 back an ACK Message. With the received Request Message, a new PAP 765 event it instantiated at the local PAP Agent. The PAP event triggers 766 the troubleshooting analysis of the received Request Message, and 767 then generate the Reply Message if the Reply condition is met, or 768 generate a new Request Message when the Reply condition is not met. 769 The Reply condition and the troubleshooting analysis of the PAP Agent 770 is out of the scope of this document. 772 If the Reply condition is met, the local PAP speaker is called the 773 Request and Reply Process Terminator. It generates the Reply Message 774 and send the message back to the requesting PAP peer. The Event ID 775 is set to be the same as the Event ID of the received Request 776 Message. The Reply Data is set by the local PAP Agent per this 777 generated event. The A bit of the Reply Message MUST be set to "1" 778 (i.e., ACK is required). The local device waits for the ACK Message 779 from the remote device for a certain time period before taking 780 further actions, and if no ACK Message is received within this time 781 frame, the local device SHOULD resend the Reply Message to the remote 782 device. The waiting period can be configured locally. This send and 783 wait process CAN be repeated for at most 3 times before receiving a 784 ACK Message from the remote device. If after 3 times of resending 785 the Request Message, still no ACK received, then this Request and 786 Reply Process is treated as unsuccessful. 788 If the Reply condition is not met, the local PAP speaker is called 789 the Request and Reply Process mid-handler. It generates a new 790 Request Message and send the message to a third PAP speaker per 791 indicated by the local PAP Agent per this generated event. In the 792 new generated Request Message, the Residua Hop value by MUST be 793 reduced by 1. The A bit of the Request Message MUST be set to "1" 794 (i.e., ACK is required). The local device waits for the ACK Message 795 from the remote device for a certain time period before taking 796 further actions, and if no ACK Message is received within this time 797 frame, the local device SHOULD resend the Request Message to the 798 remote device. The waiting period can be configured locally. This 799 send and wait process CAN be repeated for at most 3 times before 800 receiving a ACK Message from the remote device. If after 3 times of 801 resending the Request Message, still no ACK received, then this 802 Request and Reply Process is treated as unsuccessful. If ACK 803 received, the local device waits for the Reply Message. If no Reply 804 Message is received from the remote device within a time frame, the 805 local device can resend the Request Message. This send and wait 806 process CAN be repeated for at most 3 times before receiving a Reply 807 Message from the remote device. If after 3 times of resending the 808 Request Message, still no Reply Message received, then this Request 809 and Reply Process is treated as unsuccessful. The waiting period can 810 be configured locally, and SHOULD take into consideration of the 811 Residua Hop value. If the local device receives the Reply Message 812 within the time frame, it generates a new Reply Message and sends 813 back to it requesting PAP peer. The Event ID of the new Reply 814 Message is set to be the same as the Event ID of the received Request 815 Message. 817 5.3. PAP Notification Process 819 When a local PAP Event triggers a PAP Notification Process, the local 820 PAP speaker initates a Notification Message. The target PAP peer(s) 821 is/are selected by the PAP agent regarding the current PAP Event, 822 which is out of the scope of this document. The Notification Message 823 may or may not require an ACK Message, as indicated in the 824 Notification Message. If the A bit is set to 1 (meaning ACK 825 required), the local device waits for the ACK Message from the remote 826 device for a certain time period before taking further actions, and 827 if no ACK Message is received within this time frame, the local 828 device SHOULD resend the Notification Message to the remote device. 829 The waiting period can be configured locally. This send and wait 830 process CAN be repeated for at most 3 times before receiving a ACK 831 Message from the remote device. If after 3 times of resending the 832 Request Message, still no ACK received, then this Request and Reply 833 Process is treated as unsuccessful. The waiting period can be 834 configured locally. If ACK is received within the time frame, the 835 Notification Process is considered to be successful. If the A bit is 836 set to 0 (meaning no ACK required), after sending the Notification 837 Message, the Notification Process is considered successful. 839 6. PAP Error Handling 841 When any PAP process is unsuccessful, information is recorded or not 842 by local PAP Agent. No further action is taken. 844 7. Discussion 846 In addition to the preceding message definition and process 847 description, the security and reliability requirements of the PAP 848 need to be considered. There are two possible options to implement 849 PAP. 851 - Option 1: PAP is developed independently as a new protocol. 853 - Option 2: PAP reuses the existing protocol Generic Autonomic 854 Signaling Protocol(GRASP) [I-D.ietf-anima-grasp] . 856 Option1: 858 1. Definition of the Message Format and Interaction Process: It can 859 be defined independently in the PAP. 861 2. Reliability: The transmission mode of PAP is based on UDP mainly 862 considering that the collected information is the auxiliary 863 information to help locate the protocol fault, and the information 864 loss has no impact on the service. In addition, if TCP mode is 865 adopted, the resource consumption of the device may be large, 866 especially when there area large number of neighbors. If it is 867 considered that PAP must ensure reliability, it can done in the 868 application layer, such as adding the sequence number to the message. 870 3. Security: MD5 authentication can be introduced for PAP security. 872 Option2: 874 ANIMA GRASP is a signaling protocol used for dynamic peer discovery, 875 status synchronization, and parameter negotiation between AS nodes or 876 AS service agents. GRASP specifies that unicast packets must be 877 transmitted based on TCP, and multicast packets (Discovery and Flood) 878 must be transmitted based on UDP. 880 1. Message format and interaction process: PAP can reuse the defined 881 messages and procedures of the GRASP. Messages defined in the PAP 882 include Capability Negotiation Message, Request Message, Reply 883 Message, and Negotiation Message. These message types are also 884 defined in GRASP. 886 2. Reliability: TCP mode of GRASP can be used to ensure reliability 887 for PAP. But there may be challenge for the equipment resources. 889 3. Security: Autonomic Control Plane(ACP) 890 [I-D.ietf-anima-autonomic-control-plane] can be reused. 892 8. Security Considerations 894 TBD 896 9. IANA 898 TBD 900 10. Contributors 902 We thank Jiaqing Zhang (Huawei), Tao Du (Huawei) and Lei Li (Huawei) 903 for their contributions. 905 11. Acknowledgments 907 12. References 909 [I-D.brockners-inband-oam-requirements] 910 Brockners, F., Bhandari, S., Dara, S., Pignataro, C., 911 Gredler, H., Leddy, J., Youell, S., Mozes, D., Mizrahi, 912 T., Lapukhov, P., and r. remy@barefootnetworks.com, 913 "Requirements for In-situ OAM", draft-brockners-inband- 914 oam-requirements-03 (work in progress), March 2017. 916 [I-D.ietf-anima-autonomic-control-plane] 917 Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic 918 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 919 plane-30 (work in progress), October 2020. 921 [I-D.ietf-anima-grasp] 922 Bormann, C., Carpenter, B., and B. Liu, "A Generic 923 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 924 grasp-15 (work in progress), July 2017. 926 [I-D.ietf-netconf-yang-push] 927 Clemm, A. and E. Voit, "Subscription to YANG Datastores", 928 draft-ietf-netconf-yang-push-25 (work in progress), May 929 2019. 931 [I-D.song-ntf] 932 Song, H., Zhou, T., Li, Z., Fioccola, G., Li, Z., 933 Martinez-Julia, P., Ciavaglia, L., and A. Wang, "Toward a 934 Network Telemetry Framework", draft-song-ntf-02 (work in 935 progress), July 2018. 937 [RFC1157] Case, J., Fedor, M., Schoffstall, M., and J. Davin, 938 "Simple Network Management Protocol (SNMP)", RFC 1157, 939 DOI 10.17487/RFC1157, May 1990, 940 . 942 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 943 DOI 10.17487/RFC1191, November 1990, 944 . 946 [RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and 947 dual environments", RFC 1195, DOI 10.17487/RFC1195, 948 December 1990, . 950 [RFC1213] McCloghrie, K. and M. Rose, "Management Information Base 951 for Network Management of TCP/IP-based internets: MIB-II", 952 STD 17, RFC 1213, DOI 10.17487/RFC1213, March 1991, 953 . 955 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 956 Requirement Levels", BCP 14, RFC 2119, 957 DOI 10.17487/RFC2119, March 1997, 958 . 960 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 961 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 962 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 963 . 965 [RFC3988] Black, B. and K. Kompella, "Maximum Transmission Unit 966 Signalling Extensions for the Label Distribution 967 Protocol", RFC 3988, DOI 10.17487/RFC3988, January 2005, 968 . 970 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 971 and A. Bierman, Ed., "Network Configuration Protocol 972 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 973 . 975 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 976 S. Ray, "North-Bound Distribution of Link-State and 977 Traffic Engineering (TE) Information Using BGP", RFC 7752, 978 DOI 10.17487/RFC7752, March 2016, 979 . 981 [RFC7854] Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP 982 Monitoring Protocol (BMP)", RFC 7854, 983 DOI 10.17487/RFC7854, June 2016, 984 . 986 Authors' Addresses 988 Zhenbin Li 989 Huawei 990 156 Beiqing Rd 991 Beijing 992 China 994 Email: lizhenbin@huawei.com 996 Shuanglong Chen 997 Huawei 998 156 Beiqing Road 999 Beijing,100095 1000 P.R. China 1002 Email: chenshuanglong@huawei.com 1004 Yunan Gu 1005 Huawei 1006 156 Beiqing Rd 1007 Beijing 1008 China 1010 Email: guyunan@huawei.com