idnits 2.17.1 draft-ietf-teas-rsvp-te-scaling-rec-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 14, 2018) is 2261 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group V. Beeram, Ed. 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track I. Minei 5 Expires: August 18, 2018 R. Shakir 6 Google, Inc 7 D. Pacella 8 Verizon 9 T. Saad 10 Cisco Systems 11 February 14, 2018 13 Techniques to Improve the Scalability of RSVP Traffic Engineering 14 Deployments 15 draft-ietf-teas-rsvp-te-scaling-rec-09 17 Abstract 19 At the time of writing, networks which utilize RSVP Traffic 20 Engineering (RSVP-TE) Label Switched Paths (LSPs) are encountering 21 limitations in the ability of implementations to support the growth 22 in the number of LSPs deployed. 24 This document defines two techniques, "Refresh-Interval Independent 25 RSVP (RI-RSVP)" and "Per-Peer Flow-Control", that reduce the number 26 of processing cycles required to maintain RSVP-TE LSP state in Label 27 Switching Routers (LSRs) and hence allow implementations to support 28 larger scale deployments. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 34 "OPTIONAL" in this document are to be interpreted as described in BCP 35 14 [RFC2119] [RFC8174] when, and only when, they appear in all 36 capitals, as shown here. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on August 18, 2018. 55 Copyright Notice 57 Copyright (c) 2018 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Requirement for RFC2961 Support . . . . . . . . . . . . . . . 3 74 2.1. Required Functionality from RFC2961 to be Implemented . . 4 75 2.2. Making Acknowledgements Mandatory . . . . . . . . . . . . 4 76 3. Refresh-Interval Independent RSVP (RI-RSVP) . . . . . . . . . 4 77 3.1. Capability Advertisement . . . . . . . . . . . . . . . . 5 78 3.2. Compatibility . . . . . . . . . . . . . . . . . . . . . . 6 79 4. Per-Peer RSVP Flow-Control . . . . . . . . . . . . . . . . . 6 80 4.1. Capability Advertisement . . . . . . . . . . . . . . . . 7 81 4.2. Compatibility . . . . . . . . . . . . . . . . . . . . . . 7 82 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 83 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 7 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 85 7.1. Capability Object Values . . . . . . . . . . . . . . . . 8 86 8. Security Considerations . . . . . . . . . . . . . . . . . . . 8 87 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 88 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 89 9.2. Informative References . . . . . . . . . . . . . . . . . 9 90 Appendix A. Recommended Defaults . . . . . . . . . . . . . . . . 9 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 93 1. Introduction 95 At the time of writing, networks which utilize RSVP Traffic 96 Engineering (RSVP-TE) [RFC3209] Label Switched Paths (LSPs) are 97 encountering limitations in the ability of implementations to support 98 the growth in the number of LSPs deployed. 100 The set of RSVP Refresh Overhead Reduction procedures [RFC2961] 101 serves as a powerful toolkit for RSVP-TE implementations to help 102 cover a majority of the concerns about soft-state scaling. However, 103 even with these tools in the toolkit, analysis of existing 104 implementations [RFC5439] indicates that the processing required 105 beyond a certain scale may still cause significant disruption to a 106 Label Switching Router (LSR). 108 This document builds on the scaling work and analysis that has been 109 done so far and defines protocol extensions to help RSVP-TE 110 deployments push the envelope further on scaling by increasing the 111 threshold above which an LSR struggles to achieve sufficient 112 processing to maintain LSP state. 114 This document defines two techniques, "Refresh-Interval Independent 115 RSVP (RI-RSVP)" and "Per-Peer Flow-Control", that cut down the number 116 of processing cycles required to maintain LSP state. "RI-RSVP" helps 117 completely eliminate RSVP's reliance on refreshes and refresh- 118 timeouts while "Per-Peer Flow-Control" enables a busy RSVP speaker to 119 apply back pressure to its peer(s). This document defines a unique 120 RSVP Capability [RFC5063] for each technique (Support for CAPABILITY 121 object is a prerequisite for implementing these techniques). Note 122 that the "Per-Peer Flow-Control" technique requires the "RI-RSVP" 123 technique as a prerequisite. In order to reap maximum scaling 124 benefits, it is strongly recommended that implementations support 125 both the techniques and have them enabled by default. Both the 126 techniques are fully backward compatible and can be deployed 127 incrementally. 129 2. Requirement for RFC2961 Support 131 The techniques defined in Section 3 and Section 4 are based on 132 proposals made in [RFC2961]. Implementations of these techniques 133 will need to support the RSVP messages and procedures defined in 134 [RFC2961] with some minor modifications and alterations to 135 recommended time intervals and iteration counts (see Appendix A for 136 the set of recommended defaults). 138 2.1. Required Functionality from RFC2961 to be Implemented 140 An implementation that supports the techniques discussed in Section 3 141 and Section 4 must support the functionality described in [RFC2961] 142 as follows: 144 o It MUST indicate support for RSVP Refresh Overhead Reduction 145 extensions (as specified in Section 2 of [RFC2961]). 147 o It MUST support receipt of any RSVP Refresh Overhead Reduction 148 message as defined in [RFC2961]. 150 o It MUST initiate all RSVP Refresh Overhead Reduction mechanisms as 151 defined in [RFC2961] (including the SRefresh message) with the 152 default behavior being to initiate the mechanisms but offering a 153 configuration override. 155 o It MUST support reliable delivery of Path/Resv and the 156 corresponding Tear/Err messages (as specified in Section 4 of 157 [RFC2961]). 159 o It MUST support retransmission of all unacknowledged RSVP-TE 160 messages using exponential-backoff (as specified in Section 6 of 161 [RFC2961]). 163 2.2. Making Acknowledgements Mandatory 165 The reliable message delivery mechanism specified in [RFC2961] states 166 that "Nodes receiving a non-out of order message containing a 167 MESSAGE_ID object with the ACK_Desired flag set, SHOULD respond with 168 a MESSAGE_ID_ACK object." 170 In an implementation that supports the techniques discussed in 171 Section 3 and Section 4, nodes receiving a non-out of order message 172 containing a MESSAGE_ID object with the ACK-Desired flag set, MUST 173 respond with a MESSAGE_ID_ACK object. This MESSAGE_ID_ACK object can 174 be packed along with other MESSAGE_ID_ACK or MESSAGE_ID_NACK objects 175 and sent in an Ack message (or piggy-backed in any other RSVP 176 message). This improvement to the predictability of the system in 177 terms of reliable message delivery is key for being able to take any 178 action based on a non-receipt of an ACK. 180 3. Refresh-Interval Independent RSVP (RI-RSVP) 182 The RSVP protocol relies on periodic refreshes for state 183 synchronization between RSVP neighbors and for recovery from lost 184 RSVP messages. It relies on refresh timeout for stale state cleanup. 185 The primary motivation behind introducing the notion of "Refresh 186 Interval Independent RSVP" (RI-RSVP) is to completely eliminate 187 RSVP's reliance on refreshes and refresh timeouts. This is done by 188 simply increasing the refresh interval to a fairly large value. 189 [RFC2961] and [RFC5439] do talk about increasing the value of the 190 refresh interval to provide linear improvement of transmission 191 overhead, but also point out the degree of functionality that is lost 192 by doing so. This section revisits this notion, but also sets out 193 additional requirements to make sure that there is no loss of 194 functionality incurred by increasing the value of the refresh 195 interval. 197 An implementation that supports RI-RSVP: 199 o MUST support all the requirements specified in Section 2. 201 o MUST make the default value of the configurable refresh interval 202 (R) be a large value (10s of minutes). A default value of 20 203 minutes is RECOMMENDED by this document. 205 o MUST use a separate shorter refresh interval for refreshing state 206 associated with unacknowledged Path/Resv messages (uR). A default 207 value of 30 seconds is RECOMMENDED by this document. 209 o MUST implement coupling the state of individual LSPs with the 210 state of the corresponding RSVP-TE signaling adjacency. When an 211 RSVP-TE speaker detects RSVP-TE signaling adjacency failure, the 212 speaker MUST act as if all the Path and Resv states learnt via the 213 failed signaling adjacency have timed out. 215 o MUST make use of Node-ID based Hello Session ([RFC3209], 216 [RFC4558]) for detection of RSVP-TE signaling adjacency failures; 217 A default value of 9 seconds is RECOMMENDED by this document for 218 the configurable node hello interval (as opposed to the 5ms 219 default value proposed in Section 5.3 of [RFC3209]). 221 o MUST indicate support for RI-RSVP via the CAPABILITY object 222 [RFC5063] in Hello messages. 224 3.1. Capability Advertisement 226 An implementation supporting the RI-RSVP technique MUST set a new 227 flag "RI-RSVP Capable" in the CAPABILITY object signaled in Hello 228 messages. 230 Bit Number TBA1 (TBA2) - RI-RSVP Capable (I-bit): 232 Indicates that the sender supports RI-RSVP. 234 Any node that sets the new I-bit in its CAPABILITY object MUST also 235 set the Refresh-Reduction-Capable bit in the common header of all 236 RSVP-TE messages. If a peer sets the I-bit in the CAPABILITY object 237 but does not set the Refresh-Reduction-Capable bit, then the RI-RSVP 238 functionality MUST NOT be activated for that peer. 240 3.2. Compatibility 242 The RI-RSVP functionality MUST NOT be activated with a peer that does 243 not indicate support for this functionality. Inactivation of the RI- 244 RSVP functionality MUST result in the use of the traditional smaller 245 refresh interval [RFC2205]. 247 4. Per-Peer RSVP Flow-Control 249 The functionality discussed in this section provides an RSVP speaker 250 with the ability to apply back pressure to its peer(s) to reduce/ 251 eliminate a significant portion of the RSVP-TE control message load. 253 An implementation that supports "Per-Peer RSVP Flow-Control": 255 o MUST support all the requirements specified in Section 2. 257 o MUST support "RI-RSVP" (Section 3). 259 o MUST treat lack of ACKs from a peer as an indication of peer's 260 RSVP-TE control plane congestion. If congestion is detected, the 261 local system MUST throttle RSVP-TE messages to the affected peer. 262 This MUST be done on a per-peer basis. (Per-peer throttling MAY 263 be implemented by a traffic shaping mechanism that proportionally 264 reduces the RSVP signaling packet rate as the number of 265 outstanding Acks increases. And when the number of outstanding 266 Acks decreases, the send rate would be adjusted up again.) 268 o SHOULD use a Retry Limit (Rl) value of 7 (Section 6.2 of 269 [RFC2961], suggests using 3). 271 o SHOULD prioritize Hello messages and messages carrying 272 Acknowledgements over other RSVP messages. 274 o SHOULD prioritize Tear/Error over trigger Path/Resv (messages that 275 bring up new LSP state) sent to a peer when the local system 276 detects RSVP-TE control plane congestion in the peer. 278 o MUST indicate support for this technique via the CAPABILITY object 279 [RFC5063] in Hello messages. 281 4.1. Capability Advertisement 283 An implementation supporting the "Per-Peer Flow-Control" technique 284 MUST set a new flag "Per-Peer Flow-Control Capable" in the CAPABILITY 285 object signaled in Hello messages. 287 Bit Number TBA3 (TBA4) - Per-Peer Flow-Control Capable (F-bit): 289 Indicates that the sender supports Per-Peer RSVP Flow-Control. 291 Any node that sets the new F-bit in its CAPABILITY object MUST also 292 set Refresh-Reduction-Capable bit in common header of all RSVP-TE 293 messages. If a peer sets the F-bit in the CAPABILITY object but does 294 not set the Refresh-Reduction-Capable bit, then the Per-Peer Flow- 295 Control functionality MUST NOT be activated for that peer. 297 4.2. Compatibility 299 The Per-Peer Flow-Control functionality MUST NOT be activated with a 300 peer that does not indicate support for this functionality. If a 301 peer hasn't indicated that it is capable of participating in "Per- 302 Peer Flow-Control", then it SHOULD NOT be assumed that the peer would 303 always acknowledge a non-out of order message containing a MESSAGE_ID 304 object with the ACK-Desired flag set. 306 5. Acknowledgements 308 The authors would like to thank Yakov Rekhter for initiating this 309 work and providing valuable inputs. They would like to thank 310 Raveendra Torvi and Chandra Ramachandran for participating in the 311 many discussions that led to the techniques discussed in this 312 document. They would also like to thank Adrian Farrel, Lou Berger 313 and Elwyn Davies for providing detailed review comments and text 314 suggestions. 316 6. Contributors 318 Markus Jork 319 Juniper Networks 320 Email: mjork@juniper.net 322 Ebben Aries 323 Juniper Networks 324 Email: exa@juniper.net 326 7. IANA Considerations 328 7.1. Capability Object Values 330 IANA maintains all the registries associated with "Resource 331 Reservation Protocol (RSVP) Paramaters" (see 332 http://www.iana.org/assignments/rsvp-parameters/rsvp- 333 parameters.xhtml). "Capability Object Values" Registry (introduced 334 by [RFC5063]) is one of them. 336 IANA is requested to assign two new Capability Object Value bit flags 337 as follows: 339 Bit Hex Name Reference 340 Number Value 341 ------------------------------------------------------------------ 342 TBA1 TBA2 RI-RSVP Capable (I) Section 3 343 TBA3 TBA4 Per-Peer Flow-Control Capable (F) Section 4 345 8. Security Considerations 347 This document does not introduce new security issues. The security 348 considerations pertaining to the original RSVP protocol [RFC2205] and 349 RSVP-TE [RFC3209] and those that are described in [RFC5920] remain 350 relevant. 352 9. References 354 9.1. Normative References 356 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 357 Requirement Levels", BCP 14, RFC 2119, 358 DOI 10.17487/RFC2119, March 1997, . 361 [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. 362 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 363 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, 364 September 1997, . 366 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 367 and S. Molendini, "RSVP Refresh Overhead Reduction 368 Extensions", RFC 2961, DOI 10.17487/RFC2961, April 2001, 369 . 371 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 372 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 373 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 374 . 376 [RFC4558] Ali, Z., Rahman, R., Prairie, D., and D. Papadimitriou, 377 "Node-ID Based Resource Reservation Protocol (RSVP) Hello: 378 A Clarification Statement", RFC 4558, 379 DOI 10.17487/RFC4558, June 2006, . 382 [RFC5063] Satyanarayana, A., Ed. and R. Rahman, Ed., "Extensions to 383 GMPLS Resource Reservation Protocol (RSVP) Graceful 384 Restart", RFC 5063, DOI 10.17487/RFC5063, October 2007, 385 . 387 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 388 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 389 May 2017, . 391 9.2. Informative References 393 [RFC5439] Yasukawa, S., Farrel, A., and O. Komolafe, "An Analysis of 394 Scaling Issues in MPLS-TE Core Networks", RFC 5439, 395 DOI 10.17487/RFC5439, February 2009, . 398 [RFC5920] Fang, L., Ed., "Security Framework for MPLS and GMPLS 399 Networks", RFC 5920, DOI 10.17487/RFC5920, July 2010, 400 . 402 Appendix A. Recommended Defaults 404 (a) Refresh-Interval (R)- 20 minutes (Section 3): 405 Given that an implementation supporting RI-RSVP doesn't rely on 406 refreshes for state sync between peers, the function of RSVP 407 refresh interval is analogous to that of IGP refresh interval (the 408 default of which is typically in the order of 10s of minutes). 409 Choosing a default of 20 minutes allows the refresh timer to be 410 randomly set to a value in the range [10 minutes (0.5R), 30 411 minutes (1.5R)]. 413 (b) Node Hello-Interval - 9 Seconds (Section 3): 414 [RFC3209] defines the hello timeout as 3.5 times the hello 415 interval. Choosing 9 seconds for the node hello-interval gives a 416 hello timeout of 3.5*9 = 31.5 seconds. This puts the hello 417 timeout value in the vicinity of the IGP hello timeout value. 419 (c) Retry-Limit (Rl) - 7 (Section 4): 420 Choosing 7 as the retry-limit results in an overall rapid 421 retransmit phase of 31.5 seconds. This matches up with the 31.5 422 seconds hello timeout. 424 (d) Refresh-Interval for refreshing state associated with 425 unacknowledged Path/Resv messages (uR) - 30 seconds (Section 3): 426 The recommended refresh interval (R) value of 20 minutes (for an 427 implementation supporting RI-RSVP) can not be used for refreshing 428 state associated with unacknowledged Path/Resv messages. This 429 document recommends the use of the traditional default refresh 430 interval value of 30 seconds for uR. 432 Authors' Addresses 434 Vishnu Pavan Beeram (editor) 435 Juniper Networks 437 Email: vbeeram@juniper.net 439 Ina Minei 440 Google, Inc 442 Email: inaminei@google.com 444 Rob Shakir 445 Google, Inc 447 Email: rjs@rob.sh 449 Dante Pacella 450 Verizon 452 Email: dante.j.pacella@verizon.com 454 Tarek Saad 455 Cisco Systems 457 Email: tsaad@cisco.com