idnits 2.17.1 draft-campbell-dime-overload-data-analysis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 18, 2013) is 4056 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-dime-overload-reqs-03 ** Downref: Normative reference to an Informational draft: draft-ietf-dime-overload-reqs (ref. 'I-D.ietf-dime-overload-reqs') == Outdated reference: A later version (-03) exists of draft-roach-dime-overload-ctrl-01 == Outdated reference: A later version (-01) exists of draft-korhonen-dime-ovl-00 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Campbell 3 Internet-Draft Tekelec 4 Intended status: Standards Track H. Tschofenig 5 Expires: August 22, 2013 Nokia Siemens Networks 6 J. Korhonen 7 Renesas Mobile 8 A. Roach 9 Mozilla 10 February 18, 2013 12 Diameter Overload Data Analysis 13 draft-campbell-dime-overload-data-analysis-00 15 Abstract 17 When a Diameter server or agent becomes overloaded, it needs to be 18 able to gracefully reduce its load, typically by informing clients to 19 reduce sending traffic for some period of time. Multiple mechanisms 20 have been proposed for transporting overload and load information. 21 While these proposals differ in many ways, they share similar data 22 requirements. This document analyzes the data requirements of each 23 proposal with a view towards proposing a common set of of Diameter 24 Attribute-Value Pairs (AVPs). 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 22, 2013. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Documentation Conventions . . . . . . . . . . . . . . . . . . 3 62 3. Overload Control Data Usage . . . . . . . . . . . . . . . . . 3 63 4. Mechanism Differences that Affect Data Structures . . . . . . 4 64 4.1. Non-Adjacent Nodes . . . . . . . . . . . . . . . . . . . . 4 65 4.2. Stateless Negotiation . . . . . . . . . . . . . . . . . . 4 66 4.3. Overload Scopes . . . . . . . . . . . . . . . . . . . . . 5 67 4.4. Hard or Soft Overload State . . . . . . . . . . . . . . . 5 68 5. Naming Conventions . . . . . . . . . . . . . . . . . . . . . . 5 69 6. Data Element Comparison . . . . . . . . . . . . . . . . . . . 6 70 6.1. Data Elements for Connection Establishment and 71 Negotiation . . . . . . . . . . . . . . . . . . . . . . . 6 72 6.1.1. Supported Scope Selection . . . . . . . . . . . . . . 6 73 6.1.2. Algorithm Selection . . . . . . . . . . . . . . . . . 6 74 6.1.3. Application Selection . . . . . . . . . . . . . . . . 7 75 6.1.4. Frequency of Reports . . . . . . . . . . . . . . . . . 7 76 6.1.5. Grouping . . . . . . . . . . . . . . . . . . . . . . . 7 77 6.2. Data Elements for Overload and Load reporting . . . . . . 7 78 6.2.1. Scope of Report . . . . . . . . . . . . . . . . . . . 7 79 6.2.2. Overload Severity . . . . . . . . . . . . . . . . . . 8 80 6.2.3. Report Algorithm . . . . . . . . . . . . . . . . . . . 8 81 6.2.4. Report Expiration . . . . . . . . . . . . . . . . . . 8 82 6.2.5. Current Load . . . . . . . . . . . . . . . . . . . . . 9 83 6.2.6. Applications covered by a Report . . . . . . . . . . . 9 84 6.2.7. Report Action . . . . . . . . . . . . . . . . . . . . 9 85 6.2.8. Priority . . . . . . . . . . . . . . . . . . . . . . . 10 86 6.2.9. Session Groups . . . . . . . . . . . . . . . . . . . . 10 87 6.3. Result Codes . . . . . . . . . . . . . . . . . . . . . . . 10 88 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 89 8. Security Considerations . . . . . . . . . . . . . . . . . . . 11 90 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 91 9.1. Normative References . . . . . . . . . . . . . . . . . . . 12 92 9.2. Informative References . . . . . . . . . . . . . . . . . . 12 93 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 12 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 96 1. Introduction 98 When a Diameter [RFC6733] server or agent becomes overloaded, it 99 needs to be able to gracefully reduce its load, typically by 100 informing clients to reduce sending traffic for some period of time. 101 The Diameter Overload Control Requirements 102 [I-D.ietf-dime-overload-reqs] describe requirements for overflow 103 control mechanisms. 105 At the time of this writing, there have been two proposals for 106 Diameter overload control mechanisms. "A Mechanism for Diameter 107 Overload Control" (MDOC) [I-D.roach-dime-overload-ctrl] defines a 108 mechanism that piggybacks overload and load state information over 109 existing Diameter messages. "The Diameter Overload Control 110 Application" (DOCA) [I-D.korhonen-dime-ovl] defines a mechanism that 111 uses a new and distinct Diameter application to communicate similar 112 information. While there are significant differences between the two 113 proposals, they carry similar information. Each proposal includes 114 its own set of Diameter AVPs. 116 This document is intended as a framework for discussing the data 117 requirements of the two proposals. It includes an analysis of the 118 differences and similarities of their respective data elements, with 119 a view towards rationalizing the AVPs from the two proposals. 121 The authors expect that a follow-on effort will eventually specify 122 a common data model for reporting Diameter overload information. 124 This document assumes that Diameter nodes exchange overload control 125 information via Diameter, rather than via some out-of-band channel. 126 This document does not address the specific difference of either 127 mechanism proposal, except where they impact the AVP definitions. 129 2. Documentation Conventions 131 This document uses terms defined in [RFC6733] and 132 [I-D.ietf-dime-overload-reqs]. 134 3. Overload Control Data Usage 136 A Diameter overload control mechanism based on the overload control 137 requirements [I-D.ietf-dime-overload-reqs] involves the exchange of 138 information between two or more Diameter nodes. The exchanged 139 information serves three distinct purposes: 141 Negotiation: Diameter nodes need to negotiate support for the 142 overload control mechanism in general. Nodes that support 143 overload control need to advertise the overload control scopes 144 they can support. Finally, they need to select an overload 145 control algorithm. 147 Communication of Overload State: Nodes need to report that an 148 overload condition is in effect, to what degree they are 149 overloaded, and the scope of the overload condition. We refer to 150 such a communication as an "Overload Report". 152 Communication of Load: Nodes need to communicate their current load 153 status, even when not in an overloaded state. 155 Overload Control information may be communicated between adjacent 156 Diameter nodes, or it may cross one or more intervening nodes. 157 Overload Control information can be communicated in either direction; 158 that is, a downstream node can indicate overload to an upstream node, 159 or vice-versa. 161 Open Issue: There is an ongoing discussion about whether the 162 overload control mechanism should be strictly hop-by-hop, or 163 whether it should support communication between non-adjacent 164 nodes. The results of this discussion may have implications for 165 overload control data elements. 167 4. Mechanism Differences that Affect Data Structures 169 While a thorough comparison of the two proposed mechanisms is out of 170 scope for this document, there are a few differences that directly 171 impact the choice of data elements. 173 4.1. Non-Adjacent Nodes 175 MDOC only supports hop-by-hop communication of overload information. 176 DOCA allows for the possibility of communication between non-adjacent 177 nodes. For hop-by-hop communication, the originator of an overload 178 report is always the directly connected node. If non-adjacent 179 communication is to be allowed, the data model needs a way to express 180 the identity of the originating node. 182 4.2. Stateless Negotiation 184 Both MDOC and DOCA allow overload control parameters to be negotiated 185 at the beginning of a connection, and persist for the duration of the 186 connection. DOCA also allows a "stateless" mode, where the 187 parameters do not persist between overload reports. This requires 188 the sender of an overload report to restate any relevant parameters 189 for each report. Thus, the DOCA overload report format includes the 190 ability to express all such parameters at any time, not just during 191 negotiation. 193 Note that stateless negotiation does not mean that no state may ever 194 be saved. Nodes may use implementation-specific methods of 195 remembering certain parameters, or out-of-band configuration methods 196 to do the same. 198 4.3. Overload Scopes 200 As described in [I-D.ietf-dime-overload-reqs], it's possible for a 201 Diameter node to experience overload that impacts some subset of 202 potential traffic. For example, a Diameter agent might route traffic 203 to different servers based on realm. If the server for one realm 204 experienced an outage or overload condition, the agent report that it 205 is overloaded for that realm, but can process traffic for other 206 realms normally. We use the term "overload scope", or simply 207 "scope", to refer to the set of potential messages affected by an 208 overload report. 210 MDOC includes a richer (and therefore more complex) concept of 211 overload scopes. A node may include multiple scopes in an overload 212 report. Each scope entry indicates both the type of scope, and the 213 value of the scope, where the value is interpreted according to the 214 type. 216 DOCA also allows a node to include multiple scopes in a report. But 217 DOCA's current set of scope types only affect the interpretation of 218 the originating node identity. Therefore the DOCA scope entries do 219 not include a value. 221 4.4. Hard or Soft Overload State 223 MDOC assumes that overload information is soft state. That is, it 224 expires if not refreshed within a stated interval. DOCA also treats 225 most overload information as soft state, but there are situations 226 where it may be treated as hard-state. For example, if the OC-Level 227 is set to "Hold", the expiration time is not honored. 229 5. Naming Conventions 231 MDOC and DOCA use somewhat different naming conventions for their 232 respective AVPs. DOCA prefixes each AVP name with "OC". (for 233 example, "OC-Scope"). MDOC prefixes AVPs that can appear in the root 234 of messages with "Overload", and leaves those that occur inside an 235 overload related grouped AVP to be identified by context. (For 236 example, "Overload Info" and "Supported Scopes"). The working group 237 should consider picking one approach or the other. 239 6. Data Element Comparison 241 6.1. Data Elements for Connection Establishment and Negotiation 243 The following sections describe data elements used for initial 244 negotiation. 246 6.1.1. Supported Scope Selection 248 o DOCA: OC-Scope : Bitmap of scopes supported by the sender. 249 Currently defined values are "Host scope", "Realm Scope", "Only 250 origin realm", "Application Information", "Node Utilization 251 Information", and "Application Priorities". 253 o MDOC: Supported-Scopes : Bitmap of scopes supported by the sender. 254 Currently defined values are "Destination-Realm", 255 "Application-ID", "Destination-Host", "Host", "Connection", 256 "Session-Group", and "Session". 258 DOCA uses OC-Scope both to declare supported scopes, and to list the 259 scopes associated with a particular overload report. MDOC uses 260 separate dedicated AVPs for the two purposes. DOCA overloads OC- 261 Scope to include indicators that load information and priority 262 information may be included. 264 6.1.2. Algorithm Selection 266 o DOCA: OC-Algorithm : Bitmap of supported algorithms. Currently 267 defined values are "Drop", "Throttle", and "Prioritize". Multiple 268 values allowed. 270 o MDOC: Overload-Algorithm: Enumeration of supported algorithms. 271 Multiple instances allowed in negotiation. Currently, there is 272 one algorithm described, namely "loss". 274 Both mechanisms support algorithm extensibility. MDOC only allows 275 Overload-Algorithm to occur in a CER or CEA message, and negotiates a 276 single algorithm for the duration of the connection. DOCA allows the 277 algorithm to be selected at report time. (Open Issue: what does it 278 mean to indicate multiple algorithms in a congestion report?) 280 6.1.3. Application Selection 282 o DOCA: OC-Applications: Indications of the applications that are of 283 interest. 285 o MDOC: MDOC assumes that overload reports can apply to any and all 286 applications, and does not negotiate the list upfront. The 287 "application" scope is used to select one or more applications on 288 a per-report basis. 290 Open Issue: Are there use cases for the up front negotiation of 291 applications of interest? 293 6.1.4. Frequency of Reports 295 o DOCA: OC-Tocl: Indicates how frequent reports shall be sent. 297 o MDOC: N/A 299 Since MDOC piggybacks overload reports in existing messages, the rate 300 of overload reports is the same as the overall message rate. This 301 may have advantage of giving more rapid and precise feedback as load 302 increases. 304 Open Issue: We need further discussion about the appropriate rate(s) 305 for overload reporting, regardless of which mechanism may be 306 selected. 308 6.1.5. Grouping 310 o DOCA: n/a - negotiation AVPs included at message root. 312 o MDOC: Load-Info: Grouped AVP acting as a container for the other 313 AVPs used for negotiation. 315 6.2. Data Elements for Overload and Load reporting 317 6.2.1. Scope of Report 319 o DOCA: OC-Scope (See Section 6.1.1) 321 o MDOC: Load-Info-Scope: Octet-String giving the scope of the 322 overload report. The string contains a type indicator and a 323 value. One or more instances required. 325 MDOC has a richer and more complex concept of scopes. Multiple 326 scopes can be combined for a given overload report. Allowable scope 327 combinations are described in [I-D.roach-dime-overload-ctrl]. 329 6.2.2. Overload Severity 331 o DOCA: OC-Level: OctetString(1): Values 1-6 define discreet 332 overload levels of increasing severity, with 1 meaning no overload 333 condition, and 6 meaning clients should switch to a different 334 server. 336 o DOCA: OC-Sending-Rate: Float32: Used when the "throttle" algorithm 337 is in effect to indicate the maximum desired Diameter message 338 rate. 340 o MDOC: Overload-Metric (Unsigned32): A numeric representation of 341 load. The meaning is up to the interpretation of the selected 342 algorithm, with the exception that a value of zero always means 343 that no overload abatement is in effect. For the "Loss" 344 algorithm, Overload Metric is a numeric value in the range of zero 345 through 100, indicating the percentage of traffic reduction 346 requested. 348 The Overload-Metric AVP used by MDOC is more general than OC-Level, 349 in that it's interpretation is left to the algorithm. The meaning of 350 the OC-Level values appear to be fixed regardless of algorithm 351 choice. the OC-Level meanings could be used in MDOC by defining a new 352 algorithm that interpreted Overload-Metric values 1-6 in the same way 353 as defined for OC-Level. 355 Since MDOC does not define an algorithm similar to "throttle", it has 356 no built in analog to OC-Sending-Rate. However, since MDOC allows 357 algorithm extensibility, one could define a similar algorithm, and if 358 necessary, add an extension AVP to state sending-rate. 360 6.2.3. Report Algorithm 362 o DOCA: OC-Algorithm (See Section 6.1.2) 364 o MDOC: The overload control algorithm is set during negotiation, 365 and doesn't change for the duration of the connection. 367 Open Issue: DOCA's reuse of the OC-Algorithm AVP seems to allow more 368 than one algorithm to be assigned to a single overload report. It's 369 not clear what that would mean. 371 6.2.4. Report Expiration 373 o DOCA: OC-Best-Before: (Time) Time of report expiration. 375 o MDOC: Period-Of-Validity (Unsigned32)- Number of seconds until 376 expiration. 378 DOCA defines expiration to be a point in time. MDOC uses a duration, 379 i.e. number of seconds until expiration. The DOCA approach seems to 380 require clock synchronization. 382 DOCA contains an open issue about whether to allow reports to expire 383 vs. requiring explicit signaling. 385 6.2.5. Current Load 387 o DOCA: OC-Utilization: Indicates the overall load situation as a 388 value between 0 and 100. 390 o MDOC: Load: The load situation in terms of 0 - 65535. 392 Current load indicates the existing load on an otherwise non- 393 overloaded node. MDOC's range of 0-65535 was selected to harmonize 394 with the DNS service location (SRV) [RFC2782] record's "Weight" 395 field. 397 6.2.6. Applications covered by a Report 399 o DOCA: OC-Applications: Indications what applications are of 400 interest for load reporting. 402 o MDOC does not use a separate AVP for this purpose. Rather, one or 403 more applications can be indicated using the application scope 404 type. 406 6.2.7. Report Action 408 o DOCA: OC-Action: Indicates the start, interim, and end of an 409 overload period. 411 o MDOC: MDOC does not have a separate AVP to indicate the start and 412 stop of an overload condition. Rather, a report with a non-zero 413 Overload-Metric value starts the condition, and a report with a 414 zero value, or the expiration of the Period-of-Validity value, 415 indicate an end. Subsequent reports with non-zero Overload-Metric 416 values serve the same purpose as a DOCA report with an OC-Action 417 value of "interim". 419 Open Issue: Is OC-Action redundant? DOCA also has the ability to 420 express a non-overload condition in OC-Level, so an approach similar 421 to that of MDOC should be workable. 423 6.2.8. Priority 425 o DOCA: OC-Priority: Unsigned32: When used in an OC-Information AVP, 426 sets the relative priority of applications listed in OC- 427 Applications. As specified, may also be used to set the priority 428 of a given Diameter message. [Open Issue: Is OC-Priority only in 429 effect when the "Prioritize" algorithm is in effect?] 431 o MDOC: N/A 433 MDOC does not have an explicit priority data element. Relative 434 priority between applications can be managed using the "Application" 435 scope. This is not exactly the same as stating inter-application 436 priority explicitly, but it may be possible to accomplish similar 437 behavior. 439 6.2.9. Session Groups 441 o DOCA: N/A 443 o MDOC: Session-Group: UTF8String: Session-Group allows a node to 444 assign a session to a named group. Overload Reports can refer to 445 all sessions in a group using the Session-Group AVP. 447 A common application for Session-Group is when a Diameter agent load 448 balances Diameter sessions across a set of servers. If the agent 449 assigns all of the sessions assigned to a particular server to a 450 group, and that server later becomes overloaded, the agent can send 451 one overload report that applies to all sessions in the group, but 452 does not apply to sessions assigned to other, non-overloaded, 453 servers. 455 DOCA may be able to do something similar using by using the OC-Origin 456 AVP to identify the overloaded server. However, the server-group 457 approach can work even if the Diameter agent performs topology 458 hiding. 460 6.3. Result Codes 462 DOCA defines the following Diameter result codes: 464 o DIAMETER_NO_COMMON_SCOPE (Permanent Failure): The Diameter peers 465 are unable to negotiate one or more scopes in common. 467 o DIAMETER_NO_COMMON_ALGORITHM (Permanent Failure): The Diameter 468 peers are unable to negotiate one or more algorithms in common. 470 o DIAMETER_TOCL_TOO_SMALL (Permanent Failure): The peer included an 471 OC-TOCL AVP with an unacceptably low value. 473 o DIAMETER_TOCL_TOO_BIG (Permanent Failure): The peer included an 474 OC-TOCL AVP with an unacceptably high value. 476 o DIAMETER_RATE_TOO_BIG (Permanent Failure): The peer included an 477 OC-SENDING-RATE AVP with an unacceptably high value. 479 A failure to negotiate Overload Control support does not cause a 480 connection failure in MDOC. Instead, overload control is just not 481 invoked on the connection. 483 MDOC defines the following result codes: 485 o DIAMETER_PEER_IN_OVERLOAD (Transient Failure): When a Diameter 486 node drops a request due to overload, it responds with this result 487 code. This is primarily used when the peer does not support 488 overload control, and therefore fails to reduce load as it would 489 be expected to do so if it supported overload control. 491 DIAMETER_PEER_IN_OVERLOAD may be of value to both mechanisms. The 492 Overload Control Requirements [I-D.ietf-dime-overload-reqs] argues 493 that the result codes in the Diameter base protocol are insufficient 494 for reporting failures due to congestion. 496 7. IANA Considerations 498 This draft makes no requests of IANA. The authors expect that a 499 follow-on effort will specify a common set of Overload Control 500 AVPs.This may introduce additional IANA considerations. 502 8. Security Considerations 504 This document compares the data elements used by "DOCA 505 [I-D.korhonen-dime-ovl] and MDOC [I-D.roach-dime-overload-ctrl]. It 506 introduces no security considerations beyond those in the respective 507 documents. 509 The authors expect that a follow-on effort will specify a common set 510 of Overload Control AVPs. This may introduce additional security 511 considerations. 513 The authors made no attempt to analyze the security considerations in 514 the DOCA and MDOC specifications for completeness. 516 9. References 518 9.1. Normative References 520 [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, 521 "Diameter Base Protocol", RFC 6733, October 2012. 523 [I-D.ietf-dime-overload-reqs] 524 McMurry, E. and B. Campbell, "Diameter Overload Control 525 Requirements", draft-ietf-dime-overload-reqs-03 (work in 526 progress), January 2013. 528 [I-D.roach-dime-overload-ctrl] 529 Roach, A., "A Mechanism for Diameter Overload Control", 530 draft-roach-dime-overload-ctrl-01 (work in progress), 531 October 2012. 533 [I-D.korhonen-dime-ovl] 534 Korhonen, J., "Diameter Overload Control Application", 535 draft-korhonen-dime-ovl-00 (work in progress), 536 October 2012. 538 9.2. Informative References 540 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 541 specifying the location of services (DNS SRV)", RFC 2782, 542 February 2000. 544 Appendix A. Contributors 546 Eric McMurry made significant contributions to the analysis in this 547 draft. 549 Authors' Addresses 551 Ben Campbell 552 Tekelec 553 17210 Campbell Rd. 554 Suite 250 555 Dallas, TX 75252 556 US 558 Email: ben@nostrum.com 559 Hannes Tschofenig 560 Nokia Siemens Networks 561 Linnoitustie 6 562 Espoo 02600 563 Finland 565 Email: Hannes.Tschofenig@nsn.com 567 Jouni Korhonen 568 Renesas Mobile 569 Porkkalankatu 24 570 Helsinki FIN-00180 571 Finland 573 Email: jouni.nospam@gmail.com 575 Adam Roach 576 Mozilla 577 Dallas, TX 579 Email: adam@nostrum.com