idnits 2.17.1 draft-ietf-dime-load-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC7683], [RFC7068]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 27, 2016) is 2767 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 913 -- Looks like a reference, but probably isn't: '2' on line 913 == Missing Reference: 'C' is mentioned on line 825, but not defined == Missing Reference: 'A1' is mentioned on line 825, but not defined == Missing Reference: 'A2' is mentioned on line 825, but not defined == Missing Reference: 'S4' is mentioned on line 825, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-dime-agent-overload-02 ** Downref: Normative reference to an Informational RFC: RFC 7068 Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force B. Campbell 3 Internet-Draft S. Donovan, Ed. 4 Intended status: Standards Track Oracle 5 Expires: March 31, 2017 JJ. Trottin 6 Nokia 7 September 27, 2016 9 Diameter Load Information Conveyance 10 draft-ietf-dime-load-03 12 Abstract 14 This document defines a mechanism for conveying of Diameter load 15 information. [RFC7068] describes requirements for Overload Control 16 in Diameter. This includes a requirement to allow Diameter nodes to 17 send "load" information, even when the node is not overloaded. The 18 Diameter Overload Information Conveyance (DOIC) [RFC7683] solution 19 describes a mechanism meeting most of the requirements, but does not 20 currently include the ability to send load information. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on March 31, 2017. 39 Copyright Notice 41 Copyright (c) 2016 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 3 58 3. Conventions Used in This Document . . . . . . . . . . . . . . 4 59 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4.1. Differences between Load and Overload information . . . . 4 61 4.2. How is Load Information Used? . . . . . . . . . . . . . . 5 62 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 63 5.1. Theory of Operation . . . . . . . . . . . . . . . . . . . 8 64 6. Load Mechanism Procedures . . . . . . . . . . . . . . . . . . 10 65 6.1. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10 66 6.1.1. Endpoint Reporting Node Behavior . . . . . . . . . . 10 67 6.1.2. Agent Reporting Node Behavior . . . . . . . . . . . . 11 68 6.2. Receiving Node Behavior . . . . . . . . . . . . . . . . . 12 69 6.3. Extensibility . . . . . . . . . . . . . . . . . . . . . . 13 70 6.4. Addition and removal of Nodes . . . . . . . . . . . . . . 13 71 7. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . . 13 72 7.1. Load AVP . . . . . . . . . . . . . . . . . . . . . . . . 13 73 7.2. Load-Type AVP . . . . . . . . . . . . . . . . . . . . . . 14 74 7.3. Load-Value AVP . . . . . . . . . . . . . . . . . . . . . 14 75 7.4. SourceID AVP . . . . . . . . . . . . . . . . . . . . . . 14 76 7.5. Attribute Value Pair flag rules . . . . . . . . . . . . . 14 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 79 9.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 15 80 9.2. New Registries . . . . . . . . . . . . . . . . . . . . . 16 81 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 82 10.1. Normative References . . . . . . . . . . . . . . . . . . 16 83 10.2. Informative References . . . . . . . . . . . . . . . . . 16 84 Appendix A. Topology Scenarios . . . . . . . . . . . . . . . . . 16 85 A.1. No Agent . . . . . . . . . . . . . . . . . . . . . . . . 16 86 A.2. Single Agent . . . . . . . . . . . . . . . . . . . . . . 17 87 A.3. Multiple Agents . . . . . . . . . . . . . . . . . . . . . 17 88 A.4. Linked Agents . . . . . . . . . . . . . . . . . . . . . . 18 89 A.5. Shared Server Pools . . . . . . . . . . . . . . . . . . . 19 90 A.6. Agent Chains . . . . . . . . . . . . . . . . . . . . . . 20 91 A.7. Fully Meshed Layers . . . . . . . . . . . . . . . . . . . 20 92 A.8. Partitions . . . . . . . . . . . . . . . . . . . . . . . 21 93 A.9. Active-Standby Nodes . . . . . . . . . . . . . . . . . . 21 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 96 1. Introduction 98 [RFC7068] describes requirements for Overload Control in Diameter 99 [RFC6733]. The DIME working group has finished the Diameter Overload 100 Information Conveyance (DOIC) mechanism [RFC7683]. As currently 101 specified, DOIC fulfills some, but not all, of the requirements. 103 In particular, DOIC does not fulfill Req 23 and Req 24: 105 REQ 23: The solution MUST provide sufficient information to enable 106 a load-balancing node to divert messages that are rejected or 107 otherwise throttled by an overloaded upstream node to other 108 upstream nodes that are the most likely to have sufficient 109 capacity to process them. 111 REQ 24: The solution MUST provide a mechanism for indicating load 112 levels, even when not in an overload condition, to assist nodes in 113 making decisions to prevent overload conditions from occurring. 115 There are several other requirements in [RFC7068] that mention both 116 overload and load information that are only partially fulfilled by 117 DOIC. 119 The DIME working group explicitly chose not to fulfill these 120 requirements in DOIC due to several reasons. A principal reason was 121 that the working group did not agree on a general approach for 122 conveying load information. It chose to progress the rest of DOIC, 123 and deferred load information conveyance to a DOIC extension or a 124 separate mechanism. 126 This document defines a mechanism that addresses the load-related 127 requirements from RFC 7068. 129 2. Terminology and Abbreviations 131 DOIC 133 Diameter Overload Information Conveyance ([RFC7683]) 135 Load 137 The he relative usage of the Diameter message processing capacity 138 of a Diameter node. A low load level indicates that the Diameter 139 node is under utilized. A high load level indicates that the node 140 is closer to being fully utilized. 142 Offered Load 143 The actual traffic sent to the reporting node after overload 144 abatement and routing decisions are made. 146 Reporting, Reacting Node 148 Reporting node and reacting node terminology is defined in 149 [RFC7683]. 151 Routing Information 153 Routing Information referred to in this document can include the 154 Routing and Peer tables defined in RFC 6733. It can also include 155 other implementation specific tables used to store load 156 information. This document does not define the structure of such 157 tables. 159 3. Conventions Used in This Document 161 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 162 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 163 document are to be interpreted as described in RFC 2119 [RFC2119]. 165 RFC 2119 [RFC2119] interpretation does not apply for the above listed 166 words when they are not used in all-caps format. 168 4. Background 170 4.1. Differences between Load and Overload information 172 Previous discussions of how to solve the load-related requirements in 173 [RFC7068] have shown that people did not had an agreed-upon concept 174 of how "load" information differs from "overload" information. While 175 the two concepts are highly interrelated, in the opinion of the 176 authors, there are two primary differences. First, a Diameter node 177 always has a load. At any given time that load may be effectively 178 zero, effectively fully loaded, or somewhere in between. In 179 contrast, overload is an exceptional condition. A node only has 180 overload information when it is in an overloaded state. Furthermore, 181 the relationship between a node's load level and overload state at 182 any given time may be vague. For example, a node may normally 183 operate at a "fully loaded" level, but still not be considered 184 overloaded. Another node may declare itself to be "overloaded" even 185 though it might not be fully "loaded". 187 Second, Overload information, in the form of a DOIC Overload Report 188 (OLR) [RFC7683] indicates an explicit request for action on the part 189 of the reacting node. That is, the OLR requests that the reacting 190 node reduce the offered load -- the actual traffic sent to the 191 reporting node after overload abatement and routing decisions are 192 made -- by an indicated amount (by default), or as prescribed by the 193 selected abatement algorithm. Effectively, DOIC provides a contract 194 between the reporting node and the reacting node. 196 In contrast, load is informational. That is, load information can be 197 considered a hint to the recipient node. That node may use the load 198 information for load balancing purposes, as an input to certain 199 overload abatement techniques, to make inferences about the 200 likelihood that the sending node becomes overloaded in the immediate 201 future, or for other purposes. 203 None of this prevents a Diameter node from deciding to reduce the 204 offered load based on load information. The fundamental difference 205 is that an overload report requires that reduction. It is also 206 reasonable for a Diameter node to decide to increase the offered load 207 based on load information. 209 4.2. How is Load Information Used? 211 [RFC7068] contemplates two primary uses for load information. Req 23 212 discusses how load information might be used when performing 213 diversion as an overload abatement technique, as described in 214 [RFC7683]. When a reacting node diverts traffic away from an 215 overloaded node, it needs load information for the other candidates 216 for that traffic in order to effectively load balance the diverted 217 load between potential candidates. Otherwise, diversion has a 218 greater potential to drive other nodes into overload. 220 Req 24 discusses how Diameter load information might be used when no 221 overload condition currently exists. Diameter nodes can use the load 222 information to make decisions to try to avoid overload conditions in 223 the first place. Normal load-balancing falls into this category, but 224 the diameter node can take other proactive steps as well. 226 If the loaded nodes are Diameter servers (or clients in the case of 227 server-to-client transactions), both of these uses of load 228 information should be accomplished by a Diameter node that performs 229 server selection. Typically, server selection is performed by a node 230 (a client or an agent) that is an immediate peer of the server. 231 However, there are scenarios (see Appendix A) where a client or proxy 232 that is not the immediate peer to the selected servers performs 233 server selection. In this case, the client or proxy enforces the 234 server selection by inserting a Destination-Host AVP. 236 For example, a Diameter node (e.g. client) can use a redirect 237 agent to get candidate destination host addresses. The redirect 238 agent might return several destination host addresses, from which 239 the Diameter node selects one. The Diameter node can use load 240 information received from these hosts to make the selection. 242 Just as load information can be used as part of server selection, it 243 can also be used as input to the selection of the next-hop peer to 244 which a request is to be routed. 246 It should be noted that a Diameter node will need to process both 247 Load reports and Overload reports from the same Diameter node. The 248 reacting node for the Overload report always has the responsibility 249 to reduce the amount of Diameter traffic sent to the overloaded node. 250 If, or how, the reacting node uses Load information to achieve this 251 is left as an implementation decision. 253 5. Solution Overview 255 The mechanism defined here for the conveyance of load information is 256 similar in some ways to the mechanism defined for DOIC and is 257 different in other ways. 259 As with DOIC, load information is conveyed by piggy-backing the load 260 AVPs on existing Diameter applications. 262 There are two primary differences. First, there is no capability 263 negotiation process for load. The sender of the load information is 264 sending it with the expectation that any supporting nodes will use it 265 when making routing decisions. If there are no nodes that support 266 the Load mechanism then the load information is ignored. 268 The second big difference between DOIC and Load is visibility of the 269 DOIC or Load information within a Diameter network. DOIC information 270 is sent end-to-end resulting in the ability of all nodes in the path 271 of the answer message that carries the OC-OLR AVP to act on the 272 information, although only one node actually comsumes and reacts to 273 the report. The DOIC overload reports remain in the message all the 274 way from the reporting node to the node that is the target for the 275 answer message. 277 For the Load mechanism there are two types of load reports and only 278 the first one is transmitted end-to-end. 280 The first is the load of the endpoint sending the answer message. 281 This load report is carried end-to-end to enable any nodes that make 282 server selection decisions to use the load status of the sending 283 endpoint as part of the server selection decision. Unlike with DOIC, 284 more than one node may make use of the load information received. 286 The second type of load report is a peer report. This report is used 287 by Diameter nodes as part of the logic to select the next-hop 288 Diameter node and, as such, does not have significance beyond the 289 peer node. These load reports are removed by the first supporting 290 Diameter node to receive the report. 292 Because load reports can traverse Diameter nodes that do not support 293 the Load mechanism, it is necessary to include the identity of the 294 node to which the load report applies as part of the load report. 295 This allows for a Diameter node to verify that a load report applies 296 to its peer or if it should be ignored. 298 The load report includes a value indicating the load of the sending 299 node relative load of the sending node, specified in a manner 300 consistent with that defined for DNS SRV [RFC2782]. 302 The goal is to make it possible to use both the load values received 303 as a part of the Diameter Load mechanism and weight values received 304 as a result of a DNS SRV query. As a result, the Diameter load value 305 has a range of 0-65535. This value and DNS SRV weight values are 306 then used in a distribution algorithm similar to that specified in 307 [RFC2782]. 309 The DNS SRV distribution algorithm results in more messages being 310 sent to a node with a higher weight value. As a result, a higher 311 Diameter load value indicates a LOWER load on the sending node. A 312 node that is heavily loaded sends a lower Diameter load value. 313 Stated another way, a node that has zero load would have a load value 314 of 65535. A node that is 100% loaded would have a load value of 0. 316 The distribution algorithm used by Diameter nodes supporting the 317 Diameter Load mechanism is an implementation decision but it needs to 318 result in similar behavior to the algorithm described for the use of 319 weigh values specified in [RFC2782]. 321 The method for calculating the load value included in the load report 322 is also left as an implementation decision. 324 The frequency for sending of load reports is also left as an 325 implementation decision. The sending node might choose to send load 326 reports in all messages or it might choose to only send load reports 327 when the load value has changed by some implementation specific 328 amount. The important consideration is that all nodes needing the 329 load information have a sufficiently accurate view of the node's 330 load. 332 5.1. Theory of Operation 334 This section outlines how the Diameter Load mechanism is expected to 335 work. 337 For this discussion, assume the following Diameter network 338 configuration: 340 ---A1---A3----S[1], S[2]...S[p] 341 / | \ / 342 C | x 343 \ | / \ 344 ---A2---A4----S[p+1], S[p+2] ...S[n] 346 Figure 1: Example Diameter Network 348 Note that in this diagram, S[1], S[2] through S[p] are peers to A3. 349 S[p+1], S[p+2] through S[n] are peers to A4. 351 Also assume that the request for a Diameter transaction takes the 352 following path: 354 C A1 A4 S[n] 355 | | | | 356 |----->|----->|----->| 357 xxR xxR xxR 359 Figure 2: Request Message Path 361 When sending the answer message, an endpoint node that supports the 362 Diameter Load mechanism includes its own load information in the 363 answer message. Because it is a Diameter endpoint it includes a HOST 364 load report. 366 C A1 A4 S[n] 367 | | | | 368 | | |<-----| 369 | | xxA(Load type:HOST, source:S[n]) 370 | | | | 372 Figure 3: Answer Message from S[n] 374 If Agent A4 supports the Load mechanism then A4's actions depend on 375 whether A4 is responsible for doing server selection. If A4 is not 376 doing server selection then A4 ignores the HOST load report. If A4 377 is responsible for doing server selection then it stores the load 378 information for S[n] in its routing information for the handling of 379 subsequent request messages. In both cases A4 leaves the HOST report 380 in the message. 382 Note: If A4 does not support the Load mechanism then it will relay 383 the answer message without doing any processing on the load 384 information. In this case the load information AVPs will be 385 relayed without change. 387 A4 then calculates its own load information and inserts load 388 information AVPs of type PEER in the message before sending the 389 message to A1. 391 C A1 A4 S[n] 392 | | | | 393 | |<-----| | 394 | xxA(Load type:PEER, source:A4) 395 | xxA(Load type:HOST, source:S[n]) 396 | | | | 398 Figure 4: Answer Message from A4 400 If A1 supports the Load mechanism then it processes each of the Load 401 reports it receives separately. 403 For the PEER load report, A1 first determines if the source of the 404 report indicated in the load report matches the DiameterIdentity of 405 the Diameter node from which the request was received. If the 406 identities do not match then the PEER load report is discarded. If 407 the identities match then A1 saves the load information in its 408 routing information for routing of subsequent request messages. In 409 both cases A1 strips the PEER load report from the message. 411 For the HOST load report, A1's actions depend on whether A1 is 412 responsible for doing server selection. If A1 is not doing server 413 selection then A1 ignores the HOST load report. If A1 is responsible 414 for doing server selection then it stores the load information for 415 S[n] in its routing information for the handling of subsequent 416 request messages. In both cases A1 leaves the HOST report in the 417 message. 419 A1 then calculates its own load information and inserts load 420 information AVPs of type PEER in the message before sending the 421 message to C: 423 C A1 A4 S[n] 424 | | | | 425 |<-----| | | 426 xxA(Load type:PEER, source:A1) 427 xxA(Load type:HOST, source:S[n]) 429 Figure 5: Answer Message from A1 431 As with A1, C processes each load report separately. 433 For the PEER load report, C follows the same procedure as A1 for 434 determining if the Load report was received from the peer from which 435 the report was sent and, when finding it does, stores the load 436 information for use when making future routing decisions. 438 For the HOST load report, C saves the load information only if it is 439 responsible for doing server selection. 441 The Load information received by all nodes is then used for routing 442 of subsequent request messages. 444 6. Load Mechanism Procedures 446 This section defines the normative behaviors for the Load mechanism. 448 6.1. Reporting Node Behavior 450 This section defines the procedures of Diameter reporting nodes that 451 generate load reports. 453 6.1.1. Endpoint Reporting Node Behavior 455 A Diameter endpoint that supports the Diameter Load mechanism MUST 456 include a load report of type HOST in sufficient answer messages to 457 ensure that all consumers of the load information receive timely 458 updates. 460 The Diameter endpoint MUST include its own DiameterIdentity in the 461 SourceID AVP included in the Load AVP. 463 The Diameter endpoint MUST include a Load-Type AVP of type HOST in 464 the Load AVP. 466 The Diameter endpoint MUST include its load value in the Value AVP in 467 the load AVP. 469 The LOAD value should be calculated in a way that reflects the 470 available load independently of the weight of each server, in order 471 to accurately compare LOAD values from different nodes. Any specific 472 LOAD value needs to identify the same amount of available capacity, 473 regardless the Diameter node that calculates the value. 475 The mechanism used to calculate the LOAD value that fulfils this 476 requirement is an implementation decision. 478 The frequency of sending load reports is an implementation decision. 480 For instance, if the only consumer of the load reports is the 481 endpoint's peer then the endpoint can choose to only include a 482 load report when the load of the endpoint has changed by a 483 meaningful percentage. If there are consumers of the endpoint 484 load report other then the endpoint's peer (this will be the case 485 if other nodes are responsible for server selection) then the 486 endpoint might choose to include load reports in all answer 487 messages as a way of ensuring that all nodes doing server 488 selection get accurate load information. 490 6.1.2. Agent Reporting Node Behavior 492 A Diameter agent that supports the Diameter Load mechanism MUST 493 include a PEER load report in sufficient answer messages to ensure 494 that all users of the load information receive timely updates. 496 The Diameter agent MUST include its own DiameterIdentity in the 497 SourceID AVP included in the Load AVP. 499 The Diameter agent MUST include a Load-Type AVP of type PEER in the 500 Load AVP. 502 The Diameter agent MUST include its load value in the Load-Value AVP 503 in the load AVP. 505 The LOAD value should be calculated in a way that reflects the 506 available load independently of the weight of each agent, in order to 507 accurately compare LOAD values from different nodes. Any specific 508 LOAD value needs to identify the same amount of available capacity, 509 regardless the Diameter node that calculates the value. 511 The mechanism used to calculate the LOAD value that fulfils this 512 requirement is an implementation decision. 514 The frequency of sending load reports is an implementation decision. 516 Note: In the case of peer load reports it is only necessary to 517 include load reports when the load value has changed by some 518 meaningful value, as long as the agent insures that all peers 519 receive the report. It is also acceptable to include the load 520 report in every answer message handled by the Diameter agent. 522 6.2. Receiving Node Behavior 524 This section defines the behavior of Diameter nodes processing load 525 reports. 527 A Diameter node MUST be prepared to process load reports of type HOST 528 and of type PEER, as indicated in the Load-Type AVP included in the 529 Load AVP received in the same answer message or from multiple answer 530 messages. 532 Note that the node needs to be able to handle messages with no 533 load reports, messages with just a PEER load report, messages with 534 just an HOST load report and messages with both types of load 535 reports. 537 If the Diameter node is not responsible for doing server selection 538 then it SHOULD ignore load reports of type HOST. 540 If the Diameter node is responsible for doing server selection then 541 it SHOULD save the load value included in the Load-Value AVP included 542 in the Load AVP of type HOST in its routing information. 544 If the Diameter node receives a Load report of type PEER then the 545 Diameter node MUST determine if the Load report was inserted into the 546 answer message by the peer from which the message was received. This 547 is achieved by comparing the DiameterIdentity associated with the 548 connection from which the message was received with the 549 DiameterIdentity included in the SourceID AVP in the Load report. 551 If the Diameter node determines that the Load report of type PEER was 552 not received from the peer that sent or relayed the answer message 553 then the node MUST ignore the Load report. 555 If the Diameter node determines that the Load report of type PEER was 556 received from the peer that sent or relayed the answer message then 557 the node SHOULD save the load information in its routing information. 559 How a Diameter node uses load information for making routing 560 decisions is an implementation decision. However, the distribution 561 algorithm MUST result in similar behavior as the algorithm described 562 for the use of weigth values in [RFC2782]. 564 6.3. Extensibility 566 The Load mechanism can be extended to include additional information 567 in the load reports. 569 Any extension may define new AVPs for use in Load reports. These new 570 AVPs SHOULD be defined to be extensions to the Load AVPs defined in 571 this document. 573 [RFC6733] defined Grouped AVP extension mechanisms apply. This 574 allows, for example, defining a new feature that is mandatory to be 575 understood even when piggybacked on an existing application. 577 As with any Diameter specification, [RFC6733] requires all new AVPs 578 to be registered with IANA. See Section 9 for the required 579 procedures. 581 6.4. Addition and removal of Nodes 583 When a Diameter node is added, the new node will start by advertising 584 its load. Downstream nodes will need to factor the new load 585 information into load balancing decisions. The downstream nodes can 586 attempt to ensure a smooth increase of the traffic to the new node, 587 avoiding an immediate spike of traffic to the new node. The method 588 for handling of such a smooth increase is implementation specific but 589 it can rely on the evolution of load information received from the 590 new node and from the other nodes. 592 When removing a node in a controlled way (e.g. for maintenance 593 purpose, so outside a failure case), it might be appropriate to 594 progressively reduce the traffic to this node by routing traffic to 595 other nodes. Simple load information (load percentage) would not be 596 sufficient. The method for handling of the node removal is 597 implementation specific but it can rely on the evolution of the load 598 information received from the node to be removed. 600 7. Attribute Value Pairs 602 The section defines the AVPs required for the Load mechanism. 604 7.1. Load AVP 606 The Load AVP (AVP code TBD1) is of type Grouped and is used to convey 607 load information between Diameter nodes. 609 Load ::= < AVP Header: TBD1 > 610 [ Load-Type ] 611 [ Load-Value ] 612 [ SourceID ] 613 * [ AVP ] 615 7.2. Load-Type AVP 617 The Load-Type AVP (AVP code TBD2) is of type Enumerated. It is used 618 to convey the type of Diameter node that sent the load information. 619 The following values are defined: 621 HOST 0 The load report is for a host. 623 PEER 1 The load report is for a peer. 625 7.3. Load-Value AVP 627 The Load-Value AVP (AVP code TBD3) is of type Unsigned64. It is used 628 to convey relative load information about the sender of the load 629 report. 631 The Load-Value AVP is specified in a manner similar to the weight 632 value in DNS SRV ([RFC2782]). 634 The Load-Value has a range of 0-65535. 636 A higher value indicates a lower load on the sending node. A lower 637 value indicates that the sending node is heavily loaded. 639 Stated another way, a node that has zero load would have a load 640 value of 65535. A node that is 100% loaded would have a load 641 value of 0. 643 7.4. SourceID AVP 645 The SourceID AVP is defined in [I-D.ietf-dime-agent-overload]. It is 646 used to identify the Diameter node that sent the Load report. 648 7.5. Attribute Value Pair flag rules 649 +---------+ 650 |AVP flag | 651 |rules | 652 +----+----+ 653 AVP Section | |MUST| 654 Attribute Name Code Defined Value Type |MUST| NOT| 655 +--------------------------------------------------------+----+----+ 656 |Load TBD1 x.1 Grouped | | V | 657 +--------------------------------------------------------+----+----+ 658 |Load-Type TBD2 x.2 Enumerated | | V | 659 +--------------------------------------------------------+----+----+ 660 |Load-Value TBD3 x.3 Unsigned64 | | V | 661 +------------------------------------------------------ -+----+----+ 662 |SourceID TBD4 x.4 DiameterIdentity | | V | 663 +--------------------------------------------------------+----+----+ 665 As described in the Diameter base protocol [RFC6733], the M-bit usage 666 for a given AVP in a given command may be defined by the application. 668 8. Security Considerations 670 Load information may be sensitive information in some cases. 671 Depending on the mechanism, an unauthorized recipient might be able 672 to infer the topology of a Diameter network from load information. 673 Load information might be useful in identifying targets for Denial of 674 Service (DoS) attacks, where a node known to be already heavily 675 loaded might be a tempting target. Load information might also be 676 useful as feedback about the success of an ongoing DoS attack. 678 Any load information conveyance mechanism will need to allow 679 operators to avoid sending load information to nodes that are not 680 authorized to receive it. Since Diameter currently only offers 681 authentication of nodes at the transport level, any solution that 682 sends load information to non-peer nodes might require a transitive- 683 trust model. 685 9. IANA Considerations 687 9.1. AVP Codes 689 New AVPs defined by this specification are listed in 690 Section Section 7. All AVP codes are allocated from the 691 'Authentication, Authorization, and Accounting (AAA) Parameters' AVP 692 Codes registry. 694 9.2. New Registries 696 This document makes no new registry requests of IANA. 698 10. References 700 10.1. Normative References 702 [I-D.ietf-dime-agent-overload] 703 Donovan, S., "Diameter Agent Overload", draft-ietf-dime- 704 agent-overload-02 (work in progress), August 2015. 706 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 707 Requirement Levels", BCP 14, RFC 2119, 708 DOI 10.17487/RFC2119, March 1997, 709 . 711 [RFC6733] Fajardo, V., Ed., Arkko, J., Loughney, J., and G. Zorn, 712 Ed., "Diameter Base Protocol", RFC 6733, 713 DOI 10.17487/RFC6733, October 2012, 714 . 716 [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control 717 Requirements", RFC 7068, DOI 10.17487/RFC7068, November 718 2013, . 720 [RFC7683] Korhonen, J., Ed., Donovan, S., Ed., Campbell, B., and L. 721 Morand, "Diameter Overload Indication Conveyance", 722 RFC 7683, DOI 10.17487/RFC7683, October 2015, 723 . 725 10.2. Informative References 727 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 728 specifying the location of services (DNS SRV)", RFC 2782, 729 DOI 10.17487/RFC2782, February 2000, 730 . 732 Appendix A. Topology Scenarios 734 This section presents a number of Diameter topology scenarios, and 735 discusses how load information might be used in each scenario. 737 A.1. No Agent 739 Figure 6 shows a simple client-server scenario, where a client picks 740 from a set of candidate servers available for a particular realm and 741 application. The client selects the server for a given transaction 742 using the load information received from each server. 744 ------S1 745 / 746 C 747 \ 748 ------S2 750 Figure 6: Basic Client Server Scenario 752 If a node supports dynamic discovery, it will not obtain load 753 information from the nodes with which it has no Diameter 754 connection established. Nevertheless it might take into account 755 the load information from the other nodes to decide to add 756 connections to new nodes with the dynamic discovery mechanism. 758 Note: The use of dynamic connections needs to be considered. 760 A.2. Single Agent 762 Figure 7 shows a client that sends requests to an agent. The agent 763 selects the request destination from a set of candidate servers, 764 using load information received from each server. The client does 765 not need to receive load information, since it does not select 766 between multiple agents. 768 ------S1 769 / 770 C----A 771 \ 772 ------S2 774 Figure 7: Simple Agent Scenario 776 A.3. Multiple Agents 778 Figure 8 shows a client selecting between multiple agents, and each 779 agent selecting from multiple servers. The client selects an agent 780 based on the load information received from each agent. Each agent 781 selects a server based on the load information received from its 782 servers. 784 This scenario adds a complication that one set of servers may be more 785 loaded than the other set. If, for example, S4 was the least loaded 786 server, C would need to know to select agent A2 to reach S4. This 787 might require C to receive load information from the servers as well 788 as the agents. Alternatively, each agent might use the load of its 789 servers as an input into calculating its own load, in effect 790 aggregating upstream load. 792 Similarly, if C sends a host-routed request [RFC7683], it needs to 793 know which agent can deliver requests to the selected server. 794 Without some special, potentially proprietary, knowledge of the 795 topology upstream of A1 and A2, C would select the agent based on the 796 normal peer selection procedures for the realm and application, and 797 perhaps consider the load information from A1 and A2. If C sends a 798 request to A1 that contains a Destination-Host AVP with a value of 799 S4, A1 will not be able to deliver the request. 801 -----S3 802 / 803 ---A1------S1 804 / 805 C 806 \ 807 ---A2------S2 808 \ 809 ---- S4 811 Figure 8: Multiple Agents and Servers 813 A.4. Linked Agents 815 Figure 9 shows a scenario similar to that of Figure 8, except that 816 the agents are linked, so that A1 can forward a request to A2, and 817 vice-versa. Each agent could receive load information from the 818 linked agent, as well as its connected servers. 820 This somewhat simplifies the complication from Figure 8, due to the 821 fact that C does not necessarily need to choose a particular agent to 822 reach a particular server. But it creates a similar question of how, 823 for example, A1 might know that S4 was less loaded than S1 or S3. 824 Additionally, it creates the opportunity for sub-optimal request 825 paths. For example [C,A1,A2,S4] vs. [C,A2,S4]. 827 A likely application for linked agents is when each agent prefers to 828 route only to directly connected servers and only forwards requests 829 to another agent under exceptional circumstances. For example, A1 830 might not forward requests to A2 unless both S1 and S3 are 831 overloaded. In this case, A1 might use the load information from S1 832 and S3 to select between those, and only consider the load 833 information from A2 (and other connected agents) if it needs to 834 divert requests to different agents. 836 -----S3 837 / 838 ---A1------S1 839 / | 840 C | 841 \ | 842 ---A2------S2 843 \ 844 ---- S4 846 Figure 9: Linked Agents 848 Figure 10 is a variant of Figure 9. In this case, C1 sends all 849 traffic through A1 and C2 sends all traffic through A2. By default, 850 A1 will load balance traffic between S1 and S3 and A2 will load 851 balance traffic between S2 and S4. 853 Now, if S1 S3 are significantly more loaded than S2 S4, A1 may route 854 some C1 traffic to A2. This is non optimal path but allows a better 855 load balancing between the servers. To achieve this, A1 needs to 856 receive some load info from A2 about S2/S4 load. 858 -----S3 859 / 860 C1----A1------S1 861 | 862 | 863 | 864 C2----A2------S2 865 \ 866 ---- S4 868 Figure 10: Linked Agents 870 A.5. Shared Server Pools 872 Figure 11 is similar to Figure 9, except that instead of a link 873 between agents, each agent is linked to all servers. (The links to 874 each set of servers should be interpreted as a link to each server. 875 The links are not shown separately due to the limitations of ASCII 876 art.) 877 In this scenario, each agent can select among all of the servers, 878 based on the load information from the servers. The client need only 879 be concerned with the load information of the agents. 881 ---A1---S[1], S[2]...S[p] 882 / \ / 883 C x 884 \ / \ 885 ---A2---S[p+1], S[p+2] ...S[n] 887 Figure 11: Shared Server Pools 889 A.6. Agent Chains 891 The scenario in Figure 12 is similar to that of Figure 8, except 892 that, instead of the client possibly needing to select an agent that 893 can route requests to the least loaded server, in this case A1 and A2 894 need to make similar decisions when selecting between A3 or A4. As 895 the former scenario, this could be mitigated if A3 and A4 aggregate 896 upstream loads into the load information they report downstream. 898 ---A1---A3----S[1], S[2]...S[p] 899 / | \ / 900 C | x 901 \ | / \ 902 ---A2---A4----S[p+1], S[p+2] ...S[n] 904 Figure 12: Agent Chains 906 A.7. Fully Meshed Layers 908 Figure 13 extends the scenario in Figure 11 by adding an extra layer 909 of agents. But since each layer of nodes can reach any node in the 910 next layer, each node only needs to consider the load of its next-hop 911 peer. 913 ---A1---A3---S[1], S[2]...S[p] 914 / | \ / |\ / 915 C | x | x 916 \ | / \ |/ \ 917 ---A2---A4---S[p+1], S[p+2] ...S[n] 919 Figure 13: Full Mesh 921 A.8. Partitions 923 A Diameter network with multiple servers is said to be "partitioned" 924 when only a subset of available servers can serve a particular realm- 925 routed request. For example, one group of servers may handle users 926 whose names start with "A" through "M", and another group may handle 927 "N" through "Z". 929 In such a partitioned network, nodes cannot load-balance requests 930 across partitions, since not all servers can handle the request. A 931 client, or an intermediate agent, may still be able to load-balance 932 between servers inside a partition. 934 A.9. Active-Standby Nodes 936 The previous scenarios assume that traffic can be load balanced among 937 all peers that are eligible to handle a request. That is, the peers 938 operate in an "active-active" configuration. In an "active-standby" 939 configuration, traffic would be load-balanced among active peers. 940 Requests would only be sent to peers in a "standby" state if the 941 active peers became unavailable. For example, requests might be 942 diverted to a stand-by peer if one or more active peers becomes 943 overloaded. 945 Authors' Addresses 947 Ben Campbell 948 Oracle 949 7460 Warren Parkway # 300 950 Frisco, Texas 75034 951 USA 953 Email: ben@nostrum.com 955 Steve Donovan (editor) 956 Oracle 957 7460 Warren Parkway # 300 958 Frisco, Texas 75034 959 United States 961 Email: srdonovan@usdonovans.com 962 Jean-Jacques Trottin 963 Nokia 964 Route de Villejust 965 91620 Nozay 966 France 968 Email: jean-jacques.trottin@nokia.com