Diameter Maintenance and Extensions (DIME) J. Korhonen, Ed. Internet-Draft Broadcom Intended status: Standards Track S. Donovan, Ed. Expires:January 4,April 30, 2015 B. Campbell Oracle L. Morand Orange LabsJuly 3,October 27, 2014 Diameter Overload Indication Conveyancedraft-ietf-dime-ovli-03.txtdraft-ietf-dime-ovli-04.txt Abstract This specification documents a Diameter Overload Control (DOC) base solution and the dissemination of the overload report information. Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire onJanuary 4,April 30, 2015. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology and Abbreviations . . . . . . . . . . . . . . . .43 3. Solution Overview . . . . . . . . . . . . . . . . . . . . . .45 3.1.Overload Control Endpoints (Non normative)Piggybacking Principle . . . . . . .6 3.2. Piggybacking Principle (Non normative). . . . . . . . .10 3.3.. 7 3.2. DOIC Capability Announcement(Non normative). . . . . .11 3.4. DOIC Overload Condition Reporting (Non normative). . . .12 3.5. DOIC Extensibility (Non normative). . . . 8 3.3. DOIC Overload Condition Reporting . . . . . . .13 3.6. Simplified Example Architecture (Non normative). . . . .14 3.7. Considerations for Applications Integrating the9 3.4. DOICSolution (Non normative)Extensibility . . . . . . . . . . . . . . . .15 3.7.1. Application Classification (Non normative). . . 10 3.5. Simplified Example Architecture . .15 3.7.2. Application Type Overload Implications (Non normative). . . . . . . . . . . 11 4. Solution Procedures . . . . . . . . . .16 3.7.3. Request Transaction Classification (Non normative).18 3.7.4. Request Type Overload Implications (Non normative).18 4. Solution Procedures (Normative). . . . . . . . . 12 4.1. Capability Announcement . . . . . .20 4.1. Capability Announcement (Normative). . . . . . . . . . .2012 4.1.1. Reacting Node Behavior(Normative). . . . . . . . .20. . . . . . 12 4.1.2. Reporting Node Behavior(Normative). . . . . . . .21. . . . . . . 12 4.1.3. Agent Behavior(Normative). . . . . . . . . . . . .22. . . . . . 13 4.2. Overload Report Processing(Normative). . . . . . . . .22. . . . . . 14 4.2.1. Overload Control State(Normative). . . . . . . . .22. . . . . . 14 4.2.2. Reacting Node Behavior(Normative). . . . . . . . .24. . . . . . 18 4.2.3. Reporting Node Behavior(Normative). . . . . . . .26 4.2.4. Agent Behavior (Normative). . . . . . . 18 4.3. Protocol Extensibility . . . . . .26 4.3. Protocol Extensibility (Normative). . . . . . . . . . .2720 5. Loss Algorithm(Normative). . . . . . . . . . . . . . . . .28 5.1. Overview (Non normative). . . . . . 21 5.1. Overview . . . . . . . . . . . . .28 5.2. Use of OC-Reduction-Percentage AVP. . . . . . . . . . .29 5.3.21 5.2. Reporting Node Behavior(Normative). . . . . . . . . . .29 5.4.. . . . . . 22 5.3. Reacting Node Behavior(Normative). . . . . . . . . . .29. . . . . . 22 6. Attribute Value Pairs(Normative). . . . . . . . . . . . . .30. . . . . . 23 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . .3123 6.2. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . . .3124 6.3. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . .3224 6.4. OC-Sequence-Number AVP . . . . . . . . . . . . . . . . .3325 6.5. OC-Validity-Duration AVP . . . . . . . . . . . . . . . .3325 6.6. OC-Report-Type AVP . . . . . . . . . . . . . . . . . . .3425 6.7. OC-Reduction-Percentage AVP . . . . . . . . . . . . . . .3526 6.8. Attribute Value Pair flag rules . . . . . . . . . . . . .3527 7. Error Response Codes . . . . . . . . . . . . . . . . . . . .3627 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . .3628 8.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . .3628 8.2. New registries . . . . . . . . . . . . . . . . . . . . .3728 9. Security Considerations . . . . . . . . . . . . . . . . . . .3729 9.1. Potential Threat Modes . . . . . . . . . . . . . . . . .3729 9.2. Denial of Service Attacks . . . . . . . . . . . . . . . .3830 9.3. Non-Compliant Nodes . . . . . . . . . . . . . . . . . . .3930 9.4. End-to End-Security Issues . . . . . . . . . . . . . . .3931 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . .4032 11. References . . . . . . . . . . . . . . . . . . . . . . . . .4032 11.1. Normative References . . . . . . . . . . . . . . . . . .4032 11.2. Informative References . . . . . . . . . . . . . . . . .4132 Appendix A. Issues left for future specifications . . . . . . .4133 A.1. Additional traffic abatement algorithms . . . . . . . . .4133 A.2. Agent Overload . . . . . . . . . . . . . . . . . . . . .4133 A.3.DIAMETER_TOO_BUSY clarificationsNew Error Diagnostic AVP . . . . . . . . . . . .42. . . . 33 Appendix B.ExamplesDeployment Considerations . . . . . . . . . . . . . 34 Appendix C. Requirements Conformance Analysis . . . . . . . . .42 B.1. Mix of Destination-Realm routed requests and Destination- Host routed requests34 Appendix D. Considerations for Applications Integrating the DOIC Solution . . . . . . . . . . . . . . . . . .42 Appendix C. Restructuring of -02 version of the draft. . . . 34 D.1. Application Classification . . . . . . . . . . . . . . . 34 D.2. Application Type Overload Implications . . . . . . . . . 35 D.3. Request Transaction Classification . . . . . . . .45. . . 36 D.4. Request Type Overload Implications . . . . . . . . . . . 37 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . .4838 1. Introduction This specification defines a base solution for Diameter Overload Control (DOC),referedreferred to as Diameter Overload Indication Conveyance (DOIC). The requirements for the solution are described and discussed in the corresponding design requirements document [RFC7068]. Note that the overload control solution defined in this specification does not address all the requirements listed in [RFC7068]. A number of overload control related features are left for the future specifications. See Appendix A for a list of extensions that are currently being considered. See Appendix C for an analysis of the conformance to the requirements specified in [RFC7068]. The solution defined in this specification addresses Diameter overload control betweentwo endpoints (see Section 3.1).Diameter nodes that support the DOIC solution. Furthermore, the solution which is designed to apply to existing and future Diameter applications, requires no changes to the Diameter base protocol [RFC6733] and is deployable in environments where some Diameter nodes do not implement the Diameter overload control solution defined in this specification. 2. Terminology and Abbreviations Abatement Reaction to receipt of an overload report resulting in a reduction in traffic sent to the reporting node. Abatement actions include diversion and throttling. Abatement Algorithm Analgorithmmechanism requested by reporting nodes and used by reacting nodes to reduce the amount of traffic sent during an occurrence of overload control.Throttling Throttling is the reduction of the numberDiversion Abatement ofrequeststraffic sent toan entity. Throttling can includeaclient dropping requests, orreporting node by a reacting node in response to receipt of anagent rejecting requests with appropriate error responses. Clients and agents can also chooseoverload report. The abatement is achieved by diverting traffic from the reporting node toredirect throttled requestsanother Diameter node that is able tosome other entity or entities capableprocess the request. Host-Routed Request The set ofhandling them. Editor's note: Proposerequests that a reacting node knows will be served by a particular host, either due toaddthe presence of adefinitionDestination-Host AVP, or by some other local knowledge on the part ofAbatement to include both throttlingthe reacting node. Overload Control State (OCS) Reporting anddiversion (redirectingreacting node internally maintained state describing occurrences ofmessages) actions. Then to modify this definition to include justoverload control. Overload Report (OLR) Information sent by a reporting node indicating therejectingstart, continuation or end of an occurrence of overload control. Reacting Node A Diameter node that acts upon an overload report. Realm-Routed Request The set of requestsand addingthat adefinition of diversion.reacting node does not know the host that will service the request. Reporting Node A Diameter node that generates an overload report. (This may or may not be the overloaded node.)Reacting Node A Diameter node that consumes and acts upon a report. Note that "act upon" does not necessarily meanThrottling Throttling is thereacting node applies an abatement algorithm; it might decide to delegate that downstream, in which case it also becomes a "reporting node". Overload Control State (OCS) State describing an occurrencereduction ofoverload control maintained by reporting and reacting nodes. Overload Report (OLR) A setthe number ofAVPsrequests sentbyto an entity. Throttling can include a Diameter Client or Diameter Server dropping requests, or a Diameter Agent rejecting requests with appropriate error responses. In extreme cases reportingnode indicatingnodes can also throttle requests when the requested reductions in traffic does not sufficiently address thestart or continuation of an occurrence ofoverloadcontrol.scenario. 3. Solution Overview The Diameter Overload Information Conveyance (DOIC)mechanismsolution allows Diameter nodes to request other nodes to perform overload abatement actions, that is, actions to reduce the load offered to the overloaded node or realm. A Diameter node that supports DOIC is known as a "DOICendpoint".node". Any Diameter node can act as a DOICendpoint,node, including clients, servers, and agents. DOICendpointsnodes are further divided into "Reporting Nodes" and "Reacting Nodes." A reporting node requests overload abatement by sending an Overload Report (OLR) to one or more reacting nodes. A reacting nodeconsumesacts upon OLRs, and performs whatever actions are needed tofulfillfulfil the abatement requests included in the OLRs. A Reporting node may report overload on its own behalf, or on behalf of other (typically upstream) nodes. Likewise, a reacting node may perform overload abatement on its own behalf, or on behalf of other (typically downstream) nodes. A node's role as a DOICendpointnode is independent of its Diameter role. For example, DiameterrelayRelay andproxy agentsProxy Agents may act as DOICendpoints,nodes, even though they are not endpoints in the Diameter sense. Since Diameter enables bi-directional applications, where DiameterserversServers can send requests towards Diameterclients,Clients, a given Diameter node can simultaneously act as a reporting node and a reacting node. Likewise, a relay or proxy agent may act as a reacting node from the perspective of upstream nodes, and a reporting node from the perspective of downstream nodes. DOICendpointsnodes do not generate new messages to carry DOIC related information. Rather, they "piggyback" DOIC information over existing Diameter messages by inserting new AVPs into existing Diameter requests and responses. Nodes indicate support for DOIC, and any needed DOIC parameters by inserting an OC_Supported_Features AVP (Section 6.2) into existing requests and responses. Reporting nodes send OLRs by inserting OC-OLR AVPs (Section 6.3). A given OLR applies to the Diameter realm and application of the Diameter message that carries it. If a reporting node supports more than one realm and/or application, it reports independently for each combination of realm and application. Similarly,OC-Feature-Vector AVPs applythe OC-Supported- Features AVP applies to the realm and application of the enclosing message. This implies that a node may support DOIC for one application and/or realm, but not another, and may indicate different DOIC parameters for each application and realm for which it supports DOIC. Reacting nodes perform overload abatement according to an agreed-upon abatement algorithm. An abatement algorithm defines the meaning of the parameters of anOLR,OLR and the procedures required for overload abatement. This document specifies a single must-support algorithm, namely the "loss" algorithmSection(Section 5). Future specifications may introduce new algorithms. Overload conditions may vary in scope. For example, a single Diameter node may be overloaded, in which case reacting nodes may reasonably attempt to sendthrottledrequests to other destinations or via other agents. On the other hand, an entire Diameter realm may be overloaded, in which case such attempts would do harm. DOIC OLRs have a concept of "report type" (Section 6.6), where the type defines such behaviors. Report types are extensible. This document defines report types for overload of a specific server, and for overload of an entire realm. A report of type host is sent to indicate the overload of a specific server for the application-id indicated in the transaction. When receiving an OLR of type host, a reacting node applies overload abatement to what is referred to in this document as host-routed requests. This is the set of requests that the reacting node knows will be served by a particular host, either due to the presence of a Destination-Host AVP, or by some other local knowledge on the part of the reacting node. The reacting node applies overload abatement on those host-routed requests which the reacting node knows will be served by the server that matches the Origin-Host AVP of the received message that contained the received OLR of type host. A report type of realm is sent to indicate the overload of all servers in a realm for the application-id. When receiving an OLR of type realm, a reacting node applies overload abatement to what is referred to in this document as realm-routed requests. This is the set of requests that are not host-routed as defined in the previous paragraph. While a reporting node sends OLRs to "adjacent" reacting nodes, nodes that are "adjacent" for DOIC purposes may not be adjacent from a Diameter, or transport, perspective. For example, one or more Diameter agents that do not support DOIC may exist between a given pair of reporting and reacting nodes, as long as those agents pass unknown AVPs throughunmolested.unchanged. The report types described in this document can safely pass through non-supporting agents. This may not be true for report types defined in future specifications. Documents that introduce new report types MUST describe any limitations on their use across non-supporting agents. 3.1.Overload Control Endpoints (Non normative) The overload control solution can be considered as an overlay on top of an arbitrary Diameter network. The overload control information is exchanged over on a "DOIC association" established between two communication endpoints. The endpoints, namely the "reacting node" and the "reporting node" do not need to be adjacent Diameter peer nodes, nor they need to be the end-to-end Diameter nodes in a typical "client-server" deployment with multiple intermediate Diameter agent nodes in between. The overload control endpoints are the two Diameter nodes that decide to exchange overload control information between each other. How the endpoints are determined is specific to a deployment, a Diameter node role in that deployment and local configuration. The following diagrams illustrate the concept of Diameter Overload Endpoints and how they differ from the standard [RFC6733] defined client, server and agent Diameter nodes. The following is the key to the elements in the diagrams: C Diameter client as defined in [RFC6733]. S Diameter server as defined in [RFC6733]. A Diameter agent, in either a relay or proxy mode, as defined in [RFC6733]. DEP Diameter Overload Endpoint as defined in this document. In the following figures a DEP may terminate two different DOIC associations being a reporter and reactor at the same time. Diameter Session A Diameter session as defined in [RFC6733]. DOIC Association A DOIC association exists between two Diameter Overload Endpoints. One of the endpoints is the overload reporter and the other is the overload reactor. Figure 1 illustrates the most basic configuration where a client is connected directly to a server. In this case, the Diameter session and the DOIC association are both between the client and server. +-----+ +-----+ | C | | S | +-----+ +-----+ | DEP | | DEP | +--+--+ +--+--+ | | | | |{Diameter Session}| | | |{DOIC Association}| | | Figure 1: Basic DOIC deployment In Figure 2 there is an agent that is not participating directly in the exchange of overload reports. As a result, the Diameter session and the DOIC association are still established between the client and the server. +-----+ +-----+ +-----+ | C | | A | | S | +-----+ +--+--+ +-----+ | DEP | | | DEP | +--+--+ | +--+--+ | | | | | | |----------{Diameter Session}---------| | | | |----------{DOIC Association}---------| | | | Figure 2: DOIC deployment with non participating agent Figure 3 illustrates the case where the client does not support Diameter overload. In this case, the DOIC association is between the agent and the server. The agent handles the role of the reactor for overload reports generated by the server. +-----+ +-----+ +-----+ | C | | A | | S | +--+--+ +-----+ +-----+ | | DEP | | DEP | | +--+--+ +--+--+ | | | | | | |----------{Diameter Session}---------| | | | | |{DOIC Association}| | | | Figure 3: DOIC deployment with non-DOIC client and DOIC enabled agent In Figure 4 there is a DOIC association between the client and the agent and a second DOIC association between the agent and the server. One use case requiring this configuration is when the agent is serving as a SFE for a set of servers. +-----+ +-----+ +-----+ | C | | A | | S | +-----+ +-----+ +-----+ | DEP | | DEP | | DEP | +--+--+ +--+--+ +--+--+ | | | | | | |----------{Diameter Session}---------| | | | |{DOIC Association}|{DOIC Association}| | | and/or |----------{DOIC Association}---------| | | | Figure 4: A deployment where all nodes support DOIC Figure 5 illustrates a deployment where some clients support Diameter overload control and some do not. In this case the agent must support Diameter overload control for the non supporting client. It might also need to have a DOIC association with the server, as shown here, to handle overload for a server farm and/or for managing Realm overload. +-----+ +-----+ +-----+ +-----+ | C1 | | C2 | | A | | S | +-----+ +--+--+ +-----+ +-----+ | DEP | | | DEP | | DEP | +--+--+ | +--+--+ +--+--+ | | | | | | | | |-------------------{Diameter Session}-------------------| | | | | | |--------{Diameter Session}-----------| | | | | |---------{DOIC Association}----------|{DOIC Association}| | | | and/or |-------------------{DOIC Association}-------------------| | | | | Figure 5: A deployment with DOIC and non-DOIC supporting clients Editor's note: Propose to remove C1, which is already shown in a previous figure. Have this focus just on the non supporting client scenario. Figure 6 illustrates a deployment where some agents support Diameter overload control and others do not. +-----+ +-----+ +-----+ +-----+ | C | | A | | A | | S | +-----+ +--+--+ +-----+ +-----+ | DEP | | | DEP | | DEP | +--+--+ | +--+--+ +--+--+ | | | | | | | | |-------------------{Diameter Session}-------------------| | | | | | | | | |---------{DOIC Association}----------|{DOIC Association}| | | | and/or |-------------------{DOIC Association}-------------------| | | | | Figure 6: A deployment with DOIC and non-DOIC supporting agents Editor's note: Propose to add a non supporting server scenario. 3.2.Piggybacking Principle(Non normative)The overload control AVPs defined in this specification have been designed to be piggybacked on top of existing applicationmessage exchanges.messages. This is made possible by adding overload controltop leveltop-level AVPs, the OC-OLR AVP and the OC-Supported-FeaturesAVPAVP, as optional AVPs into existing commands when the corresponding Command Code Format (CCF) specification allows adding new optional AVPs (see Section 1.3.4 of [RFC6733]). Reacting nodes indicate support for DOIC by including the OC- Supported-Features AVP in all request messages originated or relayed by theDiameterreacting node. Reporting nodes indicate support for DOIC by including the OC- Supported-Features AVP in all answer messages originated or relayed by theDiameterreporting node. Reporting nodes also include overload reports using the OC-OLR AVP in answer messages. Note: There is no new Diameter application defined to carry overload related AVPs. The DOIC AVPs are carried in existing Diameter application messages. Note that the overload control solution does not have fixed server and client roles. TheendpointDOIC node role is determined based on the message type: whether the message is a request (i.e. sent by a "reacting node") or an answer (i.e. send by a "reporting node"). Therefore, in a typical "client-server" deployment, the"client"Diameter Client MAY report its overload condition to the"server"Diameter Server for anyserverDiameter Server initiated message exchange. An example of such is theserverDiameter Server requesting a re-authentication from aclient. 3.3.Diameter Client. 3.2. DOIC Capability Announcement(Non normative)The DOICsolutionssolution supports the ability for Diameter nodes to determine if other nodes in the path of a request support the solution. This capability isreferedreferred to as DOIC Capability Announcement (DCA) and is separate from Diameter Capability Exchange. The DCAmechanism is built around the piggybacking principle used for transporting Diameter overload AVPs. This includes both DCA AVPs and AVPs associated with Diameter overload reports. This allows for the DCA AVPs to be carried across Diameter nodes that do not support the DOIC solution. The DCA mechanismsolution uses the OC-Supported-Features AVPs to indicate the Diameter overload features supported. The first node in the path of a Diameter request that supports the DOIC solution inserts the OC-Supported-Feature AVP in the request message. This includes an indication that it supports the loss overload abatement algorithm defined in this specification (see Section 5). Thisinsuresensures that there is at least one commonly supported overload abatement algorithm between the reporting node and the reacting nodes in the path of the request. DOIC must support deployments where Diameter Clients and/or DiameterserversServers do not support the DOIC solution. In this scenario, it is assumed that Diameter Agents that support the DOIC solution will handle overload abatement for the non supportingclients.Diameter nodes. In this case the DOIC agent will insert the OC- Supporting-Features AVP in requests that do not already contain one, telling the reporting node that there is a DOIC node that will handle overload abatement. The reporting node inserts the OC-Supported-Feature AVP in all answer messages to requests that contained the OC-Supported-Feature AVP. The contents of the reporting node's OC-Supported-Feature AVP indicate the set of Diameter overload features supported by the reporting node with one exception. The reporting node only includes an indication of support for one overload abatement algorithm. This is the algorithm that the reporting node intends to use should it enter an overload condition or requests to use while it actually is in an overload condition. Reacting nodes can use the indicated overload abatement algorithm to prepare for possible overloadreports.reports and must use the indicated overload abatement algorithm if traffic reduction is actually requested. Note that the loss algorithm defined in this document is a stateless abatement algorithm. As a result it does not require any actions by reacting nodes prior to the receipt of an overload report. Stateful abatement algorithms that base the abatement logic on a history of request messages sent might require reacting nodes to maintain state toinsureensure that overload reports can be properly handled. The individual features supported by the DOIC nodes are indicated in the OC-Feature-Vector AVP. Any semantics associated with the features will be defined in extension specifications that introduce the features. The DCA mechanism must also support the scenario where the set of features supported by the sender of a request and by agents in the path of a request differ. In this case, the agent updates the OC- Supported-Feature AVP to reflect the mixture of the two sets of supported features. The logic to determine the content of the modified OC-Supported- Feature AVP is out-of-scope for this specification and is left to implementation decisions. Care must be takenin doing sonot to introduce interoperability issues for downstream or upstream DOIC nodes.3.4.3.3. DOIC Overload Condition Reporting(Non normative)As with DOIC Capability Announcement, Overload Condition Reporting uses new AVPs (Section 6.3) to indicate an overload condition. The OC-OLR AVP is referred to as an overload report. The OC-OLR AVP includes the type of report,an overload report ID,a sequence number, the length of time that the report is valid and abatement algorithm specific AVPs. Two types of overload reports are defined in this document, host reports and realm reports.Host reports apply to traffic thatA report of type host is sent to indicate the overload of a specific Diameterhost. The applies to requests that containnode for theDestination-Host AVP that contains a DiameterIdentity that matches that ofapplication-id indicated in the transaction. When receiving an OLR of type host, a reacting node applies overloadreport. These requests areabatement to what is referred to in this document as host-routed requests.A host report also applies to realm-routed requests,This is the set of requests thatdo not havethe reacting node knows will be served by a particular host, either due to the presence of a Destination-Host AVP,whenor by some other local knowledge on theselected route forpart of therequest is a connection toreacting node. The reacting node applies overload abatement on those host-routed requests which theimpactedreacting node knows will be served by the server that matches the Origin-Host AVP of the received message that contained the received OLR of type host. Realm reports apply to realm-routed requests for a specific realm as indicated in the Destination-Realm AVP. Reporting nodes are responsible for determining the need for a reduction of traffic. The method for making this determination is implementation specific and depend on the type of overload report being generated. A host report, for instance, will generally be generated by tracking utilization of resources required by the host to handle transactions for thetheDiameter application. A realm report will generally impact the traffic sent to multiple hosts and, as such, will typically require tracking the capacity of the servers able to handle realm-routed requests for the application. Once a reporting node determines the need for a reduction in traffic, it uses the DOIC defined AVPs to report on the condition. These AVPs are included in answer messages sent or relayed by the reporting node. The reporting node indicates the overload abatement algorithm that is to be used to handle the traffic reduction in the OC- Supported-Features AVP. The OC-OLR AVP is used to communicate information about the requested reduction. Reacting nodes, upon receipt of an overload report, are responsible for applying the abatement algorithm to traffic impacted by the overload report. The method used for that abatement is dependent on the abatement algorithm. The loss abatement algorithm is defined in this document (Section 5). Other abatement algorithms can be defined in extensions to the DOIC solutions. As the conditions that lead to the generation of the overload report change the reporting node can send new overload reports requesting greater reduction if the condition gets worse or less reduction if the condition improves. The reporting node sends an overload report with a duration of zero to indicate that theoverlaodoverload condition has ended and use of the abatement algorithm is no longer needed. The reacting node also determines when the overload report expires based on theOC-Validaty-DurationOC-Validity-Duration AVP in the overload report and stops applying the abatement algorithm when the report expires.3.5.3.4. DOIC Extensibility(Non normative)The DOICsolutionssolution is designed to be extensible. This extensibility is based on existing Diameter based extensibility mechanisms. There are multiple categories of extensions that are expected. This includes the definition of new overload abatement algorithms, the definition of new report types and new definitions of the scope of messages impacted by an overload report. The DOIC solution uses the OC-Supported-Features AVP for DOIC nodes to communicate supported features. The specific features supported by the DOIC node are indicated in the OC-Feature-Vector AVP. DOIC extensions must define new values for the OC-Feature-Vector AVP. DOIC extensions also have the ability to add new AVPs to the OC- Supported-Features AVP, if additional information about the new feature isrequired to be communicate. Overload abatement algorithmsrequired. Reporting nodes use the OC-OLR AVP to communicate overloadoccurances.occurrences. This AVP can also be extended to add new AVPs allowing a reporting nodes to communicate additional information about handling an overload condition. If necessary, new extensions can also define newtop leveltop-level AVPs. It is, however, recommended that DOIC extensions use the OC-Supported- Features and OC-OLR to carry all DOIC related AVPs.3.6.3.5. Simplified Example Architecture(Non normative)Figure71 illustrates the simplified architecture for Diameter overload information conveyance.See Section 3.1 for more discussion and details how different Diameter nodes fit into the architecture from the DOIC point of view.Realm X Same or other Realms <--------------------------------------> <----------------------> +--^-----+ : (optional) : |Diameter| : : |Server A|--+ .--. : +---^----+ : .--. +--------+ | _( `. : |Diameter| : _( `. +---^----+ +--( )--:-| Agent |-:--( )--|Diameter| +--------+ | ( ` . ) ) : +-----^--+ : ( ` . ) ) | Client | |Diameter|--+ `--(___.-' : : `--(___.-' +-----^--+ |Server B| : : +---^----+ : : End-to-end Overload Indication 1) <-----------------------------------------------> Diameter Application Y Overload Indication A Overload Indication A' 2) <----------------------> <----------------------> standard base protocol standard base protocol Figure7:1: Simplified architecture choices for overload indication delivery In Figure7,1, the Diameter overload indication can be conveyed (1) end-to-end between servers and clients or (2) between servers and Diameter agent inside the realm and then between the Diameter agent and theclients when the Diameter agent acting as back-to-back-agent for DOIC purposes. 3.7. Considerations for Applications Integrating the DOIC Solution (Non normative) THis section outlines considerations to be taken into account when integrating the DOIC solution into Diameter applications. 3.7.1. Application Classification (Non normative) The following is a classification of Diameter applications and requests. This discussion is meant to document factors that play into decisions made by the Diameter identity responsible for handling overload reports. Section 8.1 of [RFC6733] defines two state machines that imply two types of applications, session-less and session-based applications. The primary difference between these types of applications is the lifetime of Session-Ids. For session-based applications, the Session-Id is used to tie multiple requests into a single session. In session-less applications, the lifetime of the Session-Id is a single Diameter transaction, i.e. the session is implicitly terminated after a single Diameter transaction and a new Session-Id is generated for each Diameter request. For the purposes of this discussion, session-less applications are further divided into two types of applications: Stateless applications: Requests within a stateless application have no relationship to each other. The 3GPP defined S13 application is an example of a stateless application [S13], --> where only a Diameter command is defined between a client and a server and no state is maintained between two consecutive transactions. Pseudo-session applications: Applications that do not rely on the Session-Id AVP for correlation of application messages related to the same session but use other session-related information in the Diameter requests for this purpose. The 3GPP defined Cx application [Cx] is an example of a pseudo-session application. The Credit-Control application defined in [RFC4006] is an example of a Diameter session-based application. The handling of overload reports must take the type of application into consideration, as discussed in Section 3.7.2. 3.7.2. Application Type Overload Implications (Non normative) This section discusses considerations for mitigating overload reported by a Diameter entity. This discussion focuses on the type of application. Section 3.7.3 discusses considerations for handling various request types when the target server is known to be in an overloaded state. These discussions assume that the strategy for mitigating the reported overload is to reduce the overall workload sent to the overloaded entity. The concept of applying overload treatment to requests targeted for an overloaded Diameter entity is inherent to this discussion. The method used to reduce offered load is not specified here but could include routing requests to another Diameter entity known to be able to handle them, or it could mean rejecting certain requests. For a Diameter agent, rejecting requests will usually mean generating appropriate Diameter error responses. For a Diameter client, rejecting requests will depend upon the application. For example, it could mean giving an indication to the entity requesting the Diameter service that the network is busy and to try again later. Stateless applications: By definition there is no relationship between individual requests in a stateless application. As a result, when a request is sent or relayed to an overloaded Diameter entity - either a Diameter Server or a Diameter Agent - the sending or relaying entity can choose to apply the overload treatment to any request targeted for the overloaded entity. Pseudo-session applications: For pseudo-session applications, there is an implied ordering of requests. As a result, decisions about which requests towards an overloaded entity to reject could take the command code of the request into consideration. This generally means that transactions later in the sequence of transactions should be given more favorable treatment than messages earlier in the sequence. This is because more work has already been done by the Diameter network for those transactions that occur later in the sequence. Rejecting them could result in increasing the load on the network as the transactions earlier in the sequence might also need to be repeated. Session-based applications: Overload handling for session-based applications must take into consideration the work load associated with setting up and maintaining a session. As such, the entity sending requests towards an overloaded Diameter entity for a session-based application might tend to reject new session requests prior to rejecting intra-session requests. In addition, session ending requests might be given a lower probability of being rejected as rejecting session ending requests could result in session status being out of sync between the Diameter clients and servers. Application designers that would decide to reject mid-session requests will need to consider whether the rejection invalidates the session and any resulting session clean-up procedures. 3.7.3. Request Transaction Classification (Non normative) Independent Request: An independent request is not correlated to any other requests and, as such, the lifetime of the session-id is constrained to an individual transaction. Session-Initiating Request: A session-initiating request is the initial message that establishes a Diameter session. The ACR message defined in [RFC6733] is an example of a session-initiating request. Correlated Session-Initiating Request: There are cases when multiple session-initiated requests must be correlated and managed by the same Diameter server. It is notably the case in the 3GPP PCC architecture [PCC], where multiple apparently independent Diameter application sessions are actually correlated and must be handled by the same Diameter server. Intra-Session Request: An intra session request is a request that uses the same Session- Id than the one used in a previous request. An intra session request generally needs to be delivered to the server that handled the session creating request for the session. The STR message defined in [RFC6733] is an example of an intra-session requests. Pseudo-Session Requests: Pseudo-session requests are independent requests and do not use the same Session-Id but are correlated by other session-related information contained in the request. There exists Diameter applications that define an expected ordering of transactions. This sequencing of independent transactions results in a pseudo session. The AIR, MAR and SAR requests in the 3GPP defined Cx [Cx] application are examples of pseudo-session requests. 3.7.4. Request Type Overload Implications (Non normative) The request classes identified in Section 3.7.3 have implications on decisions about which requests should be throttled first. The following list of request treatment regarding throttling is provided as guidelines for application designers when implementing the Diameter overload control mechanism described in this document. The exact behavior regarding throttling is a matter of local policy, unless specifically defined for the application. Independent requests: Independent requests can be given equal treatment when making throttling decisions. Session-initiating requests: Session-initiating requests represent more work than independent or intra-session requests. Moreover, session-initiating requests are typically followed by other session-related requests. As such, as the main objective of the overload control is to reduce the total number of requests sent to the overloaded entity, throttling decisions might favor allowing intra-session requests over session-initiating requests. Individual session-initiating requests can be given equal treatment when making throttling decisions. Correlated session-initiating requests: A Request that results in a new binding, where the binding is used for routing of subsequent session-initiating requests to the same server, represents more work load than other requests. As such, these requests might be throttled more frequently than other request types. Pseudo-session requests: Throttling decisions for pseudo-session requests can take into consideration where individual requests fit into the overall sequence of requests within the pseudo session. Requests that are earlier in the sequence might be throttled more aggressively than requests that occur later in the sequence. Intra-session requests There are two classes of intra-sessions requests. The first class consists of requests that terminate a session. The second one contains the set of requests that are used by the Diameter client and server to maintain the ongoing session state. Session terminating requests should be throttled less aggressively in order to gracefully terminate sessions, allow clean-up of the related resources (e.g. session state) and get rid of the need for other intra-session requests, reducing the session management impact on the overloaded entity. The default handling of other intra-session requests might be to treat them equally when making throttling decisions. There might also be application level considerations whether some request types are favored over others.clients. 4. Solution Procedures(Normative)This section outlines the normative behavior associated with the DOIC solution. 4.1. Capability Announcement(Normative)This section defines DOIC Capability Announcement (DCA) behavior.The DCA procedures are used to indicate support for DOIC and support for DOIC features. The DOIC features include overload abatement algorithms supported. It might also include new report types or other extensions documented in the future. Diameter nodes indicate support for DOIC by including the OC- Supported-Features AVP in messages sent or handled by the node. Diameter agents that support DOIC MUST ensure that all messages have the OC-Supporting-Features AVP. If a message handled by the DOIC agent does not include the OC-Supported-Features AVP then the DOIC agent inserts the AVP. If the message already has the AVP then the agent either leaves it unchanged in the relayed message or modifies it to reflect a mixed set of DOIC features.4.1.1. Reacting Node Behavior(Normative)A reacting node MUST include the OC-Supported-Features AVP in all request messages. A reacting nodeMUSTMAY include the OC-Feature-Vector AVP with an indication of the loss algorithm. A reacting node MUST include the OC-Feature-Vector AVP to indicate support for abatement algorithms in addition to the loss algorithm. A reacting node SHOULD indicate support for all other DOIC features it supports. Not all DOIC features will necessarily apply to all transactions. For instance, there may be a future extension that only applies to session based applications. A reacting node that supports this extension can choose to not include it for non session based applications. An OC-Supported-Features AVP in answer messages indicates there is a reporting node for the transaction. The reacting node MAY take action based on the features indicated in the OC-Feature-Vector AVP. Note that the loss abatement algorithm is the only feature described in this document and it does not require action to be takenby the reacting node exceptwhenthe answer message also hasthere is an active overload report. This behavior is described in Section 4.2 and Section 5. 4.1.2. Reporting Node Behavior(Normative)Upon receipt of a request message, a reporting node determines if there is a reacting node for the transaction based on the presence of the OC-Supported-Features AVP. If the request message contains an OC-Supported-Features AVP then the reporting node MUST include the OC-Supported-Features AVP in the answer message for that transaction. The reporting node MUST NOT include the OC-Supported-Features AVP, OC-OLR AVP or any other overload control AVPs defined in extension drafts in response messages for transactions where the request message does not include the OC-Supported-Features AVP. Lack of the OC-Supported-Features AVP in the request message indicates that there is no reacting node for the transaction. Based on the content of the OC-Supported-Features AVP in the request message, the reporting node knows what overload control functionality is supported by the reactingnode(s).node. The reporting node then acts accordingly for the subsequent answer messages it initiates.If the reqeust message contains an OC-Supported-Features AVP then the reporting node MUST include the OC-Supported-Features AVP in the answer message for that transaction.The reporting node MUST indicate support for one and only one abatement algorithm in the OC-Feature-Vector AVP. The abatement algorithm included MUST be from the set of abatement algorithms contained in the requestmessagesmessage's OC-Supported-Features AVP. The abatement algorithm includedindicatesMUST indicate the abatement algorithm the reporting node wants the reacting node to use when the reporting node enters an overload condition. For an ongoing overload state, a reacting node MUST keep the algorithm that was selected by the reporting node in further requests towards the reporting node. The reporting nodeMUSTSHOULD NOT change the selected algorithm during a period of time that it is in an overload condition and, as a result, is sending OC-OLR AVPs in answer messages. The reporting node SHOULD indicate support for other DOIC features defined in extension drafts that it supports and that apply to the transaction. Note that not all DOIC features will apply to all Diameter applications or deployment scenarios. The features included in the OC-Feature-Vector AVPisare based on local reporting node policy.The reporting node4.1.3. Agent Behavior Diameter agents that support DOIC MUSTNOT include the OC-Supported-Features AVP, OC-OLR AVP or any other overload control AVPs defined in extension drafts in responseensure that all messagesfor transactions wherehave therequestOC-Supporting-Features AVP. If a message handled by the DOIC agent does not include the OC-Supported-Features AVP then the DOIC agent inserts the AVP.Lack ofIf the message already has theOC-Supported-FeaturesAVP then the agent either leaves it unchanged in therequestrelayed messageindicates that there is no reacting node for the transaction.or modifies it to reflect a mixed set of DOIC features. An agent MAY modify the OC-Supported-Features AVP carried in answer messages.4.1.3. Agent Behavior (Normative)For instance, if the agent supports a superset of the features reported by the reacting node then the agent might choose, based on local policy, to advertise that superset of features to the reporting node. If the agent modifies the OC-Supported-Features AVP sent to the reporting node then it might also need to modify the OC-Supported- Features AVP sent to a reacting node in the subsequent answer message, as it cannot send an indication of support for features that are not supported by the reacting node. Editor'snote -- Neednote: There is an open issue on the wording around agent behavior in this case that needs toaddbe resolved prior to finishing thissection.document. 4.2. Overload Report Processing(Normative)4.2.1. Overload Control State(Normative)Both reacting and reporting nodes maintainan overload control stateOverload Control State (OCS) foreach endpoint (a host or a realm) they communicate with and both endpoints have announced support for DOIC. See Sections 6.1 and 4.1 for discussion about how the support for DOIC is determined.active overload conditions. 4.2.1.1. Overload Control State for Reacting Nodes A reacting nodemaintainsSHOULD maintain the following OCS per supported Diameter application: o A host-typeOverload Control StateOCS entry for each Destination-Hosttowardsto which it sends host-type requests and o A realm-typeOverload Control StateOCS entry for each Destination-Realmtowardsto which it sends realm-type requests. A host-typeOverload Control State may beOCS entry is identified by the pair of Application-Id andDestination-Host.Host-Id. A realm-typeOverload Control State may beOCS entry is identified by the pair of Application-Id andDestination-Realm.Realm-Id. Thehost-type/realm-type Overload Control State for a given pair of Applicationhost-type andDestination-Host / Destination- Realm couldrealm-type OCS entries MAY include the followinginformation:information (the actual information stored is an implementation decision): o Sequence number (as received in OC-OLR) o Time of expiry(deviated(derived fromvalidity duration asOC-Validity-Duration AVP received inOC- OLRthe OC-OLR AVP and time ofreception)reception of the message carrying OC- OLR AVP) o Selected Abatement Algorithm (as received inOC-Supported- Features)OC-Supported-Features AVP) o Abatement Algorithm specific input data (as received withinOC-OLR, e.g. Reduction Percentagethe OC-OLR AVP, forLoss)example, OC-Reduction-Percentage for the Loss abatement algorithm) 4.2.1.2. Overload ControlStatesState for Reporting Nodes A reporting nodemaintainsSHOULD maintain OCS entries per supported Diameterapplication andapplication, per supported (and eventually selected) Abatement Algorithman Overload Control State.and per report-type. AnOverload Control State may beOCS entry is identified by the pair of Application-Id andsupportedAbatement Algorithm. TheOverload Control StateOCS entry for a given pair of Application and Abatement AlgorithmcouldMAY include theinformation:information (the actual information stored is an implementation decision): o Report type o Sequence number o Validity Durationand Expiryo Expiration Time o Algorithm specific input data(e.g.(for example, the Reduction Percentage forLoss) Overload Control States for reporting nodes containing a validity duration of 0 sec. should not expire before any previously sent (stale) OLR has timed out at any reacting node. Editor's note: This statement is unclear and contradictory with other statements. A validity timer of zero seconds indicates thattheoverload condition has ended and abatement is no longer requested.Loss Abatement Algorithm) 4.2.1.3.MaintainingReacting Node Maintenance of Overload Control StateReacting nodes create a host-type OCS identified by OCS-Id = (app- id,host-id) when receiving an answer message of application app-id containing an Orig-Host of host-id and a host-type OC-OLR AVP unless such host-type OCS already exists. Reacting nodes createWhen arealm-type OCS identified by OCS-Id = (app- id,realm-id) when receiving an answer message of application app-id containingreacting node receives anOrig-Realm of realm-id and a realm-typeOC-OLRAVP unless such realm type OCS already exists. Reacting nodes delete an OCS whenAVP, it MUST determine if itexpires (i.e. when current time minus reception timeisgreater than validity duration). Editor's note: Reacting nodes also delete on OCS withfor anupdated OLR is received with a validity duration of zero. Reacting nodes updateexisting or new overload condition. For thehost-type OCS identified by OCS-Id = (app- id,host-id) when receiving an answer message of application app-id containing an Orig-Hostremainder ofhost-id and a host-type OC-OLR AVP with a sequence number higher thanthis section thestored sequence number. Reacting nodes updateterm OLR referres to therealm-type OCS identified by OCS-Id = (app- id,realm-id) when receiving an answer messagecombination ofapplication app-id containing an Orig-Realmthe contents ofrealm-id and a realm-type OC-OLR AVP with a sequence number higher thanthestored sequence number. Reacting nodes do not delete an OCS when receiving an answer message that does not contain anreceived OC-OLR AVP(i.e. absence of OLR means "no change"). Reporting nodes create an OCS identified by OCS-Id = (app-id,Alg) when receiving a request of application app-id containing an OC- Supported-Features AVP indicating support ofand theAbatement Algorithm Alg (whichabatement algorithm indicated in thereporting node selects) while being overloaded, unless such OCS already exists. Reporting nodes delete an OCS when it expires. Editor's note: Reporting nodes should send updated overload reports with a validity duration of zeroreceived OC-Supported- Features AVP. The OLR is fora period of time afteranOCS expires or is removed due to theexisting overload conditionending. Reporting nodes updateif the reacting node has an OCSidentified by OCS-Id = (app-id,Alg) when they detectthat matches theneed to modifyreceived OLR. For a host report-type this means it matches therequested amount of applicationapp-idtraffic reduction. 4.2.2. Reacting Node Behavior (Normative) Once a reacting node receivesand host-id in anOC-OLR AVP fromexisting host OCS entry. For areporting node,realm report-type this means itapplies traffic abatement based on the selected algorithm withmatches thereporting nodeapp-id and realm-id in an existing realm OCS entry. If thecurrent overload condition. The reacting node learns the reporting node supported abatement algorithms directly from the received answer message containing the OC-Supported-Features AVP. The received OC-Supported-Features AVP does not change theOLR is for an existing overload conditionand/or traffic abatement algorithm settingsthen it MUST determine if theOC-Sequence-Number AVP contains a value thatOLR isequala retransmission or an update to thepreviously received/recorded value.existing OLR. If theOC-Supported-Features AVP is received for the first timesequence number for thereporting node or the OC- Sequence-Number AVP valuereceived OLR islessgreater than thepreviously received/ recorded value (and is outsidesequence number stored in thevalid overflow window),matching OCS entry then the reacting node MUST update the matching OCS entry. If the sequence number for the received OLR isstale (e.g. an intentionalless than orunintentional replay) and SHOULD be silently discarded. As described in Section 6.3, the OC-OLR AVP containsequal to thenecessary information forsequence number in theoverload condition onmatching OCS entry then thereporting node. Fromreacting node MUST silently ignore theOC-Report-Type AVP containedreceived OLR. The matching OCS MUST NOT be updated in this case. If theOC-OLR AVP,received OLR is for a new overload condition then the reacting nodelearns whetherMUST generate a new OCS entry for the overloadcondition report concernscondition. For aspecifichost(as identified byreport-type this means it creates on OCS entry with theOrigin-Host AVPapp-id of theanswerapplication-id in the received messagecontainingand host-id of theOC-OLR AVP) orOrigin-Host in theentire realm (as identified byreceived message. Note: This solution assumes that theOrigin-RealmOrigin-Host AVPofin the answer messagecontainingincluded by theOC-OLR AVP). The reactingreporting nodelearnsis not changed along theDiameter applicationpath towhichtheoverload report applies fromreacting node. For a realm report-type this means it creates on OCS entry with theApplication-IDapp-id of theanswerapplication-id in the received messagecontainingand realm-id of the Origin-Realm in the received message. If the received OLR contains a validity duration of zero ("0") then theOC-OLR AVP. Thereacting node MUSTuse this informationupdate the OCS entry asan input for itsbeing expired. Note that it is not necessarily appropriate to delete the OCS entry, as there is recommended behavior that the reacting node slowly returns to full traffic when ending an overload abatementalgorithm.period. Theidea isreacting node does not delete an OCS when receiving an answer message that does not contain an OC-OLR AVP (i.e. absence of OLR means "no change"). 4.2.1.4. Reporting Node Maintenance of Overload Control State A reporting node SHOULD create a new OCS entry when entering an overload condition. If thereactingreporting nodeapplies different handlingknows through absence of thetraffic abatement, whether sent requestOC-Supported- Features AVP in received messages that there aretargetedno reacting nodes supporting DOIC then the reporting node can choose to not create OCS entries. When generating aspecific host (identified by the Diameter-Host AVP innew OCS entry therequest) orsequence number MAY be set to anyhost in a realm (when only the Destination-Realm AVPvalue if there ispresent inno unexpired overload report for previous overload conditions sent to any reacting node for therequest). Note that future specifications MAY definesame application and report-type. When generating sequence numbers for newOC-Report-Type AVP values that imply different handling ofoverload conditions, theOC-OLR AVP. For example,new sequence number MUST be greater than any sequence number in an active (unexpired) overload report previously sent by the reporting node. This property MUST hold over aformreboot ofnew additional AVPs insidetheGrouped OC-OLR AVP that would define report target in a finer granularity than just a host. Editor's note:reporting node. Theabove behavior for Realm reports is inconsistent withreporting node MUST update an OCS entry when it needs to adjust thedefinitionvalidity duration ofrealm reports in section Section 6.6. If the OC-OLR AVP is received for the first time,the overload condition at reacting nodes. For instance, if the reporting nodeMUST createwishes to instruct reacting nodes to continue overloadcontrol state associated with the related realm orabatement for aspecific host in the realm identified in the message carrying the OC-OLR AVP, as described in Section 4.2.1. If the valuelonger period of time that originally communicated. This also applies if theOC-Sequence-Number AVP contained in the received OC-OLR AVP is equalreporting node wishes toor less than the value stored in an existing overload control state, the received OC-OLR AVP SHOULD be silently discarded. Ifshorten thevalueperiod ofthe OC-Sequence-Number AVP contained in the received OC-OLR AVPtime that overload abatement isgreater thanto continue. A reporting node MUST NOT update thevalue storedabatement algorithm in anexisting overload control state or there is no previously recorded sequence number, the reactingactive OCS entry. A reporting node MUST update an OCS entry when it wishes to adjust any abatement algorithm specific parameters, including theoverload control state associated withreduction percentage used for therealm orLoss abatement algorithm. For instance, if thespecificreporting nodeinwishes to change the reduction percentage either higher, if therealm. When anoverloadcontrol state is createdcondition has worsened, orupdated,lower, if thereactingoverload condition has improved, then the reporting node would update the appropriate OCS entry. The reporting node MUSTapplyupdate thetraffic abatement requested insequence number associated with theOC-OLR AVP usingOCS entry anytime thealgorithm announced incontents of theOC-Supported-Features AVP containedOCS entry are changed. This will result in a new sequence number being sent to reacting nodes, instructing thereceived answer message along withreacting nodes to process the OC-OLR AVP.TheA reporting node SHOULD update an OCS entry with a validity duration of zero ("0") when the overloadinformation contained incondition ends. If theOC-OLR AVP is either explicitly indicatedreporting node knows that the OCS entries in theOC-Validity-Duration AVP or is implicitly equals toreacting nodes are near expiration then thedefault value (5 seconds) ifreporting node can decide to delete theOC-Validity-Duration AVP is absent.OCS entry. Thereactingreporting node MUSTmaintain thekeep an OCS entry with a validity durationinof zero ("0") for a period of time long enough to ensure that any non- expired reacting node's OCS entry created as a result of the overloadcontrol state. Oncecondition in thevalidity duration times out,reporting node is deleted. 4.2.2. Reacting Node Behavior When a reacting node sends a request it MUST determine if that request matches an active OCS. If the request matches and active OCS then the reacting node MUSTassumeapply abatement treatment on theoverload condition reportedrequest. The abatement treatment applied depends on the abatement algorithm stored ina previous OC-OLR AVP has ended. A value of zero ("0") receivedthe OCS. For the Loss abatement algorithm defined in this specification, see Section 5 for theOC-Validity-Durationabatement logic applied. If the abatement treatment results inan updated overload report indicates thatthrottling of theoverload condition has endedrequest andthatif theoverload statereacting node isno longer valid.an agent then the agent MUST send an appropriate error as defined in section Section 7. In the case that the OCS entry validity duration expires orishas a validity duration of zero ("0"), meaning that it the reporting node has explicitly signaledas being no longer validthestate associated withend of the overloadreport MUST be removed and anycondition then abatement associated with the overloadreportabatement MUST be ended in a controlled fashion.After removing the overload state the sequence number MUST NOT be used for future comparisons of sequence numbers.4.2.3. Reporting Node Behavior(Normative) AThe operation on the reporting node isa Diameter node insertingstraight forward. If there is an active OCS entry then the reporting node SHOULD include the OC-OLR AVP ina Diameter message in orderall answer messages toinform a reacting node about an overload conditionrequests that contain the OC-Supported-Features AVP and that match the active OCS entry. A requestDiameter traffic abatement. The operation onmatches if the application-id in the request matches the application-id in any active OCS entry and if the report-type in the OCS entry matches a report-type supported by the reporting nodeis straight forward. The reporting node learnsas indicated in thecapabilitiesOC-Supported-Features AVP. The contents of thereacting node when it receivesOC-OLR AVP MUST contain all information necessary for the abatement algorithm indicated in the OC-Supported-Features AVPas part of any Diameter request message. Ifthat is also included in the answer message. A reporting nodeshares at least one common feature with theMAY choose to not resend an overload report to a reactingnode, then the DOICnode if it canbe enabled between these two endpoints. See Section 4.1 for further discussion onguarantee that this overload report is already active in thecapability and feature announcementreacting node. Note - In some cases (e.g. when there are one or more agents in the path betweentwo endpoints. When a traffic reduction is required due to an overload conditionreporting andthereacting nodes, or when overloadcontrol solution is supportedreports are discarded by reacting nodes) thesender ofreporting node may not be able to guarantee that theDiameter request,reacting node has received the report. A reporting node MUSTinclude an OC-Supported- Features AVP and an OC-OLR AVP inNOT send overload reports of a type that has not been advertised as supported by thecorresponding Diameter answer. The OC-OLR AVP containsreacting node. Note that a reacting node advertises support for therequired traffic reductionhost and realm report types by including theOC- Supported-FeaturesOC-Supported-Features AVPindicatesin thetraffic abatement algorithm to apply. This algorithm MUSTrequest. Support for other report types must beone of the algorithms advertisedexplicitly indicated by new feature bits in therequest sender.OC-Feature-Vector AVP. A reporting node MAY rely on the OC-Validity-Duration AVP values for the implicit overload control state cleanup on the reacting node. However, it is RECOMMENDED that the reporting node always explicitly indicates the end of a overload condition. The reporting node SHOULD indicate the end of an overload occurrence by sending a new OLR with OC-Validity-Duration set to a value of zero ("0"). The reporting node SHOULDinsureensure that all reacting nodes receive the updated overload report.4.2.4. Agent Behavior (Normative)All OLRs sent have an expiration time calculated by adding the validity-duration contained in the OLR to the time the message was sent. Transit time for the OLR can be safely ignored. The reporting node can ensure that all reacting nodes have received the OLR by continuing to send it in answer messages until the expiration time for all OLRs sent for that overload condition have expired. When a reporting node sends an OLR, it effectively delegates any necessary throttling to downstream nodes. Therefore, the reporting node SHOULD NOT apply throttling to the set of messages to which the OLR applies. That is, the same candidate set of messages SHOULD NOT be throttled multiple times. However, when the reporting node sends and OLR downstream, it MAY still be responsible to apply other abatement methods such as diversion. The reporting node might also need to throttle requests for reasons other then overload. For example, an agent or server might have a configured rate limit for each client, and throttle requests that exceed that limit, even if such requests had already been candidates for throttling by downstream nodes. This document assumes that there is a single source for realm-reports for a given realm, or that if multiple nodes can send realm reports, that each such node has full knowledge of the overload state of the entire realm. A reacting node cannot distinguish between receiving realm-reports from a single node, or from multiple nodes. Editor'snoteNote: There is not yet consensus on the above two paragraphs. Two alternatives are under consideration --Needsynchronization of sequence numbers and attribution of reports. If no consensus is reached then it will be left toadd this section.be addressed as an extension. 4.3. Protocol Extensibility(Normative)The overload control solution can be extended, e.g. with new traffic abatement algorithms, new report types or other new functionality. When defining a new extension a new feature bit MUST be defined for the OC-Feature-Vector. This feature bit is used to communicate support for the new feature. Theextention may alsoextension MAY define new AVPs for use in DOIC CapabilityAnouncementAnnouncement and for use in DOIC Overload reporting. These newAVP shouldAVPs SHOULD be defined to be extensions to the OC-Supported-Features and OC-OLR AVPs defined in this document. It should be noted that [RFC6733] defined Grouped AVP extension mechanisms apply. This allows, for example, defining a new feature that is mandatory to be understood even when piggybacked on an existingapplications. More specifically, the sub-AVPs inside the OC-Supported-Features and OC-OLR AVP MAY have the M-bit set. However, when overload control AVPs are piggybacked on top of an existing applications, setting M-bit in sub-AVPs is NOT RECOMMENDED.application. The handling of feature bits in the OC-Feature-Vector AVP that are not associated with overload abatement algorithms MUST be specified by the extensions that define the features. When defining new report type values, the corresponding specification MUST define the semantics of the new report types and how they affect the OC-OLR AVP handling. The specification MUST also reserve a corresponding newfeature, seefeature bit in theOC-Supported-Features and OC- Feature-Vector AVPs.OC-Feature-Vector AVP. The OC-OLR AVP can be expanded with optional sub-AVPs only if a legacy DOIC implementation can safely ignore them without breaking backward compatibility for the given OC-Report-Type AVPvalue implied report handling semantics.value. If the new sub-AVPs imply new semantics for handling the indicated report type, then a new OC-Report-Type AVP value MUST be defined. New features (feature bits in the OC-Feature-Vector AVP) and report types (in the OC-Report-Type AVP) MUST be registered with IANA. As with any Diameter specification, new AVPs MUST also be registered with IANA. See Section 8 for the required procedures. 5. Loss Algorithm(Normative)This section documents the Diameter overload loss abatement algorithm. 5.1. Overview(Non normative)The DOIC specification supports the ability for multiple overload abatement algorithms to be specified. The abatement algorithm used for any instance of overload is determined by the Diameter Overload Capability Announcement process documented in Section 4.1. The loss algorithm described in this section is the default algorithm that must be supported by all Diameter nodes that support DOIC. The loss algorithm is designed to be a straightforward and stateless overload abatement algorithm. It is used by reporting nodes to request a percentage reduction in the amount of traffic sent. The traffic impacted by the requested reduction depends on the type of overload report. Reporting nodes use a strategy of applying abatement logic to the requested percentage of request messages sent (or handled in the case of agents) by the reacting node that are impacted by the overload report. From a conceptual level, the logic at the reacting node could be outlined as follows.In this discussion assume that the reacting node is also the sending node.1. An overload report is received and the associated overload state is either saved or updated (if required) by the reacting node. 2. A new Diameter request is generated by the application running on the reacting node. 3. The reacting node determines that an active overload report applies to therequest.request, as indicated by the corresponding OCS entry. 4. The reacting node determines if abatement should be applied to the request. One approach that could be takenwould befor each request is to select a random number between 1 and 100. If the random number is less than the indicated reduction percentage then the request is given abatement treatment, otherwise the request is given normal routing treatment. 5.2.Use of OC-Reduction-Percentage AVP A reporting node using the loss algorithm must use the OC-Reduction- Percentage AVP (Section 6.7 to indicated the desired percentage of traffic reduction.) Editor's note: The above duplicates what is in the OC-Reduction- Percentage AVP section can probably be removed. 5.3.Reporting Node Behavior(Normative)The method a reporting nodes uses to determine the amount of traffic reduction required to address an overload condition is an implementation decision. When a reporting node that has selected the loss abatement algorithm determines the need to request a traffic reduction itmust includeincludes anOC-OLROC- OLR AVP inallresponsemessages.messages as described in Section 4.2.3. The reporting nodemustMUST indicate a percentage reduction in the OC- Reduction-Percentage AVP. The reporting nodemayMAY change the reduction percentage in subsequent overload reports. When doing so the reporting node must conform to overload report handing specified in Section 4.2.3. When the reporting node determines it no longer needs a reduction in traffic the reporting nodeshouldSHOULD send an overload report indicating the overload report is no longer valid, as specified in Section 4.2.3.5.4.5.3. Reacting Node Behavior(Normative)The method a reacting node uses to determine which request messages are given abatement treatment is an implementation decision. When receiving an OC-OLR in an answer message where the algorithm indicated in the OC-Supported-Features AVP is the loss algorithm, the reacting nodemust attempt toMUST apply abatement treatment to the requested percentage of request messages sent. Note: the loss algorithm is a stateless algorithm. As a result, the reacting node does not guarantee that there will be an absolute reduction in traffic sent. Rather, it guarantees that the requested percentage of new requests will be given abatement treatment. When applying overload abatement treatment for the load abatement algorithm, the reacting node MUST abate, either by throttling or diversion, the requested percentage of requests that would have otherwise been sent to the reporting host or realm. If reacting node comes out of the 100 percent traffic reduction as a result of the overload report timing out, the following concerns are RECOMMENDED to be applied. The reacting node sending the traffic should be conservative and, for example, first send "probe" messages to learn the overload condition of the overloaded node before converging to any traffic amount/rate decided by the sender. Similar concerns apply in all cases when the overload report times out unless the previous overload report stated 0 percent reduction.Editor's note: Need to add additional guidance to slowly increase the rate of traffic sent to avoid a sudden spike in traffic, as the spike in traffic could result in oscillation of the need for overload control.If the reacting node does not receiveaan OLR in messages sent to theformallyformerly overloaded node then the reacting nodeshouldSHOULD slowly increase the rate of traffic sent to the overloaded node. It is suggested that the reacting node decrease the amount of traffic given abatement treatment by 20% each second until the reduction is completely removed and no traffic is given abatement treatment. The goal of this behavior is to reduce the probability of overload condition thrashing where an immediate transition from 100% reduction to 0% reduction results in the reporting node moving quickly back into an overload condition. 6. Attribute Value Pairs(Normative)This section describes the encoding and semantics of the Diameter Overload Indication Attribute Value Pairs (AVPs) defined in this document.When added to existing commands, both OC-Feature-Vector and OC-OLR AVPs SHOULD have the M-bit flag cleared to avoid backward compatibility issues.A new application specification can incorporate the overload control mechanism specified in this document by making it mandatory to implement for the application and referencing this specification normatively.In such a case, the OC-Feature-Vector and OC-OLR AVPs reused in newly defined Diameter applications SHOULD have the M-bit flag set. However, itIt is the responsibility of the Diameter application designers to define how overload control mechanisms works on that application. 6.1. OC-Supported-Features AVP The OC-Supported-Features AVP (AVP code TBD1) is type of Grouped and servesfortwo purposes. First, it announces a node's support for the DOIC solution in general. Second, it contains the description of the supported DOIC features of the sending node. The OC-Supported- Features AVP MUST be included in every Diameter request message a DOIC supporting node sends. OC-Supported-Features ::= < AVP Header: TBD1 > [ OC-Feature-Vector ] * [ AVP ] The OC-Feature-Vector sub-AVP is used to announce the DOIC features supported by theendpoint,DOIC node, in the form of a flag bits field in which each bit announces one feature or capability supported by the node (see Section 6.2). The absence of the OC-Feature-Vector AVP indicates that only the default traffic abatement algorithm described in this specification is supported.A reacting node includes this AVP to indicate its capabilities to a reporting node. For example, the endpoint (reacting node) may indicate which (future defined) traffic abatement algorithms it supports in addition to the default. During the message exchange the overload control endpoints express their common set of supported capabilities. The reacting node includes the OC-Supported-Features AVP that announces what it supports. The reporting node that sends the answer also includes the OC-Supported-Features AVP that describes the capabilities it supports. The set of capabilities advertised by the reporting node depends on local policies. At least one of the announced capabilities MUST match. If there is no single matching capability the reacting node MUST act as if it does not implement DOIC and cease inserting any DOIC related AVPs into any Diameter messages with this specific reacting node. Editor's note: The last sentence conflicts with the last sentence two paragraphs up. In reality, there will always be at least one matching capability as all nodes supporting DOIC must support the loss algorithm. Suggest removing the last sentence.6.2. OC-Feature-Vector AVP The OC-Feature-Vector AVP (AVP code TBD6) is type of Unsigned64 and contains a 64 bit flags field of announced capabilities ofan overload control endpoint.a DOIC node. The value of zero (0) is reserved. The following capabilities are defined in this document: OLR_DEFAULT_ALGO (0x0000000000000001) When this flag is set by theoverload control endpointDOIC node it means that the default traffic abatement (loss) algorithm is supported. 6.3. OC-OLR AVP The OC-OLR AVP (AVP code TBD2) is type of Grouped and contains thenecessaryinformation necessary to convey an overloadreport.report on an overload condition at the reporting node. The OC-OLR AVP does not explicitly contain all information needed by the reacting node to decide whether a subsequent request must undergo a throttling process with the received reduction percentage. The value of theOC- Report-TypeOC-Report-Type AVP within the OC-OLR AVP indicates which implicit information is relevant for this decision (see Section 6.6). The application the OC-OLR AVP applies to is the same as theApplication- IdApplication-Id found in the Diameter message header. Theidentityhost or realm the OC-OLR AVP concerns is determined from the Origin-Host AVP(andand/or Origin-Realm AVPas well)foundfromin the encapsulating Diameter command. The OC-OLR AVP is intended to be sent only by a reporting node. OC-OLR ::= < AVP Header: TBD2 > < OC-Sequence-Number > < OC-Report-Type > [ OC-Reduction-Percentage ] [ OC-Validity-Duration ] * [ AVP ]The OC-Validity-Duration AVP indicates the validity time of the overload report associated with a specific sequence number, measured after reception of the OC-OLR AVP. The validity time MUST NOT be updated after reception of subsequent OC-OLR AVPs with the same sequence number. The default value for the OC-Validity-Duration AVP value is 5 (i.e., 5 seconds). When the OC-Validity-Duration AVP is not present in the OC-OLR AVP, the default value applies.Note that if a Diameter command were to contain multiple OC-OLR AVPs they all MUST have different OC-Report-Type AVP value. OC-OLR AVPs with unknown values SHOULD be silently discarded by reacting nodes and the event SHOULD be logged.Editor's note: Need to specify what happens when two reports of the same type are received.6.4. OC-Sequence-Number AVP The OC-Sequence-Number AVP (AVP code TBD3) is type of Unsigned64. Its usage in the context of overload control is described in Section 4.2. From the functionality point of view, the OC-Sequence-Number AVP MUST be used as a non-volatile increasing counter for a sequence of overload reports between two DOIC nodes for the same overloadcontrol endpoints.occurrence. The sequence number is only required to be unique between twooverload control endpoints.DOIC nodes. Sequence numbers are treated in auni-directionaluni- directional manner, i.e. two sequence numbers on each direction between twoendpointsDOIC nodes are not related or correlated.When generating sequence numbers, the new sequence number MUST be greater than any sequence number in an active overload report previously sent by the reporting node. This property MUST hold over a reboot of the reporting node.6.5. OC-Validity-Duration AVP The OC-Validity-Duration AVP (AVP code TBD4) is type of Unsigned32 and indicates insecondsmilliseconds the validity time of the overload report. The number ofsecondsmilliseconds is measured after reception of the first OC-OLR AVP with a given value of OC-Sequence-Number AVP. The default value for the OC-Validity-Duration AVP is55000 (i.e., 5 seconds). When the OC-Validity-Duration AVP is not present in the OC-OLR AVP, the default value applies. Validity duration with values above 86400 (i.e.; 24 hours) MUST NOT be used. Invalid duration values are treated as if the OC-Validity-Duration AVP were not present and result in the default value being used. Editor's note: There is an open discussion on whether to have an upper limit on the OC-Validity-Duration value, beyond that which can be indicated by an Unsigned32. A timeout of the overload report has specific concerns that need to be taken into account by theendpointDOIC node acting on the earlier received overload report(s). Section 6.7 discusses the impacts of timeout in the scope of the traffic abatement algorithms.When a reporting node has recovered from overload, it SHOULD invalidate any existing overload reports in a timely matter. This can be achieved by sending an updated overload report (meaning the OLR contains a new sequence number) with the OC-Validity-Duration AVP value set to zero ("0"). If the overload report is about to expire naturally, the reporting node MAY choose to simply let it do so. A reacting node MUST invalidate and remove an overload report that expires without an explicit overload report containing an OC- Validity-Duration value set to zero ("0").6.6. OC-Report-Type AVP The OC-Report-Type AVP (AVP code TBD5) is type of Enumerated. The value of the AVP describes what the overload report concerns. The following values are initially defined: 0 A host report. The overload treatment should apply to requests for which all of the following conditions are true: Either the Destination-Host AVP is present in the request and its value matches the value of the Origin-Host AVP of the received message that contained the OC-OLR AVP; or the Destination-Host is not present in the request but the value of the peer identity associated with the connection used to send the request matches the value of the Origin-Host AVP of the received message that contained the OC-OLR AVP. The value of the Destination-Realm AVP in the request matches the value of the Origin-Realm AVP of the received message that contained the OC-OLR AVP. The value of the Application-ID in the Diameter Header of the request matches the value of the Application-ID of the Diameter Header of the received message that contained the OC-OLR AVP. 1 A realm report. The overload treatment should apply to requests for which all of the following conditions are true: The Destination-Host AVP is absent in the requestand the value of the peer identity associated with the connection used to send the request does not match a server that could serve the request. The value of the Destination-Realm AVP in the request matches the value of the Origin-Realm AVP of the received message that contained the OC-OLR AVP. The value of the Application-ID in the Diameter Header of the request matches the value of the Application-ID of the Diameter Header of the received message that contained the OC-OLR AVP.Editor's note: There is still an open issue on the definition of Realm reports and whether what report types should be supported. There is consensus that host reports should be supported. There is discussion on Realm reports and Realm-Routed-Request reports. The above definition applies to Realm-Routed-Request reports where Realm reports are defined to apply to all requests that match the realm, independent of the presence, absence or value of the Destination-Host AVP. The default value of the OC-Report-Type AVP is 0 (i.e. the host report).The OC-Report-Type AVP is envisioned to be useful for situations where a reacting node needs to apply different overload treatments for different"types" of overload.overload contexts. For example, the reacting node(s) might need to throttle differently requests sent to a specific server (identified by the Destination-Host AVP in the request) and requests that can be handled by any server in a realm.The example in Appendix B.1 illustrates this usage.6.7. OC-Reduction-Percentage AVP The OC-Reduction-Percentage AVP (AVP code TBD8) is type of Unsigned32 and describes the percentage of the traffic that the sender is requested to reduce, compared to what it otherwise would send. The OC-Reduction-Percentage AVP applies to the default (loss) algorithm specified in this specification. However, the AVP can be reused for future abatement algorithms, if its semantics fit into the new algorithm. The value of the Reduction-Percentage AVP is between zero (0) and one hundred (100). Values greater than 100 are ignored. The value of 100 means that all traffic is to be throttled, i.e. the reporting node is under a severe load and ceases to process any new messages. The value of 0 means that the reporting node is in a stable state and has no need for theother endpointreacting node to apply any traffic abatement. The default value of the OC-Reduction-Percentage AVP is 0. When the OC-Reduction-Percentage AVP is not present in the overload report, the default value applies. 6.8. Attribute Value Pair flag rules +---------+ |AVP flag | |rules | +----+----+ AVP Section | |MUST| Attribute Name Code Defined Value Type |MUST| NOT| +--------------------------------------------------+----+----+ |OC-Supported-Features TBD1 x.x Grouped | | V | +--------------------------------------------------+----+----+ |OC-OLR TBD2 x.x Grouped | | V | +--------------------------------------------------+----+----+ |OC-Sequence-Number TBD3 x.x Unsigned64 | | V | +--------------------------------------------------+----+----+ |OC-Validity-Duration TBD4 x.x Unsigned32 | | V | +--------------------------------------------------+----+----+ |OC-Report-Type TBD5 x.x Enumerated | | V | +--------------------------------------------------+----+----+ |OC-Reduction | | | | -Percentage TBD8 x.x Unsigned32 | | V | +--------------------------------------------------+----+----+ |OC-Feature-Vector TBD6 x.x Unsigned64 | | V | +--------------------------------------------------+----+----+ As described in the Diameter base protocol [RFC6733], the M-bit setting for a given AVP is relevant to an application and each command within that application that includes the AVP. The Diameter overload control AVPs SHOULD always be sent with the M-bit cleared when used within existing Diameter applications to avoid backward compatibility issues. Otherwise, when reused in newly defined Diameter applications, the DOC related AVPs SHOULD have the M-bit set. 7. Error Response CodesEditor's note:When a DOIC node rejects a Diameter request due to overload, the DOIC node MUST select an appropriate error response code. Thissection dependsdetermination is made based onresolutionthe probability ofissue #27.the request succeeding if retried on a different path. A reporting node rejecting a Diameter request due to an overload condition SHOULD send a DIAMETER-TOO-BUSY error response, if it can assume that the same request may succeed on a different path. If a reporting node knows or assumes that the same request will not succeed on a different path, DIAMETER_UNABLE_TO_COMPLY error response SHOULD be used. Retrying would consume valuable resources during an occurrence of overload. For instance, if the request arrived at the reporting node without a Destination-Host AVP then the reporting node might determine that there is an alternative Diameter node that could successfully process the request and that retrying the transaction would not negatively impact the reporting node. DIAMETER_TOO_BUSY would be sent in this case. For instance, if the request arrived at the reporting node with a Destination-Host AVP populated with its own Diameter identity then the reporting node can assume that retrying the request would result in it coming to the same reporting node. DIAMETER_UNABLE_TO_COMPLY would be sent in this case. A second example is when an agent that supports the DOIC solution is performing the role of a reacting node for a non supporting client. Requests that are rejected as a result of DOIC throttling by the agent in this scenario would generally be rejected with a DIAMETER_UNABLE_TO_COMPLY response code. 8. IANA Considerations 8.1. AVP codes New AVPs defined by this specification are listed in Section 6. All AVP codes allocated from the 'Authentication, Authorization, and Accounting (AAA) Parameters' AVP Codes registry. 8.2. New registriesThreeTwo new registries are needed under the 'Authentication, Authorization, and Accounting (AAA) Parameters' registry. Section 6.2 defines a new "Overload Control Feature Vector" registry including the initial assignments. New values can be added into the registry using the Specification Required policy [RFC5226]. See Section 6.2 for the initial assignment in the registry. Section 6.6 defines a new "Overload Report Type" registry with its initial assignments. New types can be added using the Specification Required policy [RFC5226]. 9. Security Considerations This mechanism gives Diameter nodes the ability to request that downstream nodes send fewer Diameter requests. Nodes do this by exchanging overload reports that directly affect this reduction. This exchange is potentially subject to multiple methods of attack, and has the potential to be used as a Denial-of-Service (DoS) attack vector. Overload reports may contain information about the topology and current status of a Diameter network. This information is potentially sensitive. Network operators may wish to control disclosure of overload reports to unauthorized parties to avoid its use for competitive intelligence or to target attacks. Diameter does not include features to provide end-to-end authentication, integrity protection, or confidentiality. This may cause complications when sending overload reports between non- adjacent nodes. 9.1. Potential Threat Modes The Diameter protocol involves transactions in the form of requests and answers exchanged between clients and servers. These clients and servers may be peers, that is,they may share a direct transport (e.g. TCP or SCTP) connection, or the messages may traverse one or more intermediaries, known as Diameter Agents. Diameter nodes use TLS, DTLS, or IPSec to authenticate peers, and to provide confidentiality and integrity protection of traffic between peers. Nodes can make authorization decisions based on the peer identities authenticated at the transport layer. When agents are involved, this presents an effectively hop-by-hop trust model. That is, a Diameter client or server can authorize an agent for certain actions, but it must trust that agent to make appropriate authorization decisions about its peers, and so on. Since confidentiality and integrity protection occurs at the transport layer. Agents can read, and perhaps modify, any part of a Diameter message, including an overload report. There are several ways an attacker might attempt to exploit the overload control mechanism. An unauthorized third party might inject an overload report into the network. If this third party is upstream of an agent, and that agent fails to apply proper authorization policies, downstream nodes may mistakenly trust the report. This attack is at least partially mitigated by the assumption that nodes include overload reports in Diameter answers but not in requests. This requires an attacker to have knowledge of the original request in order to construct a response. Therefore, implementations SHOULD validate that an answer containing an overload report is a properly constructed response to a pending request prior to acting on the overload report. A similar attack involves an otherwise authorized Diameter node that sends an inappropriate overload report. For example, a server for the realm "example.com" might send an overload report indicating that a competitor's realm "example.net" is overloaded. If other nodes act on the report, they may falsely believe that "example.net" is overloaded, effectively reducing that realm's capacity. Therefore, it's critical that nodes validate that an overload report received from a peer actually falls within that peer's responsibility before acting on the report or forwarding the report to other peers. For example, an overload report fromana peer that applies to a realm not handled by that peer is suspect. An attacker might use the information in an overload report to assist in certain attacks. For example, an attacker could use information about current overload conditions to time a DoS attack for maximum effect, or use subsequent overload reports as a feedback mechanism to learn the results of a previous or ongoing attack. 9.2. Denial of Service Attacks Diameter overload reports can cause a node to cease sending some or all Diameter requests for an extended period. This makes them a tempting vector for DoS tacks. Furthermore, since Diameter is almost always used in support of other protocols, a DoS attack on Diameter is likely to impact those protocols as well. Therefore, Diameter nodes MUST NOT honor or forward overload reports from unauthorized or otherwise untrusted sources. 9.3. Non-Compliant Nodes When a Diameter node sends an overload report, it cannot assume that all nodes will comply. A non-compliant node might continue to send requests with no reduction in load. Requirement 28 [RFC7068] indicates that the overload control solution cannot assume that all Diameter nodes in a network are necessarily trusted, and that malicious nodes not be allowed to take advantage of the overload control mechanism to get more than their fair share of service. In the absence of an overload control mechanism, Diameter nodes need to implement strategies to protect themselves from floods of requests, and to make sure that a disproportionate load from one source does not prevent other sources from receiving service. For example, a Diameter server might reject a certain percentage of requests from sources that exceed certain limits. Overload control can be thought of as an optimization for such strategies, where downstream nodes never send the excess requests in the first place. However, the presence of an overload control mechanism does not remove the need for these other protection strategies. 9.4. End-to End-Security Issues The lack of end-to-end security features makes it far more difficult to establish trust in overload reports that originate from non- adjacent nodes. Any agents in the message path may insert or modify overload reports. Nodes must trust that their adjacent peers perform proper checks on overload reports from their peers, and so on, creating a transitive-trust requirement extending for potentially long chains of nodes. Network operators must determine if this transitive trust requirement is acceptable for their deployments. Nodes supporting Diameter overload control MUST give operators the ability to select which peers are trusted to deliver overload reports, and whether they are trusted to forward overload reports from non-adjacent nodes. The lack of end-to-end confidentiality protection means that any Diameter agent in the path of an overload report can view the contents of that report. In addition to the requirement to select which peers are trusted to send overload reports, operators MUST be able to select which peers are authorized to receive reports. A node MUST not send an overload report to a peer not authorized to receive it. Furthermore, an agent MUST remove any overload reports that might have been inserted by other nodes before forwarding a Diameter message to a peer that is not authorized to receive overload reports. At the time of this writing, the DIME working group is studying requirements for adding end-to-end security [I-D.ietf-dime-e2e-sec-req] features to Diameter. These features, when they become available, might make it easier to establish trust in non-adjacent nodes for overload control purposes. Readers should be reminded, however, that the overload control mechanism encourages Diameter agents to modify AVPs in, or insert additional AVPs into, existing messages that are originated by other nodes. If end-to-end security is enabled, there is a risk that such modification could violate integrity protection. The details of using any future Diameter end-to-end security mechanism with overload control will require careful consideration, and are beyond the scope of this document. 10. Contributors The following people contributed substantial ideas, feedback, and discussion to this document: o Eric McMurry o Hannes Tschofenig o Ulrich Wiehe o Jean-Jacques Trottin o Maria Cruz Bartolome o Martin Dolly o Nirav Salot o Susan Shishufeng 11. References 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithms Specification", RFC 5905, June 2010. [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, "Diameter Base Protocol", RFC 6733, October 2012. 11.2. Informative References [Cx] 3GPP, , "ETSI TS 129 229 V11.4.0", August 2013. [I-D.ietf-dime-e2e-sec-req] Tschofenig, H., Korhonen, J., Zorn, G., and K. Pillay, "Diameter AVP Level Security: Scenarios and Requirements", draft-ietf-dime-e2e-sec-req-00 (work in progress), September 2013. [PCC] 3GPP, , "ETSI TS 123 203 V11.12.0", December 2013. [RFC4006] Hakala, H., Mattila, L., Koskinen, J-P., Stura, M., and J. Loughney, "Diameter Credit-Control Application", RFC 4006, August 2005. [RFC5729] Korhonen, J., Jones, M., Morand, L., and T. Tsou, "Clarifications on the Routing of Diameter Requests Based on the Username and the Realm", RFC 5729, December 2009. [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control Requirements", RFC 7068, November 2013. [S13] 3GPP, , "ETSI TS 129 272 V11.9.0", December 2012. Appendix A. Issues left for future specifications The base solution for the overload control does not cover all possible use cases. A number of solution aspects were intentionally left for future specification and protocol work. A.1. Additional traffic abatement algorithms This specification describes only means for a simple loss based algorithm. Future algorithms can be added using the designed solution extension mechanism. The new algorithms need to be registered with IANA. See Sections 6.1 and 8 for the required IANA steps. A.2. Agent Overload This specification focuses on Diameter endpoint (server or client) overload. A separate extension will be required to outline the handling of the case of agent overload. A.3.DIAMETER_TOO_BUSY clarificationsNew Error Diagnostic AVP Thecurrent [RFC6733] behavior inproposal was made to add acase of DIAMETER_TOO_BUSY is somewhat under specified. For example, there is no information how longnew Error Diagnostic AVP to supplement thespecific Diameter node is willingerror responces to beunavailable. A specification updating [RFC6733] should clarify the handling of DIAMETER_TOO_BUSY fromable to indicate that overload was theerror answer initiating Diameter node point of view and fromreason for theoriginal request initiating Diameter node pointrejection ofview. Further,theinclusion of possible additional information providing AVPs should be discussed and possible be recommended to be used.message. Appendix B.Examples B.1. Mix of Destination-Realm routed requests and Destination-Host routedDeployment Considerations Non supporting agents Due to the way that realm-routed requests are handled in Diameterallows a client to optionally selectnetworks, with thedestinationserverof a request, even if there areselection for the request done by an agent, it is recommended that deployments enable all agentsbetweenthat do server selection to support theclient andDOIC solution prior to enabling theserver. The client does this usingDOIC solution in theDestination-Host AVP. InDiameter network. Topology hiding interactions There exist proxies that implement what is referred to as Topology Hiding. This can include cases where theclientagent modifies the Origin-Host in answer messages. The behavior of the DOIC solution is not well understood when this happens. As such, the DOIC solution does notcare if a specific server receivesaddress this scenario. Appendix C. Requirements Conformance Analysis This section contains therequest, it can omit Destination-Host and routeresult of an analysis of therequest usingDOIC solutions conformance to theDestination-Realm andrequirements defined in [RFC7068]. To be completed. Appendix D. Considerations for Applications Integrating the DOIC Solution This section outlines considerations to be taken into account when integrating the DOIC solution into Diameter applications. D.1. ApplicationId, effectively letting an agent selectClassification The following is a classification of Diameter applications and request types. This discussion is meant to document factors that play into decisions made by theserver. Clients commonly send mixturesDiameter identity responsible for handling overload reports. Section 8.1 ofDestination-Host[RFC6733] defines two state machines that imply two types of applications, session-less andDestination- Realm routed requests.session-based applications. The primary difference between these types of applications is the lifetime of Session-Ids. Forexample,session-based applications, the Session-Id is used to tie multiple requests into a single session. The Credit-Control application defined in [RFC4006] is anapplication that uses user sessions,example of aclient typically won't care which server handlesDiameter session-based application. In session-less applications, the lifetime of the Session-Id is asession-initiating requests. But oncesingle Diameter transaction, i.e. the session isinitiated,implicitly terminated after a single Diameter transaction and a new Session-Id is generated for each Diameter request. For the purposes of this discussion, session-less applications are further divided into two types of applications: Stateless applications: Requests within a stateless application have no relationship to each other. The 3GPP defined S13 application is an example of a stateless application [S13], where only a Diameter command is defined between a clientwill send all subsequent requests inand a server and no state is maintained between two consecutive transactions. Pseudo-session applications: Applications thatsessiondo not rely on the Session-Id AVP for correlation of application messages related to the sameserver. Therefore it would sendsession but use other session-related information in theinitial request with no Destination-Host AVP. If it receivesDiameter requests for this purpose. The 3GPP defined Cx application [Cx] is an example of asuccessful answer, the client would copy the Origin-Host value frompseudo-session application. The handling of overload reports must take theanswer messagetype of application intoa Destination-Host AVPconsideration, as discussed ineach subsequentAppendix D.2. D.2. Application Type Overload Implications This section discusses considerations for mitigating overload reported by a Diameter entity. This discussion focuses on the type of application. Appendix D.3 discusses considerations for handling various requestintypes when thesession. An agent has very limited optionstarget server is known to be inapplyingan overloaded state. These discussions assume that the strategy for mitigating the reported overloadabatementis torequests that contain Destination-Host AVPs. It typically cannot routereduce therequestoverall workload sent toa different server thantheone identified in Destination-Host. It's only remaining options areoverloaded entity. The concept of applying overload treatment tothrottle suchrequestslocally, or to sendtargeted for anoverload report back towards the client so the client can throttle the requests.overloaded Diameter entity is inherent to this discussion. Thesecond choicemethod used to reduce offered load isusually more efficient, since it prevents any throttlednot specified here but could include routing requestsfrom being sent in the first place, and removes the agent's needtosend errors backanother Diameter entity known to be able to handle them, or it could mean rejecting certain requests. For a Diameter agent, rejecting requests will usually mean generating appropriate Diameter error responses. For a Diameter client, rejecting requests will depend upon theclient for each dropped request. On the other hand,application. For example, it could mean giving anagent has much more leewayindication toapply overload abatement for requests that do not contain Destination-Host AVPs. Iftheagent has multiple servers in its peer table forentity requesting thegiven realmDiameter service that the network is busy andapplication, it can route suchto try again later. Stateless applications: By definition there is no relationship between individual requests in a stateless application. As a result, when a request is sent or relayed toother, lessan overloadedservers. IfDiameter entity - either a Diameter Server or a Diameter Agent - the sending or relaying entity can choose to apply the overloadseverity increases,treatment to any request targeted for theagent may reach a point whereoverloaded entity. Pseudo-session applications: For pseudo-session applications, there isnot sufficient capacity across all servers to handle even realm-routedan implied ordering of requests.In this case,As a result, decisions about which requests towards an overloaded entity to reject could take therealm itself can be considered overloaded. The agent may needcommand code of theclient to throttle realm-routed requestsrequest into consideration. This generally means that transactions later inaddition to Destination-Host routed requests. The overload severity maythe sequence of transactions should bedifferent for each server, andgiven more favorable treatment than messages earlier in the sequence. This is because more work has already been done by theseverityDiameter network for those transactions that occur later in therealm at is likelysequence. Rejecting them could result in increasing the load on the network as the transactions earlier in the sequence might also need to bedifferent thanrepeated. Session-based applications: Overload handling forany specific server. Therefore, an agent may need to forward, or originate, multiple overload reportssession-based applications must take into consideration the work load associated withdiffering ReportTypesetting up andReduction-Percentage values. Figure 8 illustrates suchmaintaining amixed-routing scenario. In this example,session. As such, theservers S1, S2, and S3 handleentity sending requests towards an overloaded Diameter entity forthe realm "realm". Any of the three can handle requests that are not part ofausersession-based application might tend to reject new session(i.e. routed by Destination-Realm). But oncerequests prior to rejecting intra-session requests. In addition, session ending requests might be given a lower probability of being rejected as rejecting sessionis established, allending requests could result inthatsessionmust go tostatus being out of sync between thesame server. Client Agent S1 S2 S3 | | | | | |(1) Request (DR:realm) | | |-------->| | | | | | | | | | | | | | | |Agent selects S1 | | | | | | | | | | | | | | | | | | |(2) Request (DR:realm) | | |-------->| | | | | | | | | | | | | | | |S1 overloaded, returns OLR | | | | | | | | | | | | | | | | |(3) Answer (OR:realm,OH:S1,OLR:RT=DH) | |<--------| | | | | | | | | | | | | | |sees OLR,routes DR trafficDiameter clients and servers. Application designers that would decide toS2&S3 | | | | | | | | | | | | | | | |(4) Answer (OR:realm,OH:S1, OLR:RT=DH) | |<--------| | | | | | | | | | | | | | |Client throttlesreject mid-session requestswith DH:S1 | | | | | | | | | | | | | | | | |(5) Request (DR:realm) | | |-------->| | | | | | | | | | | | | | | |Agent selects S2 | | | | | | | | | | | | | | | | | | |(6)will need to consider whether the rejection invalidates the session and any resulting session clean-up procedures. D.3. Request(DR:realm) | | |------------------>| | | | | | | | | | | | | | | |S2Transaction Classification Independent Request: An independent request isoverloaded... | | | | | | | | | | | | | | | | |(7) Answer (OH:S2, OLR:RT=DH)| | |<------------------| | | | | | | | | | | | | |Agent sees OLR, realm now overloaded | | | | | | | | | | | | | | | |(8) Answer (OR:realm,OH:S2, OLR:RT=DH, OLR: RT=R) |<--------| | | | | | | | | | | | | | |Client throttles DH:S1, DH:S2, and DR:realm | | | | | | | | | | | | | | | | | | | | | | | | | Figure 8: Mixnot correlated to any other requests and, as such, the lifetime ofDestination-Host and Destination-Realm Routed Requests 1. The client sends athe session-id is constrained to an individual transaction. Session-Initiating Request: A session-initiating requestwith no Destination-Host AVP (that is,is the initial message that establishes aDestination-Realm routed request.) 2.Diameter session. Theagent follows local policy to selectACR message defined in [RFC6733] is an example of aserver from its peer table. In this case, the agent selects S2session-initiating request. Correlated Session-Initiating Request: There are cases when multiple session-initiated requests must be correlated andforwardsmanaged by therequest. 3. S1 is overloaded.same Diameter server. Itsends a answer indicating success, but also includes an overload report. Since the overload report only applies to S1, the ReportTypeis"Destination-Host". 4. The agent seesnotably theoverload report,case in the 3GPP PCC architecture [PCC], where multiple apparently independent Diameter application sessions are actually correlated andrecords that S1 is overloadedmust be handled by thevalue insame Diameter server. Intra-Session Request: An intra session request is a request that uses theReduction-Percentage AVP. It begins divertingsame Session- Id than theindicated percentage of realm-routed traffic from S1one used in a previous request. An intra session request generally needs toS2 and S3. Since it can't divert Destination-Host routed traffic, it forwards the overload reportbe delivered to theclient. This effectively delegatesserver that handled thethrottling of traffic with Destination-Host:S1 tosession creating request for theclient. 5. The client sends another Destination-Realm routed request. 6.session. Theagent selects S2,STR message defined in [RFC6733] is an example of an intra-session requests. Pseudo-Session Requests: Pseudo-session requests are independent requests andforwardsdo not use the same Session-Id but are correlated by other session-related information contained in the request.7. It turns out that S2 is also overloaded, perhaps due to allThere exists Diameter applications thattraffic it took over for S1. S2 returns an successful answer containingdefine anoverload report. Since this report only applies to S2,expected ordering of transactions. This sequencing of independent transactions results in a pseudo session. The AIR, MAR and SAR requests in theReportType is "Destination-Host". 8.3GPP defined Cx [Cx] application are examples of pseudo-session requests. D.4. Request Type Overload Implications Theagent sees that S2request classes identified in Appendix D.3 have implications on decisions about which requests should be throttled first. The following list of request treatment regarding throttling isalso overloaded byprovided as guidelines for application designers when implementing thevalueDiameter overload control mechanism described inReduction-Percentage. This valuethis document. The exact behavior regarding throttling isprobably different thana matter of local policy, unless specifically defined for thevalue from S1's report. The agent divertsapplication. Independent requests: Independent requests can generally be given equal treatment when making throttling decisions, unless otherwise indicated by application requirements or local policy. Session-initiating requests: Session-initiating requests often represent more work than independent or intra-session requests. Moreover, session- initiating requests are typically followed by other session- related requests. Since theremaining traffic to S3 as best as it can, but it calculates thatmain objective of theremaining capacity across all three serversoverload control isno longer sufficienttohandle allreduce the total number of requests sent to therealm-routed traffic. This meansoverloaded entity, throttling decisions might favor allowing intra-session requests over session-initiating requests. In therealm itself is overloaded. The realm's overload percentage is most likely different than that for either S1absence of local policies orS2. The agent forward's S2's report backapplication specific requirements to theclientcontrary, Individual session-initiating requests can be given equal treatment when making throttling decisions. Correlated session-initiating requests: A Request that results inthe Diameter answer. Additionally, the agent generatesa newreport forbinding, where therealmbinding is used for routing of"realm", and inserts that report intosubsequent session-initiating requests to theanswer. The client throttlessame server, represents more work load than other requests. As such, these requestswith Destination-Host:S1 at one rate,might be throttled more frequently than other request types. Pseudo-session requests: Throttling decisions for pseudo-session requestswith Destination-Host:S2 at another rate, andcan take into consideration where individual requestswith no Destination-Host AVP at yet a third rate. (Since S3 has not indicated overload,fit into theclient does not throttle requests with Destination-Host:S3.) Appendix C. Restructuring of -02 versionoverall sequence of requests within thedraft This section capturespseudo session. Requests that are earlier in theinitial plan for restructuringsequence might be throttled more aggressively than requests that occur later in theDOIC specification fromsequence. Intra-session requests: There are two types of intra-sessions requests, requests that terminate a session and the-02 versionremainder of intra-session requests. Implementors and operators may choose to throttle session- terminating requests less aggressively in order to gracefully terminate sessions, allow clean-up of thenew -03 version. 1. Introduction (non normative) -- Existing Text from section 1. -- 2. Terminologyrelated resources (e.g. session state) andAbbreviations (non normative) -- Existing Text from section 2. -- 3. Solution Overview (Non normative) -- Existing text from section 3. -- 3.1 Overload Control Endpoints (Non normative) -- New text leveraging text from existing section 5.1 -- 3.2 Piggybacking Principle (Non normative) -- Existing text from existing section 5.2, with enhancements -- 3.3 DOIC Capability Discovery (Non normative) -- New text leveraging text from existing section 5.3 -- 3.4 DOIC Overload Condition Reporting (Non normative) -- New text -- 3.5 DOIC Extensibility (Non normative) -- New text leveraging text from existing Section 5.4 -- 3.5 Simplified Example Architecture (Non normative) -- Existing text from section 3.1.6, with enhancements -- 3.6 Considerations for Applications Integrating the DOIC Solution (Non normative) -- New text -- 3.6.1. Application Classification (Non normative) -- Existing text from section 3.1.1 -- 3.6.2. Application Type Overload Implications (Non normative) -- Existing text from section 3.1.2 -- 3.6.3. Request Transaction Classification (Non normative) -- Existing text from section 3.1.3 -- 3.6.4. Request Type Overload Implications (Non normative) -- Existing text from section 3.1.4 -- 4. Solution Procedures (Normative) 4.1 Capability Announcement (Normative) -- Existing text from section 5.3 -- 4.1.1. Reacting Node Behavior (Normative) -- Existing text from section 5.3.1 -- 4.1.2. Reporting Node Behavior (Normative) -- Existing text from section 5.3.2 -- 4.1.3. Agent Behavior (Normative) -- Existing text from section 5.3.3 -- 4.2. Overload Report Processing (Normative) 4.2.1. Overload Control State (Normative) -- Existing text from section 5.5.1 -- 4.2.2. Reacting Node Behavior (Normative) -- Existing text from section 5.5.2 -- 4.2.3. Reporting Node Behavior (Normative) -- Existing text from section 5.5.3 -- 4.2.4. Agent Behavior (Normative) -- Existing text from section 5.5.4 -- 4.3. Protocol Extensibility (Normative) -- Existing text from section 5.4 -- 5. Loss Algorithm (Normative) -- New text pulling from information spread through the document -- 5.1. Overview (Non normative) -- New text pulling from information spread through the document -- 5.2. Reporting Node Behavior (Normative) -- New text pulling from information spread throughavoid thedocument -- 5.3. Reacting Node Behavior (Normative) -- New text pulling from information spread throughneed for additional intra-session requests. Favoring session-termination requests may reduce thedocument -- 6. Attribute Value Pairs (Normative) -- Existing text from section 4. -- 6.1. OC-Supported-Features AVP -- Existing text from section 4.1 -- 6.2. OC-Feature-Vector AVP -- Existing text from section 4.2 -- 6.3. OC-OLR AVP -- Existing text from section 4.3 -- 6.4. OC-Sequence-Number AVP -- Existing text from section 4.4 -- 6.5. OC-Validity-Duration AVP -- Existing text from section 4.5 -- 6.6. OC-Report-Type AVP -- Existing text from section 4.6 -- 6.7. OC-Reduction-Percentage AVP -- Existing text from section 4.7 -- 6.8. Attribute Value Pair flag rules -- Existing text from section 4.8 -- 7. Error Response Codes -- New text basedsession management impact onresolution of issue -- 8. IANA Considerations -- Existing text from section 7. -- 8.1. AVP codes -- Existing text from section 7.1 -- 8.2. New registries -- Existing text from section 7.2 -- 9. Security Considerations -- Existing text from section 8. -- 9.1. Potential Threat Modes -- Existing text from section 8.1 -- 9.2. Denial of Service Attacks -- Existing text from section 8.2 -- 9.3. Non-Compliant Nodes -- Existing text from section 8.3 -- 9.4. End-to End-Security Issues -- Existing text from section 8.4 -- 10. Contributors 11. References 11.1. Normative References 11.2. Informative References Appendix A. Issues left for future specifications A.1. Additional traffic abatement algorithms A.2. Agent Overload A.3. DIAMETER_TOO_BUSY clarifications A.4. Per reacting node reports Appendix B. Examples B.1. Mixthe overloaded entity. The default handling ofDestination-Realm routed requests and Destination- Host routedother intra-session requestsAuthors' Addressesmight be to treat them equally when making throttling decisions. There might also be application level considerations whether some request types are favored over others. Authors' Addresses Jouni Korhonen (editor) Broadcom Porkkalankatu 24 Helsinki FIN-00180 Finland Email: jouni.nospam@gmail.com Steve Donovan (editor) Oracle 7460 Warren Parkway Frisco, Texas 75034 United States Email: srdonovan@usdonovans.com Ben Campbell Oracle 7460 Warren Parkway Frisco, Texas 75034 United States Email: ben@nostrum.com Lionel Morand Orange Labs 38/40 rue du General Leclerc Issy-Les-Moulineaux Cedex 9 92794 France Phone: +33145296257 Email: lionel.morand@orange.com