Diameter Maintenance and Extensions J. Korhonen, Ed. (DIME) Broadcom Internet-Draft S. Donovan Intended status: Standards Track B. Campbell Expires:May 26,June 20, 2014 OracleNovember 22,L. Morand Orange Labs December 17, 2013 Diameter Overload Indication Conveyancedraft-ietf-dime-ovli-00.txtdraft-ietf-dime-ovli-01.txt Abstract This specification documents a Diameter Overload Control (DOC) base solution and the dissemination of the overload report information. Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire onMay 26,June 20, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4 3. Solution Overview . . . . . . . . . . . . . . . . . . . . . .65 3.1. Architectural Assumptions . . . . . . . . . . . . . . . .65 3.1.1. Application Classification . . . . . . . . . . . . . .75 3.1.2. Application Type Overload Implications . . . . . . . .86 3.1.3. Request Transaction Classification . . . . . . . . . .98 3.1.4. Request Type Overload Implications . . . . . . . . . .109 3.1.5. DiameterDeployment Scenarios . . . . . . . . . . . . 11 3.1.6. DiameterAgent Behaviour . . . . . . . . . . . . . . .12 3.1.7.10 3.1.6. Simplified Example Architecture . . . . . . . . . . .1311 3.2. Conveyance of the Overload Indication . . . . . . . . . .1411 3.2.1.Negotiation and Versioning . . . . . .DOIC Capability Discovery . . . . . . . .14 3.2.2. Transmission of the Attribute Value Pairs. . . . . .1412 3.3. Overload Condition Indication . . . . . . . . . . . . . .1512 4. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . .1512 4.1.OC-Feature-VectorOC-Supported-Features AVP . . . . . . . . . . . . . . . .. . 1513 4.2.OC-OLROC-Feature-Vector AVP . . . . . . . . . . . . . . . . . . 14 4.3. OC-OLR AVP . . . . . .16 4.3. TimeStamp AVP. . . . . . . . . . . . . . . . . . 14 4.4. OC-Sequence-Number AVP . . . .17 4.4. ValidityDuration AVP. . . . . . . . . . . . . . 15 4.5. OC-Validity-Duration AVP . . . . .17 4.5. ReportType AVP. . . . . . . . . . . . 15 4.6. OC-Report-Type AVP . . . . . . . . . .17 4.6. Reduction-Percentage AVP. . . . . . . . . . 16 4.7. OC-Reduction-Percentage AVP . . . . . . .18 4.7.. . . . . . . . 16 4.8. Attribute Value Pair flag rules . . . . . . . . . . . . .1917 5. Overload Control Operation . . . . . . . . . . . . . . . . . .1918 5.1. Overload Control Endpoints . . . . . . . . . . . . . . . .1918 5.2. Piggybacking Principle . . . . . . . . . . . . . . . . . .2321 5.3. Capability Announcement . . . . . . . . . . . . . . . . .2322 5.3.1.Request Message InitiatorReacting Node Endpoint Considerations . .24. . . . . . 22 5.3.2.Answer Message InitiatingReporting Node Endpoint Considerations . .24. . . . . . 23 5.4. Protocol Extensibility . . . . . . . . . . . . . . . . . .2523 5.5. Overload Report Processing . . . . . . . . . . . . . . . .2524 5.5.1.Sender Endpoint ConsiderationsOverload Control State . . . . . . . . . . . . . .25. . 24 5.5.2.Receiver EndpointReacting Node Considerations . . . . . . . . . . . . . 24 5.5.3. Reporting Node Considerations . . . . . . . . . . .25. 27 6. Transport Considerations . . . . . . . . . . . . . . . . . . .2527 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . .2628 7.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . .2628 7.2. New registries . . . . . . . . . . . . . . . . . . . . . .2628 8. Security Considerations . . . . . . . . . . . . . . . . . . .2628 8.1. Potential Threat Modes . . . . . . . . . . . . . . . . . .2728 8.2. Denial of Service Attacks . . . . . . . . . . . . . . . .2830 8.3. Non-Compliant Nodes . . . . . . . . . . . . . . . . . . .2830 8.4. End-to End-Security Issues . . . . . . . . . . . . . . . .2830 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . .3031 10.Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 30 11.References . . . . . . . . . . . . . . . . . . . . . . . . . .30 11.1.32 10.1. Normative References . . . . . . . . . . . . . . . . . . .30 11.2.32 10.2. Informative References . . . . . . . . . . . . . . . . . .3132 Appendix A. Issues left for future specifications . . . . . . . .3133 A.1. Additional traffic abatement algorithms . . . . . . . . .3133 A.2. Agent Overload . . . . . . . . . . . . . . . . . . . . . .3133 A.3. DIAMETER_TOO_BUSY clarifications . . . . . . . . . . . . .3133 Appendix B.Conformance to Requirements . . . . . . . . . . . . . 32 Appendix C.Examples . . . . . . . . . . . . . . . . . . . . . .41 C.1. 3GPP S6a interface overload indication . . . . . . . . . . 41 C.2. 3GPP PCC interfaces overload indication . . . . . . . . . 41 C.3.33 B.1. Mix of Destination-Realm routed requests and Destination-Hostreoutedrouted requests . . . . . . . . . . . .41. 33 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .4137 1. Introduction This specification defines a base solution fortheDiameter Overload Control (DOC). The requirements for the solution are described and discussed in the corresponding design requirements document[I-D.ietf-dime-overload-reqs].[RFC7068]. Note that the overload control solution defined in this specification does not address all the requirements listed in[I-D.ietf-dime-overload-reqs].[RFC7068]. A number of overload control related features are left for the future specifications.See Appendix A for more detailed discussion on those.The solution defined in this specification addresses the Diameter overload control between two endpoints (see Section 5.1). Furthermore, the solution is designed to apply to existing and future Diameter applications, requires no changes to the Diameter base protocol [RFC6733] and is deployable in environments where some Diameter nodes do not implement the Diameter overload control solution defined in this specification. 2. Terminology and Abbreviations Server Farm A set of Diameter servers that can handle any request for a given set of Diameter applications. While these servers support the same set of applications, they do not necessarily all have the same capacity. An individual server farm might also support a subset of the users for a Diameter Realm.[OpenIssue: Is aA server farmassumed to support a single realm? That is, does it support a set of applications inmay host a singlerealm?] Server Front End A Server Front End (SFE) is a role that can be performed by a Diameter agent -- either a relayora proxy -- that sits betweenmultiple realms. Diameterclients and a Server Farm. An SFE can perform various functions for the server farm it sits in front of. This includes some or all of the following functions: *Routing: Diameter Routing* Diameter layer load balancing * Load Management * Overload Management * Topology Hiding * Server Farm Identity Management [OpenIssue: We usedbetween non-adjacent nodes relies on theconcept of a server farm and SFE for internal discussions. Do we still need those conceptsDestination-Realm AVP toexplaindetermine themechanism? It doesn't seem like we use them much.] Diameter Routing:DiameterRouting determinesrealm in which thedestination of Diameter messages addressedrequest needs toeither a Diameter Realm and Applicationbe processed. A Destination-Host AVP may also be present ingeneral, orthe request to address a specific serverusing Destination-Host.inside the Diameter realm. This function is defined in [RFC6733].ApplicationHowever, it is possible to enhance the routing decisions with application level knowledge as it done in 3GPP PCC [3GPP.23.203] and NAI-based source routingspecifications that expand on [RFC6733] also exist. Diameter-layer[RFC5729]. Diameter layer Load Balancing: Diameter layer load balancing allows Diameter requests to be distributed across the set of servers. Definition of this function is outside the scope of this document.Load Management: This functionality ensures that the consolidated load state for the server farm is collected, and processed. The exact algorithm for computing the load at the SFE is implementation specific but enough semantic of the conveyed load information needs to be specified so that deterministic behavior can be ensured. Overload Management: The SFE is the entity that understands the consolidated overload state for the server farm. Just as it is outside the scope of this document to specify how a Diameter server calculates its overload state, it is also outside the scope of this document to specify how an SFE calculates the overload state for the set of servers. This document describes how the SFE communicates Overload information to Diameter Clients.Topology Hiding: Topology Hiding is loosely defined as ensuring that no Diameter topology information aboutthe server farma Diameter network can be discovered from Diameter messages sent outside a predefined boundary (typically an administrative domain). This includes obfuscating identifiers and address information of Diameter entities in theserver farm.Diameter network. It can also include hiding the number of various Diameter entities in theserver farm.Diameter network. Identifying information can occur in many Diameter Attribute-Value Pairs (AVPs), including Origin-Host, Destination-Host, Route-Record, Proxy-Info, Session-ID and other AVPs.Server Farm Identity Management: Server Farm Identity Management (SFIM) is a mechanism that can be used by the SFE to present a single Diameter identity that can be used by clients to send Diameter requests to the server farm. This requires that the SFE modifies Origin-Host information in answers coming from servers in the server farm. An agent that performs SFIM appears as a server from the client's perspective.Throttling: Throttling is the reduction of the number of requests sent to an entity. Throttling can include a client dropping requests, or an agent rejecting requests with appropriate error responses. Clients and agents can also choose to redirect throttled requests to some other entity or entities capable of handling them. Reporting Node A Diameter node that generates an overload report. (This may or may not be the actually overloaded node.) Reacting Node A Diameter node that consumes and acts upon a report. Note that "act upon" does not necessarily mean the reacting node applies an abatement algorithm; it might decide to delegate that downstream, in which case it also becomes a "reporting node". OLROveloadOverload Report. 3. Solution Overview 3.1. Architectural Assumptions This section describes the high-level architectural and semantic assumptions thatunderlyunderlie the Diameter Overload Control Mechanism. 3.1.1. Application Classification The following is a classification of Diameter applications and requests. This discussion is meant to document factors that play into decisions made by the Diameter identity responsible for handling overload reports. Section 8.1 of [RFC6733] defines two state machines that imply two types of applications, session-less andsession-based.session-based applications. The primarydifferentiatordifference between these types of applications is the lifetime ofSession-IDs.Session-Ids. For session-based applications, thesession-idSession-Id is used to tie multiple requests into a single session. In session-less applications, the lifetime of thesession-idSession-Id is a single Diametertransaction. The 3GPP-defined S6a application is an example of a session-less application. The following, copied from section 7.1.4 of 29.272, explicitly states that sessions are implicitly terminated and that the server does not maintain session state: "Between the MME and the HSS and between the SGSN and the HSS and between the MME andtransaction, i.e. theEIR, Diameter sessions shall be implicitly terminated. An implicitly terminatedsession isone for which the server does not maintain state information. The client shall not send any re-authorization or session termination requests to the server. The Diameter base protocol includes the Auth-Session-State AVP as the mechanism for the implementation ofimplicitly terminatedsessions. The client (server) shall include in its requests (responses) the Auth-Session-State AVP set to the value NO_STATE_MAINTAINED (1), as described in [RFC6733]. Asafter aconsequence, the server shall not maintain any state information about this sessionsingle Diameter transaction andthe client shall not send any session terminationa new Session-Id is generated for each Diameter request.Neither the Authorization-Lifetime AVP nor the Session-Timeout AVP shall be present in requests or responses."For the purposes of this discussion, session-less applications are further divided into two types of applications: Stateless applications: Requests within a stateless application have no relationship to each other. The 3GPP defined S13 application is an example of a statelessapplication.application [3GPP.29.272], where only a Diameter command is defined between a client and a server and no state is maintained between two consecutive transactions. Pseudo-session applications:While this class of application doesApplications that do notuserely on theDiameter Session-IDSession-Id AVPto correlate requests, there is an implied orderingfor correlation oftransactions defined byapplication messages related to theapplication.same session but use other session-related information in the Diameter requests for this purpose. The 3GPP defined Cx application[reference][3GPP.29.229] is an example of a pseudo-session application.[OpenIssue: Do we assume that all requests in a pseudo-session typically need to go to the same server?]Theaccounting application defined in [RFC6733] and the Credit- ControlCredit-Control application defined in [RFC4006]are examplesis an example of a Diameter session-basedapplications.application. The handling of overload reports must take the type of application into consideration, as discussed in Section 3.1.2. 3.1.2. Application Type Overload Implications This section discusses considerations for mitigating overload reported by a Diameter entity. This discussion focuses on the type of application. Section 3.1.3 discusses considerations for handling various request types when the target server is known to be in an overloaded state.Section 3.1.5 discusses considerations for handling overload conditions based on the network deployment scenario.These discussions assume that the strategy for mitigating the reported overload is to reduce the overall workload sent to the overloaded entity. The concept of applying overload treatment to requests targeted for an overloaded Diameter entity is inherent to this discussion. The method used to reduce offered load is not specified here but could include routing requests to another Diameter entity known to be able to handle them, or it could mean rejecting certain requests. For a Diameter agent, rejecting requests will usually mean generating appropriate Diameter error responses. For a Diameter client, rejecting requests will depend upon the application. For example, it could mean giving an indication to the entity requesting the Diameter service that the network is busy and to try again later. Stateless applications: By definition there is no relationship between individual requests in a stateless application. As a result, when a request is sent or relayed to an overloaded Diameter entity - either a Diameter Server or a Diameter Agent - the sending or relaying entity can choose to apply the overload treatment to any request targeted for the overloaded entity.Pseudo-statefulPseudo-session applications:Pseudo stateful applications are also stateless applications in thatFor pseudo-session applications, there isno session Diameter state maintained between transactions. There is, however,an implied ordering of requests. As a result, decisions about whichtransactions to reject as a result ofrequests towards an overloaded entity to reject could take thecommand-codecommand code of the request into consideration. This generally means that transactions later in the sequence of transactions should be given more favorable treatment than messages earlier in the sequence. This is because more work has already been done by the Diameter network for those transactions that occur later in the sequence. Rejecting them could result in increasing the load on the network as the transactions earlier in the sequence might also need to be repeated.StatefulSession-based applications: Overload handling forstatefulsession-based applications must take into consideration the work load associated with setting upanand maintaining a session. As such, the entityhandling overload of asending requests towards an overloaded Diameter entity for astatefulsession-based application might tend to reject new session requestsbeforeprior to rejecting intra-session requests. In addition, session ending requests might be given a lowerpriorityprobability of being rejected as rejecting session ending requests could result in session status being out of sync between the Diameter clients and servers.NodesApplication designers that would decide to reject mid-session requests will need to consider whether the rejection invalidates thesession,session and any resulting session clean-upthat may be required.procedures. 3.1.3. Request Transaction Classification Independent Request: An independent request is nota part of a Diameter sessioncorrelated to any other requests and, as such, the lifetime of the session-id is constrained to an individual transaction. Session-Initiating Request: A session-initiating request is the initial message that establishes a Diameter session. The ACR message defined in [RFC6733] is an example of a session-initiating request. Correlated Session-Initiating Request: There arecases, mostcases when multiple session-initiated requests must be correlated and managed by the same Diameter server. It is notably the case in the 3GPP PCCarchitecture,architecture [3GPP.23.203], where multiple apparently independent Diameter application sessions are actually correlated and must be handled by the same Diameter server.This is a special case of a Session-Initiating Request. Gx CCR-I requests and Rx AAR messages are examples of correlated session- initiating requests. [OpenIssue: The previous paragraph needs references.]Intra-Session Request: An intra session request is a request that uses the same Session-Id than the one used in asession-id for an already establishedprevious request. An intra session request generally needs to be delivered to the server that handled the session creating request for the session. The STR message defined in [RFC6733] is an example of an intra-session requests.CCR-U and CCR-T requests defined in [RFC4006] are further examples of intra-session requests.Pseudo-Session Requests:Pseudo sessionPseudo-session requests are independent requestsand, as such,and do not use therequest transactionssame Session-Id but arenot tied together usingcorrelated by other session-related information contained in theDiameter session-id.request. Thereexistexists Diameter applications that define an expected ordering of transactions. This sequencing of independent transactions results in a pseudo session. The AIR, MAR and SAR requests in the 3GPP defined Cx application are examples of pseudo-session requests. 3.1.4. Request Type Overload Implications The request classes identified in Section 3.1.3 have implications on decisions about which requests should be throttled first. The following list of request treatment regarding throttling is provided as guidelines for application designers when implementing the Diameter overload control mechanism described in this document. Exact behavior regarding throttling must be defined per application. Independent requests: Independent requests can be given equal treatment when making throttling decisions.Session-creatingSession-initiating requests:Session-creatingSession-initiating requests represent more work than independent or intra-session requests. Moreover, session-initiating requests are typically followed by other related session-related requests. As such, as the main objective of the overload control is to reduce the total number of requests sent to the overloaded entity, throttling decisions might favor allowing intra-session requests oversession-creatingsession-initiating requests. Individualsession-creatingsession-initiating requests can be given equal treatment when making throttling decisions. Correlatedsession-creatingsession-initiating requests: A Request that results in a new binding, where the binding is used for routing of subsequentsession-creating requests,session-initiating requests to the same server, represents more work load than other requests. As such, these requests might be throttled more frequently than other request types. Pseudo-session requests: Throttling decisions for pseudo-session requests can take into consideration where individual requests fit into the overall sequence of requests within the pseudo session. Requests that are earlier in the sequence might be throttled more aggressively than requests that occur later in the sequence. Intra-session requests There are two classes of intra-sessions requests. The firstis a requestclass consists of requests thatendsterminate a session. The secondis a requestone contains the set of requests thatisare usedto convey session related state betweenby the Diameter client andserver.server to maintain the ongoing session state. Sessionending requestterminating requests should be throttled less aggressively in order tokeep session state consistent betweengracefully terminate sessions, allow clean-up of theclient and server,related resources (e.g. session state) andpossibly reduceget rid of thesessionsneed for other intra-session requests, reducing the session management impact on the overloaded entity. The default handling of other intra-session requests might be to treat them equally when making throttling decisions. There might also be application level considerations whether some request types are favored over others. 3.1.5. DiameterDeployment Scenarios This section discusses various Diameter network deployment scenarios and the implications of those deployment models on handling of overload reports. The scenarios vary based on the following: o The presence or absence of Diameter agents o Which Diameter entities support the DOC extension o The amount of the network topology understood by Diameter clients o The complexity of the Diameter server deployment for a Diameter application o Number of Diameter applications supported by Diameter clients and Diameter servers Without consideration for which elements support the DOC extension, the following is a representative list of deployment scenarios: o Client --- Server o Client --- Multiple equivalent servers o Client --- Agent --- Multiple equivalent servers o Client --- Agent [ --- Agent ] --- Partitioned server o Client --- Edge Agent [ --- Edge Agent] --- { Multiple Equivalent Servers | Partitioned Servers } o Client --- Session Correlating Agent --- Multiple Equivalent Servers [OpenIssue: Do the "multiple equivalent servers" cases change for session-stateful applications? Do we need to distinguish equivalence for session-initiation requests vs intra-session requests?] The following is a list of representative DOC deployment scenarios: o Direct connection between a DOC client and a DOC server o DOC client --- non-DOC agent --- DOC server o DOC client --- DOC agent --- DOC server o Non-DOC client --- DOC agent --- DOC server o Non-DOC client --- DOC agent --- Mix of DOC and non-DOC servers o DOC client --- agent --- Partitioned/Segmented DOC server o DOC client --- agent --- agent --- Partitioned/Segmented DOC server o DOC client --- edge agent --- edge agent --- DOC server [OpenIssue: In the last 3 list entries, are the agents DOC or non- DOC?] 3.1.6. DiameterAgent Behaviour In the context of the Diameter Overload Indication Conveyance (DOIC) and reacting to the overload information, the functional behaviour of Diameter agents in front of servers, especially Diameter proxies, needs to be common. This is important because agents may actively participate in the handling of an overload conditions. For example, they may make intelligent next hop selection decisions based on overload conditions, or aggregate overload information to be disseminated downstream. Diameter agents may have other deployment related tasks that are not defined in the Diameter base protocol [RFC6733]. These include, among other tasks, topology hiding,andor agent acting as aserver front endServer Front End (SFE) for aserverfarm ofrealDiameter servers. Since the solution defined in this specification must not break the Diameter base protocol [RFC6733] at any time, great care has to be taken not to assume functionality from the Diameter agents that would break base protocol behavior, or to assume agent functionality beyond the Diameter base protocol. Effectively this means the following from a Diameter agent: o If a Diameter agent presents itself as the "end node",perhapsas an agent acting as an topology hiding SFE, theDOC mechanism MUST NOT leak information of the Diameter nodes behind it. From the Diameter client point of viewagent is the final destinationto itsof requestsandinitiated by Diameter clients, the original source for the corresponding answers and server-initiated requests. As a consequence, the DOIC mechanism MUSTbeNOT leak information of the Diameteragent.nodes behind it. This requirement means that such a Diameter agent acts as aback-to- back-agentback-to-back-agent forDOCDOIC purposes. How the Diameter agent in this case appears to the Diameternodes it is representing (i.e.servers in thereal Diameter servers),farm, isanspecific to the implementation andadeploymentspecificwithin the realm the Diameter agent is deployed. oThis requirement also implies that ifIf the Diameter agent does not impersonate the servers behind it, the Diameter dialogue is established between clients and servers and any overload information received by a client would be froma giventhe server identified by the Origin-Hostidentity. [OpenIssue: We've discussed multiple situations where an agent might insert an OLR. I don't think we mean to force them to always perform topology hiding or SFIMidentity contained inorder to do so. We cannot assume that an OLR is always "from" or "about"theOrigin-Host. Also, the section seems to assume that topology hiding agents act as b2b overload agents, but non-topology hiding agents never do. It don't think that's the right abstraction. It's possible that topology-hiding agents must do this, but I don't think we can preclude non-topology hiding agents from also doing it, at least some of the time.] 3.1.7.Diameter message. 3.1.6. Simplified Example Architecture Figure 1 illustrates the simplified architecture for Diameter overloadcontrol.information conveyance. See Section 5.1 for more discussion and details how different Diameter nodes fit into the architecture from the DOIC point of view. Realm XOtherSame or other Realms <--------------------------------------> <----------------------> +--^-----+ : (optional) : |Diameter| : : |Server A|--+ .--. : +---^----+ : .--. +--------+ | _( `. : |Diameter| : _( `. +---^----+ +--( )--:-| Agent |-:--( )--|Diameter| +--------+ | ( ` . ) ) : +-----^--+ : ( ` . ) ) | Client | |Diameter|--+ `--(___.-' : : `--(___.-' +-----^--+ |Server B| : : +---^----+ : : End-to-end Overload Indication 1) <-----------------------------------------------> Diameter Application Y Overload Indication A Overload Indication A'1)2) <----------------------> <----------------------> standard base protocol standard base protocolEnd-to-end Overload Indication 2) <-----------------------------------------------> standard base protocolFigure 1: Simplified architecture choices for overload indication delivery In Figure 1, the Diameter overload indication can be conveyed (1) end-to-end between servers and clients or (2) between servers and Diameter agent inside the realm and then between the Diameter agent and the clients when the Diameter agent acting as back-to-back-agent for DOIC purposes. 3.2. Conveyance of the Overload Indication The followingfeaturessections describe new Diameter AVPs used for sending overload reports, and for declaring support for certainDOCDOIC features. 3.2.1.Negotiation and Versioning SinceDOIC Capability Discovery Support of DOIC may be specified as part of the functionality supported by a new Diameteroverload controlapplication. In this way, support of the considered Diameter application (discovered during capabilities exchange phase as defined in Diameter base protocol [RFC6733]) indicates implicit support of the DOIC mechanism. When the DOIC mechanism isalso designed to work overintroduced in existingapplication (i.e., the piggybacking principle),Diameter applications, aproper negotiationspecific capability discovery mechanism ishard to accomplish.required. The"capability negotiation""DOIC capability discovery mechanism" is based on theexistensepresence of specificnon-mandatory APV,optional AVPs in the Diameter messages, such as theOC-Feature-VectorOC- Supported-Features AVP (see Section4.1.4.1). Although theOC- Feature-VectorOC-Supported- Features AVP can be used to advertise a certain set of new or existing Diameter overload control capabilities, it is not a versioning solution per se, however, it can be used to achieve the same result.3.2.2. Transmission of the Attribute Value Pairs The Diameter overload control APVs SHOULD always be sent as an optional AVPs. This requirement stems from the fact that piggybacking overload control information on top of existing application cannot really use AVPs with the M-bit set. However, there are certain exceptions as explained in Section 5.4.From the Diameter overload control functionality point of view, the "Reacting node" isalwaysthe requester of the overload report information and the "Reporting node" is the provider of the overload report. Theoverload report or the capability informationOC-Supported-Features AVP in the request message is always interpreted as an announcement ofa "capability"."DOIC supported capabilities". Theoverload report and the capability informationOC-Supported-Features AVP in the answer isalwaysalso interpreted as a report of "DOIC supportedcommond functionalitycapabilities" andas a status reportat least one ofan overload condition (of a node).supported capabilities MUST be common with the "Reacting node" (see Section 4.1). 3.3. Overload Condition Indication Diameter nodes can request a reduction in offered load by indicating an overload condition in the form of an overload report. The overload report contains information about how much load should be reduced, and may contain other information about the overload condition. This information isencodedconveyed in Diameter Attribute Value Pairs (AVPs). Certain new AVPs may also be used to declare certain DOIC capabilities and extensions. 4. Attribute Value Pairs This section describes the encoding and semantics of the Diameter Overload Indication Attribute Value Pairs(AVPs).(AVPs) defined in this document. 4.1.OC-Feature-VectorOC-Supported-Features AVP TheOC-Feature-VectorOC-Supported-Features AVP (AVP code TBD1) is type ofUnsigned64Grouped and serves for two purposes. First, it announces node's support for the DOIC in general. Second, it contains the description of the supported DOIC features of the sending node. The OC-Supported- Features AVP SHOULD be included into every Diameter message a64 bit flags fieldDOIC supporting node sends (and intends to use for DOIC purposes). OC-Supported-Features ::= < AVP Header: TBD1 > < OC-Sequence-Number > [ OC-Feature-Vector ] * [ AVP ] The OC-Sequence-Number AVP is used to indicate whether the contents of the OC-Supported-Features AVP has changed since last time the node included the OC-Supported-Features AVP (see Section 4.4). Although sending the OC-Sequence-Number AVP is mandatory in the OC-Supported- Features AVP, the receiving node MAY always choose to ignore the sequence number if it can determine the feature support changes otherwise. The OC-Feature-Vector sub-AVP is used to announcedcapabilitiesthe DOIC features supported by the endpoint, in the form ofan overload control endpoint. Sendinga flag bits field in which each bit announces one feature orreceivingcapability supported by theOC-Feature- Vector AVP withnode (see Section 4.2). The absence of thevalue 0OC-Feature-Vector AVP indicates thatthe endpointonlysupportthecapabilities defineddefault traffic abatement algorithm described in thisspecification. An overload control endpoint (aspecification is supported. A reactingnode)node includes this AVP to indicate its capabilities tothe other overload control endpoint (thea reportingnode).node. For example, the endpoint (reacting node) may indicate which (future defined) traffic abatement algorithms it supports in addition to the default. During the message exchange the overload control endpoints express their common set of supported capabilities. Theendpoint sending a request (thereactingnode)node includes theOC-Feature-VectorOC-Supported-Features AVPwith those flags setthatcorrespondannounces what it supports. Theendpointreporting node that sends the answer(the reporting node)also includes theOC-Feature- VectorOC-Supported-Features AVPwith flags set to describethat describes the capabilities itboth supports and agrees withsupports. The set of capabilities advertised by therequest sender (e.g., basedreporting node depends onthelocalpolicy and/or configuration). The answer sending endpoint (the reporting node) does not need to advertise thosepolicies. At least one of the announced capabilitiesitMUST match mutually. If there isnot going to use withno single matching capability therequest sending endpoint (thereactingnode). This specificationnode MUST act as if it does notdefineimplement DOIC and cease inserting anyadditional capability flag.DOIC related AVPs into any Diameter messages with this specific reacting node. 4.2. OC-Feature-Vector AVP Theimplicity capability (allOC-Feature-Vector AVP (AVP code TBD6) is type of Unsigned64 and contains a 64 bit flags field of announced capabilities of an overload control endpoint. The value of zero (0) is reserved. The following capabilities are defined in this document: OLR_DEFAULT_ALGO (0x0000000000000001) When this flag is setto zero) indicatesby thesupport for this specification only. 4.2.overload control endpoint it means that the default traffic abatement (loss) algorithm is supported. 4.3. OC-OLR AVP The OC-OLR AVP (AVP code TBD2) is type of Grouped and contains the necessary information to convey an overload report.OC-OLR may also be used to convey additional information about an extension that is declared in the OC-Feature-Vector AVP.The OC-OLR AVP does not contain explicit information to which application it applies to and who inserted the AVP or whom the specific OC-OLR AVP concerns to. Both these information is implicitly learned from the encapsulating Diameter message/command. The application the OC-OLR AVP applies to is the same as the Application-Id found in the Diameter message header. The identity the OC-OLR AVP concerns is determined from the Origin-Host AVP (and Origin-Realm AVP as well) found from the encapsulating Diameter command. The OC-OLR AVP is intended to be sent only by a reporting node. OC-OLR ::= < AVP Header: TBD2 > <TimeStampOC-Sequence-Number > [Reduction-PercentageOC-Report-Type ] [ValidityDurationOC-Reduction-Percentage ] [ReportTypeOC-Validity-Duration ] * [ AVP ] TheTimeStampSequence-Number AVP indicateswhentheoriginal OC-OLR AVP with"freshness" of thecurrent content was created.OC-OLR AVP. It is possible to replay the sameOC- OLROC-OLR AVP multiple times between the overload control endpoints, however, when the OC-OLR AVP content changes orthe other informationsending endpoint otherwise wants the receiving endpoint to update its overload control information, then theTimeStampOC-Sequence-Number AVP MUST contain a newvalue. [OpenIssue: Is this necessarily a timestamp, or is it justgreater value than the previously received. The receiver SHOULD discard an OC-OLR AVP with a sequence number thatcan be implemented asis less than previously received one. Note that if atimestamp? Is this timestamp usedDiameter command were tocalculate expiration time? (propose no.). We should also consider whether eithercontain multiple OC-OLR AVPs they all MUST have different OC-Report-Type AVP value. OC-OLR AVPs with unknown values SHOULD be silently discarded and the event SHOULD be logged. The OC-OLR AVP can be expanded with optional sub-AVPs only if atimestamp or sequence number is neededlegacy implementation can safely ignore them without breaking backward compatibility forprotection against replay attacks.] 4.3. TimeStampthe given OC-Report-Type AVP value implied report handling semantics. If the new sub-AVPs imply new semantics for the report handling, then a new OC-Report-Type AVP value MUST be defined. 4.4. OC-Sequence-Number AVP TheTimeStampOC-Sequence-Number AVP (AVP code TBD3) is type of Time. Its usage in the context of the overload control is described inSection 4.2.Sections 4.1 and 4.3. From the functionality point of view, theTimeStampOC-Sequence-Number AVPis merelyMUST be used as a non-volatile increasing counter between two overload control endpoints (neglecting the fact that the contents of the AVP is a 64-bit NTP timestamp [RFC5905]). The sequence number is only required to be unique between two overload control endpoints.4.4. ValidityDurationSequence numbers are treated in uni-directional manner, i.e. two sequence numbers on each direction between two endpoints are not related or correlated. When generating sequence numbers, the new sequence number MUST be greater than any sequence number previously seen between two endpoints within a time window that tolerates the wraparound of the NTP timestamp (i.e. approximately 68 years). 4.5. OC-Validity-Duration AVP TheValidityDurationOC-Validity-Duration AVP (AVP code TBD4) is type of Unsigned32 and describes the number of seconds the "new and fresh" OC-OLR AVP and its content is valid since thecreationreception of the new OC-OLR AVP. The default value for the OC-Validity-Duration AVP(as indicated byvalue is 5 (i.e., 5 seconds). When theTimeStamp AVP).OC-Validity-Duration AVP is not present in the OC-OLR AVP, the default value applies. Validity duration values 0 (i.e., 0 seconds) and above 86400 (i.e., 24 hours) MUST NOT be used. Invalid validity duration values are treated as if the OC-Validity- Duration AVP were not present. A timeout of the overload report has specific concerns that need to be taken into account by the endpoint acting on the earlier received overload report(s). Section4.64.7 discusses the impacts of timeout in the scope of the traffic abatement algorithms. As a general guidance for implementations it is RECOMMENDED never to let any overload report to timeout.Rather,Following to this rule, an overload endpoint should explicitlysignal, e.g.signal the end of overloadcondition.condition and not rely on the expiration of the validity time of the overload report in the reacting node. This leaves no need for theother overload endpointreacting node to reason or guess the overload condition of theother endpoint is at. 4.5. ReportTypereporting node. 4.6. OC-Report-Type AVP TheReportTypeOC-Report-Type AVP (AVP code TBD5) is type of Enumerated. The value of the AVP describes what the overload report concerns. The following values are initially defined: 0Reserved. 1 Destination-HostA host report. The overload treatment should apply to requeststhatthesenderreacting node knows that will reach the overloadedserver.node. For example, requests with a Destination-Host AVP indicating theserver. 2 Realm (aggregated)endpoint. The reacting node learns the "host" implicitly from the Origin-Host AVP of the received message that contained the OC-OLR AVP. 1 A realm report. The overload treatment should apply to all requests bound for the overloaded realm. TheReportTypereacting node learns the "realm" implicitly from the Origin-Realm AVP of the received message that contained the OC-OLR AVP. The default value of the OC-Report-Type AVP is 0 (i.e. the host report). The OC-Report-Type AVP is envisioned to be useful for situations where a reacting node needs to apply different overload treatments for different "types" of overload. For example, the reacting node(s) might need to throttle differently requeststhat are targetedsent to a specific server (identified by thepresence of aDestination-Host AVPthan forin the request) and requests that can be handled by any server in a realm. The example in AppendixC.3B.1 illustrates this usage.[OpenIssue: There is an ongoing discussion about whetherWhen defining new report type values, theReportTypecorresponding specification MUST define the semantics of the new report types and how they affect the OC-OLR AVPishandling. The specification MUST also reserve a corresponding new feature, see theright way to solve that issue,OC-Supported-Features andwhether it's needed at all.] 4.6. Reduction-PercentageOC- Feature-Vector AVPs. 4.7. OC-Reduction-Percentage AVP TheReduction-PercentageOC-Reduction-Percentage AVP (AVP code TBD8) is type of Unsigned32 and describes the percentage of the traffic that the sender is requested to reduce, compared to what it otherwise would have sent. The OC-Reduction-Percentage AVP applies to the default (loss like) algorithm specified in this specification. However, the AVP can be reused for future abatement algorithms, if its semantics fit into the new algorithm. The value of the Reduction-Percentage AVP is between zero (0) and one hundred (100). Values greater than 100 are interpreted as 100. The value of 100 means that no traffic is expected, i.e. thesender of the informationreporting node is under a severe load and ceases to process any new messages. The value of 0 means that thesender of the informationreporting node is in a stable state and has no requests to the other endpoint to apply any traffic abatement.[Open Issue: We should consider an algorithm independent way to end an overload condition. Perhaps setting the validitytime to zero? Counter comment; sinceThe default value of theabatementOC-Reduction-Percentage AVP isbased on a specific algorithm, it0. When the OC-Reduction-Percentage AVP isnatural to indicate that fromnot present in theabatement algorithm point of view status quo has been reached.]overload report, the default value applies. If an overload control endpoint comes out of the 100 percent traffic reduction as a result of the overload report timing out, the following concerns are RECOMMENDED to be applied. Theendpointreacting node sending the traffic should be conservative and, for example, first sendfew"probe" messages to learn the overload condition of theother endpointoverloaded node before converging to any traffic amount/rate decided by the sender. Similar concernsactuallyapply in all cases when the overload report times out unless the previous overload report stated 0 percent reduction.[Open Issue: It is still open whether we need an AVP to indicate the exact used traffic abatement algorithm. Currently it assumed that the reacting node is able to figure out what to do based on the Reducttion-Percentage AVP and possible other embedded information inside the OC-OLR AVP.] 4.7.4.8. Attribute Value Pair flag rules +---------+ |AVP flag | |rules | +----+----+ AVP Section | |MUST| Attribute Name Code Defined Value Type |MUST| NOT| +--------------------------------------------------+----+----+|OC-Feature-Vector|OC-Supported-Features TBD1 x.xUnsigned64Grouped | | V | +--------------------------------------------------+----+----+ |OC-OLR TBD2 x.x Grouped | | V | +--------------------------------------------------+----+----+|TimeStamp|OC-Sequence-Number TBD3 x.x Time | | V | +--------------------------------------------------+----+----+|ValidityPeriod|OC-Validity-Duration TBD4 x.x Unsigned32 | | V | +--------------------------------------------------+----+----+|ReportType|OC-Report-Type TBD5 x.x Enumerated | | V | +--------------------------------------------------+----+----+|Reduction|OC-Reduction | | | | -Percentage TBD8 x.x Unsigned32 | | V | +--------------------------------------------------+----+----+ |OC-Feature-Vector TBD6 x.x Unsigned64 | | V | +--------------------------------------------------+----+----+ As described in the Diameter base protocol [RFC6733], the M-bit setting for a given AVP is relevant to an application and each command within that application that includes the AVP. The Diameter overload control AVPs SHOULD always be sent with the M-bit cleared when used within existing Diameter applications to avoid backward compatibility issues. Otherwise, when reused in newly defined Diameter applications, the DOC related AVPs SHOULD have the M-bit set. 5. Overload Control Operation 5.1. Overload Control Endpoints The overload control solution can be considered as an overlay on top of an arbitrary Diameter network. The overload control information is exchanged over on a "DOIC association" established between twocommunicatincommunication endpoints. The endpoints, namely the "reacting node" and the "reporting node" do not need to be adjacent Diameter peer nodes, nor they need to be the end-to-end Diameter nodes in a typical"client- server""client-server" deployment with multiple intermediate Diameter agent nodes in between. The overload controlendpointendpoints are the two Diameter nodes that decide to exchange overload control information between each other. How the endpoints are determined is specific to a deployment, a Diameter node role in that deployment and local configuration. The following diagrams illustrate the concept of Diameter Overload End-Points and how they differ from the standard [RFC6733] defined client, server and agent Diameter nodes. The following is the key to the elements in the diagrams: C Diameter client as defined in [RFC6733]. S Diameter server as defined in [RFC6733]. A Diameter agent, in either a relay or proxy mode, as defined in [RFC6733]. DEP Diameter Overload End-Point as defined in this document. In the following figures a DEP may terminate two different DOIC associations being a reporter and reactor at the same time. Diameter Session A Diameter session as defined in [RFC6733]. DOIC Association A DOIC association exists between two Diameter Overload End-Points. One of the end-points is the overload reporter and the other is the overload reactor. Figure 2 illustrates the most basic configuration where a client is connected directly to a server. In this case, the Diameter session and the DOIC association are both between the client and server. +-----+ +-----+ | C | | S | +-----+ +-----+ | DEP | | DEP | +--+--+ +--+--+ | | | | |{Diameter Session}| | | |{DOIC Association}| | | Figure 2: Basic DOIC deployment In Figure 3 there is an agent that is not participating directly in the exchange of overload reports. As a result, the Diameter session and the DOIC associationisare still established between the client and the server. +-----+ +-----+ +-----+ | C | | A | | S | +-----+ +--+--+ +-----+ | DEP | | | DEP | +--+--+ | +--+--+ | | | | | | |----------{Diameter Session}---------| | | | |----------{DOIC Association}---------| | | | Figure 3: DOIC deployment with non participating agent Figure 4 illustrates the case where the client does not support Diameter overload. In this case, the DOIC association is between the agent and the server. The agent handles the role of the reactor for overload reports generated by the server. +-----+ +-----+ +-----+ | C | | A | | S | +--+--+ +-----+ +-----+ | | DEP | | DEP | | +--+--+ +--+--+ | | | | | | |----------{Diameter Session}---------| | | | | |{DOIC Association}| | | | Figure 4: DOIC deployment with non-DOIC client and DOIC enabled agent In Figure 5 there is a DOIC association between the client and the agent and a second DOIC association between the agent and the server. One use case requiring this configuration is when the agent is serving as aSFE/SFIMSFE for a set of servers. +-----+ +-----+ +-----+ | C | | A | | S | +-----+ +-----+ +-----+ | DEP | | DEP | | DEP | +--+--+ +--+--+ +--+--+ | | | | | | |----------{Diameter Session}---------| | | | |{DOIC Association}|{DOIC Association}| | | and/or |----------{DOIC Association}---------| | | | Figure 5: A deployment where all nodes support DOIC Figure 6 illustrates a deployment where some clients support Diameter overload control and some do not. In this case the agent must support Diameter overload control for the non supporting client. It might also need to have a DOIC association with the server, as shown here, to handle overload for a server farm and/or for managing Realm overload. +-----+ +-----+ +-----+ +-----+ | C1 | | C2 | | A | | S | +-----+ +--+--+ +-----+ +-----+ | DEP | | | DEP | | DEP | +--+--+ | +--+--+ +--+--+ | | | | | | | | |-------------------{Diameter Session}-------------------| | | | | | |--------{Diameter Session}-----------| | | | | |---------{DOIC Association}----------|{DOIC Association}| | | | and/or |-------------------{DOIC Association}-------------------| | | | | Figure 6: A deployment with DOIC and non-DOIC supporting clients Figure 7 illustrates a deployment where some agents support Diameter overload control and others do not. +-----+ +-----+ +-----+ +-----+ | C | | A | | A | | S | +-----+ +--+--+ +-----+ +-----+ | DEP | | | DEP | | DEP | +--+--+ | +--+--+ +--+--+ | | | | | | | | |-------------------{Diameter Session}-------------------| | | | | | | | | |---------{DOIC Association}----------|{DOIC Association}| | | | and/or |-------------------{DOIC Association}-------------------| | | | | Figure 7: A deployment with DOIC and non-DOIC supporting agents 5.2. Piggybacking Principle The overload controlsolution definedAVPsare essentiallydefined in this specification have been designed to be piggybacked on top of existing application message exchanges. This is made possible by adding overload control top level AVPs, theOC- OLROC-OLR AVP and theOC-Feature-VectorOC-Supported-Features AVP as optional AVPs into existing commands(this has an assumption thatwhen theapplication CCFcorresponding Command Code Format (CCF) specification allows adding new optional AVPsinto(see Section 1.3.4 of [RFC6733]). When added to existing commands, both OC-Feature-Vector and OC-OLR AVPs SHOULD have theDiameter messages.M-bit flag cleared to avoid backward compatibility issues. A new application specification can incorporate the overload control mechanism specified in this document by making it mandatory to implement for the application and referencing this specification normatively. In such acase ofcase, the OC-Feature-Vector and OC-OLR AVPs reused in newly defined Diameterapplications,applications SHOULD have the M-bit flag set. However, it isRECOMMENDEDthe responsibility of the Diameter application designers toadd and defineddefine how overload control mechanisms works on that application.using OC-Feature-Vector and OC-OLR AVPs in a non- mandatory manner is intended only existing applications.Note that the overload control solution does not have fixed server and client roles. The endpoint role is determined based on thesentmessage type: whether the message is a request (i.e. sent by a "reacting node") or an answer (i.e. send by a "reporting node"). Therefore, in a typical "client-server" deployment, the "client" MAY report its overload condition to the "server" for any server initiated message exchange. An example of such is the server requesting a re-authentication from a client. 5.3. Capability Announcement Since the overload control solution relies on the piggybacking principle for the overload reporting and the overload control endpoint are likely not adjacent peers, finding out whether the other endpoint supports the overload control or what is the common traffic abatement algorithm to apply for the traffic. The approach defined in this specification for the end-to-end capabilitycapabilityannouncement relies on the exchange of theOC-Feature-VectorOC-Supported-Features between the endpoints. The feature announcement solution also works when carried out on existing applications. For the newly defined application the negotiation can be more exact based on the application specification. The announced set of capabilities MUST NOT change during the life time of the Diameter session (or transaction inacase of non-session maintaining applications). 5.3.1.Request Message InitiatorReacting Node Endpoint Considerations The basic principle is that the request message initiating endpoint (i.e. the "reacting node") announces its support for the overload control mechanism by including in the request message theOC-Feature- VectorOC- Supported-Features AVP with thosecapability flag bits set thatcapabilities it supports and is willing to use for this Diameter session (or transaction in a case of a non-session state maintainingapplications). In a case of session maintaining applications the request message initiating endpoint does not need to do the capability announcement more than onceapplications, see Section 3.1.2 forthe lifetime of themore details on Diametersession. In a case of non- session maintaining applications, itsessions). It is RECOMMENDED that the request message initiating endpoint includes the capability announcement into every request regardless it has had prior message exchanges with the give remote endpoint.[OpenIssue: We need to think about the lifetimeIn a case of acapabilities declaration. It's probablyDiameter session maintaining application, sending the OC-Supported-Features AVP in every message is not really necessary after thesame as forinitial capability announcement or until there is asession. We have had proposals that the feature vector needs to go into every request sent by an OC node. For peer to peer cases, this can be associated with connection lifetime, but it's more complex for non-adjacent OC support.]change in supported features. Once the endpoint that initiated the request message receives an answer message from the remote endpoint, it can detect from the received answer message whether the remote endpoint supports the overload control solution and in a case it does, what features are supported. The support for the overload control solution is based on the presence of theOC-Feature-VectorOC-Supported-Features AVP in the Diameter answer for existing application.For the newly defined applications the support for the overload control MAY already be part of the application specification. Based on capability knowledge the request message initiating endpoint can select the preferred common traffic abatement algorithm and act accordingly for the subsequent message exchanges.5.3.2.Answer Message InitiatingReporting Node Endpoint Considerations When a remote endpoint (i.e. a "reporting node") receives a requestmessage inmessage, it can detect whether the request message initiating endpointhas support forsupports the overload control solution based on the presence of theOC-Feature-VectorOC-Supported-Features AVP. For the newly defined applications the overload control solution support can be part of the application specification. Based on the content of theOC-Feature-VectorOC-Supported-Features AVP the request message receiving endpoint knows what overload control functionality the other endpoint supports and then act accordingly for the subsequent answer messages it initiates.It is RECOMMENDED that theThe answer message initiating endpointselects one common traffic abatement algorithm even ifMAY announce as many supported capabilities as itwould support multiple.has (the announced set is a subject to local policy and configuration). However, at least one of the announced capabilities MUST be the same as received in the request message. The answer message initiating endpoint MUST NOT include any overload control solution defined AVPs into its answer messages if the request message initiating endpoint has not indicated support at the beginning of thethecreated session (or transaction in a case ofnon-sessionnon- session state maintaining applications). The same also applies if none of the announced capabilities match between the two endpoints. 5.4. Protocol Extensibility The overload control solution can be extended, e.g. with new traffic abatement algorithms or new functionality. The new features and algorithms MUST be registered with the IANA and for theppossiblepossible use with theOC-Feature-VectorOC-Supported-Features for announcing the support for the new features (see Section 7 for the required procedures). It should be noted that [RFC6733] defined Grouped AVP extension mechanisms also apply. This allows, for example, defining a new feature that is mandatory to understand even when piggybacked on an existing applications. More specifically, the sub-AVPs inside the OC-OLR AVP MAY have the M-bit set. However, when overload control AVPs are piggybacked on top of an existing applications, setting M-bit in sub-AVPs is NOT RECOMMENDED. 5.5. Overload Report Processing 5.5.1.Sender Endpoint ConsiderationsOverload Control State Both reacting and reporting nodes maintain an overload condition state for each endpoint (a host or a realm) they communicate with and both endpoints have announced support for DOIC. See Sections 4.1 and 5.3 for discussion about how the support for DOIC is determined. The overload condition state SHOULD be able to make a difference between a realm and a specific host in that realm. The overload condition state could include the following information (per host or realm): o The endpoint information (Diameter identity of the realm and/or host, application identifier, etc) o Reduction percentage o Validity period timer o Sequence number o Supported/selected traffic abatement algorithm The overload control state information SHOULD be maintained as long as the other endpoint is known to support DOIC (based on the presence of the DOIC AVPs or by a future application specification). 5.5.2.Receiver EndpointReacting Node Considerations[OpenIssue: did we now agreeOnce a reacting node receives an OC-OLR AVP from a reporting node, it applies the traffic abatement based on the commonly supported algorithm with the reporting node and the current overload condition. The reacting node learns the reporting node supported abatement algorithms directly from the received answer message containing the OC-Supported-Features AVP or indirectly remembering the previously used traffic abatement algorithm with the given reporting node. The received OC-Supported-Features AVP does not change the existing overload condition and/or traffic abatement algorithm settings if the OC-Sequence-Number AVP contains a value thate.g.is equal to the previously received/recorded one. If the OC-Supported-Features AVP is received for the first time for the reporting node or the OC- Sequence-Number AVP value is less than the previously received/ recorded one (and is outside the valid overflow window), then either the sequence number is stale (e.g. an intentional or unintentional replay) and SHOULD be silently discarded. The OC-OLR AVP contains the necessary information of the overload condition on the reporting node. Similarly to the OC-Supported- Features's sequence numbering, the OC-OLR AVP also has the OC- Sequence-Number AVP and its handling is similar to the one in the OC- Supported-Features AVP. The reacting node MUST update its overload condition state whenever receiving the OC-OLR AVP for the first time or the OC-Sequence-Number sub-AVP indicates aserver can refrain sending OLRchange inanswers basedthe OC-OLR AVP. As described in Section 4.3, the OC-OLR AVP contains the necessary information of the overload condition onsome magical algorithm? (Note: We seemthe reporting node. From the OC-Report-Type AVP contained in the OC-OLR AVP, the reacting node learns whether the overload condition report concerns a specific host (as identified by the Origin-Host AVP of the answer message containing the OC-OLR AVP) or the entire realm (as identified by the Origin-Realm AVP of the answer message containing the OC-OLR AVP). The reacting node learns the Diameter application tohave consensuswhich the overload report applies from the Application-ID of the answer message containing the OC-OLR AVP. The reacting node MUST use this information as an input for its traffic abatement algorithm. The idea is that the reacting node applies different handling of the traffic abatement, whether sent request messages are targeted to aserver MAY repeat OLRsspecific host (identified by the Diameter-Host AVP insubsequent messages, butthe request) or to any host in a realm (when only the Destination-Realm AVP is present in the request). Note that future specifications MAY define new OC-Report-Type AVP values that imply different handling of the OC-OLR AVP. For example, in a form of new additional AVPs inside the Grouped OC-OLR AVP that would define report target in a finer granularity than just a host. In the context of this specification and the default traffic abatement algorithm, the OC-Reduction-Percentage AVP value MUST be interpreted in the following way: value == 0 Indicates explicitly the end of overload condition and the reacting node SHOULD NOT apply the traffic abatement algorithm procedures anymore for the given reporting node (or realm). value == 100 Indicates that the reporting node (or realm) does notrequiredwant to receive any traffic from the reacting node for the application the report concerns. The reacting node MUST doso, basedall measure not to send traffic to the reporting node (or realm) as long as the overload condition changes or expires. 0 < value < 100 Indicates that the reporting node urges the reacting node to reduce its traffic by a given percentage. For example if the reacting node has been sending 100 packets per second to the reporting node, then a reception of OC-Reduction-Percentage value of 10 would mean that from now onlocal policy.)]the reacting node MUST only send 90 packets per second. How the reacting node achieves the "true reduction" transactions leading to the sent request messages is up to the implementation. The reacting node MAY simply drop every 10th packet from its output queue and let the generic application logic try to recover from it. If the OC-OLR AVP is received for the first time, the reacting node MUST create an overload condition state associated with the related realm or a specific host in the realm identified in the message carrying the OC-OLR AVP, as described in Section 5.5.1. If the value of the OC-Sequence-Number AVP contained in the received OC-OLR AVP is equal to or less than the value stored in an existing overload condition state, the received OC-OLR AVP SHOULD be silently discarded. If the value of the OC-Sequence-Number AVP contained in the received OC-OLR AVP is greater than the value stored in an existing overload condition state or there is no previously recorded sequence number, the reacting node MUST update the overload condition state associated with the realm or the specific node is the realm. When an overload condition state is created or updated, the reacting node MUST apply the traffic abatement requested in the OC-OLR AVP using the algorithm announced in the OC-Supported-Features AVP contained in the received answer message along with the OC-OLR AVP. The validity duration of the overload information contained in the OC-OLR AVP is either explicitly indicated in the OC-Validity-Duration AVP or is implicitly equals to the default value (5 seconds) if the OC-Validity-Duration AVP is absent of the OC-OLR AVP. The reacting node MUST maintain the validity duration in the overload condition state. Once the validity duration times out, the reacting node MUST assume the overload condition reported in a previous OC-OLR AVP has ended. 5.5.3. Reporting Node Considerations A reporting node is a Diameter node inserting an OC-OLR AVP in a Diameter message in order to inform a reacting node about an overload condition and request Diameter traffic abatement. The operation on the reporting node is rather straight forward. The reporting node learns the capabilities of the reacting node when it receives the OC-Supported-Features AVP as part of any Diameter request message. If the reporting node shares at least one common feature with the reacting node, then the DOIC can be enabled between these two endpoints. See Section 5.3 for further discussion on the capability and feature announcement between two endpoints. When a traffic reduction is required due to an overload condition and the overload control solution is supported by the sender of the Diameter request, the reporting node MUST include an OC-Supported- Features AVP and an OC-OLR AVP in the corresponding Diameter answer. The OC-OLR AVP contains the required traffic reduction and the OC- Supported-Features AVP indicates the traffic abatement algorithm to apply. This algorithm MUST be one of the algorithms advertised by the request sender. A reporting node MAY rely on the OC-Validity-Duration AVP values for the implicit overload condition state cleanup on the reacting node. However, it is RECOMMENDED that the reporting node always explicitly indicates the end of a overload condition. 6. Transport Considerations In order to reduce overload control introduced additional AVP and message processing it might be desirable/beneficial to signal whether the Diameter command carries overload control information that should be of interest of an overload aware Diameter node. Should such indication be include is not part of this specification. It has not either been concluded at what layer such possible indication should be. Obvious candidates include transport layer protocols (e.g., SCTP PPID or TCP flags) or Diameter command header flags. 7. IANA Considerations 7.1. AVP codes New AVPs defined by this specification are listed in Section 4. All AVP codes allocated from the 'Authentication, Authorization, and Accounting (AAA) Parameters' AVP Codes registry. 7.2. New registries Three new registries are needed under the 'Authentication, Authorization, and Accounting (AAA) Parameters' registry. Section4.14.2 defines a new "Overload Control Feature Vector" registry including the initial assignments. New values can be added into the registry using the Specification Required policy [RFC5226]. See Section4.54.2 for the initial assignment in the registry. Section 4.6 defines a new "Overload Report Type" registry with its initial assignments. New types can be added using the Specification Required policy [RFC5226]. 8. Security Considerations This mechanism gives Diameter nodes the ability to request that downstream nodes send fewer Diameter requests. Nodes do this by exchanging overload reports that directly affect this reduction. This exchange is potentially subject to multiple methods of attack, and has the potential to be used as a Denial-of-Service (DoS) attack vector. Overload reports may contain information about the topology and current status of a Diameter network. This information is potentially sensitive. Network operators may wish to control disclosure of overload reports to unauthorized parties to avoid its use for competitive intelligence or to target attacks. Diameter does not include features to provide end-to-end authentication, integrity protection, or confidentiality. This may cause complications when sending overload reports between non- adjacent nodes. 8.1. Potential Threat Modes The Diameter protocol involves transactions in the form of requests and answers exchanged between clients and servers. These clients and servers may be peers, that is,they may share a direct transport (e.g. TCP or SCTP) connection, or the messages may traverse one or more intermediaries, known as Diameter Agents. Diameter nodes use TLS, DTLS, or IPSec to authenticate peers, and to provide confidentiality and integrity protection of traffic between peers. Nodes can make authorization decisions based on the peer identities authenticated at the transport layer. When agents are involved, this presents an effectively hop-by-hop trust model. That is, a Diameter client or server can authorize an agent for certain actions, but it must trust that agent to make appropriate authorization decisions about its peers, and so on. Since confidentiality and integrity protection occurs at the transport layer. Agents can read, and perhaps modify, any part of a Diameter message, including an overload report. There are several ways an attacker might attempt to exploit the overload control mechanism. An unauthorized third party might inject an overload report into the network. If this third party is upstream of an agent, and that agent fails to apply proper authorization policies, downstream nodes may mistakenly trust the report. This attack is at least partially mitigated by the assumption that nodes include overload reports in Diameter answers but not in requests. This requires an attacker to have knowledge of the original request in order to construct a response. Therefore, implementations SHOULD validate that an answer containing an overload report is a properly constructed response to a pending request prior to acting on the overload report. A similar attack involves an otherwise authorized Diameter node that sends an inappropriate overload report. For example, a server for the realm "example.com" might send an overload report indicating that a competitor's realm "example.net" is overloaded. If other nodes act on the report, they may falsely believe that "example.net" is overloaded, effectively reducing that realm's capacity. Therefore, it's critical that nodes validate that an overload report received from a peer actually falls within that peer's responsibility before acting on the report or forwarding the report to other peers. For example, an overload report from an peer that applies to a realm not handled by that peer is suspect. An attacker might use the information in an overload report to assist in certain attacks. For example, an attacker could use information about current overload conditions to time a DoS attack for maximum effect, or use subsequent overload reports as a feedback mechanism to learn the results of a previous or ongoing attack. 8.2. Denial of Service Attacks Diameter overload reports can cause a node to cease sending some or all Diameter requests for an extended period. This makes them a tempting vector for DoS tacks. Furthermore, since Diameter is almost always used in support of other protocols, a DoS attack on Diameter is likely to impact those protocols as well. Therefore, Diameter nodes MUST NOT honor or forward overload reports from unauthorized or otherwise untrusted sources. 8.3. Non-Compliant Nodes When a Diameter node sends an overload report, it cannot assume that all nodes will comply. A non-compliant node might continue to send requests with no reduction in load. Requirement 28[I-D.ietf-dime-overload-reqs][RFC7068] indicates that the overload control solution cannot assume that all Diameter nodes in a network are necessarily trusted, and that malicious nodes not be allowed to take advantage of the overload control mechanism to get more than their fair share of service. In the absence of an overload control mechanism, Diameter nodes need to implement strategies to protect themselves from floods of requests, and to make sure that a disproportionate load from one source does not prevent other sources from receiving service. For example, a Diameter server might reject a certain percentage of requests from sources that exceed certain limits. Overload control can be thought of as an optimization for such strategies, where downstream nodes never send the excess requests in the first place. However, the presence of an overload control mechanism does not remove the need for these other protection strategies. 8.4. End-to End-Security Issues The lack of end-to-end security features makes it far more difficult to establish trust in overload reports that originate from non- adjacent nodes. Any agents in the message path may insert or modify overload reports. Nodes must trust that their adjacent peers perform proper checks on overload reports from their peers, and so on, creating a transitive-trust requirement extending for potentially long chains of nodes. Network operators must determine if this transitive trust requirement is acceptable for their deployments. Nodes supporting Diameter overload control MUST give operators the ability to select which peers are trusted to deliver overload reports, and whether they are trusted to forward overload reports from non-adjacent nodes.[OpenIssue: This requires that a responding node be able to tell a peer-generated OLR from one generated by a non-adjacent node. One way of doing this would be to include the identity of the node that generated the report as part of the OLR] [OpenIssue: Do we need further language about what rules an agent should apply before forwarding an OLR?] The lack of end-to-end protection creates a tension between two requirements from the overload control requirements document. [I-D.ietf-dime-overload-reqs] Requirement 34 requires the ability to send overload reports across intermediaries (i.e. agents) that do not support overload control mechanism. Requirement 27 forbids the mechanism from adding new vulnerabilities or increasing the severity of existing ones. A non-supporting agent will most likely forward overload reports without inspecting them or applying any sort of validation or authorization. This makes the transitive trust issue considerably more of a problem. Without the ability to authenticate and integrity protect overload reports across a non-supporting agent, the mechanism cannot comply with both requirements. [OpenIssue: What do we want to do about this? Req27 is a normative MUST, while Req34 is "merely" a SHOULD. This would seem to imply that 27 has to take precedent. Can we say that overload reports MUST NOT be sent to and/or accepted from non-supporting agents until such time we can use end-to-end security?]The lack of end-to-end confidentiality protection means that any Diameter agent in the path of an overload report can view the contents of that report. In addition to the requirement to select which peers are trusted to send overload reports, operators MUST be able to select which peers are authorized to receive reports. A node MUST not send an overload report to a peer not authorized to receive it. Furthermore, an agent MUST remove any overload reports that might have been inserted by other nodes before forwarding a Diameter message to a peer that is not authorized to receive overload reports. At the time of this writing, the DIME working group is studying requirements for adding end-to-end security [I-D.ietf-dime-e2e-sec-req] features to Diameter. These features, when they become available, might make it easier to establish trust in non-adjacent nodes for overload control purposes. Readers should be reminded, however, that the overload control mechanism encourages Diameter agents to modify AVPs in, or insert additional AVPs into, existing messages that are originated by other nodes. If end-to-end security is enabled, there is a risk that such modification could violate integrity protection. The details of using any future Diameter end-to-end security mechanism with overload control will require careful consideration, and are beyond the scope of this document. 9. Contributors The following people contributed substantial ideas, feedback, and discussion to this document: o Eric McMurry o Hannes Tschofenig o Ulrich Wiehe o Jean-Jacques Trottin oLionel Morand oMaria Cruz Bartolome o Martin Dolly o Nirav Salot o Susan Shishufeng 10.Acknowledgements ... 11.References11.1.10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithms Specification", RFC 5905, June 2010. [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, "Diameter Base Protocol", RFC 6733, October 2012.11.2.10.2. Informative References [3GPP.23.203] 3GPP, "Policy and charging control architecture", 3GPP TS 23.203 10.9.0, September 2013. [3GPP.29.229] 3GPP, "Cx and Dx interfaces based on the Diameter protocol; Protocol details", 3GPP TS 29.229 10.5.0, March 2013. [3GPP.29.272] 3GPP, "Evolved Packet System (EPS); Mobility Management Entity (MME) and Serving GPRS Support Node (SGSN) related interfaces based on Diameter protocol", 3GPP TS 29.272 10.8.0, June 2013. [I-D.ietf-dime-e2e-sec-req] Tschofenig, H., Korhonen, J., Zorn, G., and K. Pillay, "Diameter AVP Level Security: Scenarios and Requirements", draft-ietf-dime-e2e-sec-req-00 (work in progress), September 2013.[I-D.ietf-dime-overload-reqs] McMurry, E. and B. Campbell, "Diameter Overload Control Requirements", draft-ietf-dime-overload-reqs-13 (work in progress), September 2013.[RFC4006] Hakala, H., Mattila, L., Koskinen, J-P., Stura, M., and J. Loughney, "Diameter Credit-Control Application", RFC 4006, August 2005. [RFC5729] Korhonen, J., Jones, M., Morand, L., and T. Tsou, "Clarifications on the Routing of Diameter Requests Based on the Username and the Realm", RFC 5729, December 2009. [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control Requirements", RFC 7068, November 2013. Appendix A. Issues left for future specifications The base solution for the overload control does not cover all possible use cases. A number of solution aspects were intentionally left for future specification and protocol work. A.1. Additional traffic abatement algorithms This specification describes only means for a simple loss based algorithm. Future algorithms can be added using the designed solution extension mechanism. The new algorithms need to be registered with IANA. See Sections 4.1 and 7 for the required IANA steps. A.2. Agent Overload This specification focuses on Diameter end-point (server or client) overload. A separate extension will be required to outline the handling the case of agent overload. A.3. DIAMETER_TOO_BUSY clarifications The current [RFC6733] behaviour in a case of DIAMETER_TOO_BUSY is somewhatunderspecified.under specified. For example, there is no information how long the specific Diameter node is willing to be unavailable. A specification updating [RFC6733] should clarify the handling of DIAMETER_TOO_BUSY from the error answer initiating Diameter node point of view and from the original request initiating Diameter node point of view. Further, the inclusion of possible additional information providingAPVsAVPs should be discussed and possible be recommended to be used. Appendix B.Conformance to Requirements The following section analyses, whichExamples B.1. Mix of Destination-Realm routed requests and Destination-Host routed requests DiameterOverload Control requirements [I-D.ietf-dime-overload-reqs] are met by this specification. Key: S - Supported P - Partial N - Not supported +------+----+-------------------------------------------------------+ | Rqmt | S/ | Notes | | # | P/ | | | | N | | +------+----+-------------------------------------------------------+ | REQ | P | The DOIC solution only addresses overload | | 1 | | information. Load information is left as future | | | | work. In addition,allows a client to optionally select theDOIC solution does not | | | | address agent overload scenarios. | | | | - | | REQ | P | The DOIC solution supports overload reports that | | 2 | | implicitly indicatedestination server of a request, even if there are agents between theapplication impacted byclient and the| | | | report. The DOIC solution does not support reporting | | | | load information. The DOIC solution is thought to | | | | support graceful behavior. Allowing an application | | | | specific capabilities negotiation mechanism violates | | | | application-independence. Suggested different | | | | wording: The DOIC solution supports overload reports | | | | that are applicable to any Diameter application.server. The| | | | DOIC solutionclient doesnot support reporting load | | | | information. The DOIC solution allows to support | | | | graceful behavior;thiswill be enhanced when the | | | | Load information will be defined. Comment: Can we | | | | removedusing thewords "is thought"? | | | | - | | REQ | S | The DOIC solution is thought to address this | | 3 | | requirement. Comment: Can we removedDestination-Host AVP. In cases where thewords "is | | | | thought"? | | | | - | | REQ | P | The DOIC solutionclient doesallow for both bothnot care if aDiameter | | 4 | |specific server receives the request, it can omit Destination-Host anda Diameter client toroute the request using the Destination-Realm and Application Id, effectively letting an agent select the server. Clients commonly sendoverload | | | | reports. The DOIC solution only addresses Diameter | | | | end-point (servermixtures of Destination-Host andclient) overload. Agent | | | | overload is being addressedDestination- Realm routed requests. For example, in an application that uses user sessions, aseparate draft. | | | | - | | REQ | S | The DOIC solution does not depend on howclient typically won't care which server handles a session-initiating requests. But once the| | 5 | | end-points are discovered. Comment: it might be | | | | worth working through at least one use case showing | | | | DNS based dynamic peer discovery to make sure we | | | | haven't missed anything. | | | | - | | REQ | ? | Need to update text as some configuationsession isrequired. | | 6 | | Need to determin ifinitiated, thecurrent discussion on | | | | overload application id increasesclient will send all subsequent requests in that session to theamount of | | | | configuration whichsame server. Therefore it wouldchange this tosend the initial request with no Destination-Host AVP. If it receives aN. | | | | - | | REQ | S | The DOIC solution supportssuccessful answer, theloss algorithm, which | | 7 | | is expected to address this requirement. There is | | | | concern aboutclient would copy theability to address oscillations. | | | | Wording is included for howOrigin-Host value from the answer message into areacting node starts to | | | | increase traffic after anDestination-Host AVP in each subsequent request in the session. An agent has very limited options in applying overloadreport expiresabatement to| | | | address this concern. Suggested different wording: | | | | The DOIC solution supports a baseline mechanism | | | | relying on traffic reduction percentagerequests thatis a | | | | loss algorithm, which allows to address this | | | | requirement. Oscillations are avoided or quite | | | | minimized by sending successive OLR reports withcontain Destination-Host AVPs. It typically cannot route the| | | | values to convergerequest to a different server than theoptimal traffic orone identified in Destination-Host. It's only remaining options are to| | | | smoothly come backthrottle such requests locally, or tonormal traffic conditions when | | | |send an overloaddecreases and ends. | | | | - | | REQ | ? | The DOIC solution supports a timestamp which is meant | | 8 | | to serve as areportversion indication to address | | | | this requirement. Comment: The use of the timestamp | | | | is under discussion. | | | | - | | REQ | ? | The DOIC solution uses a piggybacking strategy for | | 9 | | carrying overload reports, which scales lineraly with | | | |back towards theamount of traffic. As such,client so thefirst part of | | | |client can throttle therequirement is addressed.requests. TheDOIC solution does | | | | not support a mechanism for sending overload reports | | | | over a quiescent transport connections or,second choice is usually more| | | | generally, to Diameter nodes that are not producing | | | | traffic. Suggested different wording: The DOIC | | | | solution uses a piggybacking strategy for carrying | | | | overload reports. As such,efficient, since it prevents any throttled requests from being sent in the firstpart ofplace, and removes the| | | | requirement is addressed. For a connection that has | | | | become quiescent dueagent's need toOLRs with a 100% traffic | | | | reduction, the validity timer allowssend errors back tohandle this | | | | case. Other cases of quiescent connections are | | | | outsidethescope of Diameter overload (e.g. their | | | | handling may be done through the watch dog of the | | | | Diameter base protocol). | | | | - | | REQ | S | The DOIC solution supports two methodsclient formanaging | | 10 | |each dropped request. On thelength ofother hand, anoverload condition. First, all | | | | overload reports must contain a duration indication, | | | | after which the node reactingagent has much more leeway tothe report can | | | | consider theapply overloadcondition as ended. Secondly, | | | | the solution supports the methodabatement forthe node | | | | originating the overload report to explicitly | | | | communicaterequests that do not contain Destination-Host AVPs. If theconditionagent hasended. This | | | | latter mechanism depends on traffic to be sent from | | | | the reacting node and, as such, can not be depended | | | | uponmultiple servers inall circumstances. | | | | - | | REQ | ? | The DOIC solution works wellits peer table forsmall network | | 11 | | configurationsthe given realm andfor network configurations with a | | | | single Diameter agent hop. More analysis is required | | | |application, it can route such requests todetermine how wellother, less overloaded servers. If theDOIC solution handles very | | | | large Diameter network with partitioned or segmented | | | | server farms requiring multiple hops through Diameter | | | | agents. | | | | - | | REQ | P | The DOIC solution focuses on Diameter end-point | | 12 | | overload and meets this requirement for those | | | | Diameter nodes. The DOIC solution does not address | | | | Diameter Agentoverloadand does not meet this | | | | requirement for those Diameter nodes. | | | | - | | REQ | ? | The DOIC solution requires including ofseverity increases, theoverload | | 13 | | report in all answer messages in some situations. It | | | |agent may reach a point where there is notagreed, however, thatsufficient capacity across all servers to handle even realm-routed requests. In thisconstitutes | | | | substantial work. Thiscase, the realm itself canalsobemitigated by the | | | | sender of the overload report keeping state to record | | | | who has received overload reports. It is left to | | | | implementation decisions as to which approach is | | | | taken -- send in all messages or send once with a | | | | record of who has receivedconsidered overloaded. The agent may need thereport. Another way | | | | isclient tolet the request sender (reacting node) insert | | | | informationthrottle realm-routed requests inthe requestaddition tosay whether a | | | | throttling is actually performed.Destination-Host routed requests. Thereporting node | | | | then can base its decision on information received in | | | | the request; no need for keeping state to record who | | | | has receivedoverloadreports. The DOIC solution | | | | also requires capabilities negotiation in every | | | | request and response message, which increases the | | | | baseline work required for any node supporting the | | | | DOIC solution. Suggested additional text: It does | | | | not, however, require that the informationseverity may be| | | | recalculated or updated withdifferent for eachmessage. The | | | | update frequency is up to the implementation,server, and| | | | each implementation can make decisions on balancing | | | |theupdate of overload information along with its | | | | other priorities. It is expected that using a | | | | periodically updated OLR report added to all messages | | | | sent to overload control endpoints will not add | | | | substantial additional work. Piggyback base | | | | transport also does not require composition, sending, | | | | or parsing of new Diameter messagesseverity for thepurpose | | | | of conveying overload control information. Thererealm at is| | | | still discussion on the substantial additional work | | | | due to have OLR in each answer message. | | | | - | | REQ | S | The DOIC solution uses the piggybacking methodlikely to| | 14 | | deliver overload report, which scales lineraly with | | | | the amount of traffic. This allowsbe different than forimmediate | | | | feedback toanynode generating traffic toward | | | | another overloaded node. | | | | - | | REQ | S | The DOIC solution does not interfere with transport | | 15 | | protocols. | | | | - | | REQ | ? | The DOIC solution allows for a mixed network of | | 16 | | supporting and non supporting Diameter end-points. | | | | It isn't clear how realmspecific server. Therefore, an agent may need to forward, or originate, multiple overloadis handled in a | | | | networkreports withagents that do not support the DOIC | | | | solution. Suggested additional wording: Evaluation | | | | of Realm overload may requirediffering ReportType and Reduction-Percentage values. Figure 8 illustrates such aDA supporting DOIC, | | | | if the realm overload is not evaluated by the client. | | | | Realm overload handling is still under discussion. | | | | - | | REQ | ? | Suggested wording: The DOIC solution addressesmixed-routing scenario. In this| | 17 | | requirement through the loss algorithm (DOIC baseline | | | | mechanism) with the following possibilities. A DA | | | | supporting DOIC can act on behalf of clients not | | | | supporting DOIC. A reporting node is also aware of | | | | the nodes not supportingexample, theDOIC as there is no | | | | advertisement ofservers S1, S2, and S3 handle requests for theDOIC support. It may then apply | | | | a particular throttlingrealm "realm". Any of the three can handle requestscoming from | | | | these non supporting DOIC clients. | | | | - | | REQ | ? | It isn't clear yet that if this requirement is | | 18 | | addressed. There has been a proposal to mark | | | | messagesthatsurvived overload throttling as one | | | | method for an overloaded node to address fairness but | | | | this proposal isare notyetpart ofthe solution. Ita user session (i.e. routed by Destination-Realm). But once a session is| | | | also possibleestablished, all requests in thatthe overloaded node could use | | | | state gathered as part of the capability | | | | advertisement mechanism to know if the sending node | | | | supports the DOIC solution and if not,session must go toapply a | | | | particular throttling oftherequests coming from | | | | these non supporting DOIC clients. |same server. Client Agent S1 S2 S3 | | |-| |REQ|(1) Request (DR:realm) |S|The DOIC solution supports the ability for the|-------->| | |19| |overloaded node and the reacting node to be in| | | |different administrative domains.| | | |-| |REQ|Agent selects S1 |?|This mechanism is still under discussion. Comment 1:| |20| |I think this is a "S". OLRs are clearly| | | |distinguishable from any error code. The fact that| | | |an agent would need to send errors if it throttles is| | | |not an overload indication per se. It needs to do|(2) Request (DR:realm) | | |-------->| | |that even without DoC. OTOH, if we apply some DOC| | | |related fix to TOO_BUSY, we probably need a new code.| | | |Comment 2: New AVPs conveys overload control| | | |information, and this is transported on existing|S1 overloaded, returns OLR | | | |answer messages, so distinguishable from Diameter| | | |errors.| | | |-| |REQ|S|The inability for a node to send overload reports|(3) Answer (OR:realm,OH:S1,OLR:RT=DH) | |<--------| |21| |will result in equivalent through put to a network| | | |that does not support the DOIC solution.| | | |-| |REQ | S | The DOIC solution gives this node generating the | | 22 | | overload report the ability|sees OLR,routes DR traffic tocontrol the amount ofS2&S3 | | | |throttling done by the reacting node using the| | | |reduction percentage parameter in the overload | | | | report.| | | |- ||REQ|?|Initial text: The DOIC mechanism supports two | | 23 | | abatement strategies by reacting nodes, routing to an|(4) Answer (OR:realm,OH:S1, OLR:RT=DH) | |<--------| | | |alternative node or dropping traffic. The routing to| | | |an alternative node will be enhanced when the Load| | | |extension is defined. Comment: This is a N. There's| | |Client throttles requests with DH:S1 | |no good way to determine which nodes are likely to| | | |have sufficient capacity without some sort of load| | | |metric for non-overloaded nodes.| | | |-| |REQ|(5) Request (DR:realm) |N|The DOIC solution does not address delivering load|-------->| | |24| |information.| | | |-| |REQ|S|The DOIC solution contains some guideance.| |25|Agent selects S2 | | | | | |-| |REQ|S|The DOIC solution does not constrain a nodes ability| |26| |to determine which requests are trottled.| | | |-|(6) Request (DR:realm) | |REQ|------------------>| |?|Initial text: The DOIC solution does add a new line| |27| |of attack in the ability for a malicious entity to| | | |insert overload reports that would reduce or| | | |eliminate traffic. This, however,|S2 isno worse thanoverloaded... | | | |an attacker that would assert erroneous error| | | |responses such as a TOO BUSY response. It is| | | |recognized that the end-to-end security solution| | | |currently being worked on by the DIME working group ||(7) Answer (OH:S2, OLR:RT=DH)| | |<------------------| | |is needed to close these types of vulurabilities.| | | |Comment: Sending a malicious OLR with a type of| | | |"realm" will have considerably more impact than a| | |Agent sees OLR, realm now overloaded | |TOO_BUSY. Personally, I don't think we can achieve| | | |this requirement without either being hop-by-hop or| | | |requiring e2e security. We probably need further| | | |analysis of the security implications of the| |(8) Answer (OR:realm,OH:S2, OLR:RT=DH, OLR: RT=R) |<--------| | | |capabilities negotiation as well. Suggested| | | |additional verbage: An OLR only relates to the| | | |traffic between a reporting node and a reacting node | || | |Client throttles DH:S1, DH:S2, andcan effectively block the traffic from a clientDR:realm | | | |which would be an important impact. Nevertheless| | | |OLRs are regularly sent in all answers, so a| | | |malicious OLR will have a short transient effect, as| | | |quickly overridden by a new OLR. To have a | | ||significant impact would require a continuous flow of| | | |answers with malicious OLRs. There is the exception| | | | Figure 8: Mix ofthe OLRDestination-Host and Destination-Realm Routed Requests 1. The client sends a request with no Destination-Host AVP (that is, avalue of 100% reduction traffic | | | | which hasDestination-Realm routed request.) 2. The agent follows local policy to select ahigher vulnerability and the use of which | | | | should be avoided when possible.server from its peer table. Inaddition such | | | | malicious OLRs must be in answers, which meansthis case, the| | | | capability to insertagent selects S2 and forwards themalicious OLR in an existing | | | | answer rather than to create an answer which is much | | | | less easy than to create arequest.To have a | | | | network wide applicability would request to generate | | | | malicious OLRs messages towards all reacting nodes. | | | |3. S1 is overloaded. Itcan be considered that the baseline mechanism | | | | offer a relevant level of security. Further analysis | | | | withsends asecurity expertise would be beneficial. | | | | - | | REQ | ? | See REQ 18 and REQ 27. Suggested additional verbage: | | 28 | | Guidance may be provided for detection of non | | | | compliant/abnormal use of OLRs, not only by endpoints | | | |answer indicating success, but alsoby intermediate DA that can be aware of | | | | OLRs,includes anexample being edge DAs with external | | | | networks. Further analysis with a security expertise | | | | would be beneficial. | | | | - | | REQ | ? | This requirement is not explicitly addressed by the | | 29 | | DOIC solution. There is nothing in the DOIC solution | | | | that would preventoverload report. Since thegoals of this requirement from | | | | being achieved. Non-adjacent DOIC without e2e | | | | security could be an issue here. | | | | - | | REQ | ? | It isn't clear how a solution would interfere. | | 30 | | Suggested wording: A node can have methods on how to | | | | protect fromoverloadfrom nodes non supporting DOIC. | | | | The DOIC mechanism used with DOIC supporting nodes | | | | will not interfere withreport only applies to S1, theappliance of these | | | | methods. ThereReportType is "Destination-Host". 4. The agent sees theremarkoverload report, and records that S1 is overloaded by theuse of these | | | | methods may impact the global overload ofvalue in thenode | | | | andReduction-Percentage AVP. It begins diverting theevaluationindicated percentage oftherealm-routed trafficreduction thatfrom S1 to S2 and S3. Since it can't divert Destination-Host routed traffic, it forwards the| | | | reporting node will send in OLRs. If a node has | | | | methodsoverload report toprotect against denial of service attacks, | | | |theuseclient. This effectively delegates the throttling ofDOIC will not interferetraffic withthem. A | | | | denial of service attack concerningDestination-Host:S1 to theDOIC itself | | | | is addressed in REQ 27. | | | | - | | REQ | ? | Initial text with an S:client. 5. TheDOIC solution addresses | | 31 | | node and realm directly.client sends another Destination-Realm routed request. 6. Theapplication to which a | | | | report applies is implicitly determined based on the | | | | application level message carryingagent selects S2, and forwards thereport. Note | | | |request. 7. It turns out thatthereS2 isno way with DOICalso overloaded, perhaps due to all that traffic it took over for S1. S2 returns an successful answer containing anoverloaded node | | | | to communicate multiple nodes, realms or applications | | | | in a singleoverload report.So the inverse ofSince this| | | | requirementreport only applies to S2, the ReportType isnot supported. Comment:"Destination-Host". 8. Theinverse | | | | is also not _required_ :-) But I think we are "P" | | | | here, inagent sees thatwe don't support "node" per se. we do | | | | support "server." "Node" includes agents. (IS2 is also| | | | interpreted this to mean that each granularity needed | | | | to be supported independently--that is, a potential | | | | to say "all traffic to a realm" or "all traffic to a | | | | host" independently of application.) | | | | - | | REQ | ? | Initial text with an S: The DOIC solution supports | | 32 | | extensibility of bothoverloaded by theinformation communicated | | | | andvalue inthe definition of new overload abatement | | | | algorithms. Comment 1: Recent discussions have made | | | | this a ?. It can be changed to S/N/P once these | | | | discussions come to a conclusion and new textReduction-Percentage. This value is| | | | added toprobably different than thedraft. Comment 2: Suggested wording - | | | |value from S1's report. TheDOIC solution supports extensibility of both the | | | | information communicated and in the definition of new | | | | overload abatement algorithms or strategies. It | | | | should be noted that, according to the applications | | | | or to reacting node implementations, many algorithms | | | | may be applied on top ofagent diverts theDOIC baseline solution | | | | (without contradicting it), e.g. regarding which type | | | | of requestremaining traffic tothrottle, prioritized messages | | | | handling, mapping ofS3 as best as it can, but it calculates that thereduction %remaining capacity across all three servers is no longer sufficient toan internal | | | | algorithm (eg 1 message out of ten etc..) but such | | | | algorithms are out of scopehandle all ofDOIC. | | | | - | | REQ | ? | Initial text with P: The DOIC solution currently | | 33 | | definestheloss algorithm asrealm-routed traffic. This means thedefault algorithm. | | | | It does not specify it as mandatory to implement. | | | | Comment 1: Then I think that's a "n".realm itself is overloaded. TheMTI part | | | |realm's overload percentage isthe crux of the requirement. Comment 2: Suggested | | | | wording: In the DOIC baseline solution, the reacting | | | | node has to apply the received Reduction-Percentage, | | | | andmost likely different than that forachieving this, the reacting node can do | | | | requests rerouting (when it is possible)either S1 or| | | | drop/reject requests. This DOIC baseline solution is | | | | a loss algorithm and DOIC should not require further | | | | specification.S2. Theanswer to REQ32 indicates the | | | | possibilityagent forward's S2's report back toadd other algorithms on top ofthe| | | | DOIC baseline solution. The DOIC solution currently | | | | defines this loss algorithm asclient in thedefault algorithm. | | | | It is still under discussion to make it as mandatory | | | | to implement. | | | | - | | REQ | P | The ability to communicate overload reports between | | 34 | | supportingDiameternodes does not require agents to | | | | supportanswer. Additionally, theDOIC solution. Load information exchange | | | | is not currently defined. | +------+----+-------------------------------------------------------+ Table 1 Appendix C. Examples C.1. 3GPP S6a interface overload indication [TBD: Would cover S6a MME-HSS communication with several topology choices (such as with or without DRA,agent generates a new report for the realm of "realm", and inserts that report into the answer. The client throttles requests with"generic" agents).] C.2. 3GPP PCC interfaces overload indication [TBD: Would cover Gx/Rx and maybe S9..] C.3. Mix of Destination-Realm routedDestination-Host:S1 at one rate, requests with Destination-Host:S2 at another rate, andDestination-Host reoutedrequests[TBD: Add example showing the use ofwith no Destination-Hosttype OLRs and Realm type OLRs.]AVP at yet a third rate. (Since S3 has not indicated overload, the client does not throttle requests with Destination-Host:S3.) Authors' Addresses Jouni Korhonen (editor) Broadcom Porkkalankatu 24 Helsinki FIN-00180 Finland Email: jouni.nospam@gmail.com Steve Donovan Oracle 17210 Campbell Road Dallas, Texas 75254 United States Email: srdonovan@usdonovans.com Ben Campbell Oracle 17210 Campbell Road Dallas, Texas 75254 United States Email: ben@nostrum.com Lionel Morand Orange Labs 38/40 rue du General Leclerc Issy-Les-Moulineaux Cedex 9 92794 France Phone: +33145296257 Email: lionel.morand@orange.com