idnits 2.17.1 draft-polli-ratelimit-headers-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 38 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (26 November 2020) is 1239 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'UNIX' is defined on line 1022, but no explicit reference was found in the text ** Obsolete normative reference: RFC 7234 (ref. 'CACHING') (Obsoleted by RFC 9111) ** Obsolete normative reference: RFC 7230 (ref. 'MESSAGING') (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7231 (ref. 'SEMANTICS') (Obsoleted by RFC 9110) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIX' -- Duplicate reference: RFC7234, mentioned in 'RFC7234', was also mentioned in 'CACHING'. -- Obsolete informational reference (is this intentional?): RFC 7234 (Obsoleted by RFC 9111) Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP R. Polli 3 Internet-Draft Team Digitale, Italian Government 4 Intended status: Standards Track A. Martinez 5 Expires: 30 May 2021 Red Hat 6 26 November 2020 8 RateLimit Header Fields for HTTP 9 draft-polli-ratelimit-headers-05 11 Abstract 13 This document defines the RateLimit-Limit, RateLimit-Remaining, 14 RateLimit-Reset fields for HTTP, thus allowing servers to publish 15 current request quotas and clients to shape their request policy and 16 avoid being throttled out. 18 Note to Readers 20 _RFC EDITOR: please remove this section before publication_ 22 Discussion of this draft takes place on the HTTP working group 23 mailing list (ietf-http-wg@w3.org), which is archived at 24 https://lists.w3.org/Archives/Public/ietf-http-wg/ 25 (https://lists.w3.org/Archives/Public/ietf-http-wg/). 27 The source code and issues list for this draft can be found at 28 https://github.com/ioggstream/draft-polli-ratelimit-headers 29 (https://github.com/ioggstream/draft-polli-ratelimit-headers). 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on 30 May 2021. 48 Copyright Notice 50 Copyright (c) 2020 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 55 license-info) in effect on the date of publication of this document. 56 Please review these documents carefully, as they describe your rights 57 and restrictions with respect to this document. Code Components 58 extracted from this document must include Simplified BSD License text 59 as described in Section 4.e of the Trust Legal Provisions and are 60 provided without warranty as described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Rate-limiting and quotas . . . . . . . . . . . . . . . . 3 66 1.2. Current landscape of rate-limiting headers . . . . . . . 4 67 1.2.1. Interoperability issues . . . . . . . . . . . . . . . 4 68 1.3. This proposal . . . . . . . . . . . . . . . . . . . . . . 5 69 1.4. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 70 1.5. Notational Conventions . . . . . . . . . . . . . . . . . 6 71 2. Expressing rate-limit policies . . . . . . . . . . . . . . . 6 72 2.1. Time window . . . . . . . . . . . . . . . . . . . . . . . 6 73 2.2. Request quota . . . . . . . . . . . . . . . . . . . . . . 6 74 2.3. Quota policy . . . . . . . . . . . . . . . . . . . . . . 7 75 3. Header Specifications . . . . . . . . . . . . . . . . . . . . 8 76 3.1. RateLimit-Limit . . . . . . . . . . . . . . . . . . . . . 8 77 3.2. RateLimit-Remaining . . . . . . . . . . . . . . . . . . . 9 78 3.3. RateLimit-Reset . . . . . . . . . . . . . . . . . . . . . 9 79 4. Providing RateLimit headers . . . . . . . . . . . . . . . . . 10 80 5. Intermediaries . . . . . . . . . . . . . . . . . . . . . . . 11 81 6. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 82 7. Receiving RateLimit headers . . . . . . . . . . . . . . . . . 11 83 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 12 84 8.1. Unparameterized responses . . . . . . . . . . . . . . . . 12 85 8.1.1. Throttling informations in responses . . . . . . . . 12 86 8.1.2. Use in conjunction with custom headers . . . . . . . 13 87 8.1.3. Use for limiting concurrency . . . . . . . . . . . . 13 88 8.1.4. Use in throttled responses . . . . . . . . . . . . . 14 89 8.2. Parameterized responses . . . . . . . . . . . . . . . . . 15 90 8.2.1. Throttling window specified via parameter . . . . . . 15 91 8.2.2. Dynamic limits with parameterized windows . . . . . . 15 92 8.2.3. Dynamic limits for pushing back and slowing down . . 16 93 8.3. Dynamic limits for pushing back with Retry-After and slow 94 down . . . . . . . . . . . . . . . . . . . . . . . . . . 17 95 8.3.1. Missing Remaining informations . . . . . . . . . . . 17 96 8.3.2. Use with multiple windows . . . . . . . . . . . . . . 18 97 9. Security Considerations . . . . . . . . . . . . . . . . . . . 19 98 9.1. Throttling does not prevent clients from issuing 99 requests . . . . . . . . . . . . . . . . . . . . . . . . 19 100 9.2. Information disclosure . . . . . . . . . . . . . . . . . 19 101 9.3. Remaining quota-units are not granted requests . . . . . 19 102 9.4. Reliability of RateLimit-Reset . . . . . . . . . . . . . 20 103 9.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 20 104 9.6. Denial of Service . . . . . . . . . . . . . . . . . . . . 20 105 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 106 10.1. RateLimit-Limit Field Registration . . . . . . . . . . . 21 107 10.2. RateLimit-Remaining Field Registration . . . . . . . . . 21 108 10.3. RateLimit-Reset Field Registration . . . . . . . . . . . 21 109 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 110 11.1. Normative References . . . . . . . . . . . . . . . . . . 21 111 11.2. Informative References . . . . . . . . . . . . . . . . . 22 112 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 23 113 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 23 114 Appendix C. RateLimit headers currently used on the web . . . . 23 115 Appendix D. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 24 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 118 1. Introduction 120 The widespreading of HTTP as a distributed computation protocol 121 requires an explicit way of communicating service status and usage 122 quotas. 124 This was partially addressed with the "Retry-After" header field 125 defined in [SEMANTICS] to be returned in "429 Too Many Requests" or 126 "503 Service Unavailable" responses. 128 Still, there is not a standard way to communicate service quotas so 129 that the client can throttle its requests and prevent 4xx or 5xx 130 responses. 132 1.1. Rate-limiting and quotas 134 Servers use quota mechanisms to avoid systems overload, to ensure an 135 equitable distribution of computational resources or to enforce other 136 policies - eg. monetization. 138 A basic quota mechanism limits the number of acceptable requests in a 139 given time window, eg. 10 requests per second. 141 When quota is exceeded, servers usually do not serve the request 142 replying instead with a "4xx" HTTP status code (eg. 429 or 403) or 143 adopt more aggressive policies like dropping connections. 145 Quotas may be enforced on different basis (eg. per user, per IP, per 146 geographic area, ..) and at different levels. For example, an user 147 may be allowed to issue: 149 * 10 requests per second; 151 * limited to 60 request per minute; 153 * limited to 1000 request per hour. 155 Moreover system metrics, statistics and heuristics can be used to 156 implement more complex policies, where the number of acceptable 157 request and the time window are computed dynamically. 159 1.2. Current landscape of rate-limiting headers 161 To help clients throttling their requests, servers may expose the 162 counters used to evaluate quota policies via HTTP header fields. 164 Those response headers may be added by HTTP intermediaries such as 165 API gateways and reverse proxies. 167 On the web we can find many different rate-limit headers, usually 168 containing the number of allowed requests in a given time window, and 169 when the window is reset. 171 The common choice is to return three headers containing: 173 * the maximum number of allowed requests in the time window; 175 * the number of remaining requests in the current window; 177 * the time remaining in the current window expressed in seconds or 178 as a timestamp; 180 1.2.1. Interoperability issues 182 A major interoperability issue in throttling is the lack of standard 183 headers, because: 185 * each implementation associates different semantics to the same 186 header field names; 188 * header field names proliferates. 190 Client applications interfacing with different servers may thus need 191 to process different headers, or the very same application interface 192 that sits behind different reverse proxies may reply with different 193 throttling headers. 195 1.3. This proposal 197 This proposal defines syntax and semantics for the following fields: 199 * "RateLimit-Limit": containing the requests quota in the time 200 window; 202 * "RateLimit-Remaining": containing the remaining requests quota in 203 the current window; 205 * "RateLimit-Reset": containing the time remaining in the current 206 window, specified in seconds. 208 The behavior of "RateLimit-Reset" is compatible with the "delta- 209 seconds" notation of "Retry-After". 211 The fields definition allows to describe complex policies, including 212 the ones using multiple and variable time windows and dynamic quotas, 213 or implementing concurrency limits. 215 1.4. Goals 217 The goals of this proposal are: 219 1. Standardizing the names and semantic of rate-limit headers; 221 2. Improve resiliency of HTTP infrastructures simplifying the 222 enforcement and the adoption of rate-limit headers; 224 3. Simplify API documentation avoiding expliciting rate-limit fields 225 semantic in documentation. 227 The goals do not include: 229 Authorization: The rate-limit headers described here are not meant 230 to support authorization or other kinds of access controls. 232 Throttling scope: This specification does not cover the throttling 233 scope, that may be the given resource-target, its parent path or 234 the whole Origin [RFC6454] section 7. 236 Response status code: The rate-limit headers may be returned in both 237 Successful and non Successful responses. This specification does 238 not cover whether non Successful responses count on quota usage. 240 Throttling policy: This specification does not mandate a specific 241 throttling policy. The values published in the headers, including 242 the window size, can be statically or dynamically evaluated. 244 Service Level Agreement: Conveyed quota hints do not imply any 245 service guarantee. Server is free to throttle respectful clients 246 under certain circumstances. 248 1.5. Notational Conventions 250 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 251 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 252 "OPTIONAL" in this document are to be interpreted as described in 253 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 254 capitals, as shown here. 256 This document uses the Augmented BNF defined in [RFC5234] and updated 257 by [RFC7405] along with the "#rule" extension defined in Section 7 of 258 [MESSAGING]. 260 The term Origin is to be interpreted as described in [RFC6454] 261 section 7. 263 The "delta-seconds" rule is defined in [CACHING] section 1.2.1. 265 2. Expressing rate-limit policies 267 2.1. Time window 269 Rate limit policies limit the number of acceptable requests in a 270 given time window. 272 A time window is expressed in seconds, using the following syntax: 274 time-window = delta-seconds 276 Subsecond precision is not supported. 278 2.2. Request quota 280 The request-quota is a value associated to the maximum number of 281 requests that the server is willing to accept from one or more 282 clients on a given basis (originating IP, authenticated user, 283 geographical, ..) during a "time-window" as defined in Section 2.1. 285 The "request-quota" is expressed in "quota-units" and has the 286 following syntax: 288 request-quota = quota-units 289 quota-units = 1*DIGIT 291 The "request-quota" SHOULD match the maximum number of acceptable 292 requests. 294 The "request-quota" MAY differ from the total number of acceptable 295 requests when weight mechanisms, bursts, or other server policies are 296 implemented. 298 If the "request-quota" does not match the maximum number of 299 acceptable requests the relation with that SHOULD be communicated 300 out-of-band. 302 Example: A server could 304 * count once requests like "/books/{id}" 306 * count twice search requests like "/books?author=Camilleri" 308 so that we have the following counters 310 GET /books/123 ; request-quota=4, remaining: 3, status=200 311 GET /books?author=Camilleri ; request-quota=4, remaining: 1, status=200 312 GET /books?author=Eco ; request-quota=4, remaining: 0, status=429 314 2.3. Quota policy 316 This specification allows describing a quota policy with the 317 following syntax: 319 quota-policy = request-quota; "w" "=" time-window 320 *( OWS ";" OWS quota-comment) 321 quota-comment = token "=" (token / quoted-string) 323 quota-policy parameters like "w" and quota-comment tokens MUST NOT 324 occur multiple times within the same quota-policy. 326 An example policy of 100 quota-units per minute. 328 100;w=60 330 Two examples of providing further details via custom parameters in 331 "quota-comments". 333 100;w=60;comment="fixed window" 334 12;w=1;burst=1000;policy="leaky bucket" 336 3. Header Specifications 338 The following "RateLimit" response fields are defined 340 3.1. RateLimit-Limit 342 The "RateLimit-Limit" response field indicates the "request-quota" 343 associated to the client in the current "time-window". 345 If the client exceeds that limit, it MAY not be served. 347 The header value is 349 RateLimit-Limit = expiring-limit [, 1#quota-policy ] 350 expiring-limit = request-quota 352 The "expiring-limit" value MUST be set to the "request-quota" that is 353 closer to reach its limit. 355 The "quota-policy" is defined in Section 2.3, and its values are 356 informative. 358 RateLimit-Limit: 100 360 A "time-window" associated to "expiring-limit" can be communicated 361 via an optional "quota-policy" value, like shown in the following 362 example 364 RateLimit-Limit: 100, 100;w=10 366 If the "expiring-limit" is not associated to a "time-window", the 367 "time-window" MUST either be: 369 * inferred by the value of "RateLimit-Reset" at the moment of the 370 reset, or 372 * communicated out-of-band (eg. in the documentation). 374 Policies using multiple quota limits MAY be returned using multiple 375 "quota-policy" items, like shown in the following two examples: 377 RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 378 RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600 380 This header MUST NOT occur multiple times and can be sent in a 381 trailer section. 383 3.2. RateLimit-Remaining 385 The "RateLimit-Remaining" response field indicates the remaining 386 "quota-units" defined in Section 2.2 associated to the client. 388 The header value is 390 RateLimit-Remaining = quota-units 392 This header MUST NOT occur multiple times and can be sent in a 393 trailer section. 395 Clients MUST NOT assume that a positive "RateLimit-Remaining" value 396 is a guarantee of being served. 398 A low "RateLimit-Remaining" value is like a yellow traffic-light: the 399 red light may arrive suddenly. 401 One example of "RateLimit-Remaining" use is below. 403 RateLimit-Remaining: 50 405 3.3. RateLimit-Reset 407 The "RateLimit-Reset" response field indicates either 409 * the number of seconds until the quota resets. 411 The header value is 413 RateLimit-Reset = delta-seconds 415 The delta-seconds format is used because: 417 * it does not rely on clock synchronization and is resilient to 418 clock adjustment and clock skew between client and server (see 419 [SEMANTICS] Section 4.1.1.1); 421 * it mitigates the risk related to thundering herd when too many 422 clients are serviced with the same timestamp. 424 This header MUST NOT occur multiple times and can be sent in a 425 trailer section. 427 An example of "RateLimit-Reset" use is below. 429 RateLimit-Reset: 50 431 The client MUST NOT assume that all its "request-quota" will be 432 restored after the moment referenced by "RateLimit-Reset". The 433 server MAY arbitrarily alter the "RateLimit-Reset" value between 434 subsequent requests eg. in case of resource saturation or to 435 implement sliding window policies. 437 4. Providing RateLimit headers 439 A server MAY use one or more "RateLimit" response fields defined in 440 this document to communicate its quota policies. 442 The returned values refers to the metrics used to evaluate if the 443 current request respects the quota policy and MAY not apply to 444 subsequent requests. 446 Example: a successful response with the following fields 448 RateLimit-Limit: 10 449 RateLimit-Remaining: 1 450 RateLimit-Reset: 7 452 does not guarantee that the next request will be successful. Server 453 metrics may be subject to other conditions like the one shown in the 454 example from Section 2.2. 456 A server MAY return "RateLimit" response fields independently of the 457 response status code. This includes throttled responses. 459 If a response contains both the "Retry-After" and the "RateLimit- 460 Reset" fields, the value of "RateLimit-Reset" SHOULD reference the 461 same point in time as "Retry-After". 463 When using a policy involving more than one "time-window", the server 464 MUST reply with the "RateLimit" headers related to the window with 465 the lower "RateLimit-Remaining" values. 467 Under certain conditions, a server MAY artificially lower "RateLimit" 468 field values between subsequent requests, eg. to respond to Denial of 469 Service attacks or in case of resource saturation. 471 Servers usually establish whether the request is in-quota before 472 creating a response, so the RateLimit field values should be already 473 available in that moment. Nonetheless servers MAY decide to send the 474 "RateLimit" fields in a trailer section. 476 5. Intermediaries 478 This section documents the considerations advised in Section 15.3.3 479 of [SEMANTICS]. 481 An intermediary that is not part of the originating service 482 infrastructure and is not aware of the quota-policy semantic used by 483 the Origin Server SHOULD NOT alter the RateLimit fields' values in 484 such a way as to communicate a more permissive quota-policy; this 485 includes removing the RateLimit fields. 487 An intermediary MAY alter the RateLimit fields in such a way as to 488 communicate a more restrictive quota-policy when: 490 * it is aware of the quota-unit semantic used by the Origin Server; 492 * it implements this specification and enforces a quota-policy which 493 is more restrictive than the one conveyed in the fields. 495 An intermediary SHOULD forward a request even when presuming that it 496 might not be serviced; the service returning the RateLimit fields is 497 the sole responsible of enforcing the communicated quota-policy, and 498 it is always free to service incoming requests. 500 This specification does not mandate any behavior on intermediaries 501 respect to retries, nor requires that intermediaries have any role in 502 respecting quota-policies. For example, it is legitimate for a proxy 503 to retransmit a request without notifying the client, and thus 504 consuming quota-units. 506 6. Caching 508 As is the ordinary case for HTTP caching ([RFC7234]), a response with 509 RateLimit fields might be cached and re-used for subsequent requests. 510 A cached RateLimit response, does not modify quota counters but could 511 contain stale information. Clients interested in determining the 512 freshness of the RateLimit fields could rely on fields such as "Date" 513 and on the "window" value of a "quota-policy". 515 7. Receiving RateLimit headers 517 A client MUST process the received "RateLimit" headers. 519 A client MUST validate the values received in the "RateLimit" headers 520 before using them and check if there are significant discrepancies 521 with the expected ones. This includes a "RateLimit-Reset" moment too 522 far in the future or a "request-quota" too high. 524 Malformed "RateLimit" headers MAY be ignored. 526 A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit- 527 Remaining" before the "time-window" expressed in "RateLimit-Reset". 529 A client MAY still probe the server if the "RateLimit-Reset" is 530 considered too high. 532 The value of "RateLimit-Reset" is generated at response time: a 533 client aware of a significant network latency MAY behave accordingly 534 and use other informations (eg. the "Date" response header, or 535 otherwise gathered metrics) to better estimate the "RateLimit-Reset" 536 moment intended by the server. 538 The "quota-policy" values and comments provided in "RateLimit-Limit" 539 are informative and MAY be ignored. 541 If a response contains both the "RateLimit-Reset" and "Retry-After" 542 fields, the "Retry-After" header field MUST take precedence and the 543 "RateLimit-Reset" field MAY be ignored. 545 8. Examples 547 8.1. Unparameterized responses 549 8.1.1. Throttling informations in responses 551 The client exhausted its request-quota for the next 50 seconds. The 552 "time-window" is communicated out-of-band or inferred by the header 553 values. 555 Request: 557 GET /items/123 559 Response: 561 HTTP/1.1 200 Ok 562 Content-Type: application/json 563 RateLimit-Limit: 100 564 Ratelimit-Remaining: 0 565 Ratelimit-Reset: 50 567 {"hello": "world"} 569 8.1.2. Use in conjunction with custom headers 571 The server uses two custom headers, namely "acme-RateLimit-DayLimit" 572 and "acme-RateLimit-HourLimit" to expose the following policy: 574 * 5000 daily quota-units; 576 * 1000 hourly quota-units. 578 The client consumed 4900 quota-units in the first 14 hours. 580 Despite the next hourly limit of 1000 quota-units, the closest limit 581 to reach is the daily one. 583 The server then exposes the "RateLimit-*" headers to inform the 584 client that: 586 * it has only 100 quota-units left; 588 * the window will reset in 10 hours. 590 Request: 592 GET /items/123 594 Response: 596 HTTP/1.1 200 Ok 597 Content-Type: application/json 598 acme-RateLimit-DayLimit: 5000 599 acme-RateLimit-HourLimit: 1000 600 RateLimit-Limit: 5000 601 RateLimit-Remaining: 100 602 RateLimit-Reset: 36000 604 {"hello": "world"} 606 8.1.3. Use for limiting concurrency 608 Throttling headers may be used to limit concurrency, advertising 609 limits that are lower than the usual ones in case of saturation, thus 610 increasing availability. 612 The server adopted a basic policy of 100 quota-units per minute, and 613 in case of resource exhaustion adapts the returned values reducing 614 both "RateLimit-Limit" and "RateLimit-Remaining". 616 After 2 seconds the client consumed 40 quota-units 617 Request: 619 GET /items/123 621 Response: 623 HTTP/1.1 200 Ok 624 Content-Type: application/json 625 RateLimit-Limit: 100 626 RateLimit-Remaining: 60 627 RateLimit-Reset: 58 629 {"elapsed": 2, "issued": 40} 631 At the subsequent request - due to resource exhaustion - the server 632 advertises only "RateLimit-Remaining: 20". 634 Request: 636 GET /items/123 638 Response: 640 HTTP/1.1 200 Ok 641 Content-Type: application/json 642 RateLimit-Limit: 100 643 RateLimit-Remaining: 20 644 RateLimit-Reset: 56 646 {"elapsed": 4, "issued": 41} 648 8.1.4. Use in throttled responses 650 A client exhausted its quota and the server throttles the request 651 sending the "Retry-After" response header field. 653 In this example, the values of "Retry-After" and "RateLimit-Reset" 654 reference the same moment, but this is not a requirement. 656 The "429 Too Many Requests" HTTP status code is just used as an 657 example. 659 Request: 661 GET /items/123 663 Response: 665 HTTP/1.1 429 Too Many Requests 666 Content-Type: application/json 667 Date: Mon, 05 Aug 2019 09:27:00 GMT 668 Retry-After: Mon, 05 Aug 2019 09:27:05 GMT 669 RateLimit-Reset: 5 670 RateLimit-Limit: 100 671 Ratelimit-Remaining: 0 673 { 674 "title": "Too Many Requests", 675 "status": 429, 676 "detail": "You have exceeded your quota" 677 } 679 8.2. Parameterized responses 681 8.2.1. Throttling window specified via parameter 683 The client has 99 "quota-units" left for the next 50 seconds. The 684 "time-window" is communicated by the "w" parameter, so we know the 685 throughput is 100 "quota-units" per minute. 687 Request: 689 GET /items/123 691 Response: 693 HTTP/1.1 200 Ok 694 Content-Type: application/json 695 RateLimit-Limit: 100, 100;w=60 696 Ratelimit-Remaining: 99 697 Ratelimit-Reset: 50 699 {"hello": "world"} 701 8.2.2. Dynamic limits with parameterized windows 703 The policy conveyed by "RateLimit-Limit" states that the server 704 accepts 100 quota-units per minute. 706 To avoid resource exhaustion, the server artificially lowers the 707 actual limits returned in the throttling headers. 709 The "RateLimit-Remaining" then advertises only 9 quota-units for the 710 next 50 seconds to slow down the client. 712 Note that the server could have lowered even the other values in 713 "RateLimit-Limit": this specification does not mandate any relation 714 between the field values contained in subsequent responses. 716 Request: 718 GET /items/123 720 Response: 722 HTTP/1.1 200 Ok 723 Content-Type: application/json 724 RateLimit-Limit: 10, 100;w=60 725 Ratelimit-Remaining: 9 726 Ratelimit-Reset: 50 728 { 729 "status": 200, 730 "detail": "Just slow down without waiting." 731 } 733 8.2.3. Dynamic limits for pushing back and slowing down 735 Continuing the previous example, let's say the client waits 10 736 seconds and performs a new request which, due to resource exhaustion, 737 the server rejects and pushes back, advertising "RateLimit-Remaining: 738 0" for the next 20 seconds. 740 The server advertises a smaller window with a lower limit to slow 741 down the client for the rest of its original window after the 20 742 seconds elapse. 744 Request: 746 GET /items/123 748 Response: 750 HTTP/1.1 429 Too Many Requests 751 Content-Type: application/json 752 RateLimit-Limit: 0, 15;w=20 753 Ratelimit-Remaining: 0 754 Ratelimit-Reset: 20 756 { 757 "status": 429, 758 "detail": "Wait 20 seconds, then slow down!" 759 } 761 8.3. Dynamic limits for pushing back with Retry-After and slow down 763 Alternatively, given the same context where the previous example 764 starts, we can convey the same information to the client via the 765 Retry-After header, with the advantage that the server can now 766 specify the policy's nominal limit and window that will apply after 767 the reset, ie. assuming the resource exhaustion is likely to be gone 768 by then, so the advertised policy does not need to be adjusted, yet 769 we managed to stop requests for a while and slow down the rest of the 770 current window. 772 Request: 774 GET /items/123 776 Response: 778 HTTP/1.1 429 Too Many Requests 779 Content-Type: application/json 780 Retry-After: 20 781 RateLimit-Limit: 15, 100;w=60 782 Ratelimit-Remaining: 15 783 Ratelimit-Reset: 40 785 { 786 "status": 429, 787 "detail": "Wait 20 seconds, then slow down!" 788 } 790 Note that in this last response the client is expected to honor the 791 "Retry-After" header and perform no requests for the specified amount 792 of time, whereas the previous example would not force the client to 793 stop requests before the reset time is elapsed, as it would still be 794 free to query again the server even if it is likely to have the 795 request rejected. 797 8.3.1. Missing Remaining informations 799 The server does not expose "RateLimit-Remaining" values, but resets 800 the limit counter every second. 802 It communicates to the client the limit of 10 quota-units per second 803 always returning the couple "RateLimit-Limit" and "RateLimit-Reset". 805 Request: 807 GET /items/123 808 Response: 810 HTTP/1.1 200 Ok 811 Content-Type: application/json 812 RateLimit-Limit: 10 813 Ratelimit-Reset: 1 815 {"first": "request"} 817 Request: 819 GET /items/123 821 Response: 823 HTTP/1.1 200 Ok 824 Content-Type: application/json 825 RateLimit-Limit: 10 826 Ratelimit-Reset: 1 828 {"second": "request"} 830 8.3.2. Use with multiple windows 832 This is a standardized way of describing the policy detailed in 833 Section 8.1.2: 835 * 5000 daily quota-units; 837 * 1000 hourly quota-units. 839 The client consumed 4900 quota-units in the first 14 hours. 841 Despite the next hourly limit of 1000 quota-units, the closest limit 842 to reach is the daily one. 844 The server then exposes the "RateLimit" headers to inform the client 845 that: 847 * it has only 100 quota-units left; 849 * the window will reset in 10 hours; 851 * the "expiring-limit" is 5000. 853 Request: 855 GET /items/123 856 Response: 858 HTTP/1.1 200 OK 859 Content-Type: application/json 860 RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400 861 RateLimit-Remaining: 100 862 RateLimit-Reset: 36000 864 {"hello": "world"} 866 9. Security Considerations 868 9.1. Throttling does not prevent clients from issuing requests 870 This specification does not prevent clients to make over-quota 871 requests. 873 Servers should always implement mechanisms to prevent resource 874 exhaustion. 876 9.2. Information disclosure 878 Servers should not disclose operational capacity informations that 879 can be used to saturate its resources. 881 While this specification does not mandate whether non 2xx responses 882 consume quota, if 401 and 403 responses count on quota a malicious 883 client could probe the endpoint to get traffic informations of 884 another user. 886 As intermediaries might retransmit requests and consume quota-units 887 without prior knowledge of the User Agent, RateLimit headers might 888 reveal the existence of an intermediary to the User Agent. 890 9.3. Remaining quota-units are not granted requests 892 "RateLimit-*" headers convey hints from the server to the clients in 893 order to avoid being throttled out. 895 Clients MUST NOT consider the "quota-units" returned in "RateLimit- 896 Remaining" as a service level agreement. 898 In case of resource saturation, the server MAY artificially lower the 899 returned values or not serve the request anyway. 901 9.4. Reliability of RateLimit-Reset 903 Consider that "request-quota" may not be restored after the moment 904 referenced by "RateLimit-Reset", and the "RateLimit-Reset" value 905 should not be considered fixed nor constant. 907 Subsequent requests may return an higher "RateLimit-Reset" value to 908 limit concurrency or implement dynamic or adaptive throttling 909 policies. 911 9.5. Resource exhaustion 913 When returning "RateLimit-Reset" you must be aware that many 914 throttled clients may come back at the very moment specified. 916 This is true for "Retry-After" too. 918 For example, if the quota resets every day at "18:00:00" and your 919 server returns the "RateLimit-Reset" accordingly 921 Date: Tue, 15 Nov 1994 08:00:00 GMT 922 RateLimit-Reset: 36000 924 there's a high probability that all clients will show up at 925 "18:00:00". 927 This could be mitigated adding some jitter to the field-value. 929 9.6. Denial of Service 931 "RateLimit" fields may assume unexpected values by chance or purpose. 932 For example, an excessively high "RateLimit-Remaining" value may be: 934 * used by a malicious intermediary to trigger a Denial of Service 935 attack or consume client resources boosting its requests; 937 * passed by a misconfigured server; 939 or an high "RateLimit-Reset" value could inhibit clients to contact 940 the server. 942 Clients MUST validate the received values to mitigate those risks. 944 10. IANA Considerations 945 10.1. RateLimit-Limit Field Registration 947 This section registers the "RateLimit-Limit" field in the "Hypertext 948 Transfer Protocol (HTTP) Field Name Registry" registry ([SEMANTICS]). 950 Field name: "RateLimit-Limit" 952 Status: permanent 954 Specification document(s): Section 3.1 of this document 956 10.2. RateLimit-Remaining Field Registration 958 This section registers the "RateLimit-Remaining" field in the 959 "Hypertext Transfer Protocol (HTTP) Field Name Registry" registry 960 ([SEMANTICS]). 962 Field name: "RateLimit-Remaining" 964 Status: permanent 966 Specification document(s): Section 3.2 of this document 968 10.3. RateLimit-Reset Field Registration 970 This section registers the "RateLimit-Reset" field in the "Hypertext 971 Transfer Protocol (HTTP) Field Name Registry" registry ([SEMANTICS]). 973 Field name: "RateLimit-Reset" 975 Status: permanent 977 Specification document(s): Section 3.3 of this document 979 11. References 981 11.1. Normative References 983 [CACHING] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 984 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 985 RFC 7234, DOI 10.17487/RFC7234, June 2014, 986 . 988 [MESSAGING] 989 Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 990 Protocol (HTTP/1.1): Message Syntax and Routing", 991 RFC 7230, DOI 10.17487/RFC7230, June 2014, 992 . 994 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 995 Requirement Levels", BCP 14, RFC 2119, 996 DOI 10.17487/RFC2119, March 1997, 997 . 999 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1000 Specifications: ABNF", STD 68, RFC 5234, 1001 DOI 10.17487/RFC5234, January 2008, 1002 . 1004 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, 1005 DOI 10.17487/RFC6454, December 2011, 1006 . 1008 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 1009 RFC 7405, DOI 10.17487/RFC7405, December 2014, 1010 . 1012 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1013 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1014 May 2017, . 1016 [SEMANTICS] 1017 Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1018 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 1019 DOI 10.17487/RFC7231, June 2014, 1020 . 1022 [UNIX] The Open Group, ., "The Single UNIX Specification, Version 1023 2 - 6 Vol Set for UNIX 98", February 1997. 1025 11.2. Informative References 1027 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 1028 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 1029 . 1031 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 1032 Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, 1033 . 1035 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 1036 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 1037 RFC 7234, DOI 10.17487/RFC7234, June 2014, 1038 . 1040 Appendix A. Change Log 1042 RFC EDITOR PLEASE DELETE THIS SECTION. 1044 Appendix B. Acknowledgements 1046 Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro 1047 Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark 1048 Nottingham for being the initial contributors of these 1049 specifications. Kudos to the first community implementors: Aapo 1050 Talvensaari, Nathan Friedly and Sanyam Dogra. 1052 Appendix C. RateLimit headers currently used on the web 1054 RFC EDITOR PLEASE DELETE THIS SECTION. 1056 Commonly used header field names are: 1058 * "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset"; 1060 * "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit- 1061 Reset". 1063 There are variants too, where the window is specified in the header 1064 field name, eg: 1066 * "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x- 1067 ratelimit-limit-day" 1069 * "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x- 1070 ratelimit-remaining-day" 1072 Here are some interoperability issues: 1074 * "X-RateLimit-Remaining" references different values, depending on 1075 the implementation: 1077 - seconds remaining to the window expiration 1079 - milliseconds remaining to the window expiration 1081 - seconds since UTC, in UNIX Timestamp 1083 - a datetime, either "IMF-fixdate" [SEMANTICS] or [RFC3339] 1085 * different headers, with the same semantic, are used by different 1086 implementers: 1088 - X-RateLimit-Limit and X-Rate-Limit-Limit 1090 - X-RateLimit-Remaining and X-Rate-Limit-Remaining 1092 - X-RateLimit-Reset and X-Rate-Limit-Reset 1094 The semantic of RateLimit-Remaining depends on the windowing 1095 algorithm. A sliding window policy for example may result in having 1096 a ratelimit-remaining value related to the ratio between the current 1097 and the maximum throughput. Eg. 1099 RateLimit-Limit: 12, 12;w=1 1100 RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s 1101 RateLimit-Reset: 1 1103 If this is the case, the optimal solution is to achieve 1105 RateLimit-Limit: 12, 12;w=1 1106 RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s 1107 RateLimit-Reset: 1 1109 At this point you should stop increasing your request rate. 1111 Appendix D. FAQ 1113 1. Why defining standard headers for throttling? 1115 To simplify enforcement of throttling policies. 1117 2. Can I use RateLimit-* in throttled responses (eg with status code 1118 429)? 1120 Yes, you can. 1122 3. Are those specs tied to RFC 6585? 1124 No. [RFC6585] defines the "429" status code and we use it just 1125 as an example of a throttled request, that could instead use even 1126 403 or whatever status code. 1128 4. Why don't pass the throttling scope as a parameter? 1130 After a discussion on a similar thread 1131 (https://github.com/httpwg/http-core/pull/317#issuecomment- 1132 585868767) we will probably add a new "RateLimit-Scope" header to 1133 this spec. 1135 I'm open to suggestions: comment on this issue 1136 (https://github.com/ioggstream/draft-polli-ratelimit-headers/ 1137 issues/70) 1139 5. Why using delta-seconds instead of a UNIX Timestamp? Why not 1140 using subsecond precision? 1142 Using delta-seconds aligns with "Retry-After", which is returned 1143 in similar contexts, eg on 429 responses. 1145 delta-seconds as defined in [CACHING] section 1.2.1 clarifies 1146 some parsing rules too. 1148 Timestamps require a clock synchronization protocol (see 1149 [SEMANTICS] section 4.1.1.1). This may be problematic (eg. clock 1150 adjustment, clock skew, failure of hardcoded clock 1151 synchronization servers, IoT devices, ..). Moreover timestamps 1152 may not be monotonically increasing due to clock adjustment. See 1153 Another NTP client failure story 1154 (https://community.ntppool.org/t/another-ntp-client-failure- 1155 story/1014/) 1157 We did not use subsecond precision because: 1159 * that is more subject to system clock correction like the one 1160 implemented via the adjtimex() Linux system call; 1162 * response-time latency may not make it worth. A brief 1163 discussion on the subject is on the httpwg ml 1164 (https://lists.w3.org/Archives/Public/ietf-http- 1165 wg/2019JulSep/0202.html) 1167 * almost all rate-limit headers implementations do not use it. 1169 6. Why not support multiple quota remaining? 1171 While this might be of some value, my experience suggests that 1172 overly-complex quota implementations results in lower 1173 effectiveness of this policy. This spec allows the client to 1174 easily focusing on RateLimit-Remaining and RateLimit-Reset. 1176 7. Shouldn't I limit concurrency instead of request rate? 1178 You can use this specification to limit concurrency at the HTTP 1179 level (see {#use-for-limiting-concurrency}) and help clients to 1180 shape their requests avoiding being throttled out. 1182 A problematic way to limit concurrency is connection dropping, 1183 especially when connections are multiplexed (eg. HTTP/2) because 1184 this results in unserviced client requests, which is something we 1185 want to avoid. 1187 A semantic way to limit concurrency is to return 503 + Retry- 1188 After in case of resource saturation (eg. thrashing, connection 1189 queues too long, Service Level Objectives not meet, ..). 1190 Saturation conditions can be either dynamic or static: all this 1191 is out of the scope for the current document. 1193 8. Do a positive value of "RateLimit-Remaining" imply any service 1194 guarantee for my future requests to be served? 1196 No. The returned values were used to decide whether to serve or 1197 not _the current request_ and do not imply any guarantee that 1198 future requests will be successful. 1200 Instead they help to understand when future requests will 1201 probably be throttled. A low value for "RateLimit-Remaining" 1202 should be interpreted as a yellow traffic-light for either the 1203 number of requests issued in the "time-window" or the request 1204 throughput. 1206 9. Is the quota-policy definition Section 2.3 too complex? 1208 You can always return the simplest form of the 3 headers 1210 RateLimit-Limit: 100 1211 RateLimit-Remaining: 50 1212 RateLimit-Reset: 60 1214 The key runtime value is the first element of the list: "expiring- 1215 limit", the others "quota-policy" are informative. So for the 1216 following header: 1218 RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window" 1220 the key value is the one referencing the lowest limit: "100" 1222 1. Can we use shorter names? Why don't put everything in one 1223 header? 1225 The most common syntax we found on the web is "X-RateLimit-*" and 1226 when starting this I-D we opted for it 1227 (https://github.com/ioggstream/draft-polli-ratelimit-headers/ 1228 issues/34#issuecomment-519366481) 1229 The basic form of those headers is easily parseable, even by 1230 implementors procesing responses using technologies like dynamic 1231 interpreter with limited syntax. 1233 Using a single header complicates parsing and takes a significantly 1234 different approach from the existing ones: this can limit adoption. 1236 1. Why don't mention connections? 1238 Beware of the term "connection":   - it is just 1239 _one_ possible saturation cause. Once you go that path  1240 you will expose other infrastructural details (bandwidth, CPU, .. 1241 see Section 9.2)  and complicate client compliance; 1242  - it is an infrastructural detail defined in terms of 1243 server and network  rather than the consumed service. 1244 This specification protects the services first, and then the 1245 infrastructures through client cooperation (see Section 9.1). 1246   RateLimit headers enable sending _on the same 1247 connection_ different limit values  on each response, 1248 depending on the policy scope (eg. per-user, per-custom-key, ..) 1249  1251 2. Can intermediaries alter RateLimit fields? 1253 Generally, they should not because it might result in unserviced 1254 requests. There are reasonable use cases for intermediaries 1255 mangling RateLimit fields though, e.g. when they enforce stricter 1256 quota-policies, or when they are an active component of the 1257 service. In those case we will consider them as part of the 1258 originating infrastructure. 1260 Authors' Addresses 1262 Roberto Polli 1263 Team Digitale, Italian Government 1265 Email: robipolli@gmail.com 1267 Alejandro Martinez Ruiz 1268 Red Hat 1270 Email: amr@redhat.com