idnits 2.17.1 draft-polli-ratelimit-headers-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 41 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (26 May 2020) is 1431 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'UNIX' is defined on line 986, but no explicit reference was found in the text ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110) ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIX' Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP R. Polli 3 Internet-Draft Team Digitale, Italian Government 4 Intended status: Standards Track A. Martinez 5 Expires: 27 November 2020 Red Hat 6 26 May 2020 8 RateLimit Header Fields for HTTP 9 draft-polli-ratelimit-headers-03 11 Abstract 13 This document defines the RateLimit-Limit, RateLimit-Remaining, 14 RateLimit-Reset header fields for HTTP, thus allowing servers to 15 publish current request quotas and clients to shape their request 16 policy and avoid being throttled out. 18 Note to Readers 20 _RFC EDITOR: please remove this section before publication_ 22 Discussion of this draft takes place on the HTTP working group 23 mailing list (ietf-http-wg@w3.org), which is archived at 24 https://lists.w3.org/Archives/Public/ietf-http-wg/ 25 (https://lists.w3.org/Archives/Public/ietf-http-wg/). 27 The source code and issues list for this draft can be found at 28 https://github.com/ioggstream/draft-polli-ratelimit-headers 29 (https://github.com/ioggstream/draft-polli-ratelimit-headers). 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on 27 November 2020. 48 Copyright Notice 50 Copyright (c) 2020 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 55 license-info) in effect on the date of publication of this document. 56 Please review these documents carefully, as they describe your rights 57 and restrictions with respect to this document. Code Components 58 extracted from this document must include Simplified BSD License text 59 as described in Section 4.e of the Trust Legal Provisions and are 60 provided without warranty as described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Rate-limiting and quotas . . . . . . . . . . . . . . . . 3 66 1.2. Current landscape of rate-limiting headers . . . . . . . 4 67 1.2.1. Interoperability issues . . . . . . . . . . . . . . . 4 68 1.3. This proposal . . . . . . . . . . . . . . . . . . . . . . 5 69 1.4. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 70 1.5. Notational Conventions . . . . . . . . . . . . . . . . . 6 71 2. Expressing rate-limit policies . . . . . . . . . . . . . . . 6 72 2.1. Time window . . . . . . . . . . . . . . . . . . . . . . . 6 73 2.2. Request quota . . . . . . . . . . . . . . . . . . . . . . 6 74 2.3. Quota policy . . . . . . . . . . . . . . . . . . . . . . 7 75 3. Header Specifications . . . . . . . . . . . . . . . . . . . . 8 76 3.1. RateLimit-Limit . . . . . . . . . . . . . . . . . . . . . 8 77 3.2. RateLimit-Remaining . . . . . . . . . . . . . . . . . . . 9 78 3.3. RateLimit-Reset . . . . . . . . . . . . . . . . . . . . . 9 79 4. Providing RateLimit headers . . . . . . . . . . . . . . . . . 10 80 5. Receiving RateLimit headers . . . . . . . . . . . . . . . . . 10 81 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 11 82 6.1. Unparameterized responses . . . . . . . . . . . . . . . . 11 83 6.1.1. Throttling informations in responses . . . . . . . . 11 84 6.1.2. Use in conjunction with custom headers . . . . . . . 12 85 6.1.3. Use for limiting concurrency . . . . . . . . . . . . 12 86 6.1.4. Use in throttled responses . . . . . . . . . . . . . 13 87 6.2. Parameterized responses . . . . . . . . . . . . . . . . . 14 88 6.2.1. Throttling window specified via parameter . . . . . . 14 89 6.2.2. Dynamic limits with parameterized windows . . . . . . 14 90 6.2.3. Dynamic limits for pushing back and slowing down . . 15 91 6.3. Dynamic limits for pushing back with Retry-After and slow 92 down . . . . . . . . . . . . . . . . . . . . . . . . . . 16 93 6.3.1. Missing Remaining informations . . . . . . . . . . . 16 94 6.3.2. Use with multiple windows . . . . . . . . . . . . . . 17 95 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 96 7.1. Throttling does not prevent clients from issuing 97 requests . . . . . . . . . . . . . . . . . . . . . . . . 18 98 7.2. Information disclosure . . . . . . . . . . . . . . . . . 18 99 7.3. Remaining quota-units are not granted requests . . . . . 18 100 7.4. Reliability of RateLimit-Reset . . . . . . . . . . . . . 18 101 7.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 19 102 7.6. Denial of Service . . . . . . . . . . . . . . . . . . . . 19 103 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 104 8.1. RateLimit-Limit Header Field Registration . . . . . . . . 19 105 8.2. RateLimit-Remaining Header Field Registration . . . . . . 20 106 8.3. RateLimit-Reset Header Field Registration . . . . . . . . 20 107 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 108 9.1. Normative References . . . . . . . . . . . . . . . . . . 20 109 9.2. Informative References . . . . . . . . . . . . . . . . . 21 110 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 22 111 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 22 112 Appendix C. RateLimit headers currently used on the web . . . . 22 113 Appendix D. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 23 114 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 116 1. Introduction 118 The widespreading of HTTP as a distributed computation protocol 119 requires an explicit way of communicating service status and usage 120 quotas. 122 This was partially addressed with the "Retry-After" header field 123 defined in [RFC7231] to be returned in "429 Too Many Requests" or 124 "503 Service Unavailable" responses. 126 Still, there is not a standard way to communicate service quotas so 127 that the client can throttle its requests and prevent 4xx or 5xx 128 responses. 130 1.1. Rate-limiting and quotas 132 Servers use quota mechanisms to avoid systems overload, to ensure an 133 equitable distribution of computational resources or to enforce other 134 policies - eg. monetization. 136 A basic quota mechanism limits the number of acceptable requests in a 137 given time window, eg. 10 requests per second. 139 When quota is exceeded, servers usually do not serve the request 140 replying instead with a "4xx" HTTP status code (eg. 429 or 403) or 141 adopt more aggressive policies like dropping connections. 143 Quotas may be enforced on different basis (eg. per user, per IP, per 144 geographic area, ..) and at different levels. For example, an user 145 may be allowed to issue: 147 * 10 requests per second; 149 * limited to 60 request per minute; 151 * limited to 1000 request per hour. 153 Moreover system metrics, statistics and heuristics can be used to 154 implement more complex policies, where the number of acceptable 155 request and the time window are computed dynamically. 157 1.2. Current landscape of rate-limiting headers 159 To help clients throttling their requests, servers may expose the 160 counters used to evaluate quota policies via HTTP header fields. 162 Those response headers may be added by HTTP intermediaries such as 163 API gateways and reverse proxies. 165 On the web we can find many different rate-limit headers, usually 166 containing the number of allowed requests in a given time window, and 167 when the window is reset. 169 The common choice is to return three headers containing: 171 * the maximum number of allowed requests in the time window; 173 * the number of remaining requests in the current window; 175 * the time remaining in the current window expressed in seconds or 176 as a timestamp; 178 1.2.1. Interoperability issues 180 A major interoperability issue in throttling is the lack of standard 181 headers, because: 183 * each implementation associates different semantics to the same 184 header field names; 186 * header field names proliferates. 188 Client applications interfacing with different servers may thus need 189 to process different headers, or the very same application interface 190 that sits behind different reverse proxies may reply with different 191 throttling headers. 193 1.3. This proposal 195 This proposal defines syntax and semantics for the following header 196 fields: 198 * "RateLimit-Limit": containing the requests quota in the time 199 window; 201 * "RateLimit-Remaining": containing the remaining requests quota in 202 the current window; 204 * "RateLimit-Reset": containing the time remaining in the current 205 window, specified in seconds. 207 The behavior of "RateLimit-Reset" is compatible with the "delta- 208 seconds" notation of "Retry-After". 210 The header fields definition allows to describe complex policies, 211 including the ones using multiple and variable time windows and 212 dynamic quotas, or implementing concurrency limits. 214 1.4. Goals 216 The goals of this proposal are: 218 1. Standardizing the names and semantic of rate-limit headers; 220 2. Improve resiliency of HTTP infrastructures simplifying the 221 enforcement and the adoption of rate-limit headers; 223 3. Simplify API documentation avoiding expliciting rate-limit header 224 fields semantic in documentation. 226 The goals do not include: 228 Authorization: The rate-limit headers described here are not meant 229 to support authorization or other kinds of access controls. 231 Throttling scope: This specification does not cover the throttling 232 scope, that may be the given resource-target, its parent path or 233 the whole Origin [RFC6454] section 7. 235 Response status code: The rate-limit headers may be returned in both 236 Successful and non Successful responses. This specification does 237 not cover whether non Successful responses count on quota usage. 239 Throttling policy: This specification does not mandate a specific 240 throttling policy. The values published in the headers, including 241 the window size, can be statically or dynamically evaluated. 243 Service Level Agreement: Conveyed quota hints do not imply any 244 service guarantee. Server is free to throttle respectful clients 245 under certain circumstances. 247 1.5. Notational Conventions 249 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 250 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 251 "OPTIONAL" in this document are to be interpreted as described in BCP 252 14 ([RFC2119] and [RFC8174]) when, and only when, they appear in all 253 capitals, as shown here. 255 This document uses the Augmented BNF defined in [RFC5234] and updated 256 by [RFC7405] along with the "#rule" extension defined in Section 7 of 257 [RFC7230]. 259 The term Origin is to be interpreted as described in [RFC6454] 260 section 7. 262 The "delta-seconds" rule is defined in [RFC7234] section 1.2.1. 264 2. Expressing rate-limit policies 266 2.1. Time window 268 Rate limit policies limit the number of acceptable requests in a 269 given time window. 271 A time window is expressed in seconds, using the following syntax: 273 time-window = delta-seconds 275 Subsecond precision is not supported. 277 2.2. Request quota 279 The request-quota is a value associated to the maximum number of 280 requests that the server is willing to accept from one or more 281 clients on a given basis (originating IP, authenticated user, 282 geographical, ..) during a "time-window" as defined in Section 2.1. 284 The "request-quota" is expressed in "quota-units" and has the 285 following syntax: 287 request-quota = quota-units 288 quota-units = 1*DIGIT 290 The "request-quota" SHOULD match the maximum number of acceptable 291 requests. 293 The "request-quota" MAY differ from the total number of acceptable 294 requests when weight mechanisms, bursts, or other server policies are 295 implemented. 297 If the "request-quota" does not match the maximum number of 298 acceptable requests the relation with that SHOULD be communicated 299 out-of-band. 301 Example: A server could 303 * count once requests like "/books/{id}" 305 * count twice search requests like "/books?author=Camilleri" 307 so that we have the following counters 309 GET /books/123 ; request-quota=4, remaining: 3, status=200 310 GET /books?author=Camilleri ; request-quota=4, remaining: 1, status=200 311 GET /books?author=Eco ; request-quota=4, remaining: 0, status=429 313 2.3. Quota policy 315 This specification allows describing a quota policy with the 316 following syntax: 318 quota-policy = request-quota; "w" "=" time-window 319 *( OWS ";" OWS quota-comment) 320 quota-comment = token "=" (token / quoted-string) 322 quota-policy parameters like "w" and quota-comment tokens MUST NOT 323 occur multiple times within the same quota-policy. 325 An example policy of 100 quota-units per minute. 327 100;w=60 329 Two examples of providing further details via custom parameters in 330 "quota-comments". 332 100;w=60;comment="fixed window" 333 12;w=1;burst=1000;policy="leaky bucket" 335 3. Header Specifications 337 The following "RateLimit" response header fields are defined 339 3.1. RateLimit-Limit 341 The "RateLimit-Limit" response header field indicates the "request- 342 quota" associated to the client in the current "time-window". 344 If the client exceeds that limit, it MAY not be served. 346 The header value is 348 RateLimit-Limit = expiring-limit [, 1#quota-policy ] 349 expiring-limit = request-quota 351 The "expiring-limit" value MUST be set to the "request-quota" that is 352 closer to reach its limit. 354 The "quota-policy" is defined in Section 2.3, and its values are 355 informative. 357 RateLimit-Limit: 100 359 A "time-window" associated to "expiring-limit" can be communicated 360 via an optional "quota-policy" value, like shown in the following 361 example 363 RateLimit-Limit: 100, 100;w=10 365 If the "expiring-limit" is not associated to a "time-window", the 366 "time-window" MUST either be: 368 * inferred by the value of "RateLimit-Reset" at the moment of the 369 reset, or 371 * communicated out-of-band (eg. in the documentation). 373 Policies using multiple quota limits MAY be returned using multiple 374 "quota-policy" items, like shown in the following two examples: 376 RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 377 RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600 379 This header MUST NOT occur multiple times. 381 3.2. RateLimit-Remaining 383 The "RateLimit-Remaining" response header field indicates the 384 remaining "quota-units" defined in Section 2.2 associated to the 385 client. 387 The header value is 389 RateLimit-Remaining = quota-units 391 This header MUST NOT occur multiple times. 393 Clients MUST NOT assume that a positive "RateLimit-Remaining" value 394 is a guarantee of being served. 396 A low "RateLimit-Remaining" value is like a yellow traffic-light: the 397 red light may arrive suddenly. 399 One example of "RateLimit-Remaining" use is below. 401 RateLimit-Remaining: 50 403 3.3. RateLimit-Reset 405 The "RateLimit-Reset" response header field indicates either 407 * the number of seconds until the quota resets. 409 The header value is 411 RateLimit-Reset = delta-seconds 413 The delta-seconds format is used because: 415 * it does not rely on clock synchronization and is resilient to 416 clock adjustment and clock skew between client and server (see 417 [RFC7231] Section 4.1.1.1); 419 * it mitigates the risk related to thundering herd when too many 420 clients are serviced with the same timestamp. 422 This header MUST NOT occur multiple times. 424 An example of "RateLimit-Reset" use is below. 426 RateLimit-Reset: 50 428 The client MUST NOT assume that all its "request-quota" will be 429 restored after the moment referenced by "RateLimit-Reset". The 430 server MAY arbitrarily alter the "RateLimit-Reset" value between 431 subsequent requests eg. in case of resource saturation or to 432 implement sliding window policies. 434 4. Providing RateLimit headers 436 A server MAY use one or more "RateLimit" response header fields 437 defined in this document to communicate its quota policies. 439 The returned values refers to the metrics used to evaluate if the 440 current request respects the quota policy and MAY not apply to 441 subsequent requests. 443 Example: a successful response with the following header fields 445 RateLimit-Limit: 10 446 RateLimit-Remaining: 1 447 RateLimit-Reset: 7 449 does not guarantee that the next request will be successful. Server 450 metrics may be subject to other conditions like the one shown in the 451 example from Section 2.2. 453 A server MAY return "RateLimit" response header fields independently 454 of the response status code. This includes throttled responses. 456 If a response contains both the "Retry-After" and the "RateLimit- 457 Reset" header fields, the value of "RateLimit-Reset" SHOULD reference 458 the same point in time as "Retry-After". 460 When using a policy involving more than one "time-window", the server 461 MUST reply with the "RateLimit" headers related to the window with 462 the lower "RateLimit-Remaining" values. 464 Under certain conditions, a server MAY artificially lower "RateLimit" 465 field values between subsequent requests, eg. to respond to Denial of 466 Service attacks or in case of resource saturation. 468 5. Receiving RateLimit headers 470 A client MUST process the received "RateLimit" headers. 472 A client MUST validate the values received in the "RateLimit" headers 473 before using them and check if there are significant discrepancies 474 with the expected ones. This includes a "RateLimit-Reset" moment too 475 far in the future or a "request-quota" too high. 477 Malformed "RateLimit" headers MAY be ignored. 479 A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit- 480 Remaining" before the "time-window" expressed in "RateLimit-Reset". 482 A client MAY still probe the server if the "RateLimit-Reset" is 483 considered too high. 485 The value of "RateLimit-Reset" is generated at response time: a 486 client aware of a significant network latency MAY behave accordingly 487 and use other informations (eg. the "Date" response header, or 488 otherwise gathered metrics) to better estimate the "RateLimit-Reset" 489 moment intended by the server. 491 The "quota-policy" values and comments provided in "RateLimit-Limit" 492 are informative and MAY be ignored. 494 If a response contains both the "RateLimit-Reset" and "Retry-After" 495 header fields, the "Retry-After" header field MUST take precedence 496 and the "RateLimit-Reset" header field MAY be ignored. 498 6. Examples 500 6.1. Unparameterized responses 502 6.1.1. Throttling informations in responses 504 The client exhausted its request-quota for the next 50 seconds. The 505 "time-window" is communicated out-of-band or inferred by the header 506 values. 508 Request: 510 GET /items/123 512 Response: 514 HTTP/1.1 200 Ok 515 Content-Type: application/json 516 RateLimit-Limit: 100 517 Ratelimit-Remaining: 0 518 Ratelimit-Reset: 50 520 {"hello": "world"} 522 6.1.2. Use in conjunction with custom headers 524 The server uses two custom headers, namely "acme-RateLimit-DayLimit" 525 and "acme-RateLimit-HourLimit" to expose the following policy: 527 * 5000 daily quota-units; 529 * 1000 hourly quota-units. 531 The client consumed 4900 quota-units in the first 14 hours. 533 Despite the next hourly limit of 1000 quota-units, the closest limit 534 to reach is the daily one. 536 The server then exposes the "RateLimit-*" headers to inform the 537 client that: 539 * it has only 100 quota-units left; 541 * the window will reset in 10 hours. 543 Request: 545 GET /items/123 547 Response: 549 HTTP/1.1 200 Ok 550 Content-Type: application/json 551 acme-RateLimit-DayLimit: 5000 552 acme-RateLimit-HourLimit: 1000 553 RateLimit-Limit: 5000 554 RateLimit-Remaining: 100 555 RateLimit-Reset: 36000 557 {"hello": "world"} 559 6.1.3. Use for limiting concurrency 561 Throttling headers may be used to limit concurrency, advertising 562 limits that are lower than the usual ones in case of saturation, thus 563 increasing availability. 565 The server adopted a basic policy of 100 quota-units per minute, and 566 in case of resource exhaustion adapts the returned values reducing 567 both "RateLimit-Limit" and "RateLimit-Remaining". 569 After 2 seconds the client consumed 40 quota-units 570 Request: 572 GET /items/123 574 Response: 576 HTTP/1.1 200 Ok 577 Content-Type: application/json 578 RateLimit-Limit: 100 579 RateLimit-Remaining: 60 580 RateLimit-Reset: 58 582 {"elapsed": 2, "issued": 40} 584 At the subsequent request - due to resource exhaustion - the server 585 advertises only "RateLimit-Remaining: 20". 587 Request: 589 GET /items/123 591 Response: 593 HTTP/1.1 200 Ok 594 Content-Type: application/json 595 RateLimit-Limit: 100 596 RateLimit-Remaining: 20 597 RateLimit-Reset: 56 599 {"elapsed": 4, "issued": 41} 601 6.1.4. Use in throttled responses 603 A client exhausted its quota and the server throttles the request 604 sending the "Retry-After" response header field. 606 In this example, the values of "Retry-After" and "RateLimit-Reset" 607 reference the same moment, but this is not a requirement. 609 The "429 Too Many Requests" HTTP status code is just used as an 610 example. 612 Request: 614 GET /items/123 616 Response: 618 HTTP/1.1 429 Too Many Requests 619 Content-Type: application/json 620 Date: Mon, 05 Aug 2019 09:27:00 GMT 621 Retry-After: Mon, 05 Aug 2019 09:27:05 GMT 622 RateLimit-Reset: 5 623 RateLimit-Limit: 100 624 Ratelimit-Remaining: 0 626 { 627 "title": "Too Many Requests", 628 "status": 429, 629 "detail": "You have exceeded your quota" 630 } 632 6.2. Parameterized responses 634 6.2.1. Throttling window specified via parameter 636 The client has 99 "quota-units" left for the next 50 seconds. The 637 "time-window" is communicated by the "w" parameter, so we know the 638 throughput is 100 "quota-units" per minute. 640 Request: 642 GET /items/123 644 Response: 646 HTTP/1.1 200 Ok 647 Content-Type: application/json 648 RateLimit-Limit: 100, 100;w=60 649 Ratelimit-Remaining: 99 650 Ratelimit-Reset: 50 652 {"hello": "world"} 654 6.2.2. Dynamic limits with parameterized windows 656 The policy conveyed by "RateLimit-Limit" states that the server 657 accepts 100 quota-units per minute. 659 To avoid resource exhaustion, the server artificially lowers the 660 actual limits returned in the throttling headers. 662 The "RateLimit-Remaining" then advertises only 9 quota-units for the 663 next 50 seconds to slow down the client. 665 Note that the server could have lowered even the other values in 666 "RateLimit-Limit": this specification does not mandate any relation 667 between the field values contained in subsequent responses. 669 Request: 671 GET /items/123 673 Response: 675 HTTP/1.1 200 Ok 676 Content-Type: application/json 677 RateLimit-Limit: 10, 100;w=60 678 Ratelimit-Remaining: 9 679 Ratelimit-Reset: 50 681 { 682 "status": 200, 683 "detail": "Just slow down without waiting." 684 } 686 6.2.3. Dynamic limits for pushing back and slowing down 688 Continuing the previous example, let's say the client waits 10 689 seconds and performs a new request which, due to resource exhaustion, 690 the server rejects and pushes back, advertising "RateLimit-Remaining: 691 0" for the next 20 seconds. 693 The server advertises a smaller window with a lower limit to slow 694 down the client for the rest of its original window after the 20 695 seconds elapse. 697 Request: 699 GET /items/123 701 Response: 703 HTTP/1.1 429 Too Many Requests 704 Content-Type: application/json 705 RateLimit-Limit: 0, 15;w=20 706 Ratelimit-Remaining: 0 707 Ratelimit-Reset: 20 709 { 710 "status": 429, 711 "detail": "Wait 20 seconds, then slow down!" 712 } 714 6.3. Dynamic limits for pushing back with Retry-After and slow down 716 Alternatively, given the same context where the previous example 717 starts, we can convey the same information to the client via the 718 Retry-After header, with the advantage that the server can now 719 specify the policy's nominal limit and window that will apply after 720 the reset, ie. assuming the resource exhaustion is likely to be gone 721 by then, so the advertised policy does not need to be adjusted, yet 722 we managed to stop requests for a while and slow down the rest of the 723 current window. 725 Request: 727 GET /items/123 729 Response: 731 HTTP/1.1 429 Too Many Requests 732 Content-Type: application/json 733 Retry-After: 20 734 RateLimit-Limit: 15, 100;w=60 735 Ratelimit-Remaining: 15 736 Ratelimit-Reset: 40 738 { 739 "status": 429, 740 "detail": "Wait 20 seconds, then slow down!" 741 } 743 Note that in this last response the client is expected to honor the 744 "Retry-After" header and perform no requests for the specified amount 745 of time, whereas the previous example would not force the client to 746 stop requests before the reset time is elapsed, as it would still be 747 free to query again the server even if it is likely to have the 748 request rejected. 750 6.3.1. Missing Remaining informations 752 The server does not expose "RateLimit-Remaining" values, but resets 753 the limit counter every second. 755 It communicates to the client the limit of 10 quota-units per second 756 always returning the couple "RateLimit-Limit" and "RateLimit-Reset". 758 Request: 760 GET /items/123 761 Response: 763 HTTP/1.1 200 Ok 764 Content-Type: application/json 765 RateLimit-Limit: 10 766 Ratelimit-Reset: 1 768 {"first": "request"} 770 Request: 772 GET /items/123 774 Response: 776 HTTP/1.1 200 Ok 777 Content-Type: application/json 778 RateLimit-Limit: 10 779 Ratelimit-Reset: 1 781 {"second": "request"} 783 6.3.2. Use with multiple windows 785 This is a standardized way of describing the policy detailed in 786 Section 6.1.2: 788 * 5000 daily quota-units; 790 * 1000 hourly quota-units. 792 The client consumed 4900 quota-units in the first 14 hours. 794 Despite the next hourly limit of 1000 quota-units, the closest limit 795 to reach is the daily one. 797 The server then exposes the "RateLimit" headers to inform the client 798 that: 800 * it has only 100 quota-units left; 802 * the window will reset in 10 hours; 804 * the "expiring-limit" is 5000. 806 Request: 808 GET /items/123 809 Response: 811 HTTP/1.1 200 OK 812 Content-Type: application/json 813 RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400 814 RateLimit-Remaining: 100 815 RateLimit-Reset: 36000 817 {"hello": "world"} 819 7. Security Considerations 821 7.1. Throttling does not prevent clients from issuing requests 823 This specification does not prevent clients to make over-quota 824 requests. 826 Servers should always implement mechanisms to prevent resource 827 exhaustion. 829 7.2. Information disclosure 831 Servers should not disclose operational capacity informations that 832 can be used to saturate its resources. 834 While this specification does not mandate whether non 2xx responses 835 consume quota, if 401 and 403 responses count on quota a malicious 836 client could probe the endpoint to get traffic informations of 837 another user. 839 7.3. Remaining quota-units are not granted requests 841 "RateLimit-*" headers convey hints from the server to the clients in 842 order to avoid being throttled out. 844 Clients MUST NOT consider the "quota-units" returned in "RateLimit- 845 Remaining" as a service level agreement. 847 In case of resource saturation, the server MAY artificially lower the 848 returned values or not serve the request anyway. 850 7.4. Reliability of RateLimit-Reset 852 Consider that "request-quota" may not be restored after the moment 853 referenced by "RateLimit-Reset", and the "RateLimit-Reset" value 854 should not be considered fixed nor constant. 856 Subsequent requests may return an higher "RateLimit-Reset" value to 857 limit concurrency or implement dynamic or adaptive throttling 858 policies. 860 7.5. Resource exhaustion 862 When returning "RateLimit-Reset" you must be aware that many 863 throttled clients may come back at the very moment specified. 865 This is true for "Retry-After" too. 867 For example, if the quota resets every day at "18:00:00" and your 868 server returns the "RateLimit-Reset" accordingly 870 Date: Tue, 15 Nov 1994 08:00:00 GMT 871 RateLimit-Reset: 36000 873 there's a high probability that all clients will show up at 874 "18:00:00". 876 This could be mitigated adding some jitter to the field-value. 878 7.6. Denial of Service 880 "RateLimit" header fields may assume unexpected values by chance or 881 purpose. For example, an excessively high "RateLimit-Remaining" 882 value may be: 884 * used by a malicious intermediary to trigger a Denial of Service 885 attack or consume client resources boosting its requests; 887 * passed by a misconfigured server; 889 or an high "RateLimit-Reset" value could inhibit clients to contact 890 the server. 892 Clients MUST validate the received values to mitigate those risks. 894 8. IANA Considerations 896 8.1. RateLimit-Limit Header Field Registration 898 This section registers the "RateLimit-Limit" header field in the 899 "Permanent Message Header Field Names" registry ([RFC3864]). 901 Header field name: "RateLimit-Limit" 903 Applicable protocol: http 904 Status: standard 906 Author/Change controller: IETF 908 Specification document(s): Section 3.1 of this document 910 8.2. RateLimit-Remaining Header Field Registration 912 This section registers the "RateLimit-Remaining" header field in the 913 "Permanent Message Header Field Names" registry ([RFC3864]). 915 Header field name: "RateLimit-Remaining" 917 Applicable protocol: http 919 Status: standard 921 Author/Change controller: IETF 923 Specification document(s): Section 3.2 of this document 925 8.3. RateLimit-Reset Header Field Registration 927 This section registers the "RateLimit-Reset" header field in the 928 "Permanent Message Header Field Names" registry ([RFC3864]). 930 Header field name: "RateLimit-Reset" 932 Applicable protocol: http 934 Status: standard 936 Author/Change controller: IETF 938 Specification document(s): Section 3.3 of this document 940 9. References 942 9.1. Normative References 944 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 945 Requirement Levels", BCP 14, RFC 2119, 946 DOI 10.17487/RFC2119, March 1997, 947 . 949 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 950 Procedures for Message Header Fields", BCP 90, RFC 3864, 951 DOI 10.17487/RFC3864, September 2004, 952 . 954 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 955 Specifications: ABNF", STD 68, RFC 5234, 956 DOI 10.17487/RFC5234, January 2008, 957 . 959 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, 960 DOI 10.17487/RFC6454, December 2011, 961 . 963 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 964 Protocol (HTTP/1.1): Message Syntax and Routing", 965 RFC 7230, DOI 10.17487/RFC7230, June 2014, 966 . 968 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 969 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 970 DOI 10.17487/RFC7231, June 2014, 971 . 973 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 974 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 975 RFC 7234, DOI 10.17487/RFC7234, June 2014, 976 . 978 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 979 RFC 7405, DOI 10.17487/RFC7405, December 2014, 980 . 982 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 983 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 984 May 2017, . 986 [UNIX] The Open Group, ., "The Single UNIX Specification, Version 987 2 - 6 Vol Set for UNIX 98", February 1997. 989 9.2. Informative References 991 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 992 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 993 . 995 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 996 Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, 997 . 999 Appendix A. Change Log 1001 RFC EDITOR PLEASE DELETE THIS SECTION. 1003 Appendix B. Acknowledgements 1005 Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro 1006 Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark 1007 Nottingham for being the initial contributors of these 1008 specifications. Kudos to the first community implementors: Aapo 1009 Talvensaari, Nathan Friedly and Sanyam Dogra. 1011 Appendix C. RateLimit headers currently used on the web 1013 RFC EDITOR PLEASE DELETE THIS SECTION. 1015 Commonly used header field names are: 1017 * "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset"; 1019 * "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit- 1020 Reset". 1022 There are variants too, where the window is specified in the header 1023 field name, eg: 1025 * "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x- 1026 ratelimit-limit-day" 1028 * "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x- 1029 ratelimit-remaining-day" 1031 Here are some interoperability issues: 1033 * "X-RateLimit-Remaining" references different values, depending on 1034 the implementation: 1036 - seconds remaining to the window expiration 1038 - milliseconds remaining to the window expiration 1040 - seconds since UTC, in UNIX Timestamp 1042 - a datetime, either "IMF-fixdate" [RFC7231] or [RFC3339] 1044 * different headers, with the same semantic, are used by different 1045 implementers: 1047 - X-RateLimit-Limit and X-Rate-Limit-Limit 1049 - X-RateLimit-Remaining and X-Rate-Limit-Remaining 1051 - X-RateLimit-Reset and X-Rate-Limit-Reset 1053 The semantic of RateLimit-Remaining depends on the windowing 1054 algorithm. A sliding window policy for example may result in having 1055 a ratelimit-remaining value related to the ratio between the current 1056 and the maximum throughput. Eg. 1058 RateLimit-Limit: 12, 12;w=1 1059 RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s 1060 RateLimit-Reset: 1 1062 If this is the case, the optimal solution is to achieve 1064 RateLimit-Limit: 12, 12;w=1 1065 RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s 1066 RateLimit-Reset: 1 1068 At this point you should stop increasing your request rate. 1070 Appendix D. FAQ 1072 1. Why defining standard headers for throttling? 1074 To simplify enforcement of throttling policies. 1076 2. Can I use RateLimit-* in throttled responses (eg with status code 1077 429)? 1079 Yes, you can. 1081 3. Are those specs tied to RFC 6585? 1083 No. [RFC6585] defines the "429" status code and we use it just 1084 as an example of a throttled request, that could instead use even 1085 403 or whatever status code. 1087 4. Why don't pass the trottling scope as a parameter? 1089 I'm open to suggestions. File an issue if you think it's worth 1090 ;). 1092 5. Why using delta-seconds instead of a UNIX Timestamp? Why not 1093 using subsecond precision? 1095 Using delta-seconds aligns with "Retry-After", which is returned 1096 in similar contexts, eg on 429 responses. 1098 delta-seconds as defined in [RFC7234] section 1.2.1 clarifies 1099 some parsing rules too. 1101 Timestamps require a clock synchronization protocol (see 1102 [RFC7231] section 4.1.1.1). This may be problematic (eg. clock 1103 adjustment, clock skew, failure of hardcoded clock 1104 synchronization servers, IoT devices, ..). Moreover timestamps 1105 may not be monotonically increasing due to clock adjustment. See 1106 Another NTP client failure story 1107 (https://community.ntppool.org/t/another-ntp-client-failure- 1108 story/1014/) 1110 We did not use subsecond precision because: 1112 * that is more subject to system clock correction like the one 1113 implemented via the adjtimex() Linux system call; 1115 * response-time latency may not make it worth. A brief 1116 discussion on the subject is on the httpwg ml 1117 (https://lists.w3.org/Archives/Public/ietf-http- 1118 wg/2019JulSep/0202.html) 1120 * almost all rate-limit headers implementations do not use it. 1122 6. Why not support multiple quota remaining? 1124 While this might be of some value, my experience suggests that 1125 overly-complex quota implementations results in lower 1126 effectiveness of this policy. This spec allows the client to 1127 easily focusing on RateLimit-Remaining and RateLimit-Reset. 1129 7. Shouldn't I limit concurrency instead of request rate? 1131 You can do both. The goal of this spec is to provide guidance 1132 for clients in shaping their requests without being throttled 1133 out. 1135 Limiting concurrency results in unserviced client requests, which 1136 is something we want to avoid. 1138 A standard way to limit concurrency is to return 503 + Retry- 1139 After in case of resource saturation (eg. thrashing, connection 1140 queues too long, Service Level Objectives not meet, ..). 1142 Availability can be improved by dynamically lowering the values 1143 returned by the "RateLimit-*" headers to slow down clients, and 1144 "Retry-After" can be used to push them back. 1146 Saturation conditions can be either dynamic or static: all this 1147 is out of the scope for the current document. 1149 8. Do a positive value of "RateLimit-Remaining" imply any service 1150 guarantee for my future requests to be served? 1152 No. The returned values were used to decide whether to serve or 1153 not _the current request_ and do not imply any guarantee that 1154 future requests will be successful. 1156 Instead they help to understand when future requests will 1157 probably be throttled. A low value for "RateLimit-Remaining" 1158 should be interpreted as a yellow traffic-light for either the 1159 number of requests issued in the "time-window" or the request 1160 throughput. 1162 9. Is the quota-policy definition Section 2.3 too complex? 1164 You can always return the simplest form of the 3 headers 1166 RateLimit-Limit: 100 1167 RateLimit-Remaining: 50 1168 RateLimit-Reset: 60 1170 The key runtime value is the first element of the list: "expiring- 1171 limit", the others "quota-policy" are informative. So for the 1172 following header: 1174 RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window" 1176 the key value is the one referencing the lowest limit: "100" 1178 1. Can we use shorter names? Why don't put everything in one 1179 header? 1181 The most common syntax we found on the web is "X-RateLimit-*" and 1182 when starting this I-D we opted for it 1183 (https://github.com/ioggstream/draft-polli-ratelimit-headers/ 1184 issues/34#issuecomment-519366481) 1185 The basic form of those headers is easily parseable, even by 1186 implementors procesing responses using technologies like dynamic 1187 interpreter with limited syntax. 1189 Using a single header complicates parsing and takes a significantly 1190 different approach from the existing ones: this can limit adoption. 1192 Authors' Addresses 1194 Roberto Polli 1195 Team Digitale, Italian Government 1197 Email: robipolli@gmail.com 1199 Alejandro Martinez Ruiz 1200 Red Hat 1202 Email: amr@redhat.com