idnits 2.17.1 draft-polli-ratelimit-headers-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 38 characters in excess of 72. ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 21, 2019) is 1648 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 923 -- Looks like a reference, but probably isn't: '2' on line 925 -- Looks like a reference, but probably isn't: '3' on line 1037 -- Looks like a reference, but probably isn't: '4' on line 1045 == Unused Reference: 'UNIX' is defined on line 908, but no explicit reference was found in the text ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110) ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIX' Summary: 5 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Polli 3 Internet-Draft Team Digitale, Italian Government 4 Intended status: Standards Track A. Martinez 5 Expires: April 23, 2020 Red Hat 6 October 21, 2019 8 RateLimit Header Fields for HTTP 9 draft-polli-ratelimit-headers-01 11 Abstract 13 This document defines the RateLimit-Limit, RateLimit-Remaining, 14 RateLimit-Reset header fields for HTTP, thus allowing servers to 15 publish current request quotas and clients to shape their request 16 policy and avoid being throttled out. 18 Note to Readers 20 _RFC EDITOR: please remove this section before publication_ 22 Discussion of this draft takes place on the HTTP working group 23 mailing list (ietf-http-wg@w3.org), which is archived at 24 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 26 The source code and issues list for this draft can be found at 27 https://github.com/ioggstream/draft-polli-ratelimit-headers [2]. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on April 23, 2020. 46 Copyright Notice 48 Copyright (c) 2019 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. Rate-limiting and quotas . . . . . . . . . . . . . . . . 3 65 1.2. Current landscape of rate-limiting headers . . . . . . . 4 66 1.2.1. Interoperability issues . . . . . . . . . . . . . . . 4 67 1.3. This proposal . . . . . . . . . . . . . . . . . . . . . . 5 68 1.4. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 1.5. Notational Conventions . . . . . . . . . . . . . . . . . 6 70 2. Expressing rate-limit policies . . . . . . . . . . . . . . . 6 71 2.1. Time window . . . . . . . . . . . . . . . . . . . . . . . 6 72 2.2. Request quota . . . . . . . . . . . . . . . . . . . . . . 6 73 2.3. Quota policy . . . . . . . . . . . . . . . . . . . . . . 7 74 3. Header Specifications . . . . . . . . . . . . . . . . . . . . 8 75 3.1. RateLimit-Limit . . . . . . . . . . . . . . . . . . . . . 8 76 3.2. RateLimit-Remaining . . . . . . . . . . . . . . . . . . . 9 77 3.3. RateLimit-Reset . . . . . . . . . . . . . . . . . . . . . 9 78 4. Providing RateLimit headers . . . . . . . . . . . . . . . . . 10 79 5. Receiving RateLimit headers . . . . . . . . . . . . . . . . . 10 80 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 11 81 6.1. Unparameterized responses . . . . . . . . . . . . . . . . 11 82 6.1.1. Throttling informations in responses . . . . . . . . 11 83 6.1.2. Use in conjunction with custom headers . . . . . . . 11 84 6.1.3. Use for limiting concurrency . . . . . . . . . . . . 12 85 6.1.4. Use in throttled responses . . . . . . . . . . . . . 13 86 6.2. Parameterized responses . . . . . . . . . . . . . . . . . 14 87 6.2.1. Throttling window specified via parameter . . . . . . 14 88 6.2.2. Dynamic limits with parameterized windows . . . . . . 14 89 6.2.3. Missing Remaining informations . . . . . . . . . . . 15 90 6.2.4. Use with multiple windows . . . . . . . . . . . . . . 16 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 92 7.1. Throttling does not prevent clients from issuing requests 17 93 7.2. Information disclosure . . . . . . . . . . . . . . . . . 17 94 7.3. Remaining quota-units are not granted requests . . . . . 17 95 7.4. Reliability of RateLimit-Reset . . . . . . . . . . . . . 17 96 7.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 17 97 7.6. Denial of Service . . . . . . . . . . . . . . . . . . . . 18 98 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 99 8.1. RateLimit-Limit Header Field Registration . . . . . . . . 18 100 8.2. RateLimit-Remaining Header Field Registration . . . . . . 19 101 8.3. RateLimit-Reset Header Field Registration . . . . . . . . 19 102 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 103 9.1. Normative References . . . . . . . . . . . . . . . . . . 19 104 9.2. Informative References . . . . . . . . . . . . . . . . . 20 105 9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 20 106 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 21 107 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 21 108 Appendix C. RateLimit headers currently used on the web . . . . 21 109 Appendix D. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 22 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 112 1. Introduction 114 The widespreading of HTTP as a distributed computation protocol 115 requires an explicit way of communicating service status and usage 116 quotas. 118 This was partially addressed with the "Retry-After" header field 119 defined in [RFC7231] to be returned in "429 Too Many Requests" or 120 "503 Service Unavailable" responses. 122 Still, there is not a standard way to communicate service quotas so 123 that the client can throttle its requests and prevent 4xx or 5xx 124 responses. 126 1.1. Rate-limiting and quotas 128 Servers use quota mechanisms to avoid systems overload, to ensure an 129 equitable distribution of computational resources or to enforce other 130 policies - eg. monetization. 132 A basic quota mechanism limits the number of acceptable requests in a 133 given time window, eg. 10 requests per second. 135 When quota is exceeded, servers usually do not serve the request 136 replying instead with a "4xx" HTTP status code (eg. 429 or 403) or 137 adopt more aggressive policies like dropping connections. 139 Quotas may be enforced on different basis (eg. per user, per IP, per 140 geographic area, ..) and at different levels. For example, an user 141 may be allowed to issue: 143 o 10 requests per second; 145 o limited to 60 request per minute; 147 o limited to 1000 request per hour. 149 Moreover system metrics, statistics and heuristics can be used to 150 implement more complex policies, where the number of acceptable 151 request and the time window are computed dynamically. 153 1.2. Current landscape of rate-limiting headers 155 To help clients throttling their requests, servers may expose the 156 counters used to evaluate quota policies via HTTP header fields. 158 Those response headers may be added by HTTP intermediaries such as 159 API gateways and reverse proxies. 161 On the web we can find many different rate-limit headers, usually 162 containing the number of allowed requests in a given time window, and 163 when the window is reset. 165 The common choice is to return three headers containing: 167 o the maximum number of allowed requests in the time window; 169 o the number of remaining requests in the current window; 171 o the time remaining in the current window expressed in seconds or 172 as a timestamp; 174 1.2.1. Interoperability issues 176 A major interoperability issue in throttling is the lack of standard 177 headers, because: 179 o each implementation associates different semantics to the same 180 header field names; 182 o header field names proliferates. 184 Client applications interfacing with different servers may thus need 185 to process different headers, or the very same application interface 186 that sits behind different reverse proxies may reply with different 187 throttling headers. 189 1.3. This proposal 191 This proposal defines syntax and semantics for the following header 192 fields: 194 o "RateLimit-Limit": containing the requests quota in the time 195 window; 197 o "RateLimit-Remaining": containing the remaining requests quota in 198 the current window; 200 o "RateLimit-Reset": containing the time remaining in the current 201 window, specified in seconds. 203 The behavior of "RateLimit-Reset" is compatible with the "delta- 204 seconds" notation of "Retry-After". 206 The header fields definition allows to describe complex policies, 207 including the ones using multiple and variable time windows and 208 dynamic quotas, or implementing concurrency limits. 210 1.4. Goals 212 The goals of this proposal are: 214 1. Standardizing the names and semantic of rate-limit headers; 216 2. Improve resiliency of HTTP infrastructures simplifying the 217 enforcement and the adoption of rate-limit headers; 219 3. Simplify API documentation avoiding expliciting rate-limit header 220 fields semantic in documentation. 222 The goals do not include: 224 Authorization: The rate-limit headers described here are not meant 225 to support authorization or other kinds of access controls. 227 Throttling scope: This specification does not cover the throttling 228 scope, that may be the given resource-target, its parent path or 229 the whole Origin [RFC6454] section 7. 231 Response status code: The rate-limit headers may be returned in both 232 Successful and non Successful responses. This specification does 233 not cover whether non Successful responses count on quota usage. 235 Throttling policy: This specification does not mandate a specific 236 throttling policy. The values published in the headers, including 237 the window size, can be statically or dynamically evaluated. 239 Service Level Agreement: Conveyed quota hints do not imply any 240 service guarantee. Server is free to throttle respectful clients 241 under certain circumstances. 243 1.5. Notational Conventions 245 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 246 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 247 "OPTIONAL" in this document are to be interpreted as described in BCP 248 14 ([RFC2119] and [RFC8174]) when, and only when, they appear in all 249 capitals, as shown here. 251 This document uses the Augmented BNF defined in [RFC5234] and updated 252 by [RFC7405] along with the "#rule" extension defined in Section 7 of 253 [RFC7230]. 255 The term Origin is to be interpreted as described in [RFC6454] 256 section 7. 258 The "delta-seconds" rule is defined in [RFC7234] section 1.2.1. 260 2. Expressing rate-limit policies 262 2.1. Time window 264 Rate limit policies limit the number of acceptable requests in a 265 given time window. 267 A time window is expressed in seconds, using the following syntax: 269 time-window = delta-seconds 271 Subsecond precision is not supported. 273 2.2. Request quota 275 The request-quota is a value associated to the maximum number of 276 requests that the server is willing to accept from one or more 277 clients on a given basis (originating IP, authenticated user, 278 geographical, ..) during a "time-window" as defined in Section 2.1. 280 The "request-quota" is expressed in "quota-units" and has the 281 following syntax: 283 request-quota = quota-units 284 quota-units = 1*DIGIT 286 The "request-quota" SHOULD match the maximum number of acceptable 287 requests. 289 The "request-quota" MAY differ from the total number of acceptable 290 requests when weight mechanisms, bursts, or other server policies are 291 implemented. 293 If the "request-quota" does not match the maximum number of 294 acceptable requests the relation with that SHOULD be communicated 295 out-of-band. 297 Example: A server could 299 o count once requests like "/books/{id}" 301 o count twice search requests like "/books?author=Camilleri" 303 so that we have the following counters 305 GET /books/123 ; request-quota=4, remaining: 3, status=200 306 GET /books?author=Camilleri ; request-quota=4, remaining: 1, status=200 307 GET /books?author=Eco ; request-quota=4, remaining: 0, status=429 309 2.3. Quota policy 311 This specification allows describing a quota policy with the 312 following syntax: 314 quota-policy = request-quota; "w" "=" time-window 315 *( OWS ";" OWS quota-comment) 316 quota-comment = token "=" (token / quoted-string) 318 An example policy of 100 quota-units per minute. 320 100;w=60 322 Two examples of providing further details via custom parameters in 323 "quota-comments". 325 100;w=60;comment="fixed window" 326 12;w=1;burst=1000;policy="leaky bucket" 328 3. Header Specifications 330 The following "RateLimit" response header fields are defined 332 3.1. RateLimit-Limit 334 The "RateLimit-Limit" response header field indicates the "request- 335 quota" associated to the client in the current "time-window". 337 If the client exceeds that limit, it MAY not be served. 339 The header value is 341 RateLimit-Limit = expiring-limit [, 1#quota-policy ] 342 expiring-limit = request-quota 344 The "expiring-limit" value MUST be set to the "request-quota" that is 345 closer to reach its limit. 347 The "quota-policy" is defined in Section 2.3, and its values are 348 informative. 350 RateLimit-Limit: 100 352 A "time-window" associated to "expiring-limit" can be communicated 353 via an optional "quota-policy" value, like shown in the following 354 example 356 RateLimit-Limit: 100, 100;w=10 358 If the "expiring-limit" is not associated to a "time-window", the 359 "time-window" MUST either be: 361 o inferred by the value of "RateLimit-Reset" at the moment of the 362 reset, or 364 o communicated out-of-band (eg. in the documentation). 366 Policies using multiple quota limits MAY be returned using multiple 367 "quota-policy" items, like shown in the following two examples: 369 RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 370 RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600 372 3.2. RateLimit-Remaining 374 The "RateLimit-Remaining" response header field indicates the 375 remaining "quota-units" defined in Section 2.2 associated to the 376 client. 378 The header value is 380 RateLimit-Remaining = quota-units 382 Clients MUST NOT assume that a positive "RateLimit-Remaining" value 383 is a guarantee of being served. 385 A low "RateLimit-Remaining" value is like a yellow traffic-light: the 386 red light may arrive suddenly. 388 One example of "RateLimit-Remaining" use is below. 390 RateLimit-Remaining: 50 392 3.3. RateLimit-Reset 394 The "RateLimit-Reset" response header field indicates either 396 o the number of seconds until the quota resets. 398 The header value is 400 RateLimit-Reset = delta-seconds 402 The delta-seconds format is used because: 404 o it does not rely on clock synchronization and is resilient to 405 clock adjustment and clock skew between client and server (see 406 [RFC7231] Section 4.1.1.1); 408 o it mitigates the risk related to thundering herd when too many 409 clients are serviced with the same timestamp. 411 An example of "RateLimit-Reset" use is below. 413 RateLimit-Reset: 50 415 The client MUST NOT assume that all its "request-quota" will be 416 restored after the moment referenced by "RateLimit-Reset". The 417 server MAY arbitrarily alter the "RateLimit-Reset" value between 418 subsequent requests eg. in case of resource saturation or to 419 implement sliding window policies. 421 4. Providing RateLimit headers 423 A server MAY use one or more "RateLimit" response header fields 424 defined in this document to communicate its quota policies. 426 The returned values refers to the metrics used to evaluate if the 427 current request respects the quota policy and MAY not apply to 428 subsequent requests. 430 Example: a successful response with the following header fields 432 RateLimit-Limit: 10 433 RateLimit-Remaining: 1 434 RateLimit-Reset: 7 436 does not guarantee that the next request will be successful. Server 437 metrics may be subject to other conditions like the one shown in the 438 example from Section 2.2. 440 A server MAY return "RateLimit" response header fields independently 441 of the response status code. This includes throttled responses. 443 If a response contains both the "Retry-After" and the "RateLimit- 444 Reset" header fields, the value of "RateLimit-Reset" MUST be 445 consistent with the one of "Retry-After". 447 When using a policy involving more than one "time-window", the server 448 MUST reply with the "RateLimit" headers related to the window with 449 the lower "RateLimit-Remaining" values. 451 Under certain conditions, a server MAY artificially lower "RateLimit" 452 field values between subsequent requests, eg. to respond to Denial of 453 Service attacks or in case of resource saturation. 455 5. Receiving RateLimit headers 457 A client MUST process the received "RateLimit" headers. 459 A client MUST validate the values received in the "RateLimit" headers 460 before using them and check if there are significant discrepancies 461 with the expected ones. This includes a "RateLimit-Reset" moment too 462 far in the future or a "request-quota" too high. 464 Malformed "RateLimit" headers MAY be ignored. 466 A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit- 467 Remaining" before the "time-window" expressed in "RateLimit-Reset". 469 A client MAY still probe the server if the "RateLimit-Reset" is 470 considered too high. 472 The value of "RateLimit-Reset" is generated at response time: a 473 client aware of a significant network latency MAY behave accordingly 474 and use other informations (eg. the "Date" response header, or 475 otherwise gathered metrics) to better estimate the "RateLimit-Reset" 476 moment intended by the server. 478 The "quota-policy" values and comments provided in "RateLimit-Limit" 479 are informative and MAY be ignored. 481 If a response contains both the "RateLimit-Reset" and "Retry-After" 482 header fields, the "Retry-After" header field MUST take precedence 483 and the "RateLimit-Reset" header field MAY be ignored. 485 6. Examples 487 6.1. Unparameterized responses 489 6.1.1. Throttling informations in responses 491 The client exhausted its request-quota for the next 50 seconds. The 492 "time-window" is communicated out-of-band or inferred by the header 493 values. 495 Request: 497 GET /items/123 499 Response: 501 HTTP/1.1 200 Ok 502 Content-Type: application/json 503 RateLimit-Limit: 100 504 Ratelimit-Remaining: 0 505 Ratelimit-Reset: 50 507 {"hello": "world"} 509 6.1.2. Use in conjunction with custom headers 511 The server uses two custom headers, namely "acme-RateLimit-DayLimit" 512 and "acme-RateLimit-HourLimit" to expose the following policy: 514 o 5000 daily quota-units; 515 o 1000 hourly quota-units. 517 The client consumed 4900 quota-units in the first 14 hours. 519 Despite the next hourly limit of 1000 quota-units, the closest limit 520 to reach is the daily one. 522 The server then exposes the "RateLimit-*" headers to inform the 523 client that: 525 o it has only 100 quota-units left; 527 o the window will reset in 10 hours. 529 Request: 531 GET /items/123 533 Response: 535 HTTP/1.1 200 Ok 536 Content-Type: application/json 537 acme-RateLimit-DayLimit: 5000 538 acme-RateLimit-HourLimit: 1000 539 RateLimit-Limit: 5000 540 RateLimit-Remaining: 100 541 RateLimit-Reset: 36000 543 {"hello": "world"} 545 6.1.3. Use for limiting concurrency 547 Throttling headers may be used to limit concurrency, advertising 548 limits that are lower than the usual ones in case of saturation, thus 549 increasing availability. 551 The server adopted a basic policy of 100 quota-units per minute, and 552 in case of resource exhaustion adapts the returned values reducing 553 both "RateLimit-Limit" and "RateLimit-Remaining". 555 After 2 seconds the client consumed 40 quota-units 557 Request: 559 GET /items/123 560 Response: 562 HTTP/1.1 200 Ok 563 Content-Type: application/json 564 RateLimit-Limit: 100 565 RateLimit-Remaining: 60 566 RateLimit-Reset: 58 568 {"elapsed": 2, "issued": 40} 570 At the subsequent request - due to resource exhaustion - the server 571 advertises only "RateLimit-Remaining: 20". 573 Request: 575 GET /items/123 577 Response: 579 HTTP/1.1 200 Ok 580 Content-Type: application/json 581 RateLimit-Limit: 100 582 RateLimit-Remaining: 20 583 RateLimit-Reset: 56 585 {"elapsed": 4, "issued": 41} 587 6.1.4. Use in throttled responses 589 A client exhausted its quota and the server throttles the request 590 sending the "Retry-After" response header field. 592 The values of "Retry-After" and "RateLimit-Reset" are consistent as 593 they reference the same moment. 595 The "429 Too Many Requests" HTTP status code is just used as an 596 example. 598 Request: 600 GET /items/123 602 Response: 604 HTTP/1.1 429 Too Many Requests 605 Content-Type: application/json 606 Date: Mon, 05 Aug 2019 09:27:00 GMT 607 Retry-After: Mon, 05 Aug 2019 09:27:05 GMT 608 RateLimit-Reset: 5 609 RateLimit-Limit: 100 610 Ratelimit-Remaining: 0 612 { 613 "title": "Too Many Requests", 614 "status": 429, 615 "detail": "You have exceeded your quota" 616 } 618 6.2. Parameterized responses 620 6.2.1. Throttling window specified via parameter 622 The client has 99 "quota-units" left for the next 50 seconds. The 623 "time-window" is communicated by the "w" parameter, so we know the 624 throughput is 100 "quota-units" per minute. 626 Request: 628 GET /items/123 630 Response: 632 HTTP/1.1 200 Ok 633 Content-Type: application/json 634 RateLimit-Limit: 100, 100;w=60 635 Ratelimit-Remaining: 99 636 Ratelimit-Reset: 50 638 {"hello": "world"} 640 6.2.2. Dynamic limits with parameterized windows 642 The policy conveyed by "RateLimit-Limit" states that the server 643 accepts 100 quota-units per minute. 645 Due to resource exhaustion, the server artificially lowers the actual 646 limits returned in the throttling headers. 648 The current policy then advertises only 9 quota-units for the next 50 649 seconds. 651 Note that the server could have lowered even the other values in 652 "RateLimit-Limit": this specification does not mandate any relation 653 between the field values contained in subsequent responses. 655 Request: 657 GET /items/123 659 Response: 661 HTTP/1.1 200 Ok 662 Content-Type: application/json 663 RateLimit-Limit: 10, 100;w=60 664 Ratelimit-Remaining: 9 665 Ratelimit-Reset: 50 667 {"hello": "world"} 669 6.2.3. Missing Remaining informations 671 The server does not expose "RateLimit-Remaining" values, but resets 672 the limit counter every second. 674 It communicates to the client the limit of 10 quota-units per second 675 always returning the couple "RateLimit-Limit" and "RateLimit-Reset". 677 Request: 679 GET /items/123 681 Response: 683 HTTP/1.1 200 Ok 684 Content-Type: application/json 685 RateLimit-Limit: 10 686 Ratelimit-Reset: 1 688 {"first": "request"} 690 Request: 692 GET /items/123 694 Response: 696 HTTP/1.1 200 Ok 697 Content-Type: application/json 698 RateLimit-Limit: 10 699 Ratelimit-Reset: 1 701 {"second": "request"} 703 6.2.4. Use with multiple windows 705 This is a standardized way of describing the policy detailed in 706 Section 6.1.2: 708 o 5000 daily quota-units; 710 o 1000 hourly quota-units. 712 The client consumed 4900 quota-units in the first 14 hours. 714 Despite the next hourly limit of 1000 quota-units, the closest limit 715 to reach is the daily one. 717 The server then exposes the "RateLimit" headers to inform the client 718 that: 720 o it has only 100 quota-units left; 722 o the window will reset in 10 hours; 724 o the "expiring-limit" is 5000. 726 Request: 728 GET /items/123 730 Response: 732 HTTP/1.1 200 OK 733 Content-Type: application/json 734 RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400 735 RateLimit-Remaining: 100 736 RateLimit-Reset: 36000 738 {"hello": "world"} 740 7. Security Considerations 742 7.1. Throttling does not prevent clients from issuing requests 744 This specification does not prevent clients to make over-quota 745 requests. 747 Servers should always implement mechanisms to prevent resource 748 exhaustion. 750 7.2. Information disclosure 752 Servers should not disclose operational capacity informations that 753 can be used to saturate its resources. 755 While this specification does not mandate whether non 2xx responses 756 consume quota, if 401 and 403 responses count on quota a malicious 757 client could probe the endpoint to get traffic informations of 758 another user. 760 7.3. Remaining quota-units are not granted requests 762 "RateLimit-*" headers convey hints from the server to the clients in 763 order to avoid being throttled out. 765 Clients MUST NOT consider the "quota-units" returned in "RateLimit- 766 Remaining" as a service level agreement. 768 In case of resource saturation, the server MAY artificially lower the 769 returned values or not serve the request anyway. 771 7.4. Reliability of RateLimit-Reset 773 Consider that "request-quota" may not be restored after the moment 774 referenced by "RateLimit-Reset", and the "RateLimit-Reset" value 775 should not be considered fixed nor constant. 777 Subsequent requests may return an higher "RateLimit-Reset" value to 778 limit concurrency or implement dynamic or adaptive throttling 779 policies. 781 7.5. Resource exhaustion 783 When returning "RateLimit-Reset" you must be aware that many 784 throttled clients may come back at the very moment specified. 786 This is true for "Retry-After" too. 788 For example, if the quota resets every day at "18:00:00" and your 789 server returns the "RateLimit-Reset" accordingly 791 Date: Tue, 15 Nov 1994 08:00:00 GMT 792 RateLimit-Reset: 36000 794 there's a high probability that all clients will show up at 795 "18:00:00". 797 This could be mitigated adding some jitter to the field-value. 799 7.6. Denial of Service 801 "RateLimit" header fields may assume unexpected values by chance or 802 purpose. For example, an excessively high "RateLimit-Remaining" 803 value may be: 805 o used by a malicious intermediary to trigger a Denial of Service 806 attack or consume client resources boosting its requests; 808 o passed by a misconfigured server; 810 or an high "RateLimit-Reset" value could inhibit clients to contact 811 the server. 813 Clients MUST validate the received values to mitigate those risks. 815 8. IANA Considerations 817 8.1. RateLimit-Limit Header Field Registration 819 This section registers the "RateLimit-Limit" header field in the 820 "Permanent Message Header Field Names" registry ([RFC3864]). 822 Header field name: "RateLimit-Limit" 824 Applicable protocol: http 826 Status: standard 828 Author/Change controller: IETF 830 Specification document(s): Section 3.1 of this document 832 8.2. RateLimit-Remaining Header Field Registration 834 This section registers the "RateLimit-Remaining" header field in the 835 "Permanent Message Header Field Names" registry ([RFC3864]). 837 Header field name: "RateLimit-Remaining" 839 Applicable protocol: http 841 Status: standard 843 Author/Change controller: IETF 845 Specification document(s): Section 3.2 of this document 847 8.3. RateLimit-Reset Header Field Registration 849 This section registers the "RateLimit-Reset" header field in the 850 "Permanent Message Header Field Names" registry ([RFC3864]). 852 Header field name: "RateLimit-Reset" 854 Applicable protocol: http 856 Status: standard 858 Author/Change controller: IETF 860 Specification document(s): Section 3.3 of this document 862 9. References 864 9.1. Normative References 866 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 867 Requirement Levels", BCP 14, RFC 2119, 868 DOI 10.17487/RFC2119, March 1997, 869 . 871 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 872 Procedures for Message Header Fields", BCP 90, RFC 3864, 873 DOI 10.17487/RFC3864, September 2004, 874 . 876 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 877 Specifications: ABNF", STD 68, RFC 5234, 878 DOI 10.17487/RFC5234, January 2008, 879 . 881 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, 882 DOI 10.17487/RFC6454, December 2011, 883 . 885 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 886 Protocol (HTTP/1.1): Message Syntax and Routing", 887 RFC 7230, DOI 10.17487/RFC7230, June 2014, 888 . 890 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 891 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 892 DOI 10.17487/RFC7231, June 2014, 893 . 895 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 896 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 897 RFC 7234, DOI 10.17487/RFC7234, June 2014, 898 . 900 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 901 RFC 7405, DOI 10.17487/RFC7405, December 2014, 902 . 904 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 905 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 906 May 2017, . 908 [UNIX] The Open Group, ., "The Single UNIX Specification, Version 909 2 - 6 Vol Set for UNIX 98", February 1997. 911 9.2. Informative References 913 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 914 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 915 . 917 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 918 Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, 919 . 921 9.3. URIs 923 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 925 [2] https://github.com/ioggstream/draft-polli-ratelimit-headers 927 [3] https://community.ntppool.org/t/another-ntp-client-failure- 928 story/1014/ 930 [4] https://lists.w3.org/Archives/Public/ietf-http- 931 wg/2019JulSep/0202.html 933 Appendix A. Change Log 935 RFC EDITOR PLEASE DELETE THIS SECTION. 937 Appendix B. Acknowledgements 939 Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro 940 Ranellucci, Erik Wilde and Mark Nottingham for being the initial 941 contributors of these specifications. 943 Appendix C. RateLimit headers currently used on the web 945 RFC EDITOR PLEASE DELETE THIS SECTION. 947 Commonly used header field names are: 949 o "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset"; 951 o "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit- 952 Reset". 954 There are variants too, where the window is specified in the header 955 field name, eg: 957 o "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x- 958 ratelimit-limit-day" 960 o "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x- 961 ratelimit-remaining-day" 963 Here are some interoperability issues: 965 o "X-RateLimit-Remaining" references different values, depending on 966 the implementation: 968 * seconds remaining to the window expiration 970 * milliseconds remaining to the window expiration 972 * seconds since UTC, in UNIX Timestamp 974 * a datetime, either "IMF-fixdate" [RFC7231] or [RFC3339] 976 o different headers, with the same semantic, are used by different 977 implementers: 979 * X-RateLimit-Limit and X-Rate-Limit-Limit 981 * X-RateLimit-Remaining and X-Rate-Limit-Remaining 983 * X-RateLimit-Reset and X-Rate-Limit-Reset 985 The semantic of RateLimit-Remaining depends on the windowing 986 algorithm. A sliding window policy for example may result in having 987 a ratelimit-remaining value related to the ratio between the current 988 and the maximum throughput. Eg. 990 RateLimit-Limit: 12, 12;w=1 991 RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s 992 RateLimit-Reset: 1 994 If this is the case, the optimal solution is to achieve 996 RateLimit-Limit: 12, 12;w=1 997 RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s 998 RateLimit-Reset: 1 1000 At this point you should stop increasing your request rate. 1002 Appendix D. FAQ 1004 1. Why defining standard headers for throttling? 1006 To simplify enforcement of throttling policies. 1008 2. Can I use RateLimit-* in throttled responses (eg with status code 1009 429)? 1011 Yes, you can. 1013 3. Are those specs tied to RFC 6585? 1015 No. [RFC6585] defines the "429" status code and we use it just 1016 as an example of a throttled request, that could instead use even 1017 403 or whatever status code. 1019 4. Why don't pass the trottling scope as a parameter? 1021 I'm open to suggestions. File an issue if you think it's worth 1022 ;). 1024 5. Why using delta-seconds instead of a UNIX Timestamp? Why not 1025 using subsecond precision? 1026 Using delta-seconds aligns with "Retry-After", which is returned 1027 in similar contexts, eg on 429 responses. 1029 delta-seconds as defined in [RFC7234] section 1.2.1 clarifies 1030 some parsing rules too. 1032 Timestamps require a clock synchronization protocol (see 1033 [RFC7231] section 4.1.1.1). This may be problematic (eg. clock 1034 adjustment, clock skew, failure of hardcoded clock 1035 synchronization servers, IoT devices, ..). Moreover timestamps 1036 may not be monotonically increasing due to clock adjustment. See 1037 Another NTP client failure story [3] 1039 We did not use subsecond precision because: 1041 * that is more subject to system clock correction like the one 1042 implemented via the adjtimex() Linux system call; 1044 * response-time latency may not make it worth. A brief 1045 discussion on the subject is on the httpwg ml [4] 1047 * almost all rate-limit headers implementations do not use it. 1049 6. Why not support multiple quota remaining? 1051 While this might be of some value, my experience suggests that 1052 overly-complex quota implementations results in lower 1053 effectiveness of this policy. This spec allows the client to 1054 easily focusing on RateLimit-Remaining and RateLimit-Reset. 1056 7. Shouldn't I limit concurrency instead of request rate? 1058 You can do both. The goal of this spec is to provide guidance 1059 for clients in shaping their requests without being throttled 1060 out. 1062 Limiting concurrency results in unserviced client requests, which 1063 is something we want to avoid. 1065 A standard way to limit concurrency is to return 503 + Retry- 1066 After in case of resource saturation (eg. thrashing, connection 1067 queues too long, Service Level Objectives not meet, ..). 1069 Dynamically lowering the values returned by the rate-limit 1070 headers, and returning retry-after along with them can improve 1071 availability. 1073 Saturation conditions can be either dynamic or static: all this 1074 is out of the scope for the current document. 1076 8. Do a positive value of "RateLimit-Remaining" imply any service 1077 guarantee for my future requests to be served? 1079 No. The returned values were used to decide whether to serve or 1080 not _the current request_ and do not imply any guarantee that 1081 future requests will be successful. 1083 Instead they help to understand when future requests will 1084 probably be throttled. A low value for "RateLimit-Remaining" 1085 should be interpreted as a yellow traffic-light for either the 1086 number of requests issued in the "time-window" or the request 1087 throughput. 1089 9. Is the quota-policy definition Section 2.3 too complex? 1091 You can always return the simplest form of the 3 headers 1093 RateLimit-Limit: 100 1094 RateLimit-Remaining: 50 1095 RateLimit-Reset: 60 1097 The key runtime value is the first element of the list: "expiring- 1098 limit", the others "quota-policy" are informative. So for the 1099 following header: 1101 RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window" 1103 the key value is the one referencing the lowest limit: "100" 1105 Authors' Addresses 1107 Roberto Polli 1108 Team Digitale, Italian Government 1110 Email: robipolli@gmail.com 1112 Alejandro Martinez Ruiz 1113 Red Hat 1115 Email: amr@redhat.com