idnits 2.17.1 draft-ietf-httpapi-ratelimit-headers-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 41 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (11 May 2021) is 1078 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-httpbis-semantics-15 -- Possible downref: Normative reference to a draft: ref. 'SEMANTICS' -- Obsolete informational reference (is this intentional?): RFC 7234 (Obsoleted by RFC 9111) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTPAPI R. Polli 3 Internet-Draft Team Digitale, Italian Government 4 Intended status: Standards Track A. Martinez 5 Expires: 12 November 2021 Red Hat 6 11 May 2021 8 RateLimit Header Fields for HTTP 9 draft-ietf-httpapi-ratelimit-headers-01 11 Abstract 13 This document defines the RateLimit-Limit, RateLimit-Remaining, 14 RateLimit-Reset fields for HTTP, thus allowing servers to publish 15 current service limits and clients to shape their request policy and 16 avoid being throttled out. 18 Note to Readers 20 _RFC EDITOR: please remove this section before publication_ 22 Discussion of this draft takes place on the HTTP working group 23 mailing list (httpapi@ietf.org), which is archived at 24 https://lists.w3.org/Archives/Public/ietf-httpapi-wg/ 25 (https://lists.w3.org/Archives/Public/ietf-httpapi-wg/). 27 The source code and issues list for this draft can be found at 28 https://github.com/ietf-wg-httpapi/ratelimit-headers 29 (https://github.com/ietf-wg-httpapi/ratelimit-headers). 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on 12 November 2021. 48 Copyright Notice 50 Copyright (c) 2021 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 55 license-info) in effect on the date of publication of this document. 56 Please review these documents carefully, as they describe your rights 57 and restrictions with respect to this document. Code Components 58 extracted from this document must include Simplified BSD License text 59 as described in Section 4.e of the Trust Legal Provisions and are 60 provided without warranty as described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Rate-limiting and quotas . . . . . . . . . . . . . . . . 3 66 1.2. Current landscape of rate-limiting headers . . . . . . . 4 67 1.2.1. Interoperability issues . . . . . . . . . . . . . . . 4 68 1.3. This proposal . . . . . . . . . . . . . . . . . . . . . . 5 69 1.4. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 70 1.5. Notational Conventions . . . . . . . . . . . . . . . . . 6 71 2. Expressing rate-limit policies . . . . . . . . . . . . . . . 6 72 2.1. Time window . . . . . . . . . . . . . . . . . . . . . . . 6 73 2.2. Service limit . . . . . . . . . . . . . . . . . . . . . . 7 74 2.3. Quota policy . . . . . . . . . . . . . . . . . . . . . . 7 75 3. Header Specifications . . . . . . . . . . . . . . . . . . . . 8 76 3.1. RateLimit-Limit . . . . . . . . . . . . . . . . . . . . . 8 77 3.2. RateLimit-Remaining . . . . . . . . . . . . . . . . . . . 9 78 3.3. RateLimit-Reset . . . . . . . . . . . . . . . . . . . . . 9 79 4. Providing RateLimit fields . . . . . . . . . . . . . . . . . 10 80 4.1. Performance considerations . . . . . . . . . . . . . . . 11 81 5. Intermediaries . . . . . . . . . . . . . . . . . . . . . . . 12 82 6. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 83 7. Receiving RateLimit fields . . . . . . . . . . . . . . . . . 12 84 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 13 85 8.1. Unparameterized responses . . . . . . . . . . . . . . . . 13 86 8.1.1. Throttling information in responses . . . . . . . . . 13 87 8.1.2. Use in conjunction with custom fields . . . . . . . . 14 88 8.1.3. Use for limiting concurrency . . . . . . . . . . . . 15 89 8.1.4. Use in throttled responses . . . . . . . . . . . . . 16 90 8.2. Parameterized responses . . . . . . . . . . . . . . . . . 17 91 8.2.1. Throttling window specified via parameter . . . . . . 17 92 8.2.2. Dynamic limits with parameterized windows . . . . . . 17 93 8.2.3. Dynamic limits for pushing back and slowing down . . 18 94 8.3. Dynamic limits for pushing back with Retry-After and slow 95 down . . . . . . . . . . . . . . . . . . . . . . . . . . 19 97 8.3.1. Missing Remaining information . . . . . . . . . . . . 20 98 8.3.2. Use with multiple windows . . . . . . . . . . . . . . 20 99 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21 100 9.1. Throttling does not prevent clients from issuing 101 requests . . . . . . . . . . . . . . . . . . . . . . . . 21 102 9.2. Information disclosure . . . . . . . . . . . . . . . . . 21 103 9.3. Remaining quota-units are not granted requests . . . . . 22 104 9.4. Reliability of RateLimit-Reset . . . . . . . . . . . . . 22 105 9.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 22 106 9.6. Denial of Service . . . . . . . . . . . . . . . . . . . . 23 107 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 108 10.1. RateLimit-Limit Field Registration . . . . . . . . . . . 23 109 10.2. RateLimit-Remaining Field Registration . . . . . . . . . 24 110 10.3. RateLimit-Reset Field Registration . . . . . . . . . . . 24 111 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 112 11.1. Normative References . . . . . . . . . . . . . . . . . . 24 113 11.2. Informative References . . . . . . . . . . . . . . . . . 25 114 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 25 115 Appendix B. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 25 116 RateLimit fields currently used on the web . . . . . . . . . . . 29 117 Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 118 Since draft-ietf-httpapi-ratelimit-headers-00 . . . . . . . . . 30 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 121 1. Introduction 123 The widespreading of HTTP as a distributed computation protocol 124 requires an explicit way of communicating service status and usage 125 quotas. 127 This was partially addressed by the "Retry-After" header field 128 defined in [SEMANTICS] to be returned in "429 Too Many Requests" (see 129 [STATUS429]) or "503 Service Unavailable" responses. 131 Still, there is not a standard way to communicate service quotas so 132 that the client can throttle its requests and prevent 4xx or 5xx 133 responses. 135 1.1. Rate-limiting and quotas 137 Servers use quota mechanisms to avoid systems overload, to ensure an 138 equitable distribution of computational resources or to enforce other 139 policies - eg. monetization. 141 A basic quota mechanism limits the number of acceptable requests in a 142 given time window, eg. 10 requests per second. 144 When quota is exceeded, servers usually do not serve the request 145 replying instead with a "4xx" HTTP status code (eg. 429 or 403) or 146 adopt more aggressive policies like dropping connections. 148 Quotas may be enforced on different basis (eg. per user, per IP, per 149 geographic area, ..) and at different levels. For example, an user 150 may be allowed to issue: 152 * 10 requests per second; 154 * limited to 60 request per minute; 156 * limited to 1000 request per hour. 158 Moreover system metrics, statistics and heuristics can be used to 159 implement more complex policies, where the number of acceptable 160 request and the time window are computed dynamically. 162 1.2. Current landscape of rate-limiting headers 164 To help clients throttling their requests, servers may expose the 165 counters used to evaluate quota policies via HTTP header fields. 167 Those response headers may be added by HTTP intermediaries such as 168 API gateways and reverse proxies. 170 On the web we can find many different rate-limit headers, usually 171 containing the number of allowed requests in a given time window, and 172 when the window is reset. 174 The common choice is to return three headers containing: 176 * the maximum number of allowed requests in the time window; 178 * the number of remaining requests in the current window; 180 * the time remaining in the current window expressed in seconds or 181 as a timestamp; 183 1.2.1. Interoperability issues 185 A major interoperability issue in throttling is the lack of standard 186 headers, because: 188 * each implementation associates different semantics to the same 189 header field names; 191 * header field names proliferates. 193 User Agents interfacing with different servers may thus need to 194 process different headers, or the very same application interface 195 that sits behind different reverse proxies may reply with different 196 throttling headers. 198 1.3. This proposal 200 This proposal defines syntax and semantics for the following fields: 202 * "RateLimit-Limit": containing the requests quota in the time 203 window; 205 * "RateLimit-Remaining": containing the remaining requests quota in 206 the current window; 208 * "RateLimit-Reset": containing the time remaining in the current 209 window, specified in seconds. 211 The behavior of "RateLimit-Reset" is compatible with the "delay- 212 seconds" notation of "Retry-After". 214 The fields definition allows to describe complex policies, including 215 the ones using multiple and variable time windows and dynamic quotas, 216 or implementing concurrency limits. 218 1.4. Goals 220 The goals of this proposal are: 222 1. Standardizing the names and semantics of rate-limit headers to 223 ease their enforcement and adoption; 225 2. Improve resiliency of HTTP infrastructure by providing clients 226 with information useful to throttle their requests and prevent 227 4xx or 5xx responses; 229 3. Simplify API documentation by eliminating the need to include 230 detailed quota limits and related header fields in API 231 documentation. 233 The goals do not include: 235 Authorization: The rate-limit fields described here are not meant to 236 support authorization or other kinds of access controls. 238 Throttling scope: This specification does not cover the throttling 239 scope, that may be the given resource-target, its parent path or 240 the whole Origin (see Section 7 of [RFC6454]). 242 Response status code: The rate-limit fields may be returned in both 243 successful (see Section 15.3 of [SEMANTICS]) and non-successful 244 responses. This specification does not cover whether non 245 Successful responses count on quota usage, nor it mandates any 246 correlation between the RateLimit values and the returned status 247 code. 249 Throttling policy: This specification does not mandate a specific 250 throttling policy. The values published in the fields, including 251 the window size, can be statically or dynamically evaluated. 253 Service Level Agreement: Conveyed quota hints do not imply any 254 service guarantee. Server is free to throttle respectful clients 255 under certain circumstances. 257 1.5. Notational Conventions 259 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 260 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 261 "OPTIONAL" in this document are to be interpreted as described in 262 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 263 capitals, as shown here. 265 This document uses the Augmented BNF defined in [RFC5234] and updated 266 by [RFC7405] along with the "#rule" extension defined in 267 Section 5.6.1 of [SEMANTICS]. 269 The term Origin is to be interpreted as described in Section 7 of 270 [RFC6454]. 272 The "delay-seconds" rule is defined in Section 10.2.4 of [SEMANTICS]. 274 2. Expressing rate-limit policies 276 2.1. Time window 278 Rate limit policies limit the number of acceptable requests in a 279 given time window. 281 A time window is expressed in seconds, using the following syntax: 283 time-window = delay-seconds 285 Subsecond precision is not supported. 287 2.2. Service limit 289 The service-limit is a value associated to the maximum number of 290 requests that the server is willing to accept from one or more 291 clients on a given basis (originating IP, authenticated user, 292 geographical, ..) during a "time-window" as defined in Section 2.1. 294 The "service-limit" is expressed in "quota-units" and has the 295 following syntax: 297 service-limit = quota-units 298 quota-units = 1*DIGIT 300 The "service-limit" SHOULD match the maximum number of acceptable 301 requests. 303 The "service-limit" MAY differ from the total number of acceptable 304 requests when weight mechanisms, bursts, or other server policies are 305 implemented. 307 If the "service-limit" does not match the maximum number of 308 acceptable requests the relation with that SHOULD be communicated 309 out-of-band. 311 Example: A server could 313 * count once requests like "/books/{id}" 315 * count twice search requests like "/books?author=Camilleri" 317 so that we have the following counters 319 GET /books/123 ; service-limit=4, remaining: 3, status=200 320 GET /books?author=Camilleri ; service-limit=4, remaining: 1, status=200 321 GET /books?author=Eco ; service-limit=4, remaining: 0, status=429 323 2.3. Quota policy 325 This specification allows describing a quota policy with the 326 following syntax: 328 quota-policy = service-limit; "w" "=" time-window 329 *( OWS ";" OWS quota-comment) 330 quota-comment = token "=" (token / quoted-string) 332 quota-policy parameters like "w" and quota-comment tokens MUST NOT 333 occur multiple times within the same quota-policy. 335 An example policy of 100 quota-units per minute. 337 100;w=60 339 The definition of a quota-policy does not imply any specific 340 distribution of quota-units over time. Such service specific details 341 can be conveyed in the "quota-comments". 343 Two examples of providing further details via custom parameters in 344 "quota-comments". 346 100;w=60;comment="fixed window" 347 12;w=1;burst=1000;policy="leaky bucket" 349 3. Header Specifications 351 The following "RateLimit" response fields are defined 353 3.1. RateLimit-Limit 355 The "RateLimit-Limit" response field indicates the "service-limit" 356 associated to the client in the current "time-window". 358 If the client exceeds that limit, it MAY not be served. 360 The field value is 362 RateLimit-Limit = expiring-limit [, 1#quota-policy ] 363 expiring-limit = service-limit 365 The "expiring-limit" value MUST be set to the "service-limit" that is 366 closer to reach its limit. 368 The "quota-policy" is defined in Section 2.3, and its values are 369 informative. 371 RateLimit-Limit: 100 373 A "time-window" associated to "expiring-limit" can be communicated 374 via an optional "quota-policy" value, like shown in the following 375 example 377 RateLimit-Limit: 100, 100;w=10 379 If the "expiring-limit" is not associated to a "time-window", the 380 "time-window" MUST either be: 382 * inferred by the value of "RateLimit-Reset" at the moment of the 383 reset, or 385 * communicated out-of-band (eg. in the documentation). 387 Policies using multiple quota limits MAY be returned using multiple 388 "quota-policy" items, like shown in the following two examples: 390 RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 391 RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600 393 This field MUST NOT occur multiple times and can be sent in a trailer 394 section. 396 3.2. RateLimit-Remaining 398 The "RateLimit-Remaining" response field indicates the remaining 399 "quota-units" defined in Section 2.2 associated to the client. 401 The field value is 403 RateLimit-Remaining = quota-units 405 This field MUST NOT occur multiple times and can be sent in a trailer 406 section. 408 Clients MUST NOT assume that a positive "RateLimit-Remaining" value 409 is a guarantee that further requests will be served. 411 A low "RateLimit-Remaining" value is like a yellow traffic-light for 412 either the number of requests issued in the "time-window" or the 413 request throughput: the red light may arrive suddenly (see 414 Section 4). 416 One example of "RateLimit-Remaining" use is below. 418 RateLimit-Remaining: 50 420 3.3. RateLimit-Reset 422 The "RateLimit-Reset" response field indicates either 424 * the number of seconds until the quota resets. 426 The field value is 428 RateLimit-Reset = delay-seconds 430 The delay-seconds format is used because: 432 * it does not rely on clock synchronization and is resilient to 433 clock adjustment and clock skew between client and server (see 434 Section 5.6.7 of [SEMANTICS]); 436 * it mitigates the risk related to thundering herd when too many 437 clients are serviced with the same timestamp. 439 This field MUST NOT occur multiple times and can be sent in a trailer 440 section. 442 An example of "RateLimit-Reset" use is below. 444 RateLimit-Reset: 50 446 The client MUST NOT assume that all its "service-limit" will be 447 restored after the moment referenced by "RateLimit-Reset". The 448 server MAY arbitrarily alter the "RateLimit-Reset" value between 449 subsequent requests eg. in case of resource saturation or to 450 implement sliding window policies. 452 4. Providing RateLimit fields 454 A server MAY use one or more "RateLimit" response fields defined in 455 this document to communicate its quota policies. 457 The returned values refers to the metrics used to evaluate if the 458 current request respects the quota policy and MAY not apply to 459 subsequent requests. 461 Example: a successful response with the following fields 463 RateLimit-Limit: 10 464 RateLimit-Remaining: 1 465 RateLimit-Reset: 7 467 does not guarantee that the next request will be successful. Server 468 metrics may be subject to other conditions like the one shown in the 469 example from Section 2.2. 471 A server MAY return "RateLimit" response fields independently of the 472 response status code. This includes throttled responses. 474 This document does not mandate any correlation between the 475 "RateLimit" values and the returned status code. 477 Servers should be careful in returning "RateLimit" fields in 478 redirection responses (eg. 3xx status codes) because a low 479 "RateLimit-Remaining" value could limit the client from issuing 480 requests. For example, given the rate limiting fields below, a 481 client could decide to wait 10 seconds before following the 482 "Location" header, because "RateLimit-Remaining" is 0. 484 HTTP/1.1 301 Moved Permanently 485 Location: /foo/123 486 RateLimit-Remaining: 0 487 RateLimit-Limit: 10 488 RateLimit-Reset: 10 490 If a response contains both the "Retry-After" and the "RateLimit- 491 Reset" fields, the value of "RateLimit-Reset" SHOULD reference the 492 same point in time as "Retry-After". 494 When using a policy involving more than one "time-window", the server 495 MUST reply with the "RateLimit" fields related to the window with the 496 lower "RateLimit-Remaining" values. 498 A service returning "RateLimit" fields MUST NOT convey values 499 exposing an unwanted volume of requests and SHOULD implement 500 mechanisms to cap the ratio between "RateLimit-Remaining" and 501 "RateLimit-Reset" (see Section 9.5); this is especially important 502 when quota-policies use a large "time-window". 504 Under certain conditions, a server MAY artificially lower "RateLimit" 505 field values between subsequent requests, eg. to respond to Denial of 506 Service attacks or in case of resource saturation. 508 Servers usually establish whether the request is in-quota before 509 creating a response, so the RateLimit field values should be already 510 available in that moment. Nonetheless servers MAY decide to send the 511 "RateLimit" fields in a trailer section. 513 4.1. Performance considerations 515 Servers are not required to return "RateLimit" fields in every 516 response, and clients need to take this into account. For example, 517 an implementer concerned with performance might provide "RateLimit" 518 fields only when a given quota is going to expire. 520 Implementers concerned with response fields' size, might take into 521 account their ratio with respect to the payload data, or use header- 522 compression http features such as [HPACK]. 524 5. Intermediaries 526 This section documents the considerations advised in Section 16.3.3 527 of [SEMANTICS]. 529 An intermediary that is not part of the originating service 530 infrastructure and is not aware of the quota-policy semantic used by 531 the Origin Server SHOULD NOT alter the RateLimit fields' values in 532 such a way as to communicate a more permissive quota-policy; this 533 includes removing the RateLimit fields. 535 An intermediary MAY alter the RateLimit fields in such a way as to 536 communicate a more restrictive quota-policy when: 538 * it is aware of the quota-unit semantic used by the Origin Server; 540 * it implements this specification and enforces a quota-policy which 541 is more restrictive than the one conveyed in the fields. 543 An intermediary SHOULD forward a request even when presuming that it 544 might not be serviced; the service returning the RateLimit fields is 545 the sole responsible of enforcing the communicated quota-policy, and 546 it is always free to service incoming requests. 548 This specification does not mandate any behavior on intermediaries 549 respect to retries, nor requires that intermediaries have any role in 550 respecting quota-policies. For example, it is legitimate for a proxy 551 to retransmit a request without notifying the client, and thus 552 consuming quota-units. 554 6. Caching 556 As is the ordinary case for HTTP caching ([RFC7234]), a response with 557 RateLimit fields might be cached and re-used for subsequent requests. 558 A cached "RateLimit" response does not modify quota counters but 559 could contain stale information. Clients interested in determining 560 the freshness of the "RateLimit" fields could rely on fields such as 561 "Date" and on the "time-window" of a "quota-policy". 563 7. Receiving RateLimit fields 565 A client MUST process the received "RateLimit" fields. 567 A client MUST validate the values received in the "RateLimit" fields 568 before using them and check if there are significant discrepancies 569 with the expected ones. This includes a "RateLimit-Reset" moment too 570 far in the future or a "service-limit" too high. 572 A client receiving "RateLimit" fields MUST NOT assume that subsequent 573 responses contain the same "RateLimit" fields, or any "RateLimit" 574 fields at all. 576 Malformed "RateLimit" fields MAY be ignored. 578 A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit- 579 Remaining" before the "time-window" expressed in "RateLimit-Reset". 581 A client MAY still probe the server if the "RateLimit-Reset" is 582 considered too high. 584 The value of "RateLimit-Reset" is generated at response time: a 585 client aware of a significant network latency MAY behave accordingly 586 and use other information (eg. the "Date" response header field, or 587 otherwise gathered metrics) to better estimate the "RateLimit-Reset" 588 moment intended by the server. 590 The "quota-policy" values and comments provided in "RateLimit-Limit" 591 are informative and MAY be ignored. 593 If a response contains both the "RateLimit-Reset" and "Retry-After" 594 fields, "Retry-After" MUST take precedence and "RateLimit-Reset" MAY 595 be ignored. 597 This specification does not mandate a specific throttling behavior 598 and implementers can adopt their preferred policies, including: 600 * slowing down or preemptively backoff their request rate when 601 approaching quota limits; 603 * consuming all the quota according to the exposed limits and then 604 wait. 606 8. Examples 608 8.1. Unparameterized responses 610 8.1.1. Throttling information in responses 612 The client exhausted its service-limit for the next 50 seconds. The 613 "time-window" is communicated out-of-band or inferred by the field 614 values. 616 Request: 618 GET /items/123 HTTP/1.1 619 Host: api.example 620 Response: 622 HTTP/1.1 200 Ok 623 Content-Type: application/json 624 RateLimit-Limit: 100 625 Ratelimit-Remaining: 0 626 Ratelimit-Reset: 50 628 {"hello": "world"} 630 Since the field values are not necessarily correlated with the 631 response status code, a subsequent request is not required to fail. 632 The example below shows that the server decided to serve the request 633 even if "RateLimit-Remaining" is 0. Another server, or the same 634 server under other load conditions, could have decided to throttle 635 the request instead. 637 Request: 639 GET /items/456 HTTP/1.1 640 Host: api.example 642 Response: 644 HTTP/1.1 200 Ok 645 Content-Type: application/json 646 RateLimit-Limit: 100 647 Ratelimit-Remaining: 0 648 Ratelimit-Reset: 48 650 {"still": "successful"} 652 8.1.2. Use in conjunction with custom fields 654 The server uses two custom fields, namely "acme-RateLimit-DayLimit" 655 and "acme-RateLimit-HourLimit" to expose the following policy: 657 * 5000 daily quota-units; 659 * 1000 hourly quota-units. 661 The client consumed 4900 quota-units in the first 14 hours. 663 Despite the next hourly limit of 1000 quota-units, the closest limit 664 to reach is the daily one. 666 The server then exposes the "RateLimit-*" fields to inform the client 667 that: 669 * it has only 100 quota-units left; 671 * the window will reset in 10 hours. 673 Request: 675 GET /items/123 HTTP/1.1 676 Host: api.example 678 Response: 680 HTTP/1.1 200 Ok 681 Content-Type: application/json 682 acme-RateLimit-DayLimit: 5000 683 acme-RateLimit-HourLimit: 1000 684 RateLimit-Limit: 5000 685 RateLimit-Remaining: 100 686 RateLimit-Reset: 36000 688 {"hello": "world"} 690 8.1.3. Use for limiting concurrency 692 Throttling fields may be used to limit concurrency, advertising 693 limits that are lower than the usual ones in case of saturation, thus 694 increasing availability. 696 The server adopted a basic policy of 100 quota-units per minute, and 697 in case of resource exhaustion adapts the returned values reducing 698 both "RateLimit-Limit" and "RateLimit-Remaining". 700 After 2 seconds the client consumed 40 quota-units 702 Request: 704 GET /items/123 HTTP/1.1 705 Host: api.example 707 Response: 709 HTTP/1.1 200 Ok 710 Content-Type: application/json 711 RateLimit-Limit: 100 712 RateLimit-Remaining: 60 713 RateLimit-Reset: 58 715 {"elapsed": 2, "issued": 40} 716 At the subsequent request - due to resource exhaustion - the server 717 advertises only "RateLimit-Remaining: 20". 719 Request: 721 GET /items/123 HTTP/1.1 722 Host: api.example 724 Response: 726 HTTP/1.1 200 Ok 727 Content-Type: application/json 728 RateLimit-Limit: 100 729 RateLimit-Remaining: 20 730 RateLimit-Reset: 56 732 {"elapsed": 4, "issued": 41} 734 8.1.4. Use in throttled responses 736 A client exhausted its quota and the server throttles it sending 737 "Retry-After". 739 In this example, the values of "Retry-After" and "RateLimit-Reset" 740 reference the same moment, but this is not a requirement. 742 The "429 Too Many Requests" HTTP status code is just used as an 743 example. 745 Request: 747 GET /items/123 HTTP/1.1 748 Host: api.example 750 Response: 752 HTTP/1.1 429 Too Many Requests 753 Content-Type: application/json 754 Date: Mon, 05 Aug 2019 09:27:00 GMT 755 Retry-After: Mon, 05 Aug 2019 09:27:05 GMT 756 RateLimit-Reset: 5 757 RateLimit-Limit: 100 758 Ratelimit-Remaining: 0 760 { 761 "title": "Too Many Requests", 762 "status": 429, 763 "detail": "You have exceeded your quota" 764 } 766 8.2. Parameterized responses 768 8.2.1. Throttling window specified via parameter 770 The client has 99 "quota-units" left for the next 50 seconds. The 771 "time-window" is communicated by the "w" parameter, so we know the 772 throughput is 100 "quota-units" per minute. 774 Request: 776 GET /items/123 HTTP/1.1 777 Host: api.example 779 Response: 781 HTTP/1.1 200 Ok 782 Content-Type: application/json 783 RateLimit-Limit: 100, 100;w=60 784 Ratelimit-Remaining: 99 785 Ratelimit-Reset: 50 787 {"hello": "world"} 789 8.2.2. Dynamic limits with parameterized windows 791 The policy conveyed by "RateLimit-Limit" states that the server 792 accepts 100 quota-units per minute. 794 To avoid resource exhaustion, the server artificially lowers the 795 actual limits returned in the throttling headers. 797 The "RateLimit-Remaining" then advertises only 9 quota-units for the 798 next 50 seconds to slow down the client. 800 Note that the server could have lowered even the other values in 801 "RateLimit-Limit": this specification does not mandate any relation 802 between the field values contained in subsequent responses. 804 Request: 806 GET /items/123 HTTP/1.1 807 Host: api.example 809 Response: 811 HTTP/1.1 200 Ok 812 Content-Type: application/json 813 RateLimit-Limit: 10, 100;w=60 814 Ratelimit-Remaining: 9 815 Ratelimit-Reset: 50 817 { 818 "status": 200, 819 "detail": "Just slow down without waiting." 820 } 822 8.2.3. Dynamic limits for pushing back and slowing down 824 Continuing the previous example, let's say the client waits 10 825 seconds and performs a new request which, due to resource exhaustion, 826 the server rejects and pushes back, advertising "RateLimit-Remaining: 827 0" for the next 20 seconds. 829 The server advertises a smaller window with a lower limit to slow 830 down the client for the rest of its original window after the 20 831 seconds elapse. 833 Request: 835 GET /items/123 HTTP/1.1 836 Host: api.example 838 Response: 840 HTTP/1.1 429 Too Many Requests 841 Content-Type: application/json 842 RateLimit-Limit: 0, 15;w=20 843 Ratelimit-Remaining: 0 844 Ratelimit-Reset: 20 846 { 847 "status": 429, 848 "detail": "Wait 20 seconds, then slow down!" 849 } 851 8.3. Dynamic limits for pushing back with Retry-After and slow down 853 Alternatively, given the same context where the previous example 854 starts, we can convey the same information to the client via "Retry- 855 After", with the advantage that the server can now specify the 856 policy's nominal limit and window that will apply after the reset, 857 ie. assuming the resource exhaustion is likely to be gone by then, so 858 the advertised policy does not need to be adjusted, yet we managed to 859 stop requests for a while and slow down the rest of the current 860 window. 862 Request: 864 GET /items/123 HTTP/1.1 865 Host: api.example 867 Response: 869 HTTP/1.1 429 Too Many Requests 870 Content-Type: application/json 871 Retry-After: 20 872 RateLimit-Limit: 15, 100;w=60 873 Ratelimit-Remaining: 15 874 Ratelimit-Reset: 40 876 { 877 "status": 429, 878 "detail": "Wait 20 seconds, then slow down!" 879 } 881 Note that in this last response the client is expected to honor 882 "Retry-After" and perform no requests for the specified amount of 883 time, whereas the previous example would not force the client to stop 884 requests before the reset time is elapsed, as it would still be free 885 to query again the server even if it is likely to have the request 886 rejected. 888 8.3.1. Missing Remaining information 890 The server does not expose "RateLimit-Remaining" values, but resets 891 the limit counter every second. 893 It communicates to the client the limit of 10 quota-units per second 894 always returning the couple "RateLimit-Limit" and "RateLimit-Reset". 896 Request: 898 GET /items/123 HTTP/1.1 899 Host: api.example 901 Response: 903 HTTP/1.1 200 Ok 904 Content-Type: application/json 905 RateLimit-Limit: 10 906 Ratelimit-Reset: 1 908 {"first": "request"} 910 Request: 912 GET /items/123 HTTP/1.1 913 Host: api.example 915 Response: 917 HTTP/1.1 200 Ok 918 Content-Type: application/json 919 RateLimit-Limit: 10 920 Ratelimit-Reset: 1 922 {"second": "request"} 924 8.3.2. Use with multiple windows 926 This is a standardized way of describing the policy detailed in 927 Section 8.1.2: 929 * 5000 daily quota-units; 931 * 1000 hourly quota-units. 933 The client consumed 4900 quota-units in the first 14 hours. 935 Despite the next hourly limit of 1000 quota-units, the closest limit 936 to reach is the daily one. 938 The server then exposes the "RateLimit" fields to inform the client 939 that: 941 * it has only 100 quota-units left; 943 * the window will reset in 10 hours; 945 * the "expiring-limit" is 5000. 947 Request: 949 GET /items/123 HTTP/1.1 950 Host: api.example 952 Response: 954 HTTP/1.1 200 OK 955 Content-Type: application/json 956 RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400 957 RateLimit-Remaining: 100 958 RateLimit-Reset: 36000 960 {"hello": "world"} 962 9. Security Considerations 964 9.1. Throttling does not prevent clients from issuing requests 966 This specification does not prevent clients to make over-quota 967 requests. 969 Servers should always implement mechanisms to prevent resource 970 exhaustion. 972 9.2. Information disclosure 974 Servers should not disclose operational capacity information that can 975 be used to saturate its resources. 977 While this specification does not mandate whether non 2xx responses 978 consume quota, if 401 and 403 responses count on quota a malicious 979 client could probe the endpoint to get traffic information of another 980 user. 982 As intermediaries might retransmit requests and consume quota-units 983 without prior knowledge of the User Agent, RateLimit fields might 984 reveal the existence of an intermediary to the User Agent. 986 9.3. Remaining quota-units are not granted requests 988 "RateLimit-*" fields convey hints from the server to the clients in 989 order to avoid being throttled out. 991 Clients MUST NOT consider the "quota-units" returned in "RateLimit- 992 Remaining" as a service level agreement. 994 In case of resource saturation, the server MAY artificially lower the 995 returned values or not serve the request anyway. 997 9.4. Reliability of RateLimit-Reset 999 Consider that "service-limit" may not be restored after the moment 1000 referenced by "RateLimit-Reset", and the "RateLimit-Reset" value 1001 should not be considered fixed nor constant. 1003 Subsequent requests may return an higher "RateLimit-Reset" value to 1004 limit concurrency or implement dynamic or adaptive throttling 1005 policies. 1007 9.5. Resource exhaustion 1009 When returning "RateLimit-Reset" you must be aware that many 1010 throttled clients may come back at the very moment specified. 1012 This is true for "Retry-After" too. 1014 For example, if the quota resets every day at "18:00:00" and your 1015 server returns the "RateLimit-Reset" accordingly 1017 Date: Tue, 15 Nov 1994 08:00:00 GMT 1018 RateLimit-Reset: 36000 1020 there's a high probability that all clients will show up at 1021 "18:00:00". 1023 This could be mitigated by adding some jitter to the field-value. 1025 Resource exhaustion issues can be associated with quota policies 1026 using a large "time-window", because a user agent by chance or 1027 purpose might consume most of its quota-units in a significantly 1028 shorter interval. 1030 This behavior can be even triggered by the provided "RateLimit" 1031 fields. The following example describes a service with an unconsumed 1032 quota-policy of 10000 quota-units per 1000 seconds. 1034 RateLimit-Limit: 10000, 10000;w=1000 1035 RateLimit-Remaining: 10000 1036 RateLimit-Reset: 10 1038 A client implementing a simple ratio between "RateLimit-Remaining" 1039 and "RateLimit-Reset" could infer an average throughput of 1000 1040 quota-units per second, while "RateLimit-Limit" conveys a quota- 1041 policy with an average of 10 quota-units per second. If the service 1042 cannot handle such load, it should return either a lower "RateLimit- 1043 Remaining" value or an higher "RateLimit-Reset" value. Moreover, 1044 complementing large "time-window" quota-policies with a short "time- 1045 window" one mitigates those risks. 1047 9.6. Denial of Service 1049 "RateLimit" fields may assume unexpected values by chance or purpose. 1050 For example, an excessively high "RateLimit-Remaining" value may be: 1052 * used by a malicious intermediary to trigger a Denial of Service 1053 attack or consume client resources boosting its requests; 1055 * passed by a misconfigured server; 1057 or an high "RateLimit-Reset" value could inhibit clients to contact 1058 the server. 1060 Clients MUST validate the received values to mitigate those risks. 1062 10. IANA Considerations 1064 10.1. RateLimit-Limit Field Registration 1066 This section registers the "RateLimit-Limit" field in the "Hypertext 1067 Transfer Protocol (HTTP) Field Name Registry" registry ([SEMANTICS]). 1069 Field name: "RateLimit-Limit" 1071 Status: permanent 1073 Specification document(s): Section 3.1 of this document 1075 10.2. RateLimit-Remaining Field Registration 1077 This section registers the "RateLimit-Remaining" field in the 1078 "Hypertext Transfer Protocol (HTTP) Field Name Registry" registry 1079 ([SEMANTICS]). 1081 Field name: "RateLimit-Remaining" 1083 Status: permanent 1085 Specification document(s): Section 3.2 of this document 1087 10.3. RateLimit-Reset Field Registration 1089 This section registers the "RateLimit-Reset" field in the "Hypertext 1090 Transfer Protocol (HTTP) Field Name Registry" registry ([SEMANTICS]). 1092 Field name: "RateLimit-Reset" 1094 Status: permanent 1096 Specification document(s): Section 3.3 of this document 1098 11. References 1100 11.1. Normative References 1102 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1103 Requirement Levels", BCP 14, RFC 2119, 1104 DOI 10.17487/RFC2119, March 1997, 1105 . 1107 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1108 Specifications: ABNF", STD 68, RFC 5234, 1109 DOI 10.17487/RFC5234, January 2008, 1110 . 1112 [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, 1113 DOI 10.17487/RFC6454, December 2011, 1114 . 1116 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 1117 RFC 7405, DOI 10.17487/RFC7405, December 2014, 1118 . 1120 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1121 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1122 May 2017, . 1124 [SEMANTICS] 1125 Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP 1126 Semantics", Work in Progress, Internet-Draft, draft-ietf- 1127 httpbis-semantics-15, 30 March 2021, 1128 . 1131 11.2. Informative References 1133 [HPACK] Peon, R. and H. Ruellan, "HPACK: Header Compression for 1134 HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, 1135 . 1137 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 1138 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 1139 . 1141 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 1142 Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, 1143 . 1145 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 1146 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 1147 RFC 7234, DOI 10.17487/RFC7234, June 2014, 1148 . 1150 [STATUS429] 1151 Stewart, R., Tuexen, M., and P. Lei, "Stream Control 1152 Transmission Protocol (SCTP) Stream Reconfiguration", 1153 RFC 6525, DOI 10.17487/RFC6525, February 2012, 1154 . 1156 [UNIX] The Open Group, ., "The Single UNIX Specification, Version 1157 2 - 6 Vol Set for UNIX 98", February 1997. 1159 Appendix A. Acknowledgements 1161 Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro 1162 Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark 1163 Nottingham for being the initial contributors of these 1164 specifications. Kudos to the first community implementors: Aapo 1165 Talvensaari, Nathan Friedly and Sanyam Dogra. 1167 Appendix B. FAQ 1169 1. Why defining standard fields for throttling? 1171 To simplify enforcement of throttling policies. 1173 2. Can I use RateLimit-* in throttled responses (eg with status code 1174 429)? 1176 Yes, you can. 1178 3. Are those specs tied to RFC 6585? 1180 No. [RFC6585] defines the "429" status code and we use it just 1181 as an example of a throttled request, that could instead use even 1182 "403" or whatever status code. The goal of this specification is 1183 to standardize the name and semantic of three ratelimit fields 1184 widely used on the internet. Stricter relations with status 1185 codes or error response payloads would impose behaviors to all 1186 the existing implementations making the adoption more complex. 1188 4. Why don't pass the throttling scope as a parameter? 1190 After a discussion on a similar thread 1191 (https://github.com/httpwg/http-core/pull/317#issuecomment- 1192 585868767) we will probably add a new "RateLimit-Scope" field to 1193 this spec. 1195 I'm open to suggestions: comment on this issue 1196 (https://github.com/ioggstream/draft-polli-ratelimit-headers/ 1197 issues/70) 1199 5. Why using delay-seconds instead of a UNIX Timestamp? Why not 1200 using subsecond precision? 1202 Using delay-seconds aligns with "Retry-After", which is returned 1203 in similar contexts, eg on 429 responses. 1205 Timestamps require a clock synchronization protocol (see 1206 Section 5.6.7 of [SEMANTICS]). This may be problematic (eg. 1207 clock adjustment, clock skew, failure of hardcoded clock 1208 synchronization servers, IoT devices, ..). Moreover timestamps 1209 may not be monotonically increasing due to clock adjustment. See 1210 Another NTP client failure story 1211 (https://community.ntppool.org/t/another-ntp-client-failure- 1212 story/1014/) 1214 We did not use subsecond precision because: 1216 * that is more subject to system clock correction like the one 1217 implemented via the adjtimex() Linux system call; 1219 * response-time latency may not make it worth. A brief 1220 discussion on the subject is on the httpwg ml 1221 (https://lists.w3.org/Archives/Public/ietf-http- 1222 wg/2019JulSep/0202.html) 1224 * almost all rate-limit headers implementations do not use it. 1226 6. Why not support multiple quota remaining? 1228 While this might be of some value, my experience suggests that 1229 overly-complex quota implementations results in lower 1230 effectiveness of this policy. This spec allows the client to 1231 easily focusing on RateLimit-Remaining and RateLimit-Reset. 1233 7. Shouldn't I limit concurrency instead of request rate? 1235 You can use this specification to limit concurrency at the HTTP 1236 level (see {#use-for-limiting-concurrency}) and help clients to 1237 shape their requests avoiding being throttled out. 1239 A problematic way to limit concurrency is connection dropping, 1240 especially when connections are multiplexed (eg. HTTP/2) because 1241 this results in unserviced client requests, which is something we 1242 want to avoid. 1244 A semantic way to limit concurrency is to return 503 + Retry- 1245 After in case of resource saturation (eg. thrashing, connection 1246 queues too long, Service Level Objectives not meet, ..). 1247 Saturation conditions can be either dynamic or static: all this 1248 is out of the scope for the current document. 1250 8. Do a positive value of "RateLimit-Remaining" imply any service 1251 guarantee for my future requests to be served? 1253 No. FAQ integrated in Section 3.2. 1255 9. Is the quota-policy definition Section 2.3 too complex? 1257 You can always return the simplest form of the 3 fields 1259 RateLimit-Limit: 100 1260 RateLimit-Remaining: 50 1261 RateLimit-Reset: 60 1263 The key runtime value is the first element of the list: "expiring- 1264 limit", the others "quota-policy" are informative. So for the 1265 following field: 1267 RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window" 1269 the key value is the one referencing the lowest limit: "100" 1271 1. Can we use shorter names? Why don't put everything in one field? 1273 The most common syntax we found on the web is "X-RateLimit-*" and 1274 when starting this I-D we opted for it 1275 (https://github.com/ioggstream/draft-polli-ratelimit-headers/ 1276 issues/34#issuecomment-519366481) 1278 The basic form of those fields is easily parseable, even by 1279 implementors procesing responses using technologies like dynamic 1280 interpreter with limited syntax. 1282 Using a single field complicates parsing and takes a significantly 1283 different approach from the existing ones: this can limit adoption. 1285 1. Why don't mention connections? 1287 Beware of the term "connection":   - it is just 1288 _one_ possible saturation cause. Once you go that path  1289 you will expose other infrastructural details (bandwidth, CPU, .. 1290 see Section 9.2)  and complicate client compliance; 1291  - it is an infrastructural detail defined in terms of 1292 server and network  rather than the consumed service. 1293 This specification protects the services first, and then the 1294 infrastructures through client cooperation (see Section 9.1). 1295   RateLimit fields enable sending _on the same 1296 connection_ different limit values  on each response, 1297 depending on the policy scope (eg. per-user, per-custom-key, ..) 1298  1300 2. Can intermediaries alter RateLimit fields? 1302 Generally, they should not because it might result in unserviced 1303 requests. There are reasonable use cases for intermediaries 1304 mangling RateLimit fields though, e.g. when they enforce stricter 1305 quota-policies, or when they are an active component of the 1306 service. In those case we will consider them as part of the 1307 originating infrastructure. 1309 3. Why the "w" parameter is just informative? Could it be used by a 1310 client to determine the request rate? 1312 A non-informative "w" parameter might be fine in an environment 1313 where clients and servers are tightly coupled. Conveying 1314 policies with this detail on a large scale would be very complex 1315 and implementations would be likely not interoperable. We thus 1316 decided to leave "w" as an informational parameter and only rely 1317 on "RateLimit-Limit", "RateLimit-Remaining" and "RateLimit-Reset" 1318 for defining the throttling behavior. 1320 RateLimit fields currently used on the web 1322 _RFC Editor: Please remove this section before publication._ 1324 Commonly used header field names are: 1326 * "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset"; 1328 * "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit- 1329 Reset". 1331 There are variants too, where the window is specified in the header 1332 field name, eg: 1334 * "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x- 1335 ratelimit-limit-day" 1337 * "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x- 1338 ratelimit-remaining-day" 1340 Here are some interoperability issues: 1342 * "X-RateLimit-Remaining" references different values, depending on 1343 the implementation: 1345 - seconds remaining to the window expiration 1347 - milliseconds remaining to the window expiration 1349 - seconds since UTC, in UNIX Timestamp [UNIX] 1351 - a datetime, either "IMF-fixdate" [SEMANTICS] or [RFC3339] 1353 * different headers, with the same semantic, are used by different 1354 implementers: 1356 - X-RateLimit-Limit and X-Rate-Limit-Limit 1358 - X-RateLimit-Remaining and X-Rate-Limit-Remaining 1360 - X-RateLimit-Reset and X-Rate-Limit-Reset 1362 The semantic of RateLimit-Remaining depends on the windowing 1363 algorithm. A sliding window policy for example may result in having 1364 a "RateLimit-Remaining" value related to the ratio between the 1365 current and the maximum throughput. Eg. 1367 RateLimit-Limit: 12, 12;w=1 1368 RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s 1369 RateLimit-Reset: 1 1371 If this is the case, the optimal solution is to achieve 1373 RateLimit-Limit: 12, 12;w=1 1374 RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s 1375 RateLimit-Reset: 1 1377 At this point you should stop increasing your request rate. 1379 Changes 1381 _RFC Editor: Please remove this section before publication._ 1383 Since draft-ietf-httpapi-ratelimit-headers-00 1385 * Use I-D.httpbis-semantics, which includes referencing "delay- 1386 seconds" instead of "delta-seconds". #5 1388 Authors' Addresses 1390 Roberto Polli 1391 Team Digitale, Italian Government 1392 Italy 1394 Email: robipolli@gmail.com 1396 Alejandro Martinez Ruiz 1397 Red Hat 1399 Email: amr@redhat.com