idnits 2.17.1 

draft-polli-ratelimit-headers-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 9 instances of too long lines in the document, the longest one
     being 14 characters in excess of 72.

  ** The abstract seems to contain references ([2], [1]), which it shouldn't.
      Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (September 05, 2019) is 1694 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '1' on line 925

  -- Looks like a reference, but probably isn't: '2' on line 927

  -- Looks like a reference, but probably isn't: '3' on line 1039

  -- Looks like a reference, but probably isn't: '4' on line 1044

  == Unused Reference: 'UNIX' is defined on line 910, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112)

  ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110)

  ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIX'


     Summary: 5 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           R. Polli
3	Internet-Draft                         Team Digitale, Italian Government
4	Intended status: Standards Track                      September 05, 2019
5	Expires: March 8, 2020

7	                    RateLimit Header Fields for HTTP
8	                    draft-polli-ratelimit-headers-00

10	Abstract

12	   This document defines the RateLimit-Limit, RateLimit-Remaining,
13	   RateLimit-Reset header fields for HTTP, thus allowing servers to
14	   publish current request quotas and clients to shape their request
15	   policy and avoid being throttled out.

17	Note to Readers

19	   _RFC EDITOR: please remove this section before publication_

21	   Discussion of this draft takes place on the HTTP working group
22	   mailing list (ietf-http-wg@w3.org), which is archived at
23	   https://lists.w3.org/Archives/Public/ietf-http-wg/ [1].

25	   The source code and issues list for this draft can be found at
26	   https://github.com/ioggstream/draft-polli-ratelimit-headers [2].

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at https://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on March 8, 2020.

45	Copyright Notice

47	   Copyright (c) 2019 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (https://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Table of Contents

62	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
63	     1.1.  Rate-limiting and quotas  . . . . . . . . . . . . . . . .   3
64	     1.2.  Current landscape of rate-limiting headers  . . . . . . .   4
65	       1.2.1.  Interoperability issues . . . . . . . . . . . . . . .   4
66	     1.3.  This proposal . . . . . . . . . . . . . . . . . . . . . .   5
67	     1.4.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . .   5
68	     1.5.  Notational Conventions  . . . . . . . . . . . . . . . . .   6
69	   2.  Expressing rate-limit policies  . . . . . . . . . . . . . . .   6
70	     2.1.  Time window . . . . . . . . . . . . . . . . . . . . . . .   6
71	     2.2.  Request quota . . . . . . . . . . . . . . . . . . . . . .   6
72	     2.3.  Quota policy  . . . . . . . . . . . . . . . . . . . . . .   7
73	   3.  Header Specifications . . . . . . . . . . . . . . . . . . . .   8
74	     3.1.  RateLimit-Limit . . . . . . . . . . . . . . . . . . . . .   8
75	     3.2.  RateLimit-Remaining . . . . . . . . . . . . . . . . . . .   9
76	     3.3.  RateLimit-Reset . . . . . . . . . . . . . . . . . . . . .   9
77	   4.  Providing RateLimit headers . . . . . . . . . . . . . . . . .  10
78	   5.  Receiving RateLimit headers . . . . . . . . . . . . . . . . .  11
79	   6.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  11
80	     6.1.  Unparameterized responses . . . . . . . . . . . . . . . .  11
81	       6.1.1.  Throttling informations in responses  . . . . . . . .  11
82	       6.1.2.  Use in conjunction with custom headers  . . . . . . .  12
83	       6.1.3.  Use for limiting concurrency  . . . . . . . . . . . .  12
84	       6.1.4.  Use in throttled responses  . . . . . . . . . . . . .  13
85	     6.2.  Parameterized responses . . . . . . . . . . . . . . . . .  14
86	       6.2.1.  Throttling window specified via parameter . . . . . .  14
87	       6.2.2.  Dynamic limits with parameterized windows . . . . . .  14
88	       6.2.3.  Missing Remaining informations  . . . . . . . . . . .  15
89	       6.2.4.  Use with multiple windows . . . . . . . . . . . . . .  16
90	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
91	     7.1.  Throttling does not prevent clients from issuing requests  17
92	     7.2.  Information disclosure  . . . . . . . . . . . . . . . . .  17
93	     7.3.  Remaining quota-units are not granted requests  . . . . .  17
94	     7.4.  Reliability of RateLimit-Reset  . . . . . . . . . . . . .  18
95	     7.5.  Resource exhaustion and clock skew  . . . . . . . . . . .  18
96	     7.6.  Denial of Service . . . . . . . . . . . . . . . . . . . .  18
97	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  18
98	     8.1.  RateLimit-Limit Header Field Registration . . . . . . . .  18
99	     8.2.  RateLimit-Remaining Header Field Registration . . . . . .  19
100	     8.3.  RateLimit-Reset Header Field Registration . . . . . . . .  19
101	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  19
102	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  19
103	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  20
104	     9.3.  URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  21
105	   Appendix A.  Change Log . . . . . . . . . . . . . . . . . . . . .  21
106	   Appendix B.  Acknowledgements . . . . . . . . . . . . . . . . . .  21
107	   Appendix C.  RateLimit headers currently used on the web  . . . .  21
108	   Appendix D.  FAQ  . . . . . . . . . . . . . . . . . . . . . . . .  22
109	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  24

111	1.  Introduction

113	   The widespreading of HTTP as a distributed computation protocol
114	   requires an explicit way of communicating service status and usage
115	   quotas.

117	   This was partially addressed with the "Retry-After" header field
118	   defined in [RFC7231] to be returned in "429 Too Many Requests" or
119	   "503 Service Unavailable" responses.

121	   Still, there is not a standard way to communicate service quotas so
122	   that the client can throttle its requests and prevent 4xx or 5xx
123	   responses.

125	1.1.  Rate-limiting and quotas

127	   Servers use quota mechanisms to avoid systems overload, to ensure an
128	   equitable distribution of computational resources or to enforce other
129	   policies - eg. monetization.

131	   A basic quota mechanism limits the number of acceptable requests in a
132	   given time window, eg. 10 requests per second.

134	   When quota is exceeded, servers usually do not serve the request
135	   replying instead with a "4xx" HTTP status code (eg. 429 or 403) or
136	   adopt more aggressive policies like dropping connections.

138	   Quotas may be enforced on different basis (eg. per user, per IP, per
139	   geographic area, ..) and at different levels.  For example, an user
140	   may be allowed to issue:

142	   o  10 requests per second;

144	   o  limited to 60 request per minute;

146	   o  limited to 1000 request per hour.

148	   Moreover system metrics, statistics and heuristics can be used to
149	   implement more complex policies, where the number of acceptable
150	   request and the time window are computed dynamically.

152	1.2.  Current landscape of rate-limiting headers

154	   To help clients throttling their requests, servers may expose the
155	   counters used to evaluate quota policies via HTTP header fields.

157	   Those response headers may be added by HTTP intermediaries such as
158	   API gateways and reverse proxies.

160	   On the web we can find many different rate-limit headers, usually
161	   containing the number of allowed requests in a given time window, and
162	   when the window is reset.

164	   The common choice is to return three headers containing:

166	   o  the maximum number of allowed requests in the time window;

168	   o  the number of remaining requests in the current window;

170	   o  the time remaining in the current window expressed in seconds or
171	      as a timestamp;

173	1.2.1.  Interoperability issues

175	   A major interoperability issue in throttling is the lack of standard
176	   headers, because:

178	   o  each implementation associates different semantics to the same
179	      header field names;

181	   o  header field names proliferates.

183	   Client applications interfacing with different servers may thus need
184	   to process different headers, or the very same application interface
185	   that sits behind different reverse proxies may reply with different
186	   throttling headers.

188	1.3.  This proposal

190	   This proposal defines syntax and semantics for the following header
191	   fields:

193	   o  "RateLimit-Limit": containing the requests quota in the time
194	      window;

196	   o  "RateLimit-Remaining": containing the remaining requests quota in
197	      the current window;

199	   o  "RateLimit-Reset": containing the time remaining in the current
200	      window, specified in seconds or as a timestamp;

202	   The behavior of "RateLimit-Reset" is compatible with the one of
203	   "Retry-After".

205	   The preferred syntax for "RateLimit-Reset" is the seconds notation
206	   respect to the timestamp one.

208	   The header fields definition allows to describe complex policies,
209	   including the ones using multiple and variable time windows and
210	   dynamic quotas, or implementing concurrency limits.

212	1.4.  Goals

214	   The goals of this proposal are:

216	   1.  Standardizing the names and semantic of rate-limit headers;

218	   2.  Improve resiliency of HTTP infrastructures simplifying the
219	       enforcement and the adoption of rate-limit headers;

221	   3.  Simplify API documentation avoiding expliciting rate-limit header
222	       fields semantic in documentation.

224	   The goals do not include:

226	   Authorization:  The rate-limit headers described here are not meant
227	      to support authorization or other kinds of access controls.

229	   Throttling scope:  This specification does not cover the throttling
230	      scope, that may be the given resource-target, its parent path or
231	      the whole Origin [RFC6454] section 7.

233	   Response status code:  The rate-limit headers may be returned in both
234	      Successful and non Successful responses.  This specification does
235	      not cover whether non Successful responses count on quota usage.

237	   Throttling policy:  This specification does not mandate a specific
238	      throttling policy.  The values published in the headers, including
239	      the window size, can be statically or dynamically evaluated.

241	   Service Level Agreement:  Conveyed quota hints do not imply any
242	      service guarantee.  Server is free to throttle respectful clients
243	      under certain circumstances.

245	1.5.  Notational Conventions

247	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
248	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
249	   "OPTIONAL" in this document are to be interpreted as described in BCP
250	   14 ([RFC2119] and [RFC8174]) when, and only when, they appear in all
251	   capitals, as shown here.

253	   This document uses the Augmented BNF defined in [RFC5234] and updated
254	   by [RFC7405] along with the "#rule" extension defined in Section 7 of
255	   [RFC7230].

257	   The term Origin is to be interpreted as described in [RFC6454]
258	   section 7.

260	   The "delta-seconds" rule is defined in [RFC7234] section 1.2.1.

262	2.  Expressing rate-limit policies

264	2.1.  Time window

266	   Rate limit policies limit the number of acceptable requests in a
267	   given time window.

269	   A time window is expressed in seconds, using the following syntax:

271	   time-window = delta-seconds

273	   Subsecond precision is not supported.

275	2.2.  Request quota

277	   The request-quota is a value associated to the maximum number of
278	   requests that the server is willing to accept from one or more
279	   clients on a given basis (originating IP, authenticated user,
280	   geographical, ..) during a "time-window" as defined in Section 2.1.

282	   The "request-quota" is expressed in "quota-units" and has the
283	   following syntax:

285	   request-quota = quota-units
286	   quota-units = 1*DIGIT

288	   The "request-quota" SHOULD match the maximum number of acceptable
289	   requests.

291	   The "request-quota" MAY differ from the total number of acceptable
292	   requests when weight mechanisms, bursts, or other server policies are
293	   implemented.

295	   If the "request-quota" does not match the maximum number of
296	   acceptable requests the relation with that SHOULD be communicated
297	   out-of-bound.

299	   Example: A server could

301	   o  count once requests like "/books/{id}"

303	   o  count twice search requests like "/books?author=Camilleri"

305	   so that we have the following counters

307	GET /books/123                  ; request-quota=4, remaining: 3, status=200
308	GET /books?author=Camilleri     ; request-quota=4, remaining: 1, status=200
309	GET /books?author=Eco           ; request-quota=4, remaining: 0, status=429

311	2.3.  Quota policy

313	   This specification allows describing a quota policy with the
314	   following syntax:

316	quota-policy = request-quota; "window" "=" time-window *( OWS ";" OWS quota-comment)
317	quota-comment = token "=" (token / quoted-string)

319	   An example policy of 100 quota-units per minute.

321	   100;window=60

323	   Two examples of providing further details via custom parameters in
324	   "quota-comments".

326	   100;window=60;comment="fixed window"
327	   12;window=1; burst=1000;policy="leaky bucket"

329	3.  Header Specifications

331	   The following "RateLimit" response header fields are defined

333	3.1.  RateLimit-Limit

335	   The "RateLimit-Limit" response header field indicates the "request-
336	   quota" associated to the client in the current "time-window".

338	   If the client exceeds that limit, it MAY not be served.

340	   The header value is

342	   RateLimit-Limit = expiring-limit [, 1#quota-policy ]
343	   expiring-limit = request-quota

345	   The "expiring-limit" value MUST be set to the "request-quota" that is
346	   closer to reach its limit.

348	   The "quota-policy" is defined in Section 2.3, and its values are
349	   informative.

351	   RateLimit-Limit: 100

353	   A "time-window" associated to "expiring-limit" can be communicated
354	   via an optional "quota-policy" value, like shown in the following
355	   example

357	      RateLimit-Limit: 100, 100;window=10

359	   If the "expiring-limit" is not associated to a "time-window", the
360	   "time-window" MUST either be:

362	   o  inferred by the value of "RateLimit-Reset" at the moment of the
363	      reset, or

365	   o  communicated out-of-bound (eg. in the documentation).

367	   Policies using multiple quota limits MAY be returned using multiple
368	   "quota-policy" items, like shown in the following two examples:

370	   RateLimit-Limit: 10, 10;window=1, 50;window=60, 1000;window=3600, 5000;window=86400
371	   RateLimit-Limit: 10, 10;window=1;burst=1000, 1000;window=3600

373	3.2.  RateLimit-Remaining

375	   The "RateLimit-Remaining" response header field indicates the
376	   remaining "quota-units" defined in Section 2.2 associated to the
377	   client.

379	   The header syntax is:

381	   RateLimit-Remaining = quota-units

383	   Clients MUST NOT assume that a positive "RateLimit-Remaining" value
384	   is a guarantee of being served.

386	   A low "RateLimit-Remaining" value is like a yellow traffic-light: the
387	   red light may arrive suddenly.

389	   One example of "RateLimit-Remaining" use is below.

391	      RateLimit-Remaining: 50

393	3.3.  RateLimit-Reset

395	   The "RateLimit-Reset" response header field indicates either

397	   o  the number of seconds until the quota resets, or

399	   o  the timestamp when the quota resets.

401	   The header value is:

403	   RateLimit-Reset = delta-seconds / IMF-fixdate

405	   The "IMF-fixdate" format is defined in [RFC7231] appendix D.

407	   The "RateLimit-Reset" value:

409	   o  SHOULD use the "delta-seconds" format;

411	   o  MAY use the "IMF-fixdate" format.

413	   The "IMF-fixdate" format is NOT RECOMMENDED.

415	   The preferred format is the "delta-seconds" one, because:

417	   o  it does not rely on clock synchronization and is resilient to
418	      clock skew between client and server;

420	   o  it mitigates the risk related to thundering herd when too many
421	      clients are serviced with the same timestamp.

423	   Two examples of "RateLimit-Reset" use are below.

425	   RateLimit-Reset: 50                              ; preferred delta-seconds notation
426	   RateLimit-Reset: Tue, 15 Nov 1994 08:12:31 GMT   ; IMF-fixdate notation

428	   The client MUST NOT give for granted that all its "request-quota"
429	   will be restored after the moment referenced by "RateLimit-Reset".
430	   The server MAY arbitrarily alter the "RateLimit-Reset" value between
431	   subsequent requests eg. in case of resource saturation or to
432	   implement sliding window policies.

434	4.  Providing RateLimit headers

436	   A server MAY use one or more "RateLimit" response header fields
437	   defined in this document to communicate its quota policies.

439	   The returned values refers to the metrics used to evaluate if the
440	   current request respects the quota policy and MAY not apply to
441	   subsequent requests.

443	   Example: a successful response with the following header fields

445	   RateLimit-Limit: 10
446	   RateLimit-Remaining: 1
447	   RateLimit-Reset: 7

449	   does not guarantee that the next request will be successful.  Server
450	   metrics may be subject to other conditions like the one shown in the
451	   example from Section 2.2.

453	   A server MAY return "RateLimit" response header fields independently
454	   of the response status code.  This includes throttled responses.

456	   If a response contains both the "Retry-After" and the "RateLimit-
457	   Reset" header fields, the value of "RateLimit-Reset" MUST be
458	   consistent with the one of "Retry-After".

460	   When using a policy involving more than one "time-window", the server
461	   MUST reply with the "RateLimit" headers related to the window with
462	   the lower "RateLimit-Remaining" values.

464	   Under certain conditions, a server MAY artificially lower "RateLimit"
465	   headers values between subsequent requests, eg. to respond to Denial
466	   of Service attacks or in case of resource saturation.

468	5.  Receiving RateLimit headers

470	   A client MUST process the received "RateLimit" headers.

472	   A client MUST validate the values received in the "RateLimit" headers
473	   before using them and check if there are significant discrepancies
474	   with the expected ones.  This includes a "RateLimit-Reset" moment too
475	   far in the future or a "request-quota" too high.

477	   Malformed "RateLimit" headers MAY be ignored.

479	   A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit-
480	   Remaining" before the "time-window" expressed in "RateLimit-Reset".

482	   A client MAY still probe the server if the "RateLimit-Reset" is
483	   considered too high.

485	   The "quota-policy" values and comments provided in "RateLimit-Limit"
486	   are informative and MAY be ignored.

488	   If a response contains both the "RateLimit-Reset" and "Retry-After"
489	   header fields, the "Retry-After" header field MUST take precedence
490	   and the "RateLimit-Reset" header field MAY be ignored.

492	6.  Examples

494	6.1.  Unparameterized responses

496	6.1.1.  Throttling informations in responses

498	   The client exhausted its request-quota for the next 50 seconds.  The
499	   "time-window" is communicated out-of-bound or inferred by the header
500	   values.

502	   Request:

504	     GET /items/123

506	   Response:

508	     HTTP/1.1 200 Ok
509	     Content-Type: application/json
510	     RateLimit-Limit: 100
511	     Ratelimit-Remaining: 0
512	     Ratelimit-Reset: 50

514	     {"hello": "world"}

516	6.1.2.  Use in conjunction with custom headers

518	   The server uses two custom headers, namely "acme-RateLimit-DayLimit"
519	   and "acme-RateLimit-HourLimit" to expose the following policy:

521	   o  5000 daily quota-units;

523	   o  1000 hourly quota-units.

525	   The client consumed 4900 quota-units in the first 14 hours.

527	   Despite the next hourly limit of 1000 quota-units, the closest limit
528	   to reach is the daily one.

530	   The server then exposes the "RateLimit-*" headers to inform the
531	   client that:

533	   o  it has only 100 quota-units left;

535	   o  the window will reset in 10 hours.

537	   Request:

539	     GET /items/123

541	   Response:

543	     HTTP/1.1 200 Ok
544	     Content-Type: application/json
545	     acme-RateLimit-DayLimit: 5000
546	     acme-RateLimit-HourLimit: 1000
547	     RateLimit-Limit: 5000
548	     RateLimit-Remaining: 100
549	     RateLimit-Reset: 36000

551	     {"hello": "world"}

553	6.1.3.  Use for limiting concurrency

555	   Throttling headers may be used to limit concurrency, advertising
556	   limits that are lower than the usual ones in case of saturation, thus
557	   increasing availability.

559	   The server adopted a basic policy of 100 quota-units per minute, and
560	   in case of resource exhaustion adapts the returned values reducing
561	   both "RateLimit-Limit" and "RateLimit-Remaining".

563	   After 2 seconds the client consumed 40 quota-units
564	   Request:

566	     GET /items/123

568	   Response:

570	     HTTP/1.1 200 Ok
571	     Content-Type: application/json
572	     RateLimit-Limit: 100
573	     RateLimit-Remaining: 60
574	     RateLimit-Reset: 58

576	     {"elapsed": 2, "issued": 40}

578	   At the subsequent request - due to resource exhaustion - the server
579	   advertises only "RateLimit-Remaining: 20".

581	   Request:

583	     GET /items/123

585	   Response:

587	     HTTP/1.1 200 Ok
588	     Content-Type: application/json
589	     RateLimit-Limit: 100
590	     RateLimit-Remaining: 20
591	     RateLimit-Reset: 56

593	     {"elapsed": 4, "issued": 41}

595	6.1.4.  Use in throttled responses

597	   A client exhausted its quota and the server throttles the request
598	   sending the "Retry-After" response header field.

600	   The values of "Retry-After" and "RateLimit-Reset" are consistent as
601	   they reference the same moment.

603	   The "429 Too Many Requests" HTTP status code is just used as an
604	   example.

606	   Request:

608	     GET /items/123

610	   Response:

612	     HTTP/1.1 429 Too Many Requests
613	     Content-Type: application/json
614	     Date: Mon, 05 Aug 2019 09:27:00 GMT
615	     Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
616	     RateLimit-Reset: 5
617	     RateLimit-Limit: 100
618	     Ratelimit-Remaining: 0

620	     {
621	       "title": "Too Many Requests",
622	       "status": 429,
623	       "detail": "You have exceeded your quota"
624	     }

626	6.2.  Parameterized responses

628	6.2.1.  Throttling window specified via parameter

630	   The client has 99 "quota-units" left for the next 50 seconds.  The
631	   "time-window" is communicated by the "window" parameter, so we know
632	   the throughput is 100 "quota-units" per minute.

634	   Request:

636	     GET /items/123

638	   Response:

640	     HTTP/1.1 200 Ok
641	     Content-Type: application/json
642	     RateLimit-Limit: 100, 100;window=60
643	     Ratelimit-Remaining: 99
644	     Ratelimit-Reset: 50

646	     {"hello": "world"}

648	6.2.2.  Dynamic limits with parameterized windows

650	   The policy conveyed by "RateLimit-Limit" states that the server
651	   accepts 100 quota-units per minute.

653	   Due to resource exhaustion, the server artificially lowers the actual
654	   limits returned in the throttling headers.

656	   The current policy advertises then only 9 quota-units in the next 50
657	   seconds.

659	   Note that the server could have lowered even the other values in
660	   "RateLimit-Limit": this specification does not mandate any relation
661	   between the header values in subsequent responses.

663	   Request:

665	     GET /items/123

667	   Response:

669	     HTTP/1.1 200 Ok
670	     Content-Type: application/json
671	     RateLimit-Limit: 10, 100;window=60
672	     Ratelimit-Remaining: 9
673	     Ratelimit-Reset: 50

675	     {"hello": "world"}

677	6.2.3.  Missing Remaining informations

679	   The server does not expose "RateLimit-Remaining" values, but resets
680	   the limit counter every second.

682	   It communicates to the client the limit of 10 quota-units per second
683	   always returning the couple "RateLimit-Limit" and "RateLimit-Reset".

685	   Request:

687	     GET /items/123

689	   Response:

691	     HTTP/1.1 200 Ok
692	     Content-Type: application/json
693	     RateLimit-Limit: 10
694	     Ratelimit-Reset: 1

696	     {"first": "request"}

698	   Request:

700	     GET /items/123

702	   Response:

704	     HTTP/1.1 200 Ok
705	     Content-Type: application/json
706	     RateLimit-Limit: 10
707	     Ratelimit-Reset: 1

709	     {"second": "request"}

711	6.2.4.  Use with multiple windows

713	   This is a standardized way of describing the policy detailed in
714	   Section 6.1.2:

716	   o  5000 daily quota-units;

718	   o  1000 hourly quota-units.

720	   The client consumed 4900 quota-units in the first 14 hours.

722	   Despite the next hourly limit of 1000 quota-units, the closest limit
723	   to reach is the daily one.

725	   The server then exposes the "RateLimit" headers to inform the client
726	   that:

728	   o  it has only 100 quota-units left;

730	   o  the window will reset in 10 hours;

732	   o  the "expiring-limit" is 5000.

734	   Request:

736	     GET /items/123

738	   Response:

740	     HTTP/1.1 200 Ok
741	     Content-Type: application/json
742	     RateLimit-Limit: 5000, 1000;window=3600, 5000;window=86400
743	     RateLimit-Remaining: 100
744	     RateLimit-Reset: 36000

746	     {"hello": "world"}

748	7.  Security Considerations

750	7.1.  Throttling does not prevent clients from issuing requests

752	   This specification does not prevent clients to make over-quota
753	   requests.

755	   Servers should always implement mechanisms to prevent resource
756	   exhaustion.

758	7.2.  Information disclosure

760	   Servers should not disclose operational capacity informations that
761	   can be used to saturate its resources.

763	   While this specification does not mandate whether non 2xx responses
764	   consume quota, if 401 and 403 responses count on quota a malicious
765	   client could probe the endpoint to get traffic informations of
766	   another user .

768	7.3.  Remaining quota-units are not granted requests

770	   "RateLimit-*" headers convey hints from the server to the clients in
771	   order to avoid being throttled out.

773	   Clients MUST NOT consider the "quota-units" returned in "RateLimit-
774	   Remaining" as a service level agreement.

776	   In case of resource saturation, the server MAY artificially lower the
777	   returned values or not serve the request anyway.

779	7.4.  Reliability of RateLimit-Reset

781	   Consider that "request-quota" may not be restored after the moment
782	   referenced by "RateLimit-Reset", and the "RateLimit-Reset" value
783	   should not be considered fixed nor constant.

785	   Subsequent requests may return an higher "RateLimit-Reset" value to
786	   limit concurrency or implement dynamic or adaptive throttling
787	   policies.

789	7.5.  Resource exhaustion and clock skew

791	   Implementers returning "RateLimit-Reset" must be aware that many
792	   throttled clients may come back at the very moment specified.  For
793	   example, when returning

795	   "RateLimit-Reset: Tue, 15 Nov 1994 08:00:00 GMT "

797	   there's a high probability that all clients will show up at
798	   "08:00:00".

800	   This could be mitigated adding some jitter to the header value.

802	7.6.  Denial of Service

804	   "RateLimit" header fields may assume unexpected values by chance or
805	   purpose.  For example, an excessively high "RateLimit-Remaining"
806	   value may be:

808	   o  used by a malicious intermediary to trigger a Denial of Service
809	      attack or consume client resources boosting its requests;

811	   o  passed by a misconfigured server;

813	   or an high "RateLimit-Reset" value could inhibit clients to contact
814	   the server.

816	   Clients MUST validate the received values to mitigate those risks.

818	8.  IANA Considerations

820	8.1.  RateLimit-Limit Header Field Registration

822	   This section registers the "RateLimit-Limit" header field in the
823	   "Permanent Message Header Field Names" registry ([RFC3864]).

825	   Header field name: "RateLimit-Limit"
826	   Applicable protocol: http

828	   Status: standard

830	   Author/Change controller: IETF

832	   Specification document(s): Section 3.1 of this document

834	8.2.  RateLimit-Remaining Header Field Registration

836	   This section registers the "RateLimit-Remaining" header field in the
837	   "Permanent Message Header Field Names" registry ([RFC3864]).

839	   Header field name: "RateLimit-Remaining"

841	   Applicable protocol: http

843	   Status: standard

845	   Author/Change controller: IETF

847	   Specification document(s): Section 3.2 of this document

849	8.3.  RateLimit-Reset Header Field Registration

851	   This section registers the "RateLimit-Reset" header field in the
852	   "Permanent Message Header Field Names" registry ([RFC3864]).

854	   Header field name: "RateLimit-Reset"

856	   Applicable protocol: http

858	   Status: standard

860	   Author/Change controller: IETF

862	   Specification document(s): Section 3.3 of this document

864	9.  References

866	9.1.  Normative References

868	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
869	              Requirement Levels", BCP 14, RFC 2119,
870	              DOI 10.17487/RFC2119, March 1997,
871	              <https://www.rfc-editor.org/info/rfc2119>.

873	   [RFC3864]  Klyne, G., Nottingham, M., and J. Mogul, "Registration
874	              Procedures for Message Header Fields", BCP 90, RFC 3864,
875	              DOI 10.17487/RFC3864, September 2004,
876	              <https://www.rfc-editor.org/info/rfc3864>.

878	   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
879	              Specifications: ABNF", STD 68, RFC 5234,
880	              DOI 10.17487/RFC5234, January 2008,
881	              <https://www.rfc-editor.org/info/rfc5234>.

883	   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454,
884	              DOI 10.17487/RFC6454, December 2011,
885	              <https://www.rfc-editor.org/info/rfc6454>.

887	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
888	              Protocol (HTTP/1.1): Message Syntax and Routing",
889	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
890	              <https://www.rfc-editor.org/info/rfc7230>.

892	   [RFC7231]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
893	              Protocol (HTTP/1.1): Semantics and Content", RFC 7231,
894	              DOI 10.17487/RFC7231, June 2014,
895	              <https://www.rfc-editor.org/info/rfc7231>.

897	   [RFC7234]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
898	              Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
899	              RFC 7234, DOI 10.17487/RFC7234, June 2014,
900	              <https://www.rfc-editor.org/info/rfc7234>.

902	   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
903	              RFC 7405, DOI 10.17487/RFC7405, December 2014,
904	              <https://www.rfc-editor.org/info/rfc7405>.

906	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
907	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
908	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

910	   [UNIX]     The Open Group, ., "The Single UNIX Specification, Version
911	              2 - 6 Vol Set for UNIX 98", February 1997.

913	9.2.  Informative References

915	   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
916	              Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
917	              <https://www.rfc-editor.org/info/rfc3339>.

919	   [RFC6585]  Nottingham, M. and R. Fielding, "Additional HTTP Status
920	              Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
921	              <https://www.rfc-editor.org/info/rfc6585>.

923	9.3.  URIs

925	   [1] https://lists.w3.org/Archives/Public/ietf-http-wg/

927	   [2] https://github.com/ioggstream/draft-polli-ratelimit-headers

929	   [3] https://community.ntppool.org/t/another-ntp-client-failure-
930	       story/1014/

932	   [4] https://lists.w3.org/Archives/Public/ietf-http-
933	       wg/2019JulSep/0202.html

935	Appendix A.  Change Log

937	   RFC EDITOR PLEASE DELETE THIS SECTION.

939	Appendix B.  Acknowledgements

941	   Thanks to Willi Schoenborn, Alessandro Ranellucci, Erik Wilde and
942	   Mark Nottingham for being the initial contributors of this
943	   specifications.

945	Appendix C.  RateLimit headers currently used on the web

947	   RFC EDITOR PLEASE DELETE THIS SECTION.

949	   Commonly used header field names are:

951	   o  "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset";

953	   o  "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit-
954	      Reset".

956	   There are variants too, where the window is specified in the header
957	   field name, eg:

959	   o  "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x-
960	      ratelimit-limit-day"

962	   o  "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x-
963	      ratelimit-remaining-day"

965	   Here are some interoperability issues:

967	   o  "X-RateLimit-Remaining" references different values, depending on
968	      the implementation:

970	      *  seconds remaining to the window expiration

972	      *  milliseconds remaining to the window expiration

974	      *  seconds since UTC, in UNIX Timestamp

976	      *  a datetime, either "IMF-fixdate" [RFC7231] or [RFC3339]

978	   o  different headers, with the same semantic, are used by different
979	      implementers:

981	      *  X-RateLimit-Limit and X-Rate-Limit-Limit

983	      *  X-RateLimit-Remaining and X-Rate-Limit-Remaining

985	      *  X-RateLimit-Reset and X-Rate-Limit-Reset

987	   The semantic of RateLimit-Remaining depends on the windowing
988	   algorithm.  A sliding window policy for example may result in having
989	   a ratelimit-remaining value related to the ratio between the current
990	   and the maximum throughput.  Eg.

992	RateLimit-Limit: 12, 12;window=1
993	RateLimit-Remaining: 6             ; using 50% of throughput, that is 6 units/s
994	RateLimit-Reset: 1

996	   If this is the case, the optimal solution is to achieve

998	RateLimit-Limit: 12, 12;window=1
999	RateLimit-Remaining: 1             ; using 100% of throughput, that is 12 units/s
1000	RateLimit-Reset: 1

1002	   At this point you should stop increasing your request rate.

1004	Appendix D.  FAQ

1006	   1.  Why defining standard headers for throttling?

1008	       To simplify enforcement of throttling policies.

1010	   2.  Can I use RateLimit-* in throttled responses (eg with status code
1011	       429)?

1013	       Yes, you can.

1015	   3.  Are those specs tied to RFC 6585?

1017	       No.  [RFC6585] defines the "429" status code and we use it just
1018	       as an example of a throttled request, that could instead use even
1019	       403 or whatever status code.

1021	   4.  Why don't pass the trottling scope as a parameter?

1023	       I'm open to suggestions.  File an issue if you think it's worth
1024	       ;).

1026	   5.  Why using delta-seconds instead of UNIX Timestamp?  Why IMF-
1027	       fixdate is NOT RECOMMENDED?  Why not using subsecond precision?

1029	       Using delta-seconds permits to align with "Retry-After", which is
1030	       returned in similar contexts, eg on 429 responses.

1032	       delta-seconds as defined in [RFC7234] section 1.2.1 clarifies
1033	       some parsing rules too.

1035	       As explained in [RFC7231] section 4.1.1.1 using "IMF-fixdate"
1036	       requires a clock synchronization protocol.  This may be
1037	       problematic (eg. clock skew, failure of hardcoded clock
1038	       synchronization servers, IoT devices, ..).  See Another NTP
1039	       client failure story [3]

1041	       We did not use subsecond precision because almost all rate-limit
1042	       headers implementations do not use it.  Conveyed values are
1043	       subject to response-time latency.  A brief discussion on the
1044	       subject is on the httpwg ml [4]

1046	   6.  Why not support multiple quota remaining?

1048	       While this might be of some value, my experience suggests that
1049	       overly-complex quota implementations results in lower
1050	       effectiveness of this policy.  This spec allows the client to
1051	       easily focusing on RateLimit-Remaining and RateLimit-Reset.

1053	   7.  Shouldn't I limit concurrency instead of request rate?

1055	       You can do both.  The goal of this spec is to provide guidance
1056	       for clients in shaping their requests without being throttled
1057	       out.

1059	       Limiting concurrency results in unserviced client requests, which
1060	       is something we want to avoid.

1062	       A standard way to limit concurrency is to return 503 + Retry-
1063	       After in case of resource saturation (eg. thrashing, connection
1064	       queues too long, Service Level Objectives not meet, ..).

1066	       Dynamically lowering the values returned by the rate-limit
1067	       headers, and returning retry-after along with them can improve
1068	       availability.

1070	       Saturation conditions can be either dynamic or static: all this
1071	       is out of the scope for the current document.

1073	   8.  Do a positive value of "RateLimit-Remaining" imply any service
1074	       guarantee for my future requests to be served?

1076	       No.  The returned values were used to decide whether to serve or
1077	       not _the current request_ and do not imply any guarantee that
1078	       future requests will be successful.

1080	       Instead they help to understand when future requests will
1081	       probably be throttled.  A low value for "RateLimit-Remaining"
1082	       should be intepreted as a yellow traffic-light for either the
1083	       number of requests issued in the "time-window" or the request
1084	       throughput.

1086	   9.  Is the quota-policy definition Section 2.3 too complex?

1088	       You can always return the simplest form of the 3 headers

1090	       "RateLimit-Limit: 100 RateLimit-Remaining: 50 RateLimit-Reset: 60
1091	       "

1093	       The key runtime value is the first element of the list:
1094	       "expiring-limit", the others "quota-policy" are informative.  So
1095	       for the following header:

1097	       "RateLimit-Limit: 100, 100;window=60;burst=1000;comment="sliding
1098	       window", 5000;window=3600;burst=0;comment="fixed window" "

1100	       the key value is the one referencing the lowest limit: "100"

1102	Author's Address

1104	   Roberto Polli
1105	   Team Digitale, Italian Government

1107	   Email: robipolli@gmail.com