idnits 2.17.1 

draft-polli-ratelimit-headers-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 6 instances of too long lines in the document, the longest one
     being 41 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (26 May 2020) is 1431 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'UNIX' is defined on line 986, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112)

  ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110)

  ** Obsolete normative reference: RFC 7234 (Obsoleted by RFC 9111)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'UNIX'


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	HTTP                                                            R. Polli
3	Internet-Draft                         Team Digitale, Italian Government
4	Intended status: Standards Track                             A. Martinez
5	Expires: 27 November 2020                                        Red Hat
6	                                                             26 May 2020

8	                    RateLimit Header Fields for HTTP
9	                    draft-polli-ratelimit-headers-03

11	Abstract

13	   This document defines the RateLimit-Limit, RateLimit-Remaining,
14	   RateLimit-Reset header fields for HTTP, thus allowing servers to
15	   publish current request quotas and clients to shape their request
16	   policy and avoid being throttled out.

18	Note to Readers

20	   _RFC EDITOR: please remove this section before publication_

22	   Discussion of this draft takes place on the HTTP working group
23	   mailing list (ietf-http-wg@w3.org), which is archived at
24	   https://lists.w3.org/Archives/Public/ietf-http-wg/
25	   (https://lists.w3.org/Archives/Public/ietf-http-wg/).

27	   The source code and issues list for this draft can be found at
28	   https://github.com/ioggstream/draft-polli-ratelimit-headers
29	   (https://github.com/ioggstream/draft-polli-ratelimit-headers).

31	Status of This Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF).  Note that other groups may also distribute
38	   working documents as Internet-Drafts.  The list of current Internet-
39	   Drafts is at https://datatracker.ietf.org/drafts/current/.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."

46	   This Internet-Draft will expire on 27 November 2020.

48	Copyright Notice

50	   Copyright (c) 2020 IETF Trust and the persons identified as the
51	   document authors.  All rights reserved.

53	   This document is subject to BCP 78 and the IETF Trust's Legal
54	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
55	   license-info) in effect on the date of publication of this document.
56	   Please review these documents carefully, as they describe your rights
57	   and restrictions with respect to this document.  Code Components
58	   extracted from this document must include Simplified BSD License text
59	   as described in Section 4.e of the Trust Legal Provisions and are
60	   provided without warranty as described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	     1.1.  Rate-limiting and quotas  . . . . . . . . . . . . . . . .   3
66	     1.2.  Current landscape of rate-limiting headers  . . . . . . .   4
67	       1.2.1.  Interoperability issues . . . . . . . . . . . . . . .   4
68	     1.3.  This proposal . . . . . . . . . . . . . . . . . . . . . .   5
69	     1.4.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . .   5
70	     1.5.  Notational Conventions  . . . . . . . . . . . . . . . . .   6
71	   2.  Expressing rate-limit policies  . . . . . . . . . . . . . . .   6
72	     2.1.  Time window . . . . . . . . . . . . . . . . . . . . . . .   6
73	     2.2.  Request quota . . . . . . . . . . . . . . . . . . . . . .   6
74	     2.3.  Quota policy  . . . . . . . . . . . . . . . . . . . . . .   7
75	   3.  Header Specifications . . . . . . . . . . . . . . . . . . . .   8
76	     3.1.  RateLimit-Limit . . . . . . . . . . . . . . . . . . . . .   8
77	     3.2.  RateLimit-Remaining . . . . . . . . . . . . . . . . . . .   9
78	     3.3.  RateLimit-Reset . . . . . . . . . . . . . . . . . . . . .   9
79	   4.  Providing RateLimit headers . . . . . . . . . . . . . . . . .  10
80	   5.  Receiving RateLimit headers . . . . . . . . . . . . . . . . .  10
81	   6.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  11
82	     6.1.  Unparameterized responses . . . . . . . . . . . . . . . .  11
83	       6.1.1.  Throttling informations in responses  . . . . . . . .  11
84	       6.1.2.  Use in conjunction with custom headers  . . . . . . .  12
85	       6.1.3.  Use for limiting concurrency  . . . . . . . . . . . .  12
86	       6.1.4.  Use in throttled responses  . . . . . . . . . . . . .  13
87	     6.2.  Parameterized responses . . . . . . . . . . . . . . . . .  14
88	       6.2.1.  Throttling window specified via parameter . . . . . .  14
89	       6.2.2.  Dynamic limits with parameterized windows . . . . . .  14
90	       6.2.3.  Dynamic limits for pushing back and slowing down  . .  15
91	     6.3.  Dynamic limits for pushing back with Retry-After and slow
92	           down  . . . . . . . . . . . . . . . . . . . . . . . . . .  16
93	       6.3.1.  Missing Remaining informations  . . . . . . . . . . .  16
94	       6.3.2.  Use with multiple windows . . . . . . . . . . . . . .  17
95	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
96	     7.1.  Throttling does not prevent clients from issuing
97	           requests  . . . . . . . . . . . . . . . . . . . . . . . .  18
98	     7.2.  Information disclosure  . . . . . . . . . . . . . . . . .  18
99	     7.3.  Remaining quota-units are not granted requests  . . . . .  18
100	     7.4.  Reliability of RateLimit-Reset  . . . . . . . . . . . . .  18
101	     7.5.  Resource exhaustion . . . . . . . . . . . . . . . . . . .  19
102	     7.6.  Denial of Service . . . . . . . . . . . . . . . . . . . .  19
103	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
104	     8.1.  RateLimit-Limit Header Field Registration . . . . . . . .  19
105	     8.2.  RateLimit-Remaining Header Field Registration . . . . . .  20
106	     8.3.  RateLimit-Reset Header Field Registration . . . . . . . .  20
107	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
108	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  20
109	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  21
110	   Appendix A.  Change Log . . . . . . . . . . . . . . . . . . . . .  22
111	   Appendix B.  Acknowledgements . . . . . . . . . . . . . . . . . .  22
112	   Appendix C.  RateLimit headers currently used on the web  . . . .  22
113	   Appendix D.  FAQ  . . . . . . . . . . . . . . . . . . . . . . . .  23
114	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  26

116	1.  Introduction

118	   The widespreading of HTTP as a distributed computation protocol
119	   requires an explicit way of communicating service status and usage
120	   quotas.

122	   This was partially addressed with the "Retry-After" header field
123	   defined in [RFC7231] to be returned in "429 Too Many Requests" or
124	   "503 Service Unavailable" responses.

126	   Still, there is not a standard way to communicate service quotas so
127	   that the client can throttle its requests and prevent 4xx or 5xx
128	   responses.

130	1.1.  Rate-limiting and quotas

132	   Servers use quota mechanisms to avoid systems overload, to ensure an
133	   equitable distribution of computational resources or to enforce other
134	   policies - eg. monetization.

136	   A basic quota mechanism limits the number of acceptable requests in a
137	   given time window, eg. 10 requests per second.

139	   When quota is exceeded, servers usually do not serve the request
140	   replying instead with a "4xx" HTTP status code (eg. 429 or 403) or
141	   adopt more aggressive policies like dropping connections.

143	   Quotas may be enforced on different basis (eg. per user, per IP, per
144	   geographic area, ..) and at different levels.  For example, an user
145	   may be allowed to issue:

147	   *  10 requests per second;

149	   *  limited to 60 request per minute;

151	   *  limited to 1000 request per hour.

153	   Moreover system metrics, statistics and heuristics can be used to
154	   implement more complex policies, where the number of acceptable
155	   request and the time window are computed dynamically.

157	1.2.  Current landscape of rate-limiting headers

159	   To help clients throttling their requests, servers may expose the
160	   counters used to evaluate quota policies via HTTP header fields.

162	   Those response headers may be added by HTTP intermediaries such as
163	   API gateways and reverse proxies.

165	   On the web we can find many different rate-limit headers, usually
166	   containing the number of allowed requests in a given time window, and
167	   when the window is reset.

169	   The common choice is to return three headers containing:

171	   *  the maximum number of allowed requests in the time window;

173	   *  the number of remaining requests in the current window;

175	   *  the time remaining in the current window expressed in seconds or
176	      as a timestamp;

178	1.2.1.  Interoperability issues

180	   A major interoperability issue in throttling is the lack of standard
181	   headers, because:

183	   *  each implementation associates different semantics to the same
184	      header field names;

186	   *  header field names proliferates.

188	   Client applications interfacing with different servers may thus need
189	   to process different headers, or the very same application interface
190	   that sits behind different reverse proxies may reply with different
191	   throttling headers.

193	1.3.  This proposal

195	   This proposal defines syntax and semantics for the following header
196	   fields:

198	   *  "RateLimit-Limit": containing the requests quota in the time
199	      window;

201	   *  "RateLimit-Remaining": containing the remaining requests quota in
202	      the current window;

204	   *  "RateLimit-Reset": containing the time remaining in the current
205	      window, specified in seconds.

207	   The behavior of "RateLimit-Reset" is compatible with the "delta-
208	   seconds" notation of "Retry-After".

210	   The header fields definition allows to describe complex policies,
211	   including the ones using multiple and variable time windows and
212	   dynamic quotas, or implementing concurrency limits.

214	1.4.  Goals

216	   The goals of this proposal are:

218	   1.  Standardizing the names and semantic of rate-limit headers;

220	   2.  Improve resiliency of HTTP infrastructures simplifying the
221	       enforcement and the adoption of rate-limit headers;

223	   3.  Simplify API documentation avoiding expliciting rate-limit header
224	       fields semantic in documentation.

226	   The goals do not include:

228	   Authorization:  The rate-limit headers described here are not meant
229	      to support authorization or other kinds of access controls.

231	   Throttling scope:  This specification does not cover the throttling
232	      scope, that may be the given resource-target, its parent path or
233	      the whole Origin [RFC6454] section 7.

235	   Response status code:  The rate-limit headers may be returned in both
236	      Successful and non Successful responses.  This specification does
237	      not cover whether non Successful responses count on quota usage.

239	   Throttling policy:  This specification does not mandate a specific
240	      throttling policy.  The values published in the headers, including
241	      the window size, can be statically or dynamically evaluated.

243	   Service Level Agreement:  Conveyed quota hints do not imply any
244	      service guarantee.  Server is free to throttle respectful clients
245	      under certain circumstances.

247	1.5.  Notational Conventions

249	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
250	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
251	   "OPTIONAL" in this document are to be interpreted as described in BCP
252	   14 ([RFC2119] and [RFC8174]) when, and only when, they appear in all
253	   capitals, as shown here.

255	   This document uses the Augmented BNF defined in [RFC5234] and updated
256	   by [RFC7405] along with the "#rule" extension defined in Section 7 of
257	   [RFC7230].

259	   The term Origin is to be interpreted as described in [RFC6454]
260	   section 7.

262	   The "delta-seconds" rule is defined in [RFC7234] section 1.2.1.

264	2.  Expressing rate-limit policies

266	2.1.  Time window

268	   Rate limit policies limit the number of acceptable requests in a
269	   given time window.

271	   A time window is expressed in seconds, using the following syntax:

273	   time-window = delta-seconds

275	   Subsecond precision is not supported.

277	2.2.  Request quota

279	   The request-quota is a value associated to the maximum number of
280	   requests that the server is willing to accept from one or more
281	   clients on a given basis (originating IP, authenticated user,
282	   geographical, ..) during a "time-window" as defined in Section 2.1.

284	   The "request-quota" is expressed in "quota-units" and has the
285	   following syntax:

287	      request-quota = quota-units
288	      quota-units = 1*DIGIT

290	   The "request-quota" SHOULD match the maximum number of acceptable
291	   requests.

293	   The "request-quota" MAY differ from the total number of acceptable
294	   requests when weight mechanisms, bursts, or other server policies are
295	   implemented.

297	   If the "request-quota" does not match the maximum number of
298	   acceptable requests the relation with that SHOULD be communicated
299	   out-of-band.

301	   Example: A server could

303	   *  count once requests like "/books/{id}"

305	   *  count twice search requests like "/books?author=Camilleri"

307	   so that we have the following counters

309	   GET /books/123                  ; request-quota=4, remaining: 3, status=200
310	   GET /books?author=Camilleri     ; request-quota=4, remaining: 1, status=200
311	   GET /books?author=Eco           ; request-quota=4, remaining: 0, status=429

313	2.3.  Quota policy

315	   This specification allows describing a quota policy with the
316	   following syntax:

318	      quota-policy = request-quota; "w" "=" time-window
319	                     *( OWS ";" OWS quota-comment)
320	      quota-comment = token "=" (token / quoted-string)

322	   quota-policy parameters like "w" and quota-comment tokens MUST NOT
323	   occur multiple times within the same quota-policy.

325	   An example policy of 100 quota-units per minute.

327	      100;w=60

329	   Two examples of providing further details via custom parameters in
330	   "quota-comments".

332	      100;w=60;comment="fixed window"
333	      12;w=1;burst=1000;policy="leaky bucket"

335	3.  Header Specifications

337	   The following "RateLimit" response header fields are defined

339	3.1.  RateLimit-Limit

341	   The "RateLimit-Limit" response header field indicates the "request-
342	   quota" associated to the client in the current "time-window".

344	   If the client exceeds that limit, it MAY not be served.

346	   The header value is

348	      RateLimit-Limit = expiring-limit [, 1#quota-policy ]
349	      expiring-limit = request-quota

351	   The "expiring-limit" value MUST be set to the "request-quota" that is
352	   closer to reach its limit.

354	   The "quota-policy" is defined in Section 2.3, and its values are
355	   informative.

357	      RateLimit-Limit: 100

359	   A "time-window" associated to "expiring-limit" can be communicated
360	   via an optional "quota-policy" value, like shown in the following
361	   example

363	      RateLimit-Limit: 100, 100;w=10

365	   If the "expiring-limit" is not associated to a "time-window", the
366	   "time-window" MUST either be:

368	   *  inferred by the value of "RateLimit-Reset" at the moment of the
369	      reset, or

371	   *  communicated out-of-band (eg. in the documentation).

373	   Policies using multiple quota limits MAY be returned using multiple
374	   "quota-policy" items, like shown in the following two examples:

376	      RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400
377	      RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600

379	   This header MUST NOT occur multiple times.

381	3.2.  RateLimit-Remaining

383	   The "RateLimit-Remaining" response header field indicates the
384	   remaining "quota-units" defined in Section 2.2 associated to the
385	   client.

387	   The header value is

389	      RateLimit-Remaining = quota-units

391	   This header MUST NOT occur multiple times.

393	   Clients MUST NOT assume that a positive "RateLimit-Remaining" value
394	   is a guarantee of being served.

396	   A low "RateLimit-Remaining" value is like a yellow traffic-light: the
397	   red light may arrive suddenly.

399	   One example of "RateLimit-Remaining" use is below.

401	      RateLimit-Remaining: 50

403	3.3.  RateLimit-Reset

405	   The "RateLimit-Reset" response header field indicates either

407	   *  the number of seconds until the quota resets.

409	   The header value is

411	      RateLimit-Reset = delta-seconds

413	   The delta-seconds format is used because:

415	   *  it does not rely on clock synchronization and is resilient to
416	      clock adjustment and clock skew between client and server (see
417	      [RFC7231] Section 4.1.1.1);

419	   *  it mitigates the risk related to thundering herd when too many
420	      clients are serviced with the same timestamp.

422	   This header MUST NOT occur multiple times.

424	   An example of "RateLimit-Reset" use is below.

426	      RateLimit-Reset: 50

428	   The client MUST NOT assume that all its "request-quota" will be
429	   restored after the moment referenced by "RateLimit-Reset".  The
430	   server MAY arbitrarily alter the "RateLimit-Reset" value between
431	   subsequent requests eg. in case of resource saturation or to
432	   implement sliding window policies.

434	4.  Providing RateLimit headers

436	   A server MAY use one or more "RateLimit" response header fields
437	   defined in this document to communicate its quota policies.

439	   The returned values refers to the metrics used to evaluate if the
440	   current request respects the quota policy and MAY not apply to
441	   subsequent requests.

443	   Example: a successful response with the following header fields

445	      RateLimit-Limit: 10
446	      RateLimit-Remaining: 1
447	      RateLimit-Reset: 7

449	   does not guarantee that the next request will be successful.  Server
450	   metrics may be subject to other conditions like the one shown in the
451	   example from Section 2.2.

453	   A server MAY return "RateLimit" response header fields independently
454	   of the response status code.  This includes throttled responses.

456	   If a response contains both the "Retry-After" and the "RateLimit-
457	   Reset" header fields, the value of "RateLimit-Reset" SHOULD reference
458	   the same point in time as "Retry-After".

460	   When using a policy involving more than one "time-window", the server
461	   MUST reply with the "RateLimit" headers related to the window with
462	   the lower "RateLimit-Remaining" values.

464	   Under certain conditions, a server MAY artificially lower "RateLimit"
465	   field values between subsequent requests, eg. to respond to Denial of
466	   Service attacks or in case of resource saturation.

468	5.  Receiving RateLimit headers

470	   A client MUST process the received "RateLimit" headers.

472	   A client MUST validate the values received in the "RateLimit" headers
473	   before using them and check if there are significant discrepancies
474	   with the expected ones.  This includes a "RateLimit-Reset" moment too
475	   far in the future or a "request-quota" too high.

477	   Malformed "RateLimit" headers MAY be ignored.

479	   A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit-
480	   Remaining" before the "time-window" expressed in "RateLimit-Reset".

482	   A client MAY still probe the server if the "RateLimit-Reset" is
483	   considered too high.

485	   The value of "RateLimit-Reset" is generated at response time: a
486	   client aware of a significant network latency MAY behave accordingly
487	   and use other informations (eg. the "Date" response header, or
488	   otherwise gathered metrics) to better estimate the "RateLimit-Reset"
489	   moment intended by the server.

491	   The "quota-policy" values and comments provided in "RateLimit-Limit"
492	   are informative and MAY be ignored.

494	   If a response contains both the "RateLimit-Reset" and "Retry-After"
495	   header fields, the "Retry-After" header field MUST take precedence
496	   and the "RateLimit-Reset" header field MAY be ignored.

498	6.  Examples

500	6.1.  Unparameterized responses

502	6.1.1.  Throttling informations in responses

504	   The client exhausted its request-quota for the next 50 seconds.  The
505	   "time-window" is communicated out-of-band or inferred by the header
506	   values.

508	   Request:

510	   GET /items/123

512	   Response:

514	   HTTP/1.1 200 Ok
515	   Content-Type: application/json
516	   RateLimit-Limit: 100
517	   Ratelimit-Remaining: 0
518	   Ratelimit-Reset: 50

520	   {"hello": "world"}

522	6.1.2.  Use in conjunction with custom headers

524	   The server uses two custom headers, namely "acme-RateLimit-DayLimit"
525	   and "acme-RateLimit-HourLimit" to expose the following policy:

527	   *  5000 daily quota-units;

529	   *  1000 hourly quota-units.

531	   The client consumed 4900 quota-units in the first 14 hours.

533	   Despite the next hourly limit of 1000 quota-units, the closest limit
534	   to reach is the daily one.

536	   The server then exposes the "RateLimit-*" headers to inform the
537	   client that:

539	   *  it has only 100 quota-units left;

541	   *  the window will reset in 10 hours.

543	   Request:

545	   GET /items/123

547	   Response:

549	   HTTP/1.1 200 Ok
550	   Content-Type: application/json
551	   acme-RateLimit-DayLimit: 5000
552	   acme-RateLimit-HourLimit: 1000
553	   RateLimit-Limit: 5000
554	   RateLimit-Remaining: 100
555	   RateLimit-Reset: 36000

557	   {"hello": "world"}

559	6.1.3.  Use for limiting concurrency

561	   Throttling headers may be used to limit concurrency, advertising
562	   limits that are lower than the usual ones in case of saturation, thus
563	   increasing availability.

565	   The server adopted a basic policy of 100 quota-units per minute, and
566	   in case of resource exhaustion adapts the returned values reducing
567	   both "RateLimit-Limit" and "RateLimit-Remaining".

569	   After 2 seconds the client consumed 40 quota-units
570	   Request:

572	   GET /items/123

574	   Response:

576	   HTTP/1.1 200 Ok
577	   Content-Type: application/json
578	   RateLimit-Limit: 100
579	   RateLimit-Remaining: 60
580	   RateLimit-Reset: 58

582	   {"elapsed": 2, "issued": 40}

584	   At the subsequent request - due to resource exhaustion - the server
585	   advertises only "RateLimit-Remaining: 20".

587	   Request:

589	   GET /items/123

591	   Response:

593	   HTTP/1.1 200 Ok
594	   Content-Type: application/json
595	   RateLimit-Limit: 100
596	   RateLimit-Remaining: 20
597	   RateLimit-Reset: 56

599	   {"elapsed": 4, "issued": 41}

601	6.1.4.  Use in throttled responses

603	   A client exhausted its quota and the server throttles the request
604	   sending the "Retry-After" response header field.

606	   In this example, the values of "Retry-After" and "RateLimit-Reset"
607	   reference the same moment, but this is not a requirement.

609	   The "429 Too Many Requests" HTTP status code is just used as an
610	   example.

612	   Request:

614	   GET /items/123

616	   Response:

618	   HTTP/1.1 429 Too Many Requests
619	   Content-Type: application/json
620	   Date: Mon, 05 Aug 2019 09:27:00 GMT
621	   Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
622	   RateLimit-Reset: 5
623	   RateLimit-Limit: 100
624	   Ratelimit-Remaining: 0

626	   {
627	   "title": "Too Many Requests",
628	   "status": 429,
629	   "detail": "You have exceeded your quota"
630	   }

632	6.2.  Parameterized responses

634	6.2.1.  Throttling window specified via parameter

636	   The client has 99 "quota-units" left for the next 50 seconds.  The
637	   "time-window" is communicated by the "w" parameter, so we know the
638	   throughput is 100 "quota-units" per minute.

640	   Request:

642	   GET /items/123

644	   Response:

646	   HTTP/1.1 200 Ok
647	   Content-Type: application/json
648	   RateLimit-Limit: 100, 100;w=60
649	   Ratelimit-Remaining: 99
650	   Ratelimit-Reset: 50

652	   {"hello": "world"}

654	6.2.2.  Dynamic limits with parameterized windows

656	   The policy conveyed by "RateLimit-Limit" states that the server
657	   accepts 100 quota-units per minute.

659	   To avoid resource exhaustion, the server artificially lowers the
660	   actual limits returned in the throttling headers.

662	   The "RateLimit-Remaining" then advertises only 9 quota-units for the
663	   next 50 seconds to slow down the client.

665	   Note that the server could have lowered even the other values in
666	   "RateLimit-Limit": this specification does not mandate any relation
667	   between the field values contained in subsequent responses.

669	   Request:

671	   GET /items/123

673	   Response:

675	   HTTP/1.1 200 Ok
676	   Content-Type: application/json
677	   RateLimit-Limit: 10, 100;w=60
678	   Ratelimit-Remaining: 9
679	   Ratelimit-Reset: 50

681	   {
682	     "status": 200,
683	     "detail": "Just slow down without waiting."
684	   }

686	6.2.3.  Dynamic limits for pushing back and slowing down

688	   Continuing the previous example, let's say the client waits 10
689	   seconds and performs a new request which, due to resource exhaustion,
690	   the server rejects and pushes back, advertising "RateLimit-Remaining:
691	   0" for the next 20 seconds.

693	   The server advertises a smaller window with a lower limit to slow
694	   down the client for the rest of its original window after the 20
695	   seconds elapse.

697	   Request:

699	   GET /items/123

701	   Response:

703	   HTTP/1.1 429 Too Many Requests
704	   Content-Type: application/json
705	   RateLimit-Limit: 0, 15;w=20
706	   Ratelimit-Remaining: 0
707	   Ratelimit-Reset: 20

709	   {
710	     "status": 429,
711	     "detail": "Wait 20 seconds, then slow down!"
712	   }

714	6.3.  Dynamic limits for pushing back with Retry-After and slow down

716	   Alternatively, given the same context where the previous example
717	   starts, we can convey the same information to the client via the
718	   Retry-After header, with the advantage that the server can now
719	   specify the policy's nominal limit and window that will apply after
720	   the reset, ie. assuming the resource exhaustion is likely to be gone
721	   by then, so the advertised policy does not need to be adjusted, yet
722	   we managed to stop requests for a while and slow down the rest of the
723	   current window.

725	   Request:

727	   GET /items/123

729	   Response:

731	   HTTP/1.1 429 Too Many Requests
732	   Content-Type: application/json
733	   Retry-After: 20
734	   RateLimit-Limit: 15, 100;w=60
735	   Ratelimit-Remaining: 15
736	   Ratelimit-Reset: 40

738	   {
739	     "status": 429,
740	     "detail": "Wait 20 seconds, then slow down!"
741	   }

743	   Note that in this last response the client is expected to honor the
744	   "Retry-After" header and perform no requests for the specified amount
745	   of time, whereas the previous example would not force the client to
746	   stop requests before the reset time is elapsed, as it would still be
747	   free to query again the server even if it is likely to have the
748	   request rejected.

750	6.3.1.  Missing Remaining informations

752	   The server does not expose "RateLimit-Remaining" values, but resets
753	   the limit counter every second.

755	   It communicates to the client the limit of 10 quota-units per second
756	   always returning the couple "RateLimit-Limit" and "RateLimit-Reset".

758	   Request:

760	   GET /items/123
761	   Response:

763	   HTTP/1.1 200 Ok
764	   Content-Type: application/json
765	   RateLimit-Limit: 10
766	   Ratelimit-Reset: 1

768	   {"first": "request"}

770	   Request:

772	   GET /items/123

774	   Response:

776	   HTTP/1.1 200 Ok
777	   Content-Type: application/json
778	   RateLimit-Limit: 10
779	   Ratelimit-Reset: 1

781	   {"second": "request"}

783	6.3.2.  Use with multiple windows

785	   This is a standardized way of describing the policy detailed in
786	   Section 6.1.2:

788	   *  5000 daily quota-units;

790	   *  1000 hourly quota-units.

792	   The client consumed 4900 quota-units in the first 14 hours.

794	   Despite the next hourly limit of 1000 quota-units, the closest limit
795	   to reach is the daily one.

797	   The server then exposes the "RateLimit" headers to inform the client
798	   that:

800	   *  it has only 100 quota-units left;

802	   *  the window will reset in 10 hours;

804	   *  the "expiring-limit" is 5000.

806	   Request:

808	   GET /items/123
809	   Response:

811	   HTTP/1.1 200 OK
812	   Content-Type: application/json
813	   RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400
814	   RateLimit-Remaining: 100
815	   RateLimit-Reset: 36000

817	   {"hello": "world"}

819	7.  Security Considerations

821	7.1.  Throttling does not prevent clients from issuing requests

823	   This specification does not prevent clients to make over-quota
824	   requests.

826	   Servers should always implement mechanisms to prevent resource
827	   exhaustion.

829	7.2.  Information disclosure

831	   Servers should not disclose operational capacity informations that
832	   can be used to saturate its resources.

834	   While this specification does not mandate whether non 2xx responses
835	   consume quota, if 401 and 403 responses count on quota a malicious
836	   client could probe the endpoint to get traffic informations of
837	   another user.

839	7.3.  Remaining quota-units are not granted requests

841	   "RateLimit-*" headers convey hints from the server to the clients in
842	   order to avoid being throttled out.

844	   Clients MUST NOT consider the "quota-units" returned in "RateLimit-
845	   Remaining" as a service level agreement.

847	   In case of resource saturation, the server MAY artificially lower the
848	   returned values or not serve the request anyway.

850	7.4.  Reliability of RateLimit-Reset

852	   Consider that "request-quota" may not be restored after the moment
853	   referenced by "RateLimit-Reset", and the "RateLimit-Reset" value
854	   should not be considered fixed nor constant.

856	   Subsequent requests may return an higher "RateLimit-Reset" value to
857	   limit concurrency or implement dynamic or adaptive throttling
858	   policies.

860	7.5.  Resource exhaustion

862	   When returning "RateLimit-Reset" you must be aware that many
863	   throttled clients may come back at the very moment specified.

865	   This is true for "Retry-After" too.

867	   For example, if the quota resets every day at "18:00:00" and your
868	   server returns the "RateLimit-Reset" accordingly

870	      Date: Tue, 15 Nov 1994 08:00:00 GMT
871	      RateLimit-Reset: 36000

873	   there's a high probability that all clients will show up at
874	   "18:00:00".

876	   This could be mitigated adding some jitter to the field-value.

878	7.6.  Denial of Service

880	   "RateLimit" header fields may assume unexpected values by chance or
881	   purpose.  For example, an excessively high "RateLimit-Remaining"
882	   value may be:

884	   *  used by a malicious intermediary to trigger a Denial of Service
885	      attack or consume client resources boosting its requests;

887	   *  passed by a misconfigured server;

889	   or an high "RateLimit-Reset" value could inhibit clients to contact
890	   the server.

892	   Clients MUST validate the received values to mitigate those risks.

894	8.  IANA Considerations

896	8.1.  RateLimit-Limit Header Field Registration

898	   This section registers the "RateLimit-Limit" header field in the
899	   "Permanent Message Header Field Names" registry ([RFC3864]).

901	   Header field name: "RateLimit-Limit"

903	   Applicable protocol: http
904	   Status: standard

906	   Author/Change controller: IETF

908	   Specification document(s): Section 3.1 of this document

910	8.2.  RateLimit-Remaining Header Field Registration

912	   This section registers the "RateLimit-Remaining" header field in the
913	   "Permanent Message Header Field Names" registry ([RFC3864]).

915	   Header field name: "RateLimit-Remaining"

917	   Applicable protocol: http

919	   Status: standard

921	   Author/Change controller: IETF

923	   Specification document(s): Section 3.2 of this document

925	8.3.  RateLimit-Reset Header Field Registration

927	   This section registers the "RateLimit-Reset" header field in the
928	   "Permanent Message Header Field Names" registry ([RFC3864]).

930	   Header field name: "RateLimit-Reset"

932	   Applicable protocol: http

934	   Status: standard

936	   Author/Change controller: IETF

938	   Specification document(s): Section 3.3 of this document

940	9.  References

942	9.1.  Normative References

944	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
945	              Requirement Levels", BCP 14, RFC 2119,
946	              DOI 10.17487/RFC2119, March 1997,
947	              <https://www.rfc-editor.org/info/rfc2119>.

949	   [RFC3864]  Klyne, G., Nottingham, M., and J. Mogul, "Registration
950	              Procedures for Message Header Fields", BCP 90, RFC 3864,
951	              DOI 10.17487/RFC3864, September 2004,
952	              <https://www.rfc-editor.org/info/rfc3864>.

954	   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
955	              Specifications: ABNF", STD 68, RFC 5234,
956	              DOI 10.17487/RFC5234, January 2008,
957	              <https://www.rfc-editor.org/info/rfc5234>.

959	   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454,
960	              DOI 10.17487/RFC6454, December 2011,
961	              <https://www.rfc-editor.org/info/rfc6454>.

963	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
964	              Protocol (HTTP/1.1): Message Syntax and Routing",
965	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
966	              <https://www.rfc-editor.org/info/rfc7230>.

968	   [RFC7231]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
969	              Protocol (HTTP/1.1): Semantics and Content", RFC 7231,
970	              DOI 10.17487/RFC7231, June 2014,
971	              <https://www.rfc-editor.org/info/rfc7231>.

973	   [RFC7234]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
974	              Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
975	              RFC 7234, DOI 10.17487/RFC7234, June 2014,
976	              <https://www.rfc-editor.org/info/rfc7234>.

978	   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
979	              RFC 7405, DOI 10.17487/RFC7405, December 2014,
980	              <https://www.rfc-editor.org/info/rfc7405>.

982	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
983	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
984	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

986	   [UNIX]     The Open Group, ., "The Single UNIX Specification, Version
987	              2 - 6 Vol Set for UNIX 98", February 1997.

989	9.2.  Informative References

991	   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
992	              Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
993	              <https://www.rfc-editor.org/info/rfc3339>.

995	   [RFC6585]  Nottingham, M. and R. Fielding, "Additional HTTP Status
996	              Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
997	              <https://www.rfc-editor.org/info/rfc6585>.

999	Appendix A.  Change Log

1001	   RFC EDITOR PLEASE DELETE THIS SECTION.

1003	Appendix B.  Acknowledgements

1005	   Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro
1006	   Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark
1007	   Nottingham for being the initial contributors of these
1008	   specifications.  Kudos to the first community implementors: Aapo
1009	   Talvensaari, Nathan Friedly and Sanyam Dogra.

1011	Appendix C.  RateLimit headers currently used on the web

1013	   RFC EDITOR PLEASE DELETE THIS SECTION.

1015	   Commonly used header field names are:

1017	   *  "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset";

1019	   *  "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit-
1020	      Reset".

1022	   There are variants too, where the window is specified in the header
1023	   field name, eg:

1025	   *  "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x-
1026	      ratelimit-limit-day"

1028	   *  "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x-
1029	      ratelimit-remaining-day"

1031	   Here are some interoperability issues:

1033	   *  "X-RateLimit-Remaining" references different values, depending on
1034	      the implementation:

1036	      -  seconds remaining to the window expiration

1038	      -  milliseconds remaining to the window expiration

1040	      -  seconds since UTC, in UNIX Timestamp

1042	      -  a datetime, either "IMF-fixdate" [RFC7231] or [RFC3339]

1044	   *  different headers, with the same semantic, are used by different
1045	      implementers:

1047	      -  X-RateLimit-Limit and X-Rate-Limit-Limit

1049	      -  X-RateLimit-Remaining and X-Rate-Limit-Remaining

1051	      -  X-RateLimit-Reset and X-Rate-Limit-Reset

1053	   The semantic of RateLimit-Remaining depends on the windowing
1054	   algorithm.  A sliding window policy for example may result in having
1055	   a ratelimit-remaining value related to the ratio between the current
1056	   and the maximum throughput.  Eg.

1058	   RateLimit-Limit: 12, 12;w=1
1059	   RateLimit-Remaining: 6          ; using 50% of throughput, that is 6 units/s
1060	   RateLimit-Reset: 1

1062	   If this is the case, the optimal solution is to achieve

1064	   RateLimit-Limit: 12, 12;w=1
1065	   RateLimit-Remaining: 1          ; using 100% of throughput, that is 12 units/s
1066	   RateLimit-Reset: 1

1068	   At this point you should stop increasing your request rate.

1070	Appendix D.  FAQ

1072	   1.  Why defining standard headers for throttling?

1074	       To simplify enforcement of throttling policies.

1076	   2.  Can I use RateLimit-* in throttled responses (eg with status code
1077	       429)?

1079	       Yes, you can.

1081	   3.  Are those specs tied to RFC 6585?

1083	       No.  [RFC6585] defines the "429" status code and we use it just
1084	       as an example of a throttled request, that could instead use even
1085	       403 or whatever status code.

1087	   4.  Why don't pass the trottling scope as a parameter?

1089	       I'm open to suggestions.  File an issue if you think it's worth
1090	       ;).

1092	   5.  Why using delta-seconds instead of a UNIX Timestamp?  Why not
1093	       using subsecond precision?

1095	       Using delta-seconds aligns with "Retry-After", which is returned
1096	       in similar contexts, eg on 429 responses.

1098	       delta-seconds as defined in [RFC7234] section 1.2.1 clarifies
1099	       some parsing rules too.

1101	       Timestamps require a clock synchronization protocol (see
1102	       [RFC7231] section 4.1.1.1).  This may be problematic (eg. clock
1103	       adjustment, clock skew, failure of hardcoded clock
1104	       synchronization servers, IoT devices, ..).  Moreover timestamps
1105	       may not be monotonically increasing due to clock adjustment.  See
1106	       Another NTP client failure story
1107	       (https://community.ntppool.org/t/another-ntp-client-failure-
1108	       story/1014/)

1110	       We did not use subsecond precision because:

1112	       *  that is more subject to system clock correction like the one
1113	          implemented via the adjtimex() Linux system call;

1115	       *  response-time latency may not make it worth.  A brief
1116	          discussion on the subject is on the httpwg ml
1117	          (https://lists.w3.org/Archives/Public/ietf-http-
1118	          wg/2019JulSep/0202.html)

1120	       *  almost all rate-limit headers implementations do not use it.

1122	   6.  Why not support multiple quota remaining?

1124	       While this might be of some value, my experience suggests that
1125	       overly-complex quota implementations results in lower
1126	       effectiveness of this policy.  This spec allows the client to
1127	       easily focusing on RateLimit-Remaining and RateLimit-Reset.

1129	   7.  Shouldn't I limit concurrency instead of request rate?

1131	       You can do both.  The goal of this spec is to provide guidance
1132	       for clients in shaping their requests without being throttled
1133	       out.

1135	       Limiting concurrency results in unserviced client requests, which
1136	       is something we want to avoid.

1138	       A standard way to limit concurrency is to return 503 + Retry-
1139	       After in case of resource saturation (eg. thrashing, connection
1140	       queues too long, Service Level Objectives not meet, ..).

1142	       Availability can be improved by dynamically lowering the values
1143	       returned by the "RateLimit-*" headers to slow down clients, and
1144	       "Retry-After" can be used to push them back.

1146	       Saturation conditions can be either dynamic or static: all this
1147	       is out of the scope for the current document.

1149	   8.  Do a positive value of "RateLimit-Remaining" imply any service
1150	       guarantee for my future requests to be served?

1152	       No.  The returned values were used to decide whether to serve or
1153	       not _the current request_ and do not imply any guarantee that
1154	       future requests will be successful.

1156	       Instead they help to understand when future requests will
1157	       probably be throttled.  A low value for "RateLimit-Remaining"
1158	       should be interpreted as a yellow traffic-light for either the
1159	       number of requests issued in the "time-window" or the request
1160	       throughput.

1162	   9.  Is the quota-policy definition Section 2.3 too complex?

1164	       You can always return the simplest form of the 3 headers

1166	   RateLimit-Limit: 100
1167	   RateLimit-Remaining: 50
1168	   RateLimit-Reset: 60

1170	   The key runtime value is the first element of the list: "expiring-
1171	   limit", the others "quota-policy" are informative.  So for the
1172	   following header:

1174	   RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window"

1176	   the key value is the one referencing the lowest limit: "100"

1178	   1.  Can we use shorter names?  Why don't put everything in one
1179	       header?

1181	   The most common syntax we found on the web is "X-RateLimit-*" and
1182	   when starting this I-D we opted for it
1183	   (https://github.com/ioggstream/draft-polli-ratelimit-headers/
1184	   issues/34#issuecomment-519366481)
1185	   The basic form of those headers is easily parseable, even by
1186	   implementors procesing responses using technologies like dynamic
1187	   interpreter with limited syntax.

1189	   Using a single header complicates parsing and takes a significantly
1190	   different approach from the existing ones: this can limit adoption.

1192	Authors' Addresses

1194	   Roberto Polli
1195	   Team Digitale, Italian Government

1197	   Email: robipolli@gmail.com

1199	   Alejandro Martinez Ruiz
1200	   Red Hat

1202	   Email: amr@redhat.com