idnits 2.17.1 

draft-ietf-httpapi-ratelimit-headers-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 3 instances of too long lines in the document, the longest one
     being 37 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (30 May 2022) is 697 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Normative reference to a draft: ref. 'SEMANTICS' 

  -- Obsolete informational reference (is this intentional?): RFC 7234
     (Obsoleted by RFC 9111)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	HTTPAPI                                                         R. Polli
3	Internet-Draft                         Team Digitale, Italian Government
4	Intended status: Standards Track                             A. Martinez
5	Expires: 1 December 2022                                         Red Hat
6	                                                             30 May 2022

8	                       RateLimit Fields for HTTP
9	                draft-ietf-httpapi-ratelimit-headers-04

11	Abstract

13	   This document defines the RateLimit-Limit, RateLimit-Remaining,
14	   RateLimit-Reset fields for HTTP, thus allowing servers to publish
15	   current service limits and clients to shape their request policy and
16	   avoid being throttled out.

18	Note to Readers

20	   _RFC EDITOR: please remove this section before publication_

22	   Discussion of this draft takes place on the HTTP working group
23	   mailing list (httpapi@ietf.org), which is archived at
24	   https://mailarchive.ietf.org/arch/browse/httpapi/
25	   (https://mailarchive.ietf.org/arch/browse/httpapi/).

27	   The source code and issues list for this draft can be found at
28	   https://github.com/ietf-wg-httpapi/ratelimit-headers
29	   (https://github.com/ietf-wg-httpapi/ratelimit-headers).

31	   References to ThisRFC in the IANA Considerations section would be
32	   replaced with the RFC number when assigned.

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on 1 December 2022.

50	Copyright Notice

52	   Copyright (c) 2022 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
57	   license-info) in effect on the date of publication of this document.
58	   Please review these documents carefully, as they describe your rights
59	   and restrictions with respect to this document.  Code Components
60	   extracted from this document must include Revised BSD License text as
61	   described in Section 4.e of the Trust Legal Provisions and are
62	   provided without warranty as described in the Revised BSD License.

64	Table of Contents

66	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
67	     1.1.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . .   4
68	     1.2.  Notational Conventions  . . . . . . . . . . . . . . . . .   5
69	   2.  Expressing rate-limit policies  . . . . . . . . . . . . . . .   5
70	     2.1.  Time window . . . . . . . . . . . . . . . . . . . . . . .   5
71	     2.2.  Service limit . . . . . . . . . . . . . . . . . . . . . .   6
72	     2.3.  Quota policy  . . . . . . . . . . . . . . . . . . . . . .   6
73	   3.  Providing RateLimit fields  . . . . . . . . . . . . . . . . .   7
74	     3.1.  Performance considerations  . . . . . . . . . . . . . . .   8
75	   4.  Receiving RateLimit fields  . . . . . . . . . . . . . . . . .   9
76	     4.1.  Intermediaries  . . . . . . . . . . . . . . . . . . . . .  10
77	     4.2.  Caching . . . . . . . . . . . . . . . . . . . . . . . . .  10
78	   5.  Fields definition . . . . . . . . . . . . . . . . . . . . . .  10
79	     5.1.  RateLimit-Limit . . . . . . . . . . . . . . . . . . . . .  10
80	     5.2.  RateLimit-Policy  . . . . . . . . . . . . . . . . . . . .  11
81	     5.3.  RateLimit-Remaining . . . . . . . . . . . . . . . . . . .  12
82	     5.4.  RateLimit-Reset . . . . . . . . . . . . . . . . . . . . .  12
83	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
84	     6.1.  Throttling does not prevent clients from issuing
85	           requests  . . . . . . . . . . . . . . . . . . . . . . . .  13
86	     6.2.  Information disclosure  . . . . . . . . . . . . . . . . .  13
87	     6.3.  Remaining quota-units are not granted requests  . . . . .  13
88	     6.4.  Reliability of RateLimit-Reset  . . . . . . . . . . . . .  13
89	     6.5.  Resource exhaustion . . . . . . . . . . . . . . . . . . .  14
90	     6.6.  Denial of Service . . . . . . . . . . . . . . . . . . . .  14
91	   7.  Privacy Considerations  . . . . . . . . . . . . . . . . . . .  15
92	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
93	     8.1.  RateLimit Parameters Registration . . . . . . . . . . . .  16
94	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
95	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  16
96	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  17
97	   Appendix A.  Rate-limiting and quotas . . . . . . . . . . . . . .  18
98	     A.1.  Interoperability issues . . . . . . . . . . . . . . . . .  19
99	   Appendix B.  Examples . . . . . . . . . . . . . . . . . . . . . .  19
100	     B.1.  Unparameterized responses . . . . . . . . . . . . . . . .  19
101	       B.1.1.  Throttling information in responses . . . . . . . . .  19
102	       B.1.2.  Use in conjunction with custom fields . . . . . . . .  20
103	       B.1.3.  Use for limiting concurrency  . . . . . . . . . . . .  21
104	       B.1.4.  Use in throttled responses  . . . . . . . . . . . . .  22
105	     B.2.  Parameterized responses . . . . . . . . . . . . . . . . .  23
106	       B.2.1.  Throttling window specified via parameter . . . . . .  23
107	       B.2.2.  Dynamic limits with parameterized windows . . . . . .  23
108	       B.2.3.  Dynamic limits for pushing back and slowing down  . .  24
109	     B.3.  Dynamic limits for pushing back with Retry-After and slow
110	           down  . . . . . . . . . . . . . . . . . . . . . . . . . .  25
111	       B.3.1.  Missing Remaining information . . . . . . . . . . . .  25
112	       B.3.2.  Use with multiple windows . . . . . . . . . . . . . .  26
113	   FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27
114	   RateLimit fields currently used on the web  . . . . . . . . . . .  30
115	   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  32
116	   Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  32
117	     Since draft-ietf-httpapi-ratelimit-headers-03 . . . . . . . . .  32
118	     Since draft-ietf-httpapi-ratelimit-headers-02 . . . . . . . . .  32
119	     Since draft-ietf-httpapi-ratelimit-headers-01 . . . . . . . . .  32
120	     Since draft-ietf-httpapi-ratelimit-headers-00 . . . . . . . . .  32
121	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  33

123	1.  Introduction

125	   The widespreading of HTTP as a distributed computation protocol
126	   requires an explicit way of communicating service status and usage
127	   quotas.

129	   This was partially addressed by the Retry-After header field defined
130	   in [SEMANTICS] to be returned in 429 (Too Many Request) (see
131	   [STATUS429]) or 503 (Service Unavailable) responses.

133	   Widely deployed quota mechanisms limit the number of acceptable
134	   requests in a given time window, e.g. 10 requests per second;
135	   currently, there is no standard way to communicate service quotas so
136	   that the client can throttle its requests and prevent 4xx or 5xx
137	   responses.  See Appendix A for further information on the current
138	   usage of rate limiting in HTTP.

140	   This document defines syntax and semantics for the following fields:

142	   *  RateLimit-Limit: containing the requests quota in the time window;
143	   *  RateLimit-Remaining: containing the remaining requests quota in
144	      the current window;

146	   *  RateLimit-Reset: containing the time remaining in the current
147	      window, specified in seconds;

149	   *  RateLimit-Policy: containing the quota policy information.

151	   The behavior of RateLimit-Reset is compatible with the delay-seconds
152	   notation of Retry-After.

154	   The fields definition allows to describe complex policies, including
155	   the ones using multiple and variable time windows and dynamic quotas,
156	   or implementing concurrency limits.

158	1.1.  Goals

160	   The goals of the RateLimit fields are:

162	   Interoperability:  Standardization of the names and semantics of
163	      rate-limit headers to ease their enforcement and adoption;

165	   Resiliency:  Improve resiliency of HTTP infrastructure by providing
166	      clients with information useful to throttle their requests and
167	      prevent 4xx or 5xx responses;

169	   Documentation:  Simplify API documentation by eliminating the need to
170	      include detailed quota limits and related fields in API
171	      documentation.

173	   The following features are out of the scope of this document:

175	   Authorization:  RateLimit fields are not meant to support
176	      authorization or other kinds of access controls.

178	   Throttling scope:  This specification does not cover the throttling
179	      scope, that may be the given resource-target, its parent path or
180	      the whole Origin (see Section 7 of [RFC6454]).  This can be
181	      addressed using extensibility mechanisms such as the parameter
182	      registry Section 8.1.

184	   Response status code:  RateLimit fields may be returned in both
185	      successful (see Section 15.3 of [SEMANTICS]) and non-successful
186	      responses.  This specification does not cover whether non
187	      Successful responses count on quota usage, nor it mandates any
188	      correlation between the RateLimit values and the returned status
189	      code.

191	   Throttling policy:  This specification does not mandate a specific
192	      throttling policy.  The values published in the fields, including
193	      the window size, can be statically or dynamically evaluated.

195	   Service Level Agreement:  Conveyed quota hints do not imply any
196	      service guarantee.  Server is free to throttle respectful clients
197	      under certain circumstances.

199	1.2.  Notational Conventions

201	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
202	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
203	   "OPTIONAL" in this document are to be interpreted as described in
204	   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
205	   capitals, as shown here.

207	   This document uses the Augmented BNF defined in [RFC5234] and updated
208	   by [RFC7405] along with the "#rule" extension defined in
209	   Section 5.6.1 of [SEMANTICS].

211	   The term Origin is to be interpreted as described in Section 7 of
212	   [RFC6454].

214	   This document uses the following terminology from Section 3 of [SF]
215	   to specify syntax and parsing: List, Item, String, Token and Integer
216	   together with the concept of bare item.

218	2.  Expressing rate-limit policies

220	2.1.  Time window

222	   Rate limit policies limit the number of acceptable requests in a
223	   given time window.

225	   A time window is expressed in seconds, using the following syntax:

227	   time-window = delay-seconds
228	   delay-seconds = sf-integer

230	   Where delay-seconds is a non-negative Integer compatible with the
231	   "delay-seconds" rule defined in Section 10.2.3 of [SEMANTICS].

233	   Subsecond precision is not supported.

235	2.2.  Service limit

237	   The service-limit is a value associated to the maximum number of
238	   requests that the server is willing to accept from one or more
239	   clients on a given basis (originating IP, authenticated user,
240	   geographical, ..) during a time-window as defined in Section 2.1.

242	   The service-limit is expressed in quota-units and has the following
243	   syntax:

245	      service-limit = quota-units
246	      quota-units = sf-integer

248	   where quota-units is a non-negative Integer.

250	   The service-limit SHOULD match the maximum number of acceptable
251	   requests.

253	   The service-limit MAY differ from the total number of acceptable
254	   requests when weight mechanisms, bursts, or other server policies are
255	   implemented.

257	   If the service-limit does not match the maximum number of acceptable
258	   requests the relation with that SHOULD be communicated out-of-band.

260	   Example: A server could

262	   *  count once requests like /books/{id}

264	   *  count twice search requests like /books?author=WuMing

266	   so that we have the following counters

268	   GET /books/123           ; service-limit=4, remaining: 3, status=200
269	   GET /books?author=WuMing ; service-limit=4, remaining: 1, status=200
270	   GET /books?author=Eco    ; service-limit=4, remaining: 0, status=429

272	2.3.  Quota policy

274	   This specification allows describing a quota policy with the
275	   following syntax:

277	      quota-policy = sf-item

279	   where the associated bare-item is a service-limit and parameters are
280	   supported.

282	   The following parameters are defined:

284	   w:  The REQUIRED "w" parameter specifies a time window.  Its syntax
285	      is a "time-window" defined in Section 2.1.

287	   Other parameters are allowed and can be regarded as comments.  They
288	   ought to be registered within the "Hypertext Transfer Protocol (HTTP)
289	   RateLimit Parameters Registry", as described in Section 8.1.

291	   An example policy of 100 quota-units per minute.

293	      100;w=60

295	   The definition of a quota-policy does not imply any specific
296	   distribution of quota-units over time.  Such service specific details
297	   can be conveyed as parameters.

299	   Two policy examples containing further details via custom parameters

301	      100;w=60;comment="fixed window"
302	      12;w=1;burst=1000;policy="leaky bucket"

304	   To avoid clashes, implementers SHOULD prefix unregistered parameters
305	   with an x-<vendor> identifier, e.g. x-acme-policy, x-acme-burst.
306	   While it is useful to define a clear syntax and semantics even for
307	   custom parameters, it is important to note that user agents are not
308	   required to process quota policy information.

310	3.  Providing RateLimit fields

312	   A server uses the RateLimit response fields defined in this document
313	   to communicate its quota policies according to the following rules:

315	   *  RateLimit-Limit and RateLimit-Reset are REQUIRED;

317	   *  RateLimit-Remaining is RECOMMENDED.

319	   The returned values refers to the metrics used to evaluate if the
320	   current request respects the quota policy and MAY not apply to
321	   subsequent requests.

323	   Example: a successful response with the following fields

325	      RateLimit-Limit: 10
326	      RateLimit-Remaining: 1
327	      RateLimit-Reset: 7

329	   does not guarantee that the next request will be successful.  Server
330	   metrics may be subject to other conditions like the one shown in the
331	   example from Section 2.2.

333	   A server MAY return RateLimit response fields independently of the
334	   response status code.  This includes throttled responses.

336	   This document does not mandate any correlation between the RateLimit
337	   values and the returned status code.

339	   Servers should be careful in returning RateLimit fields in
340	   redirection responses (e.g. 3xx status codes) because a low
341	   RateLimit-Remaining value could prevent the client from issuing
342	   requests.  For example, given the rate limiting fields below, a
343	   client could decide to wait 10 seconds before following the Location
344	   header, because RateLimit-Remaining is 0.

346	   HTTP/1.1 301 Moved Permanently
347	   Location: /foo/123
348	   RateLimit-Remaining: 0
349	   RateLimit-Limit: 10
350	   RateLimit-Reset: 10

352	   If a response contains both the Retry-After and the RateLimit-Reset
353	   fields, the value of RateLimit-Reset SHOULD reference the same point
354	   in time as Retry-After.

356	   When using a policy involving more than one time-window, the server
357	   MUST reply with the RateLimit fields related to the window with the
358	   lower RateLimit-Remaining values.

360	   A service returning RateLimit fields MUST NOT convey values exposing
361	   an unwanted volume of requests and SHOULD implement mechanisms to cap
362	   the ratio between RateLimit-Remaining and RateLimit-Reset (see
363	   Section 6.5); this is especially important when quota-policies use a
364	   large time-window.

366	   Under certain conditions, a server MAY artificially lower RateLimit
367	   field values between subsequent requests, e.g. to respond to Denial
368	   of Service attacks or in case of resource saturation.

370	   Servers usually establish whether the request is in-quota before
371	   creating a response, so the RateLimit field values should be already
372	   available in that moment.  Nonetheless servers MAY decide to send the
373	   RateLimit fields in a trailer section.

375	3.1.  Performance considerations

377	   Servers are not required to return RateLimit fields in every
378	   response, and clients need to take this into account.  For example,
379	   an implementer concerned with performance might provide RateLimit
380	   fields only when a given quota is going to expire.

382	   Implementers concerned with response fields' size, might take into
383	   account their ratio with respect to the content length, or use
384	   header-compression HTTP features such as [HPACK].

386	4.  Receiving RateLimit fields

388	   A client MUST process the received RateLimit fields.

390	   A client MUST validate the values received in the RateLimit fields
391	   before using them and check if there are significant discrepancies
392	   with the expected ones.  This includes a RateLimit-Reset moment too
393	   far in the future or a service-limit too high.

395	   A client receiving RateLimit fields MUST NOT assume that subsequent
396	   responses contain the same RateLimit fields, or any RateLimit fields
397	   at all.

399	   Malformed RateLimit fields MAY be ignored.

401	   A client SHOULD NOT exceed the quota-units expressed in RateLimit-
402	   Remaining before the time-window expressed in RateLimit-Reset.

404	   A client MAY still probe the server if the RateLimit-Reset is
405	   considered too high.

407	   The value of RateLimit-Reset is generated at response time: a client
408	   aware of a significant network latency MAY behave accordingly and use
409	   other information (e.g. the Date response header field, or otherwise
410	   gathered metrics) to better estimate the RateLimit-Reset moment
411	   intended by the server.

413	   The details provided in RateLimit-Policy are informative and MAY be
414	   ignored.

416	   If a response contains both the RateLimit-Reset and Retry-After
417	   fields, Retry-After MUST take precedence and RateLimit-Reset MAY be
418	   ignored.

420	   This specification does not mandate a specific throttling behavior
421	   and implementers can adopt their preferred policies, including:

423	   *  slowing down or preemptively back-off their request rate when
424	      approaching quota limits;

426	   *  consuming all the quota according to the exposed limits and then
427	      wait.

429	4.1.  Intermediaries

431	   This section documents the considerations advised in Section 16.3.2
432	   of [SEMANTICS].

434	   An intermediary that is not part of the originating service
435	   infrastructure and is not aware of the quota-policy semantic used by
436	   the Origin Server SHOULD NOT alter the RateLimit fields' values in
437	   such a way as to communicate a more permissive quota-policy; this
438	   includes removing the RateLimit fields.

440	   An intermediary MAY alter the RateLimit fields in such a way as to
441	   communicate a more restrictive quota-policy when:

443	   *  it is aware of the quota-unit semantic used by the Origin Server;

445	   *  it implements this specification and enforces a quota-policy which
446	      is more restrictive than the one conveyed in the fields.

448	   An intermediary SHOULD forward a request even when presuming that it
449	   might not be serviced; the service returning the RateLimit fields is
450	   the sole responsible of enforcing the communicated quota-policy, and
451	   it is always free to service incoming requests.

453	   This specification does not mandate any behavior on intermediaries
454	   respect to retries, nor requires that intermediaries have any role in
455	   respecting quota-policies.  For example, it is legitimate for a proxy
456	   to retransmit a request without notifying the client, and thus
457	   consuming quota-units.

459	4.2.  Caching

461	   As is the ordinary case for HTTP caching ([RFC7234]), a response with
462	   RateLimit fields might be cached and re-used for subsequent requests.
463	   A cached RateLimit response does not modify quota counters but could
464	   contain stale information.  Clients interested in determining the
465	   freshness of the RateLimit fields could rely on fields such as Date
466	   and on the time-window of a quota-policy.

468	5.  Fields definition

470	   The following RateLimit response fields are defined

472	5.1.  RateLimit-Limit

474	   The RateLimit-Limit response field indicates the service-limit
475	   associated to the client in the current time-window.

477	   If the client exceeds that limit, it MAY not be served.

479	   The field is a non-negative Integer.  Its value is named expiring-
480	   limit.

482	      RateLimit-Limit = expiring-limit
483	      expiring-limit = service-limit

485	   The expiring-limit value MUST be set to the service-limit that is
486	   closer to reach its limit, and the associated time-window MUST either
487	   be:

489	   *  inferred by the value of RateLimit-Reset at the moment of the
490	      reset, or

492	   *  communicated out-of-band (e.g. in the documentation).

494	   The RateLimit-Policy field (see Section 5.2), might contain
495	   information on the associated time-window.

497	      RateLimit-Limit: 100

499	   This field MUST NOT occur multiple times and can be sent in a trailer
500	   section.

502	5.2.  RateLimit-Policy

504	   The RateLimit-Policy response field indicates the quota associated to
505	   the client and its value is informative.

507	   The field is a non-empty List of quota policies (see Section 2.3).

509	      RateLimit-Policy = sf-list

511	   A time-window associated to expiring-limit can be communicated via
512	   RateLimit-Policy, like shown in the following example.

514	      RateLimit-Policy: 100;w=10
515	      RateLimit-Limit: 100

517	   Policies using multiple quota limits MAY be returned using multiple
518	   quota-policy items, like shown in the following two examples:

520	      RateLimit-Policy: 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400
521	      RateLimit-Policy: 10;w=1;burst=1000, 1000;w=3600

523	   This field MUST NOT occur multiple times and can be sent in a trailer
524	   section.

526	5.3.  RateLimit-Remaining

528	   The RateLimit-Remaining response field indicates the remaining quota-
529	   units defined in Section 2.2 associated to the client.

531	   The field is a non-negative Integer expressed in quota-units.

533	      RateLimit-Remaining = quota-units

535	   This field MUST NOT occur multiple times and can be sent in a trailer
536	   section.

538	   Clients MUST NOT assume that a positive RateLimit-Remaining value is
539	   a guarantee that further requests will be served.

541	   A low RateLimit-Remaining value is like a yellow traffic-light for
542	   either the number of requests issued in the time-window or the
543	   request throughput: the red light may arrive suddenly (see
544	   Section 3).

546	   One example of RateLimit-Remaining use is below.

548	      RateLimit-Remaining: 50

550	5.4.  RateLimit-Reset

552	   The RateLimit-Reset response field indicates the number of seconds
553	   until the quota resets.

555	   The field is a non-negative Integer.

557	      RateLimit-Reset = delay-seconds

559	   The delay-seconds format is used because:

561	   *  it does not rely on clock synchronization and is resilient to
562	      clock adjustment and clock skew between client and server (see
563	      Section 5.6.7 of [SEMANTICS]);

565	   *  it mitigates the risk related to thundering herd when too many
566	      clients are serviced with the same timestamp.

568	   This field MUST NOT occur multiple times and can be sent in a trailer
569	   section.

571	   An example of RateLimit-Reset use is below.

573	      RateLimit-Reset: 50

575	   The client MUST NOT assume that all its service-limit will be
576	   restored after the moment referenced by RateLimit-Reset.  The server
577	   MAY arbitrarily alter the RateLimit-Reset value between subsequent
578	   requests e.g. in case of resource saturation or to implement sliding
579	   window policies.

581	6.  Security Considerations

583	6.1.  Throttling does not prevent clients from issuing requests

585	   This specification does not prevent clients to make over-quota
586	   requests.

588	   Servers should always implement mechanisms to prevent resource
589	   exhaustion.

591	6.2.  Information disclosure

593	   Servers should not disclose to untrusted parties operational capacity
594	   information that can be used to saturate its infrastructural
595	   resources.

597	   While this specification does not mandate whether non 2xx responses
598	   consume quota, if 401 and 403 responses count on quota a malicious
599	   client could probe the endpoint to get traffic information of another
600	   user.

602	   As intermediaries might retransmit requests and consume quota-units
603	   without prior knowledge of the User Agent, RateLimit fields might
604	   reveal the existence of an intermediary to the User Agent.

606	6.3.  Remaining quota-units are not granted requests

608	   RateLimit-* fields convey hints from the server to the clients in
609	   order to avoid being throttled out.

611	   Clients MUST NOT consider the quota-units returned in RateLimit-
612	   Remaining as a service level agreement.

614	   In case of resource saturation, the server MAY artificially lower the
615	   returned values or not serve the request regardless of the advertised
616	   quotas.

618	6.4.  Reliability of RateLimit-Reset

620	   Consider that service-limit may not be restored after the moment
621	   referenced by RateLimit-Reset, and the RateLimit-Reset value should
622	   not be considered fixed nor constant.

624	   Subsequent requests may return a higher RateLimit-Reset value to
625	   limit concurrency or implement dynamic or adaptive throttling
626	   policies.

628	6.5.  Resource exhaustion

630	   When returning RateLimit-Reset you must be aware that many throttled
631	   clients may come back at the very moment specified.

633	   This is true for Retry-After too.

635	   For example, if the quota resets every day at 18:00:00 and your
636	   server returns the RateLimit-Reset accordingly

638	      Date: Tue, 15 Nov 1994 08:00:00 GMT
639	      RateLimit-Reset: 36000

641	   there's a high probability that all clients will show up at 18:00:00.

643	   This could be mitigated by adding some jitter to the field-value.

645	   Resource exhaustion issues can be associated with quota policies
646	   using a large time-window, because a user agent by chance or on
647	   purpose might consume most of its quota-units in a significantly
648	   shorter interval.

650	   This behavior can be even triggered by the provided RateLimit fields.
651	   The following example describes a service with an unconsumed quota-
652	   policy of 10000 quota-units per 1000 seconds.

654	   RateLimit-Limit: 10000
655	   RateLimit-Policy: 10000;w=1000
656	   RateLimit-Remaining: 10000
657	   RateLimit-Reset: 10

659	   A client implementing a simple ratio between RateLimit-Remaining and
660	   RateLimit-Reset could infer an average throughput of 1000 quota-units
661	   per second, while RateLimit-Limit conveys a quota-policy with an
662	   average of 10 quota-units per second.  If the service cannot handle
663	   such load, it should return either a lower RateLimit-Remaining value
664	   or an higher RateLimit-Reset value.  Moreover, complementing large
665	   time-window quota-policies with a short time-window one mitigates
666	   those risks.

668	6.6.  Denial of Service

670	   RateLimit fields may assume unexpected values by chance or purpose.
671	   For example, an excessively high RateLimit-Remaining value may be:

673	   *  used by a malicious intermediary to trigger a Denial of Service
674	      attack or consume client resources boosting its requests;

676	   *  passed by a misconfigured server;

678	   or an high RateLimit-Reset value could inhibit clients to contact the
679	   server.

681	   Clients MUST validate the received values to mitigate those risks.

683	7.  Privacy Considerations

685	   Clients that act upon a request to rate limit are potentially re-
686	   identifiable (see Section 7.1 of [DNS-PRIVACY]) because they react to
687	   information that might only be given to them.  Note that this might
688	   apply to other fields too (e.g.  Retry-After).

690	   Since rate limiting is usually implemented in contexts where clients
691	   are either identified or profiled (e.g. assigning different quota
692	   units to different users), this is rarely a concern.

694	   Privacy enhancing infrastructures using RateLimit fields can define
695	   specific techniques to mitigate the risks of re-identification.

697	8.  IANA Considerations

699	   IANA is requested to update one registry and create one new registry.

701	   Please add the following entries to the "Hypertext Transfer Protocol
702	   (HTTP) Field Name Registry" registry ([SEMANTICS]):

704	       +=====================+===========+========================+
705	       | Field Name          | Status    | Specification          |
706	       +=====================+===========+========================+
707	       | RateLimit-Limit     | permanent | Section 5.1 of ThisRFC |
708	       +---------------------+-----------+------------------------+
709	       | RateLimit-Remaining | permanent | Section 5.3 of ThisRFC |
710	       +---------------------+-----------+------------------------+
711	       | RateLimit-Reset     | permanent | Section 5.4 of ThisRFC |
712	       +---------------------+-----------+------------------------+
713	       | RateLimit-Policy    | permanent | Section 5.2 of ThisRFC |
714	       +---------------------+-----------+------------------------+

716	                                 Table 1

718	8.1.  RateLimit Parameters Registration

720	   IANA is requested to create a new registry to be called "Hypertext
721	   Transfer Protocol (HTTP) RateLimit Parameters Registry", to be
722	   located at https://www.iana.org/assignments/http-ratelimit-parameters
723	   (https://www.iana.org/assignments/http-ratelimit-parameters).
724	   Registration is done on the advice of a Designated Expert, appointed
725	   by the IESG or their delegate.  All entries are Specification
726	   Required ([IANA], Section 4.6).

728	   Registration requests consist of the following information:

730	   *  Parameter name: The parameter name, conforming to [SF].

732	   *  Field name: The RateLimit field for which the parameter is
733	      registered.  If a parameter is intended to be used with multiple
734	      fields, it has to be registered for each one.

736	   *  Description: A brief description of the parameter.

738	   *  Specification document: A reference to the document that specifies
739	      the parameter, preferably including a URI that can be used to
740	      retrieve a copy of the document.

742	   *  Comments (optional): Any additional information that can be
743	      useful.

745	   The initial contents of this registry should be:

747	   +==================+=========+===========+===============+==========+
748	   | Field Name       |Parameter|Description|Specification  |Comments  |
749	   |                  |name     |           |               |(optional)|
750	   +==================+=========+===========+===============+==========+
751	   | RateLimit-Policy |w        |Time window|Section 2.3 of |          |
752	   |                  |         |           |ThisRFC        |          |
753	   +------------------+---------+-----------+---------------+----------+

755	                                  Table 2

757	9.  References

759	9.1.  Normative References

761	   [IANA]     Cotton, M., Leiba, B., and T. Narten, "Guidelines for
762	              Writing an IANA Considerations Section in RFCs", BCP 26,
763	              RFC 8126, DOI 10.17487/RFC8126, June 2017,
764	              <https://www.rfc-editor.org/rfc/rfc8126>.

766	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
767	              Requirement Levels", BCP 14, RFC 2119,
768	              DOI 10.17487/RFC2119, March 1997,
769	              <https://www.rfc-editor.org/rfc/rfc2119>.

771	   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
772	              Specifications: ABNF", STD 68, RFC 5234,
773	              DOI 10.17487/RFC5234, January 2008,
774	              <https://www.rfc-editor.org/rfc/rfc5234>.

776	   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454,
777	              DOI 10.17487/RFC6454, December 2011,
778	              <https://www.rfc-editor.org/rfc/rfc6454>.

780	   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
781	              RFC 7405, DOI 10.17487/RFC7405, December 2014,
782	              <https://www.rfc-editor.org/rfc/rfc7405>.

784	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
785	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
786	              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

788	   [SEMANTICS]
789	              Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP
790	              Semantics", Work in Progress, Internet-Draft, draft-ietf-
791	              httpbis-semantics-19, 12 September 2021,
792	              <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-
793	              semantics-19>.

795	   [SF]       Nottingham, M. and P-H. Kamp, "Structured Field Values for
796	              HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021,
797	              <https://www.rfc-editor.org/rfc/rfc8941>.

799	9.2.  Informative References

801	   [DNS-PRIVACY]
802	              Wicinski, T., Ed., "DNS Privacy Considerations", RFC 9076,
803	              DOI 10.17487/RFC9076, July 2021,
804	              <https://www.rfc-editor.org/rfc/rfc9076>.

806	   [HPACK]    Peon, R. and H. Ruellan, "HPACK: Header Compression for
807	              HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015,
808	              <https://www.rfc-editor.org/rfc/rfc7541>.

810	   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
811	              Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
812	              <https://www.rfc-editor.org/rfc/rfc3339>.

814	   [RFC6585]  Nottingham, M. and R. Fielding, "Additional HTTP Status
815	              Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
816	              <https://www.rfc-editor.org/rfc/rfc6585>.

818	   [RFC7234]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
819	              Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
820	              RFC 7234, DOI 10.17487/RFC7234, June 2014,
821	              <https://www.rfc-editor.org/rfc/rfc7234>.

823	   [STATUS429]
824	              Stewart, R., Tuexen, M., and P. Lei, "Stream Control
825	              Transmission Protocol (SCTP) Stream Reconfiguration",
826	              RFC 6525, DOI 10.17487/RFC6525, February 2012,
827	              <https://www.rfc-editor.org/rfc/rfc6525>.

829	   [UNIX]     The Open Group, "The Single UNIX Specification, Version 2
830	              - 6 Vol Set for UNIX 98", February 1997.

832	Appendix A.  Rate-limiting and quotas

834	   Servers use quota mechanisms to avoid systems overload, to ensure an
835	   equitable distribution of computational resources or to enforce other
836	   policies - e.g. monetization.

838	   A basic quota mechanism limits the number of acceptable requests in a
839	   given time window, e.g. 10 requests per second.

841	   When quota is exceeded, servers usually do not serve the request
842	   replying instead with a 4xx HTTP status code (e.g. 429 or 403) or
843	   adopt more aggressive policies like dropping connections.

845	   Quotas may be enforced on different basis (e.g. per user, per IP, per
846	   geographic area, ..) and at different levels.  For example, an user
847	   may be allowed to issue:

849	   *  10 requests per second;

851	   *  limited to 60 requests per minute;

853	   *  limited to 1000 requests per hour.

855	   Moreover system metrics, statistics and heuristics can be used to
856	   implement more complex policies, where the number of acceptable
857	   requests and the time window are computed dynamically.

859	   To help clients throttling their requests, servers may expose the
860	   counters used to evaluate quota policies via HTTP header fields.

862	   Those response headers may be added by HTTP intermediaries such as
863	   API gateways and reverse proxies.

865	   On the web we can find many different rate-limit headers, usually
866	   containing the number of allowed requests in a given time window, and
867	   when the window is reset.

869	   The common choice is to return three headers containing:

871	   *  the maximum number of allowed requests in the time window;

873	   *  the number of remaining requests in the current window;

875	   *  the time remaining in the current window expressed in seconds or
876	      as a timestamp;

878	A.1.  Interoperability issues

880	   A major interoperability issue in throttling is the lack of standard
881	   headers, because:

883	   *  each implementation associates different semantics to the same
884	      header field names;

886	   *  header field names proliferates.

888	   User Agents interfacing with different servers may thus need to
889	   process different headers, or the very same application interface
890	   that sits behind different reverse proxies may reply with different
891	   throttling headers.

893	Appendix B.  Examples

895	B.1.  Unparameterized responses

897	B.1.1.  Throttling information in responses

899	   The client exhausted its service-limit for the next 50 seconds.  The
900	   time-window is communicated out-of-band or inferred by the field
901	   values.

903	   Request:

905	   GET /items/123 HTTP/1.1
906	   Host: api.example

908	   Response:

910	   HTTP/1.1 200 Ok
911	   Content-Type: application/json
912	   RateLimit-Limit: 100
913	   Ratelimit-Remaining: 0
914	   Ratelimit-Reset: 50

916	   {"hello": "world"}

918	   Since the field values are not necessarily correlated with the
919	   response status code, a subsequent request is not required to fail.
920	   The example below shows that the server decided to serve the request
921	   even if RateLimit-Remaining is 0.  Another server, or the same server
922	   under other load conditions, could have decided to throttle the
923	   request instead.

925	   Request:

927	   GET /items/456 HTTP/1.1
928	   Host: api.example

930	   Response:

932	   HTTP/1.1 200 Ok
933	   Content-Type: application/json
934	   RateLimit-Limit: 100
935	   Ratelimit-Remaining: 0
936	   Ratelimit-Reset: 48

938	   {"still": "successful"}

940	B.1.2.  Use in conjunction with custom fields

942	   The server uses two custom fields, namely acme-RateLimit-DayLimit and
943	   acme-RateLimit-HourLimit to expose the following policy:

945	   *  5000 daily quota-units;

947	   *  1000 hourly quota-units.

949	   The client consumed 4900 quota-units in the first 14 hours.

951	   Despite the next hourly limit of 1000 quota-units, the closest limit
952	   to reach is the daily one.

954	   The server then exposes the RateLimit-* fields to inform the client
955	   that:

957	   *  it has only 100 quota-units left;
958	   *  the window will reset in 10 hours.

960	   Request:

962	   GET /items/123 HTTP/1.1
963	   Host: api.example

965	   Response:

967	   HTTP/1.1 200 Ok
968	   Content-Type: application/json
969	   acme-RateLimit-DayLimit: 5000
970	   acme-RateLimit-HourLimit: 1000
971	   RateLimit-Limit: 5000
972	   RateLimit-Remaining: 100
973	   RateLimit-Reset: 36000

975	   {"hello": "world"}

977	B.1.3.  Use for limiting concurrency

979	   Throttling fields may be used to limit concurrency, advertising
980	   limits that are lower than the usual ones in case of saturation, thus
981	   increasing availability.

983	   The server adopted a basic policy of 100 quota-units per minute, and
984	   in case of resource exhaustion adapts the returned values reducing
985	   both RateLimit-Limit and RateLimit-Remaining.

987	   After 2 seconds the client consumed 40 quota-units

989	   Request:

991	   GET /items/123 HTTP/1.1
992	   Host: api.example

994	   Response:

996	   HTTP/1.1 200 Ok
997	   Content-Type: application/json
998	   RateLimit-Limit: 100
999	   RateLimit-Remaining: 60
1000	   RateLimit-Reset: 58

1002	   {"elapsed": 2, "issued": 40}

1004	   At the subsequent request - due to resource exhaustion - the server
1005	   advertises only RateLimit-Remaining: 20.

1007	   Request:

1009	   GET /items/123 HTTP/1.1
1010	   Host: api.example

1012	   Response:

1014	   HTTP/1.1 200 Ok
1015	   Content-Type: application/json
1016	   RateLimit-Limit: 100
1017	   RateLimit-Remaining: 20
1018	   RateLimit-Reset: 56

1020	   {"elapsed": 4, "issued": 41}

1022	B.1.4.  Use in throttled responses

1024	   A client exhausted its quota and the server throttles it sending
1025	   Retry-After.

1027	   In this example, the values of Retry-After and RateLimit-Reset
1028	   reference the same moment, but this is not a requirement.

1030	   The 429 (Too Many Request) HTTP status code is just used as an
1031	   example.

1033	   Request:

1035	   GET /items/123 HTTP/1.1
1036	   Host: api.example

1038	   Response:

1040	   HTTP/1.1 429 Too Many Requests
1041	   Content-Type: application/json
1042	   Date: Mon, 05 Aug 2019 09:27:00 GMT
1043	   Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
1044	   RateLimit-Reset: 5
1045	   RateLimit-Limit: 100
1046	   Ratelimit-Remaining: 0

1048	   {
1049	   "title": "Too Many Requests",
1050	   "status": 429,
1051	   "detail": "You have exceeded your quota"
1052	   }

1054	B.2.  Parameterized responses

1056	B.2.1.  Throttling window specified via parameter

1058	   The client has 99 quota-units left for the next 50 seconds.  The
1059	   time-window is communicated by the w parameter, so we know the
1060	   throughput is 100 quota-units per minute.

1062	   Request:

1064	   GET /items/123 HTTP/1.1
1065	   Host: api.example

1067	   Response:

1069	   HTTP/1.1 200 Ok
1070	   Content-Type: application/json
1071	   RateLimit-Limit: 100
1072	   RateLimit-Policy: 100;w=60
1073	   Ratelimit-Remaining: 99
1074	   Ratelimit-Reset: 50

1076	   {"hello": "world"}

1078	B.2.2.  Dynamic limits with parameterized windows

1080	   The policy conveyed by RateLimit-Limit states that the server accepts
1081	   100 quota-units per minute.

1083	   To avoid resource exhaustion, the server artificially lowers the
1084	   actual limits returned in the throttling headers.

1086	   The RateLimit-Remaining then advertises only 9 quota-units for the
1087	   next 50 seconds to slow down the client.

1089	   Note that the server could have lowered even the other values in
1090	   RateLimit-Limit: this specification does not mandate any relation
1091	   between the field values contained in subsequent responses.

1093	   Request:

1095	   GET /items/123 HTTP/1.1
1096	   Host: api.example

1098	   Response:

1100	   HTTP/1.1 200 Ok
1101	   Content-Type: application/json
1102	   RateLimit-Limit: 10
1103	   RateLimit-Policy: 100;w=60
1104	   Ratelimit-Remaining: 9
1105	   Ratelimit-Reset: 50

1107	   {
1108	     "status": 200,
1109	     "detail": "Just slow down without waiting."
1110	   }

1112	B.2.3.  Dynamic limits for pushing back and slowing down

1114	   Continuing the previous example, let's say the client waits 10
1115	   seconds and performs a new request which, due to resource exhaustion,
1116	   the server rejects and pushes back, advertising RateLimit-Remaining:
1117	   0 for the next 20 seconds.

1119	   The server advertises a smaller window with a lower limit to slow
1120	   down the client for the rest of its original window after the 20
1121	   seconds elapse.

1123	   Request:

1125	   GET /items/123 HTTP/1.1
1126	   Host: api.example

1128	   Response:

1130	   HTTP/1.1 429 Too Many Requests
1131	   Content-Type: application/json
1132	   RateLimit-Limit: 0
1133	   RateLimit-Policy: 15;w=20
1134	   Ratelimit-Remaining: 0
1135	   Ratelimit-Reset: 20

1137	   {
1138	     "status": 429,
1139	     "detail": "Wait 20 seconds, then slow down!"
1140	   }

1142	B.3.  Dynamic limits for pushing back with Retry-After and slow down

1144	   Alternatively, given the same context where the previous example
1145	   starts, we can convey the same information to the client via Retry-
1146	   After, with the advantage that the server can now specify the
1147	   policy's nominal limit and window that will apply after the reset,
1148	   e.g. assuming the resource exhaustion is likely to be gone by then,
1149	   so the advertised policy does not need to be adjusted, yet we managed
1150	   to stop requests for a while and slow down the rest of the current
1151	   window.

1153	   Request:

1155	   GET /items/123 HTTP/1.1
1156	   Host: api.example

1158	   Response:

1160	   HTTP/1.1 429 Too Many Requests
1161	   Content-Type: application/json
1162	   Retry-After: 20
1163	   RateLimit-Limit: 15
1164	   RateLimit-Policy: 100;w=60
1165	   Ratelimit-Remaining: 15
1166	   Ratelimit-Reset: 40

1168	   {
1169	     "status": 429,
1170	     "detail": "Wait 20 seconds, then slow down!"
1171	   }

1173	   Note that in this last response the client is expected to honor
1174	   Retry-After and perform no requests for the specified amount of time,
1175	   whereas the previous example would not force the client to stop
1176	   requests before the reset time is elapsed, as it would still be free
1177	   to query again the server even if it is likely to have the request
1178	   rejected.

1180	B.3.1.  Missing Remaining information

1182	   The server does not expose RateLimit-Remaining values (for example,
1183	   because the underlying counters are not available).  Instead, it
1184	   resets the limit counter every second.

1186	   It communicates to the client the limit of 10 quota-units per second
1187	   always returning the couple RateLimit-Limit and RateLimit-Reset.

1189	   Request:

1191	   GET /items/123 HTTP/1.1
1192	   Host: api.example

1194	   Response:

1196	   HTTP/1.1 200 Ok
1197	   Content-Type: application/json
1198	   RateLimit-Limit: 10
1199	   Ratelimit-Reset: 1

1201	   {"first": "request"}

1203	   Request:

1205	   GET /items/123 HTTP/1.1
1206	   Host: api.example

1208	   Response:

1210	   HTTP/1.1 200 Ok
1211	   Content-Type: application/json
1212	   RateLimit-Limit: 10
1213	   Ratelimit-Reset: 1

1215	   {"second": "request"}

1217	B.3.2.  Use with multiple windows

1219	   This is a standardized way of describing the policy detailed in
1220	   Appendix B.1.2:

1222	   *  5000 daily quota-units;

1224	   *  1000 hourly quota-units.

1226	   The client consumed 4900 quota-units in the first 14 hours.

1228	   Despite the next hourly limit of 1000 quota-units, the closest limit
1229	   to reach is the daily one.

1231	   The server then exposes the RateLimit fields to inform the client
1232	   that:

1234	   *  it has only 100 quota-units left;

1236	   *  the window will reset in 10 hours;

1238	   *  the expiring-limit is 5000.

1240	   Request:

1242	   GET /items/123 HTTP/1.1
1243	   Host: api.example

1245	   Response:

1247	   HTTP/1.1 200 OK
1248	   Content-Type: application/json
1249	   RateLimit-Limit: 5000
1250	   RateLimit-Policy: 1000;w=3600, 5000;w=86400
1251	   RateLimit-Remaining: 100
1252	   RateLimit-Reset: 36000

1254	   {"hello": "world"}

1256	FAQ

1258	   This section is to be removed before publishing as an RFC.

1260	   1.  Why defining standard fields for throttling?

1262	       To simplify enforcement of throttling policies.

1264	   2.  Can I use RateLimit-* in throttled responses (eg with status code
1265	       429)?

1267	       Yes, you can.

1269	   3.  Are those specs tied to RFC 6585?

1271	       No.  [RFC6585] defines the 429 status code and we use it just as
1272	       an example of a throttled request, that could instead use even
1273	       403 or whatever status code.  The goal of this specification is
1274	       to standardize the name and semantic of three ratelimit fields
1275	       widely used on the internet.  Stricter relations with status
1276	       codes or error response payloads would impose behaviors to all
1277	       the existing implementations making the adoption more complex.

1279	   4.  Why don't pass the throttling scope as a parameter?

1281	       The word "scope" can have different meanings: for example it can
1282	       be an URL, or an authorization scope.  Since authorization is out
1283	       of the scope of this document (see Section 1.1), and that we rely
1284	       only on [SEMANTICS], in Section 1.1 we defined "scope" in terms
1285	       of URL.

1287	       Since clients are not required to process quota policies (see
1288	       Section 4), we could add a new "RateLimit-Scope" field to this
1289	       spec.  See this discussion on a similar thread
1290	       (https://github.com/httpwg/http-core/pull/317#issuecomment-
1291	       585868767)

1293	       Specific ecosystems can still bake their own prefixed parameters,
1294	       such as acme-auth-scope or acme-url-scope and ensure that clients
1295	       process them.  This behavior cannot be relied upon when
1296	       communicating between different ecosystems.

1298	       We are open to suggestions: comment on this issue
1299	       (https://github.com/ioggstream/draft-polli-ratelimit-headers/
1300	       issues/70)

1302	   5.  Why using delay-seconds instead of a UNIX Timestamp?  Why not
1303	       using subsecond precision?

1305	       Using delay-seconds aligns with Retry-After, which is returned in
1306	       similar contexts, eg on 429 responses.

1308	       Timestamps require a clock synchronization protocol (see
1309	       Section 5.6.7 of [SEMANTICS]).  This may be problematic (e.g.
1310	       clock adjustment, clock skew, failure of hardcoded clock
1311	       synchronization servers, IoT devices, ..).  Moreover timestamps
1312	       may not be monotonically increasing due to clock adjustment.  See
1313	       Another NTP client failure story
1314	       (https://community.ntppool.org/t/another-ntp-client-failure-
1315	       story/1014/)

1317	       We did not use subsecond precision because:

1319	       *  that is more subject to system clock correction like the one
1320	          implemented via the adjtimex() Linux system call;

1322	       *  response-time latency may not make it worth.  A brief
1323	          discussion on the subject is on the httpwg ml
1324	          (https://lists.w3.org/Archives/Public/ietf-http-
1325	          wg/2019JulSep/0202.html)

1327	       *  almost all rate-limit headers implementations do not use it.

1329	   6.  Why not support multiple quota remaining?

1331	       While this might be of some value, my experience suggests that
1332	       overly-complex quota implementations results in lower
1333	       effectiveness of this policy.  This spec allows the client to
1334	       easily focusing on RateLimit-Remaining and RateLimit-Reset.

1336	   7.  Shouldn't I limit concurrency instead of request rate?

1338	       You can use this specification to limit concurrency at the HTTP
1339	       level (see {#use-for-limiting-concurrency}) and help clients to
1340	       shape their requests avoiding being throttled out.

1342	       A problematic way to limit concurrency is connection dropping,
1343	       especially when connections are multiplexed (e.g.  HTTP/2)
1344	       because this results in unserviced client requests, which is
1345	       something we want to avoid.

1347	       A semantic way to limit concurrency is to return 503 + Retry-
1348	       After in case of resource saturation (e.g. thrashing, connection
1349	       queues too long, Service Level Objectives not meet, ..).
1350	       Saturation conditions can be either dynamic or static: all this
1351	       is out of the scope for the current document.

1353	   8.  Do a positive value of RateLimit-Remaining imply any service
1354	       guarantee for my future requests to be served?

1356	       No.  FAQ integrated in Section 5.3.

1358	   9.  Is the quota-policy definition Section 2.3 too complex?

1360	       You can always return the simplest form of the 3 fields

1362	   RateLimit-Limit: 100
1363	   RateLimit-Remaining: 50
1364	   RateLimit-Reset: 60

1366	   The key runtime value is the first element of the list: expiring-
1367	   limit, the others quota-policy are informative.  So for the following
1368	   field:

1370	   RateLimit-Limit: 100
1371	   RateLimit-Policy: 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window"

1373	   the key value is the one referencing the lowest limit: 100

1375	   1.  Can we use shorter names?  Why don't put everything in one field?

1377	   The most common syntax we found on the web is X-RateLimit-* and when
1378	   starting this I-D we opted for it (https://github.com/ioggstream/
1379	   draft-polli-ratelimit-headers/issues/34#issuecomment-519366481)

1381	   The basic form of those fields is easily parseable, even by
1382	   implementers processing responses using technologies like dynamic
1383	   interpreter with limited syntax.

1385	   Using a single field complicates parsing and takes a significantly
1386	   different approach from the existing ones: this can limit adoption.

1388	   1.  Why don't mention connections?

1390	       Beware of the term "connection": &#65532; &#65532; - it is just
1391	       _one_ possible saturation cause.  Once you go that path &#65532;
1392	       you will expose other infrastructural details (bandwidth, CPU, ..
1393	       see Section 6.2) &#65532; and complicate client compliance;
1394	       &#65532; - it is an infrastructural detail defined in terms of
1395	       server and network &#65532; rather than the consumed service.
1396	       This specification protects the services first, and then the
1397	       infrastructures through client cooperation (see Section 6.1).
1398	       &#65532; &#65532; RateLimit fields enable sending _on the same
1399	       connection_ different limit values &#65532; on each response,
1400	       depending on the policy scope (e.g. per-user, per-custom-key, ..)
1401	       &#65532;

1403	   2.  Can intermediaries alter RateLimit fields?

1405	       Generally, they should not because it might result in unserviced
1406	       requests.  There are reasonable use cases for intermediaries
1407	       mangling RateLimit fields though, e.g. when they enforce stricter
1408	       quota-policies, or when they are an active component of the
1409	       service.  In those case we will consider them as part of the
1410	       originating infrastructure.

1412	   3.  Why the w parameter is just informative?  Could it be used by a
1413	       client to determine the request rate?

1415	       A non-informative w parameter might be fine in an environment
1416	       where clients and servers are tightly coupled.  Conveying
1417	       policies with this detail on a large scale would be very complex
1418	       and implementations would be likely not interoperable.  We thus
1419	       decided to leave w as an informational parameter and only rely on
1420	       RateLimit-Limit, RateLimit-Remaining and RateLimit-Reset for
1421	       defining the throttling behavior.

1423	RateLimit fields currently used on the web

1425	   This section is to be removed before publishing as an RFC.

1427	   Commonly used header field names are:

1429	   *  X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset;

1431	   *  X-Rate-Limit-Limit, X-Rate-Limit-Remaining, X-Rate-Limit-Reset.

1433	   There are variants too, where the window is specified in the header
1434	   field name, eg:

1436	   *  x-ratelimit-limit-minute, x-ratelimit-limit-hour, x-ratelimit-
1437	      limit-day

1439	   *  x-ratelimit-remaining-minute, x-ratelimit-remaining-hour, x-
1440	      ratelimit-remaining-day

1442	   Here are some interoperability issues:

1444	   *  X-RateLimit-Remaining references different values, depending on
1445	      the implementation:

1447	      -  seconds remaining to the window expiration

1449	      -  milliseconds remaining to the window expiration

1451	      -  seconds since UTC, in UNIX Timestamp [UNIX]

1453	      -  a datetime, either IMF-fixdate [SEMANTICS] or [RFC3339]

1455	   *  different headers, with the same semantic, are used by different
1456	      implementers:

1458	      -  X-RateLimit-Limit and X-Rate-Limit-Limit

1460	      -  X-RateLimit-Remaining and X-Rate-Limit-Remaining

1462	      -  X-RateLimit-Reset and X-Rate-Limit-Reset

1464	   The semantic of RateLimit-Remaining depends on the windowing
1465	   algorithm.  A sliding window policy for example may result in having
1466	   a RateLimit-Remaining value related to the ratio between the current
1467	   and the maximum throughput. e.g.

1469	   RateLimit-Limit: 12
1470	   RateLimit-Policy: 12;w=1
1471	   RateLimit-Remaining: 6          ; using 50% of throughput, that is 6 units/s
1472	   RateLimit-Reset: 1

1474	   If this is the case, the optimal solution is to achieve

1476	   RateLimit-Limit: 12
1477	   RateLimit-Policy: 12;w=1
1478	   RateLimit-Remaining: 1          ; using 100% of throughput, that is 12 units/s
1479	   RateLimit-Reset: 1
1480	   At this point you should stop increasing your request rate.

1482	Acknowledgements

1484	   Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro
1485	   Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark
1486	   Nottingham for being the initial contributors of these
1487	   specifications.  Kudos to the first community implementers: Aapo
1488	   Talvensaari, Nathan Friedly and Sanyam Dogra.

1490	   In addition to the people above, this document owes a lot to the
1491	   extensive discussion in the HTTPAPI workgroup, including Rich Salz,
1492	   Darrel Miller and Julian Reschke.

1494	Changes

1496	   This section is to be removed before publishing as an RFC.

1498	Since draft-ietf-httpapi-ratelimit-headers-03

1500	   This section is to be removed before publishing as an RFC.

1502	   *  Split policy informatio in RateLimit-Policy #81

1504	Since draft-ietf-httpapi-ratelimit-headers-02

1506	   This section is to be removed before publishing as an RFC.

1508	   *  Address throttling scope #83

1510	Since draft-ietf-httpapi-ratelimit-headers-01

1512	   This section is to be removed before publishing as an RFC.

1514	   *  Update IANA considerations #60

1516	   *  Use Structured fields #58

1518	   *  Reorganize document #67

1520	Since draft-ietf-httpapi-ratelimit-headers-00

1522	   This section is to be removed before publishing as an RFC.

1524	   *  Use I-D.httpbis-semantics, which includes referencing delay-
1525	      seconds instead of delta-seconds. #5

1527	Authors' Addresses

1529	   Roberto Polli
1530	   Team Digitale, Italian Government
1531	   Italy
1532	   Email: robipolli@gmail.com

1534	   Alejandro Martinez Ruiz
1535	   Red Hat
1536	   Email: alex@flawedcode.org