idnits 2.17.1 

draft-ietf-httpapi-ratelimit-headers-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 3 instances of too long lines in the document, the longest one
     being 41 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (7 March 2022) is 774 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Normative reference to a draft: ref. 'SEMANTICS' 

  -- Obsolete informational reference (is this intentional?): RFC 7234
     (Obsoleted by RFC 9111)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	HTTPAPI                                                         R. Polli
3	Internet-Draft                         Team Digitale, Italian Government
4	Intended status: Standards Track                             A. Martinez
5	Expires: 8 September 2022                                        Red Hat
6	                                                            7 March 2022

8	                       RateLimit Fields for HTTP
9	                draft-ietf-httpapi-ratelimit-headers-03

11	Abstract

13	   This document defines the RateLimit-Limit, RateLimit-Remaining,
14	   RateLimit-Reset fields for HTTP, thus allowing servers to publish
15	   current service limits and clients to shape their request policy and
16	   avoid being throttled out.

18	Note to Readers

20	   _RFC EDITOR: please remove this section before publication_

22	   Discussion of this draft takes place on the HTTP working group
23	   mailing list (httpapi@ietf.org), which is archived at
24	   https://mailarchive.ietf.org/arch/browse/httpapi/
25	   (https://mailarchive.ietf.org/arch/browse/httpapi/).

27	   The source code and issues list for this draft can be found at
28	   https://github.com/ietf-wg-httpapi/ratelimit-headers
29	   (https://github.com/ietf-wg-httpapi/ratelimit-headers).

31	   References to ThisRFC in the IANA Considerations section would be
32	   replaced with the RFC number when assigned.

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on 8 September 2022.

50	Copyright Notice

52	   Copyright (c) 2022 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
57	   license-info) in effect on the date of publication of this document.
58	   Please review these documents carefully, as they describe your rights
59	   and restrictions with respect to this document.  Code Components
60	   extracted from this document must include Revised BSD License text as
61	   described in Section 4.e of the Trust Legal Provisions and are
62	   provided without warranty as described in the Revised BSD License.

64	Table of Contents

66	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
67	     1.1.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . .   4
68	     1.2.  Notational Conventions  . . . . . . . . . . . . . . . . .   5
69	   2.  Expressing rate-limit policies  . . . . . . . . . . . . . . .   5
70	     2.1.  Time window . . . . . . . . . . . . . . . . . . . . . . .   5
71	     2.2.  Service limit . . . . . . . . . . . . . . . . . . . . . .   5
72	     2.3.  Quota policy  . . . . . . . . . . . . . . . . . . . . . .   6
73	   3.  Providing RateLimit fields  . . . . . . . . . . . . . . . . .   7
74	     3.1.  Performance considerations  . . . . . . . . . . . . . . .   8
75	   4.  Receiving RateLimit fields  . . . . . . . . . . . . . . . . .   9
76	     4.1.  Intermediaries  . . . . . . . . . . . . . . . . . . . . .  10
77	     4.2.  Caching . . . . . . . . . . . . . . . . . . . . . . . . .  10
78	   5.  Fields definition . . . . . . . . . . . . . . . . . . . . . .  10
79	     5.1.  RateLimit-Limit . . . . . . . . . . . . . . . . . . . . .  10
80	     5.2.  RateLimit-Remaining . . . . . . . . . . . . . . . . . . .  11
81	     5.3.  RateLimit-Reset . . . . . . . . . . . . . . . . . . . . .  12
82	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
83	     6.1.  Throttling does not prevent clients from issuing
84	           requests  . . . . . . . . . . . . . . . . . . . . . . . .  13
85	     6.2.  Information disclosure  . . . . . . . . . . . . . . . . .  13
86	     6.3.  Remaining quota-units are not granted requests  . . . . .  13
87	     6.4.  Reliability of RateLimit-Reset  . . . . . . . . . . . . .  13
88	     6.5.  Resource exhaustion . . . . . . . . . . . . . . . . . . .  14
89	     6.6.  Denial of Service . . . . . . . . . . . . . . . . . . . .  14
90	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
91	     7.1.  RateLimit Parameters Registration . . . . . . . . . . . .  15
92	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
93	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  16
94	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  17
95	   Appendix A.  Rate-limiting and quotas . . . . . . . . . . . . . .  17
96	     A.1.  Interoperability issues . . . . . . . . . . . . . . . . .  18
97	   Appendix B.  Examples . . . . . . . . . . . . . . . . . . . . . .  19
98	     B.1.  Unparameterized responses . . . . . . . . . . . . . . . .  19
99	       B.1.1.  Throttling information in responses . . . . . . . . .  19
100	       B.1.2.  Use in conjunction with custom fields . . . . . . . .  20
101	       B.1.3.  Use for limiting concurrency  . . . . . . . . . . . .  21
102	       B.1.4.  Use in throttled responses  . . . . . . . . . . . . .  22
103	     B.2.  Parameterized responses . . . . . . . . . . . . . . . . .  22
104	       B.2.1.  Throttling window specified via parameter . . . . . .  22
105	       B.2.2.  Dynamic limits with parameterized windows . . . . . .  23
106	       B.2.3.  Dynamic limits for pushing back and slowing down  . .  23
107	     B.3.  Dynamic limits for pushing back with Retry-After and slow
108	           down  . . . . . . . . . . . . . . . . . . . . . . . . . .  24
109	       B.3.1.  Missing Remaining information . . . . . . . . . . . .  25
110	       B.3.2.  Use with multiple windows . . . . . . . . . . . . . .  26
111	   FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27
112	   RateLimit fields currently used on the web  . . . . . . . . . . .  30
113	   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  31
114	   Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  31
115	     Since draft-ietf-httpapi-ratelimit-headers-01 . . . . . . . . .  31
116	     Since draft-ietf-httpapi-ratelimit-headers-00 . . . . . . . . .  32
117	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  32

119	1.  Introduction

121	   The widespreading of HTTP as a distributed computation protocol
122	   requires an explicit way of communicating service status and usage
123	   quotas.

125	   This was partially addressed by the Retry-After header field defined
126	   in [SEMANTICS] to be returned in 429 Too Many Requests (see
127	   [STATUS429]) or 503 Service Unavailable responses.

129	   Widely deployed quota mechanisms limit the number of acceptable
130	   requests in a given time window, e.g. 10 requests per second;
131	   currently, there is no standard way to communicate service quotas so
132	   that the client can throttle its requests and prevent 4xx or 5xx
133	   responses.  See Appendix A for further information on the current
134	   usage of rate limiting in HTTP.

136	   This document defines syntax and semantics for the following fields:

138	   *  RateLimit-Limit: containing the requests quota in the time window;

140	   *  RateLimit-Remaining: containing the remaining requests quota in
141	      the current window;

143	   *  RateLimit-Reset: containing the time remaining in the current
144	      window, specified in seconds.

146	   The behavior of RateLimit-Reset is compatible with the delay-seconds
147	   notation of Retry-After.

149	   The fields definition allows to describe complex policies, including
150	   the ones using multiple and variable time windows and dynamic quotas,
151	   or implementing concurrency limits.

153	1.1.  Goals

155	   The goals of the RateLimit fields are:

157	   Interoperability:  Standardization of the names and semantics of
158	      rate-limit headers to ease their enforcement and adoption;

160	   Resiliency:  Improve resiliency of HTTP infrastructure by providing
161	      clients with information useful to throttle their requests and
162	      prevent 4xx or 5xx responses;

164	   Documentation:  Simplify API documentation by eliminating the need to
165	      include detailed quota limits and related fields in API
166	      documentation.

168	   The following features are out of the scope of this document:

170	   Authorization:  RateLimit fields are not meant to support
171	      authorization or other kinds of access controls.

173	   Throttling scope:  This specification does not cover the throttling
174	      scope, that may be the given resource-target, its parent path or
175	      the whole Origin (see Section 7 of [RFC6454]).  This can be
176	      addressed using extensibility mechanisms such as the parameter
177	      registry Section 7.1.

179	   Response status code:  RateLimit fields may be returned in both
180	      successful (see Section 15.3 of [SEMANTICS]) and non-successful
181	      responses.  This specification does not cover whether non
182	      Successful responses count on quota usage, nor it mandates any
183	      correlation between the RateLimit values and the returned status
184	      code.

186	   Throttling policy:  This specification does not mandate a specific
187	      throttling policy.  The values published in the fields, including
188	      the window size, can be statically or dynamically evaluated.

190	   Service Level Agreement:  Conveyed quota hints do not imply any
191	      service guarantee.  Server is free to throttle respectful clients
192	      under certain circumstances.

194	1.2.  Notational Conventions

196	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
197	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
198	   "OPTIONAL" in this document are to be interpreted as described in
199	   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
200	   capitals, as shown here.

202	   This document uses the Augmented BNF defined in [RFC5234] and updated
203	   by [RFC7405] along with the "#rule" extension defined in
204	   Section 5.6.1 of [SEMANTICS].

206	   The term Origin is to be interpreted as described in Section 7 of
207	   [RFC6454].

209	   This specification uses Structured Fields [SF] to specify syntax.

211	   The terms sf-list, sf-item, sf-string, sf-token, sf-integer, bare-
212	   item and key refer to the structured types defined therein.

214	2.  Expressing rate-limit policies

216	2.1.  Time window

218	   Rate limit policies limit the number of acceptable requests in a
219	   given time window.

221	   A time window is expressed in seconds, using the following syntax:

223	   time-window = delay-seconds
224	   delay-seconds = sf-integer

226	   Where delay-seconds is a non-negative sf-integer compatible with the
227	   "delay-seconds" rule defined in Section 10.2.3 of [SEMANTICS].

229	   Subsecond precision is not supported.

231	2.2.  Service limit

233	   The service-limit is a value associated to the maximum number of
234	   requests that the server is willing to accept from one or more
235	   clients on a given basis (originating IP, authenticated user,
236	   geographical, ..) during a time-window as defined in Section 2.1.

238	   The service-limit is expressed in quota-units and has the following
239	   syntax:

241	      service-limit = quota-units
242	      quota-units = sf-integer

244	   where quota-units is a non-negative sf-integer.

246	   The service-limit SHOULD match the maximum number of acceptable
247	   requests.

249	   The service-limit MAY differ from the total number of acceptable
250	   requests when weight mechanisms, bursts, or other server policies are
251	   implemented.

253	   If the service-limit does not match the maximum number of acceptable
254	   requests the relation with that SHOULD be communicated out-of-band.

256	   Example: A server could

258	   *  count once requests like /books/{id}

260	   *  count twice search requests like /books?author=WuMing

262	   so that we have the following counters

264	   GET /books/123           ; service-limit=4, remaining: 3, status=200
265	   GET /books?author=WuMing ; service-limit=4, remaining: 1, status=200
266	   GET /books?author=Eco    ; service-limit=4, remaining: 0, status=429

268	2.3.  Quota policy

270	   This specification allows describing a quota policy with the
271	   following syntax:

273	      quota-policy = sf-item

275	   where the associated bare-item is a service-limit and parameters are
276	   supported.

278	   The following parameters are defined:

280	   w:  The REQUIRED "w" parameter specifies a time window.  Its syntax
281	      is a "time-window" defined in Section 2.1.

283	   Other parameters are allowed and can be regarded as comments.  They
284	   ought to be registered within the "Hypertext Transfer Protocol (HTTP)
285	   RateLimit Parameters Registry", as described in Section 7.1.

287	   An example policy of 100 quota-units per minute.

289	      100;w=60

291	   The definition of a quota-policy does not imply any specific
292	   distribution of quota-units over time.  Such service specific details
293	   can be conveyed as parameters.

295	   Two policy examples containing further details via custom parameters

297	      100;w=60;comment="fixed window"
298	      12;w=1;burst=1000;policy="leaky bucket"

300	   To avoid clashes, implementers SHOULD prefix unregistered parameters
301	   with an x-<vendor> identifier, e.g. x-acme-policy, x-acme-burst.
302	   While it is useful to define a clear syntax and semantics even for
303	   custom parameters, it is important to note that user agents are not
304	   required to process quota policy information.

306	3.  Providing RateLimit fields

308	   A server uses the RateLimit response fields defined in this document
309	   to communicate its quota policies according to the following rules:

311	   *  RateLimit-Limit and RateLimit-Reset are REQUIRED;

313	   *  RateLimit-Remaining is RECOMMENDED.

315	   The returned values refers to the metrics used to evaluate if the
316	   current request respects the quota policy and MAY not apply to
317	   subsequent requests.

319	   Example: a successful response with the following fields

321	      RateLimit-Limit: 10
322	      RateLimit-Remaining: 1
323	      RateLimit-Reset: 7

325	   does not guarantee that the next request will be successful.  Server
326	   metrics may be subject to other conditions like the one shown in the
327	   example from Section 2.2.

329	   A server MAY return RateLimit response fields independently of the
330	   response status code.  This includes throttled responses.

332	   This document does not mandate any correlation between the RateLimit
333	   values and the returned status code.

335	   Servers should be careful in returning RateLimit fields in
336	   redirection responses (e.g. 3xx status codes) because a low
337	   RateLimit-Remaining value could prevent the client from issuing
338	   requests.  For example, given the rate limiting fields below, a
339	   client could decide to wait 10 seconds before following the Location
340	   header, because RateLimit-Remaining is 0.

342	   HTTP/1.1 301 Moved Permanently
343	   Location: /foo/123
344	   RateLimit-Remaining: 0
345	   RateLimit-Limit: 10
346	   RateLimit-Reset: 10

348	   If a response contains both the Retry-After and the RateLimit-Reset
349	   fields, the value of RateLimit-Reset SHOULD reference the same point
350	   in time as Retry-After.

352	   When using a policy involving more than one time-window, the server
353	   MUST reply with the RateLimit fields related to the window with the
354	   lower RateLimit-Remaining values.

356	   A service returning RateLimit fields MUST NOT convey values exposing
357	   an unwanted volume of requests and SHOULD implement mechanisms to cap
358	   the ratio between RateLimit-Remaining and RateLimit-Reset (see
359	   Section 6.5); this is especially important when quota-policies use a
360	   large time-window.

362	   Under certain conditions, a server MAY artificially lower RateLimit
363	   field values between subsequent requests, e.g. to respond to Denial
364	   of Service attacks or in case of resource saturation.

366	   Servers usually establish whether the request is in-quota before
367	   creating a response, so the RateLimit field values should be already
368	   available in that moment.  Nonetheless servers MAY decide to send the
369	   RateLimit fields in a trailer section.

371	   To ease the migration from existing rate limit headers, a server
372	   SHOULD be able to provide the RateLimit-Limit field even without the
373	   optional quota-policy section.

375	3.1.  Performance considerations

377	   Servers are not required to return RateLimit fields in every
378	   response, and clients need to take this into account.  For example,
379	   an implementer concerned with performance might provide RateLimit
380	   fields only when a given quota is going to expire.

382	   Implementers concerned with response fields' size, might take into
383	   account their ratio with respect to the payload data, or use header-
384	   compression http features such as [HPACK].

386	4.  Receiving RateLimit fields

388	   A client MUST process the received RateLimit fields.

390	   A client MUST validate the values received in the RateLimit fields
391	   before using them and check if there are significant discrepancies
392	   with the expected ones.  This includes a RateLimit-Reset moment too
393	   far in the future or a service-limit too high.

395	   A client receiving RateLimit fields MUST NOT assume that subsequent
396	   responses contain the same RateLimit fields, or any RateLimit fields
397	   at all.

399	   Malformed RateLimit fields MAY be ignored.

401	   A client SHOULD NOT exceed the quota-units expressed in RateLimit-
402	   Remaining before the time-window expressed in RateLimit-Reset.

404	   A client MAY still probe the server if the RateLimit-Reset is
405	   considered too high.

407	   The value of RateLimit-Reset is generated at response time: a client
408	   aware of a significant network latency MAY behave accordingly and use
409	   other information (e.g. the Date response header field, or otherwise
410	   gathered metrics) to better estimate the RateLimit-Reset moment
411	   intended by the server.

413	   The quota-policy values and comments provided in RateLimit-Limit are
414	   informative and MAY be ignored.

416	   If a response contains both the RateLimit-Reset and Retry-After
417	   fields, Retry-After MUST take precedence and RateLimit-Reset MAY be
418	   ignored.

420	   This specification does not mandate a specific throttling behavior
421	   and implementers can adopt their preferred policies, including:

423	   *  slowing down or preemptively back-off their request rate when
424	      approaching quota limits;

426	   *  consuming all the quota according to the exposed limits and then
427	      wait.

429	4.1.  Intermediaries

431	   This section documents the considerations advised in Section 16.3.2
432	   of [SEMANTICS].

434	   An intermediary that is not part of the originating service
435	   infrastructure and is not aware of the quota-policy semantic used by
436	   the Origin Server SHOULD NOT alter the RateLimit fields' values in
437	   such a way as to communicate a more permissive quota-policy; this
438	   includes removing the RateLimit fields.

440	   An intermediary MAY alter the RateLimit fields in such a way as to
441	   communicate a more restrictive quota-policy when:

443	   *  it is aware of the quota-unit semantic used by the Origin Server;

445	   *  it implements this specification and enforces a quota-policy which
446	      is more restrictive than the one conveyed in the fields.

448	   An intermediary SHOULD forward a request even when presuming that it
449	   might not be serviced; the service returning the RateLimit fields is
450	   the sole responsible of enforcing the communicated quota-policy, and
451	   it is always free to service incoming requests.

453	   This specification does not mandate any behavior on intermediaries
454	   respect to retries, nor requires that intermediaries have any role in
455	   respecting quota-policies.  For example, it is legitimate for a proxy
456	   to retransmit a request without notifying the client, and thus
457	   consuming quota-units.

459	4.2.  Caching

461	   As is the ordinary case for HTTP caching ([RFC7234]), a response with
462	   RateLimit fields might be cached and re-used for subsequent requests.
463	   A cached RateLimit response does not modify quota counters but could
464	   contain stale information.  Clients interested in determining the
465	   freshness of the RateLimit fields could rely on fields such as Date
466	   and on the time-window of a quota-policy.

468	5.  Fields definition

470	   The following RateLimit response fields are defined

472	5.1.  RateLimit-Limit

474	   The RateLimit-Limit response field indicates the service-limit
475	   associated to the client in the current time-window.

477	   If the client exceeds that limit, it MAY not be served.

479	   The field is a List Structured Field of positive length.  The first
480	   member is named expiring-limit and its syntax is service-limit, while
481	   the syntax of the other optional members is quota-policy

483	      RateLimit-Limit = sf-list

485	   The expiring-limit value MUST be set to the service-limit that is
486	   closer to reach its limit.

488	   The quota-policy is defined in Section 2.3, and its values are
489	   informative.

491	      RateLimit-Limit: 100

493	   A time-window associated to expiring-limit can be communicated via an
494	   optional quota-policy value, like shown in the following example

496	      RateLimit-Limit: 100, 100;w=10

498	   If the expiring-limit is not associated to a time-window, the time-
499	   window MUST either be:

501	   *  inferred by the value of RateLimit-Reset at the moment of the
502	      reset, or

504	   *  communicated out-of-band (e.g. in the documentation).

506	   Policies using multiple quota limits MAY be returned using multiple
507	   quota-policy items, like shown in the following two examples:

509	      RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400
510	      RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600

512	   This field MUST NOT occur multiple times and can be sent in a trailer
513	   section.

515	5.2.  RateLimit-Remaining

517	   The RateLimit-Remaining response field indicates the remaining quota-
518	   units defined in Section 2.2 associated to the client.

520	   The field is an Integer Structured Field and its value is

522	      RateLimit-Remaining = quota-units

524	   This field MUST NOT occur multiple times and can be sent in a trailer
525	   section.

527	   Clients MUST NOT assume that a positive RateLimit-Remaining value is
528	   a guarantee that further requests will be served.

530	   A low RateLimit-Remaining value is like a yellow traffic-light for
531	   either the number of requests issued in the time-window or the
532	   request throughput: the red light may arrive suddenly (see
533	   Section 3).

535	   One example of RateLimit-Remaining use is below.

537	      RateLimit-Remaining: 50

539	5.3.  RateLimit-Reset

541	   The RateLimit-Reset response field indicates either

543	   *  the number of seconds until the quota resets.

545	   The field is an Integer Structured Field and its value is

547	      RateLimit-Reset = delay-seconds

549	   The delay-seconds format is used because:

551	   *  it does not rely on clock synchronization and is resilient to
552	      clock adjustment and clock skew between client and server (see
553	      Section 5.6.7 of [SEMANTICS]);

555	   *  it mitigates the risk related to thundering herd when too many
556	      clients are serviced with the same timestamp.

558	   This field MUST NOT occur multiple times and can be sent in a trailer
559	   section.

561	   An example of RateLimit-Reset use is below.

563	      RateLimit-Reset: 50

565	   The client MUST NOT assume that all its service-limit will be
566	   restored after the moment referenced by RateLimit-Reset.  The server
567	   MAY arbitrarily alter the RateLimit-Reset value between subsequent
568	   requests e.g. in case of resource saturation or to implement sliding
569	   window policies.

571	6.  Security Considerations

573	6.1.  Throttling does not prevent clients from issuing requests

575	   This specification does not prevent clients to make over-quota
576	   requests.

578	   Servers should always implement mechanisms to prevent resource
579	   exhaustion.

581	6.2.  Information disclosure

583	   Servers should not disclose to untrusted parties operational capacity
584	   information that can be used to saturate its infrastructural
585	   resources.

587	   While this specification does not mandate whether non 2xx responses
588	   consume quota, if 401 and 403 responses count on quota a malicious
589	   client could probe the endpoint to get traffic information of another
590	   user.

592	   As intermediaries might retransmit requests and consume quota-units
593	   without prior knowledge of the User Agent, RateLimit fields might
594	   reveal the existence of an intermediary to the User Agent.

596	6.3.  Remaining quota-units are not granted requests

598	   RateLimit-* fields convey hints from the server to the clients in
599	   order to avoid being throttled out.

601	   Clients MUST NOT consider the quota-units returned in RateLimit-
602	   Remaining as a service level agreement.

604	   In case of resource saturation, the server MAY artificially lower the
605	   returned values or not serve the request regardless of the advertised
606	   quotas.

608	6.4.  Reliability of RateLimit-Reset

610	   Consider that service-limit may not be restored after the moment
611	   referenced by RateLimit-Reset, and the RateLimit-Reset value should
612	   not be considered fixed nor constant.

614	   Subsequent requests may return a higher RateLimit-Reset value to
615	   limit concurrency or implement dynamic or adaptive throttling
616	   policies.

618	6.5.  Resource exhaustion

620	   When returning RateLimit-Reset you must be aware that many throttled
621	   clients may come back at the very moment specified.

623	   This is true for Retry-After too.

625	   For example, if the quota resets every day at 18:00:00 and your
626	   server returns the RateLimit-Reset accordingly

628	      Date: Tue, 15 Nov 1994 08:00:00 GMT
629	      RateLimit-Reset: 36000

631	   there's a high probability that all clients will show up at 18:00:00.

633	   This could be mitigated by adding some jitter to the field-value.

635	   Resource exhaustion issues can be associated with quota policies
636	   using a large time-window, because a user agent by chance or on
637	   purpose might consume most of its quota-units in a significantly
638	   shorter interval.

640	   This behavior can be even triggered by the provided RateLimit fields.
641	   The following example describes a service with an unconsumed quota-
642	   policy of 10000 quota-units per 1000 seconds.

644	   RateLimit-Limit: 10000, 10000;w=1000
645	   RateLimit-Remaining: 10000
646	   RateLimit-Reset: 10

648	   A client implementing a simple ratio between RateLimit-Remaining and
649	   RateLimit-Reset could infer an average throughput of 1000 quota-units
650	   per second, while RateLimit-Limit conveys a quota-policy with an
651	   average of 10 quota-units per second.  If the service cannot handle
652	   such load, it should return either a lower RateLimit-Remaining value
653	   or an higher RateLimit-Reset value.  Moreover, complementing large
654	   time-window quota-policies with a short time-window one mitigates
655	   those risks.

657	6.6.  Denial of Service

659	   RateLimit fields may assume unexpected values by chance or purpose.
660	   For example, an excessively high RateLimit-Remaining value may be:

662	   *  used by a malicious intermediary to trigger a Denial of Service
663	      attack or consume client resources boosting its requests;

665	   *  passed by a misconfigured server;
666	   or an high RateLimit-Reset value could inhibit clients to contact the
667	   server.

669	   Clients MUST validate the received values to mitigate those risks.

671	7.  IANA Considerations

673	   IANA is requested to update one registry and create one new registry.

675	   Please add the following entries to the "Hypertext Transfer Protocol
676	   (HTTP) Field Name Registry" registry ([SEMANTICS]):

678	       +=====================+===========+========================+
679	       | Field Name          | Status    | Specification          |
680	       +=====================+===========+========================+
681	       | RateLimit-Limit     | permanent | Section 5.1 of ThisRFC |
682	       +---------------------+-----------+------------------------+
683	       | RateLimit-Remaining | permanent | Section 5.2 of ThisRFC |
684	       +---------------------+-----------+------------------------+
685	       | RateLimit-Reset     | permanent | Section 5.3 of ThisRFC |
686	       +---------------------+-----------+------------------------+

688	                                 Table 1

690	7.1.  RateLimit Parameters Registration

692	   IANA is requested to create a new registry to be called "Hypertext
693	   Transfer Protocol (HTTP) RateLimit Parameters Registry", to be
694	   located at https://www.iana.org/assignments/http-ratelimit-parameters
695	   (https://www.iana.org/assignments/http-ratelimit-parameters).
696	   Registration is done on the advice of a Designated Expert, appointed
697	   by the IESG or their delegate.  All entries are Specification
698	   Required ([IANA], Section 4.6).

700	   Registration requests consist of the following information:

702	   *  Parameter name: The parameter name, conforming to [SF].

704	   *  Field name: The RateLimit field for which the parameter is
705	      registered.  If a parameter is intended to be used with multiple
706	      fields, it has to be registered for each one.

708	   *  Description: A brief description of the parameter.

710	   *  Specification document: A reference to the document that specifies
711	      the parameter, preferably including a URI that can be used to
712	      retrieve a copy of the document.

714	   *  Comments (optional): Any additional information that can be
715	      useful.

717	   The initial contents of this registry should be:

719	   +=================+=========+============+===============+==========+
720	   | Field Name      |Parameter|Description |Specification  |Comments  |
721	   |                 |name     |            |               |(optional)|
722	   +=================+=========+============+===============+==========+
723	   | RateLimit-Limit |w        |Time window |Section 2.3 of |          |
724	   |                 |         |            |ThisRFC        |          |
725	   +-----------------+---------+------------+---------------+----------+

727	                                  Table 2

729	8.  References

731	8.1.  Normative References

733	   [IANA]     Cotton, M., Leiba, B., and T. Narten, "Guidelines for
734	              Writing an IANA Considerations Section in RFCs", BCP 26,
735	              RFC 8126, DOI 10.17487/RFC8126, June 2017,
736	              <https://www.rfc-editor.org/rfc/rfc8126>.

738	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
739	              Requirement Levels", BCP 14, RFC 2119,
740	              DOI 10.17487/RFC2119, March 1997,
741	              <https://www.rfc-editor.org/rfc/rfc2119>.

743	   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
744	              Specifications: ABNF", STD 68, RFC 5234,
745	              DOI 10.17487/RFC5234, January 2008,
746	              <https://www.rfc-editor.org/rfc/rfc5234>.

748	   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454,
749	              DOI 10.17487/RFC6454, December 2011,
750	              <https://www.rfc-editor.org/rfc/rfc6454>.

752	   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
753	              RFC 7405, DOI 10.17487/RFC7405, December 2014,
754	              <https://www.rfc-editor.org/rfc/rfc7405>.

756	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
757	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
758	              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

760	   [SEMANTICS]
761	              Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP
762	              Semantics", Work in Progress, Internet-Draft, draft-ietf-
763	              httpbis-semantics-19, 12 September 2021,
764	              <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-
765	              semantics-19>.

767	   [SF]       Nottingham, M. and P-H. Kamp, "Structured Field Values for
768	              HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021,
769	              <https://www.rfc-editor.org/rfc/rfc8941>.

771	8.2.  Informative References

773	   [HPACK]    Peon, R. and H. Ruellan, "HPACK: Header Compression for
774	              HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015,
775	              <https://www.rfc-editor.org/rfc/rfc7541>.

777	   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
778	              Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
779	              <https://www.rfc-editor.org/rfc/rfc3339>.

781	   [RFC6585]  Nottingham, M. and R. Fielding, "Additional HTTP Status
782	              Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
783	              <https://www.rfc-editor.org/rfc/rfc6585>.

785	   [RFC7234]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
786	              Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
787	              RFC 7234, DOI 10.17487/RFC7234, June 2014,
788	              <https://www.rfc-editor.org/rfc/rfc7234>.

790	   [STATUS429]
791	              Stewart, R., Tuexen, M., and P. Lei, "Stream Control
792	              Transmission Protocol (SCTP) Stream Reconfiguration",
793	              RFC 6525, DOI 10.17487/RFC6525, February 2012,
794	              <https://www.rfc-editor.org/rfc/rfc6525>.

796	   [UNIX]     The Open Group, "The Single UNIX Specification, Version 2
797	              - 6 Vol Set for UNIX 98", February 1997.

799	Appendix A.  Rate-limiting and quotas

801	   Servers use quota mechanisms to avoid systems overload, to ensure an
802	   equitable distribution of computational resources or to enforce other
803	   policies - e.g. monetization.

805	   A basic quota mechanism limits the number of acceptable requests in a
806	   given time window, e.g. 10 requests per second.

808	   When quota is exceeded, servers usually do not serve the request
809	   replying instead with a 4xx HTTP status code (e.g. 429 or 403) or
810	   adopt more aggressive policies like dropping connections.

812	   Quotas may be enforced on different basis (e.g. per user, per IP, per
813	   geographic area, ..) and at different levels.  For example, an user
814	   may be allowed to issue:

816	   *  10 requests per second;

818	   *  limited to 60 requests per minute;

820	   *  limited to 1000 requests per hour.

822	   Moreover system metrics, statistics and heuristics can be used to
823	   implement more complex policies, where the number of acceptable
824	   requests and the time window are computed dynamically.

826	   To help clients throttling their requests, servers may expose the
827	   counters used to evaluate quota policies via HTTP header fields.

829	   Those response headers may be added by HTTP intermediaries such as
830	   API gateways and reverse proxies.

832	   On the web we can find many different rate-limit headers, usually
833	   containing the number of allowed requests in a given time window, and
834	   when the window is reset.

836	   The common choice is to return three headers containing:

838	   *  the maximum number of allowed requests in the time window;

840	   *  the number of remaining requests in the current window;

842	   *  the time remaining in the current window expressed in seconds or
843	      as a timestamp;

845	A.1.  Interoperability issues

847	   A major interoperability issue in throttling is the lack of standard
848	   headers, because:

850	   *  each implementation associates different semantics to the same
851	      header field names;

853	   *  header field names proliferates.

855	   User Agents interfacing with different servers may thus need to
856	   process different headers, or the very same application interface
857	   that sits behind different reverse proxies may reply with different
858	   throttling headers.

860	Appendix B.  Examples

862	B.1.  Unparameterized responses

864	B.1.1.  Throttling information in responses

866	   The client exhausted its service-limit for the next 50 seconds.  The
867	   time-window is communicated out-of-band or inferred by the field
868	   values.

870	   Request:

872	   GET /items/123 HTTP/1.1
873	   Host: api.example

875	   Response:

877	   HTTP/1.1 200 Ok
878	   Content-Type: application/json
879	   RateLimit-Limit: 100
880	   Ratelimit-Remaining: 0
881	   Ratelimit-Reset: 50

883	   {"hello": "world"}

885	   Since the field values are not necessarily correlated with the
886	   response status code, a subsequent request is not required to fail.
887	   The example below shows that the server decided to serve the request
888	   even if RateLimit-Remaining is 0.  Another server, or the same server
889	   under other load conditions, could have decided to throttle the
890	   request instead.

892	   Request:

894	   GET /items/456 HTTP/1.1
895	   Host: api.example

897	   Response:

899	   HTTP/1.1 200 Ok
900	   Content-Type: application/json
901	   RateLimit-Limit: 100
902	   Ratelimit-Remaining: 0
903	   Ratelimit-Reset: 48

905	   {"still": "successful"}

907	B.1.2.  Use in conjunction with custom fields

909	   The server uses two custom fields, namely acme-RateLimit-DayLimit and
910	   acme-RateLimit-HourLimit to expose the following policy:

912	   *  5000 daily quota-units;

914	   *  1000 hourly quota-units.

916	   The client consumed 4900 quota-units in the first 14 hours.

918	   Despite the next hourly limit of 1000 quota-units, the closest limit
919	   to reach is the daily one.

921	   The server then exposes the RateLimit-* fields to inform the client
922	   that:

924	   *  it has only 100 quota-units left;

926	   *  the window will reset in 10 hours.

928	   Request:

930	   GET /items/123 HTTP/1.1
931	   Host: api.example

933	   Response:

935	   HTTP/1.1 200 Ok
936	   Content-Type: application/json
937	   acme-RateLimit-DayLimit: 5000
938	   acme-RateLimit-HourLimit: 1000
939	   RateLimit-Limit: 5000
940	   RateLimit-Remaining: 100
941	   RateLimit-Reset: 36000

943	   {"hello": "world"}

945	B.1.3.  Use for limiting concurrency

947	   Throttling fields may be used to limit concurrency, advertising
948	   limits that are lower than the usual ones in case of saturation, thus
949	   increasing availability.

951	   The server adopted a basic policy of 100 quota-units per minute, and
952	   in case of resource exhaustion adapts the returned values reducing
953	   both RateLimit-Limit and RateLimit-Remaining.

955	   After 2 seconds the client consumed 40 quota-units

957	   Request:

959	   GET /items/123 HTTP/1.1
960	   Host: api.example

962	   Response:

964	   HTTP/1.1 200 Ok
965	   Content-Type: application/json
966	   RateLimit-Limit: 100
967	   RateLimit-Remaining: 60
968	   RateLimit-Reset: 58

970	   {"elapsed": 2, "issued": 40}

972	   At the subsequent request - due to resource exhaustion - the server
973	   advertises only RateLimit-Remaining: 20.

975	   Request:

977	   GET /items/123 HTTP/1.1
978	   Host: api.example

980	   Response:

982	   HTTP/1.1 200 Ok
983	   Content-Type: application/json
984	   RateLimit-Limit: 100
985	   RateLimit-Remaining: 20
986	   RateLimit-Reset: 56

988	   {"elapsed": 4, "issued": 41}

990	B.1.4.  Use in throttled responses

992	   A client exhausted its quota and the server throttles it sending
993	   Retry-After.

995	   In this example, the values of Retry-After and RateLimit-Reset
996	   reference the same moment, but this is not a requirement.

998	   The 429 Too Many Requests HTTP status code is just used as an
999	   example.

1001	   Request:

1003	   GET /items/123 HTTP/1.1
1004	   Host: api.example

1006	   Response:

1008	   HTTP/1.1 429 Too Many Requests
1009	   Content-Type: application/json
1010	   Date: Mon, 05 Aug 2019 09:27:00 GMT
1011	   Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
1012	   RateLimit-Reset: 5
1013	   RateLimit-Limit: 100
1014	   Ratelimit-Remaining: 0

1016	   {
1017	   "title": "Too Many Requests",
1018	   "status": 429,
1019	   "detail": "You have exceeded your quota"
1020	   }

1022	B.2.  Parameterized responses

1024	B.2.1.  Throttling window specified via parameter

1026	   The client has 99 quota-units left for the next 50 seconds.  The
1027	   time-window is communicated by the w parameter, so we know the
1028	   throughput is 100 quota-units per minute.

1030	   Request:

1032	   GET /items/123 HTTP/1.1
1033	   Host: api.example

1035	   Response:

1037	   HTTP/1.1 200 Ok
1038	   Content-Type: application/json
1039	   RateLimit-Limit: 100, 100;w=60
1040	   Ratelimit-Remaining: 99
1041	   Ratelimit-Reset: 50

1043	   {"hello": "world"}

1045	B.2.2.  Dynamic limits with parameterized windows

1047	   The policy conveyed by RateLimit-Limit states that the server accepts
1048	   100 quota-units per minute.

1050	   To avoid resource exhaustion, the server artificially lowers the
1051	   actual limits returned in the throttling headers.

1053	   The RateLimit-Remaining then advertises only 9 quota-units for the
1054	   next 50 seconds to slow down the client.

1056	   Note that the server could have lowered even the other values in
1057	   RateLimit-Limit: this specification does not mandate any relation
1058	   between the field values contained in subsequent responses.

1060	   Request:

1062	   GET /items/123 HTTP/1.1
1063	   Host: api.example

1065	   Response:

1067	   HTTP/1.1 200 Ok
1068	   Content-Type: application/json
1069	   RateLimit-Limit: 10, 100;w=60
1070	   Ratelimit-Remaining: 9
1071	   Ratelimit-Reset: 50

1073	   {
1074	     "status": 200,
1075	     "detail": "Just slow down without waiting."
1076	   }

1078	B.2.3.  Dynamic limits for pushing back and slowing down

1080	   Continuing the previous example, let's say the client waits 10
1081	   seconds and performs a new request which, due to resource exhaustion,
1082	   the server rejects and pushes back, advertising RateLimit-Remaining:
1083	   0 for the next 20 seconds.

1085	   The server advertises a smaller window with a lower limit to slow
1086	   down the client for the rest of its original window after the 20
1087	   seconds elapse.

1089	   Request:

1091	   GET /items/123 HTTP/1.1
1092	   Host: api.example

1094	   Response:

1096	   HTTP/1.1 429 Too Many Requests
1097	   Content-Type: application/json
1098	   RateLimit-Limit: 0, 15;w=20
1099	   Ratelimit-Remaining: 0
1100	   Ratelimit-Reset: 20

1102	   {
1103	     "status": 429,
1104	     "detail": "Wait 20 seconds, then slow down!"
1105	   }

1107	B.3.  Dynamic limits for pushing back with Retry-After and slow down

1109	   Alternatively, given the same context where the previous example
1110	   starts, we can convey the same information to the client via Retry-
1111	   After, with the advantage that the server can now specify the
1112	   policy's nominal limit and window that will apply after the reset,
1113	   e.g. assuming the resource exhaustion is likely to be gone by then,
1114	   so the advertised policy does not need to be adjusted, yet we managed
1115	   to stop requests for a while and slow down the rest of the current
1116	   window.

1118	   Request:

1120	   GET /items/123 HTTP/1.1
1121	   Host: api.example

1123	   Response:

1125	   HTTP/1.1 429 Too Many Requests
1126	   Content-Type: application/json
1127	   Retry-After: 20
1128	   RateLimit-Limit: 15, 100;w=60
1129	   Ratelimit-Remaining: 15
1130	   Ratelimit-Reset: 40

1132	   {
1133	     "status": 429,
1134	     "detail": "Wait 20 seconds, then slow down!"
1135	   }

1137	   Note that in this last response the client is expected to honor
1138	   Retry-After and perform no requests for the specified amount of time,
1139	   whereas the previous example would not force the client to stop
1140	   requests before the reset time is elapsed, as it would still be free
1141	   to query again the server even if it is likely to have the request
1142	   rejected.

1144	B.3.1.  Missing Remaining information

1146	   The server does not expose RateLimit-Remaining values (for example,
1147	   because the underlying counters are not available).  Instead, it
1148	   resets the limit counter every second.

1150	   It communicates to the client the limit of 10 quota-units per second
1151	   always returning the couple RateLimit-Limit and RateLimit-Reset.

1153	   Request:

1155	   GET /items/123 HTTP/1.1
1156	   Host: api.example

1158	   Response:

1160	   HTTP/1.1 200 Ok
1161	   Content-Type: application/json
1162	   RateLimit-Limit: 10
1163	   Ratelimit-Reset: 1

1165	   {"first": "request"}

1167	   Request:

1169	   GET /items/123 HTTP/1.1
1170	   Host: api.example

1172	   Response:

1174	   HTTP/1.1 200 Ok
1175	   Content-Type: application/json
1176	   RateLimit-Limit: 10
1177	   Ratelimit-Reset: 1

1179	   {"second": "request"}

1181	B.3.2.  Use with multiple windows

1183	   This is a standardized way of describing the policy detailed in
1184	   Appendix B.1.2:

1186	   *  5000 daily quota-units;

1188	   *  1000 hourly quota-units.

1190	   The client consumed 4900 quota-units in the first 14 hours.

1192	   Despite the next hourly limit of 1000 quota-units, the closest limit
1193	   to reach is the daily one.

1195	   The server then exposes the RateLimit fields to inform the client
1196	   that:

1198	   *  it has only 100 quota-units left;

1200	   *  the window will reset in 10 hours;

1202	   *  the expiring-limit is 5000.

1204	   Request:

1206	   GET /items/123 HTTP/1.1
1207	   Host: api.example

1209	   Response:

1211	   HTTP/1.1 200 OK
1212	   Content-Type: application/json
1213	   RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400
1214	   RateLimit-Remaining: 100
1215	   RateLimit-Reset: 36000

1217	   {"hello": "world"}

1219	FAQ

1221	   _RFC Editor: Please remove this section before publication._

1223	   1.  Why defining standard fields for throttling?

1225	       To simplify enforcement of throttling policies.

1227	   2.  Can I use RateLimit-* in throttled responses (eg with status code
1228	       429)?

1230	       Yes, you can.

1232	   3.  Are those specs tied to RFC 6585?

1234	       No.  [RFC6585] defines the 429 status code and we use it just as
1235	       an example of a throttled request, that could instead use even
1236	       403 or whatever status code.  The goal of this specification is
1237	       to standardize the name and semantic of three ratelimit fields
1238	       widely used on the internet.  Stricter relations with status
1239	       codes or error response payloads would impose behaviors to all
1240	       the existing implementations making the adoption more complex.

1242	   4.  Why don't pass the throttling scope as a parameter?

1244	       The word "scope" can have different meanings: for example it can
1245	       be an URL, or an authorization scope.  Since authorization is out
1246	       of the scope of this document (see Section 1.1), and that we rely
1247	       only on [SEMANTICS], in Section 1.1 we defined "scope" in terms
1248	       of URL.

1250	       Since clients are not required to process quota policies (see
1251	       Section 4), we could add a new "RateLimit-Scope" field to this
1252	       spec.  See this discussion on a similar thread
1253	       (https://github.com/httpwg/http-core/pull/317#issuecomment-
1254	       585868767)

1256	       Specific ecosystems can still bake their own prefixed parameters,
1257	       such as acme-auth-scope or acme-url-scope and ensure that clients
1258	       process them.  This behavior cannot be relied upon when
1259	       communicating between different ecosystems.

1261	       We are open to suggestions: comment on this issue
1262	       (https://github.com/ioggstream/draft-polli-ratelimit-headers/
1263	       issues/70)

1265	   5.  Why using delay-seconds instead of a UNIX Timestamp?  Why not
1266	       using subsecond precision?
1267	       Using delay-seconds aligns with Retry-After, which is returned in
1268	       similar contexts, eg on 429 responses.

1270	       Timestamps require a clock synchronization protocol (see
1271	       Section 5.6.7 of [SEMANTICS]).  This may be problematic (e.g.
1272	       clock adjustment, clock skew, failure of hardcoded clock
1273	       synchronization servers, IoT devices, ..).  Moreover timestamps
1274	       may not be monotonically increasing due to clock adjustment.  See
1275	       Another NTP client failure story
1276	       (https://community.ntppool.org/t/another-ntp-client-failure-
1277	       story/1014/)

1279	       We did not use subsecond precision because:

1281	       *  that is more subject to system clock correction like the one
1282	          implemented via the adjtimex() Linux system call;

1284	       *  response-time latency may not make it worth.  A brief
1285	          discussion on the subject is on the httpwg ml
1286	          (https://lists.w3.org/Archives/Public/ietf-http-
1287	          wg/2019JulSep/0202.html)

1289	       *  almost all rate-limit headers implementations do not use it.

1291	   6.  Why not support multiple quota remaining?

1293	       While this might be of some value, my experience suggests that
1294	       overly-complex quota implementations results in lower
1295	       effectiveness of this policy.  This spec allows the client to
1296	       easily focusing on RateLimit-Remaining and RateLimit-Reset.

1298	   7.  Shouldn't I limit concurrency instead of request rate?

1300	       You can use this specification to limit concurrency at the HTTP
1301	       level (see {#use-for-limiting-concurrency}) and help clients to
1302	       shape their requests avoiding being throttled out.

1304	       A problematic way to limit concurrency is connection dropping,
1305	       especially when connections are multiplexed (e.g.  HTTP/2)
1306	       because this results in unserviced client requests, which is
1307	       something we want to avoid.

1309	       A semantic way to limit concurrency is to return 503 + Retry-
1310	       After in case of resource saturation (e.g. thrashing, connection
1311	       queues too long, Service Level Objectives not meet, ..).
1312	       Saturation conditions can be either dynamic or static: all this
1313	       is out of the scope for the current document.

1315	   8.  Do a positive value of RateLimit-Remaining imply any service
1316	       guarantee for my future requests to be served?

1318	       No.  FAQ integrated in Section 5.2.

1320	   9.  Is the quota-policy definition Section 2.3 too complex?

1322	       You can always return the simplest form of the 3 fields

1324	   RateLimit-Limit: 100
1325	   RateLimit-Remaining: 50
1326	   RateLimit-Reset: 60

1328	   The key runtime value is the first element of the list: expiring-
1329	   limit, the others quota-policy are informative.  So for the following
1330	   field:

1332	   RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window"

1334	   the key value is the one referencing the lowest limit: 100

1336	   1.  Can we use shorter names?  Why don't put everything in one field?

1338	   The most common syntax we found on the web is X-RateLimit-* and when
1339	   starting this I-D we opted for it (https://github.com/ioggstream/
1340	   draft-polli-ratelimit-headers/issues/34#issuecomment-519366481)

1342	   The basic form of those fields is easily parseable, even by
1343	   implementers processing responses using technologies like dynamic
1344	   interpreter with limited syntax.

1346	   Using a single field complicates parsing and takes a significantly
1347	   different approach from the existing ones: this can limit adoption.

1349	   1.  Why don't mention connections?

1351	       Beware of the term "connection": &#65532; &#65532; - it is just
1352	       _one_ possible saturation cause.  Once you go that path &#65532;
1353	       you will expose other infrastructural details (bandwidth, CPU, ..
1354	       see Section 6.2) &#65532; and complicate client compliance;
1355	       &#65532; - it is an infrastructural detail defined in terms of
1356	       server and network &#65532; rather than the consumed service.
1357	       This specification protects the services first, and then the
1358	       infrastructures through client cooperation (see Section 6.1).
1359	       &#65532; &#65532; RateLimit fields enable sending _on the same
1360	       connection_ different limit values &#65532; on each response,
1361	       depending on the policy scope (e.g. per-user, per-custom-key, ..)
1362	       &#65532;

1364	   2.  Can intermediaries alter RateLimit fields?

1366	       Generally, they should not because it might result in unserviced
1367	       requests.  There are reasonable use cases for intermediaries
1368	       mangling RateLimit fields though, e.g. when they enforce stricter
1369	       quota-policies, or when they are an active component of the
1370	       service.  In those case we will consider them as part of the
1371	       originating infrastructure.

1373	   3.  Why the w parameter is just informative?  Could it be used by a
1374	       client to determine the request rate?

1376	       A non-informative w parameter might be fine in an environment
1377	       where clients and servers are tightly coupled.  Conveying
1378	       policies with this detail on a large scale would be very complex
1379	       and implementations would be likely not interoperable.  We thus
1380	       decided to leave w as an informational parameter and only rely on
1381	       RateLimit-Limit, RateLimit-Remaining and RateLimit-Reset for
1382	       defining the throttling behavior.

1384	RateLimit fields currently used on the web

1386	   _RFC Editor: Please remove this section before publication._

1388	   Commonly used header field names are:

1390	   *  X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset;

1392	   *  X-Rate-Limit-Limit, X-Rate-Limit-Remaining, X-Rate-Limit-Reset.

1394	   There are variants too, where the window is specified in the header
1395	   field name, eg:

1397	   *  x-ratelimit-limit-minute, x-ratelimit-limit-hour, x-ratelimit-
1398	      limit-day

1400	   *  x-ratelimit-remaining-minute, x-ratelimit-remaining-hour, x-
1401	      ratelimit-remaining-day

1403	   Here are some interoperability issues:

1405	   *  X-RateLimit-Remaining references different values, depending on
1406	      the implementation:

1408	      -  seconds remaining to the window expiration

1410	      -  milliseconds remaining to the window expiration
1411	      -  seconds since UTC, in UNIX Timestamp [UNIX]

1413	      -  a datetime, either IMF-fixdate [SEMANTICS] or [RFC3339]

1415	   *  different headers, with the same semantic, are used by different
1416	      implementers:

1418	      -  X-RateLimit-Limit and X-Rate-Limit-Limit

1420	      -  X-RateLimit-Remaining and X-Rate-Limit-Remaining

1422	      -  X-RateLimit-Reset and X-Rate-Limit-Reset

1424	   The semantic of RateLimit-Remaining depends on the windowing
1425	   algorithm.  A sliding window policy for example may result in having
1426	   a RateLimit-Remaining value related to the ratio between the current
1427	   and the maximum throughput. e.g.

1429	   RateLimit-Limit: 12, 12;w=1
1430	   RateLimit-Remaining: 6          ; using 50% of throughput, that is 6 units/s
1431	   RateLimit-Reset: 1

1433	   If this is the case, the optimal solution is to achieve

1435	   RateLimit-Limit: 12, 12;w=1
1436	   RateLimit-Remaining: 1          ; using 100% of throughput, that is 12 units/s
1437	   RateLimit-Reset: 1

1439	   At this point you should stop increasing your request rate.

1441	Acknowledgements

1443	   Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro
1444	   Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark
1445	   Nottingham for being the initial contributors of these
1446	   specifications.  Kudos to the first community implementers: Aapo
1447	   Talvensaari, Nathan Friedly and Sanyam Dogra.

1449	   In addition to the people above, this document owes a lot to the
1450	   extensive discussion in the HTTPAPI workgroup, including Rich Salz,
1451	   Darrel Miller and Julian Reschke.

1453	Changes

1455	   _RFC Editor: Please remove this section before publication._

1457	Since draft-ietf-httpapi-ratelimit-headers-01
1458	   *  Update IANA considerations #60

1460	   *  Use Structured fields #58

1462	   *  Reorganize document #67

1464	Since draft-ietf-httpapi-ratelimit-headers-00

1466	   *  Use I-D.httpbis-semantics, which includes referencing delay-
1467	      seconds instead of delta-seconds. #5

1469	Authors' Addresses

1471	   Roberto Polli
1472	   Team Digitale, Italian Government
1473	   Italy
1474	   Email: robipolli@gmail.com

1476	   Alejandro Martinez Ruiz
1477	   Red Hat
1478	   Email: amr@redhat.com