idnits 2.17.1 

draft-ietf-lmap-router-buffer-sizes-ksubram-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  == The document has an IETF Trust Provisions of 28 Dec 2009, Section 6.c(i)
     Publication Limitation clause.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (October 27, 2014) is 3468 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC  793 (ref.
     'Postel') (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 2893
     (Obsoleted by RFC 4213)


     Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Internet Engineering Task Force                      K. Subramaniam, Ed.
3	Internet-Draft                                                  D. Loher
4	Intended status: Informational                                 Microsoft
5	Expires: April 30, 2015                                 October 27, 2014

7	                     Router Buffer Sizes In The WAN
8	             draft-ietf-lmap-router-buffer-sizes-ksubram-00

10	Abstract

12	   This draft identifies the set of data that needs to be collected, and
13	   analyzed to quantify router buffer sizes used in routers in the Wide
14	   Area Network (WAN).  The scope of this draft is limited to WAN links
15	   that have link latencies of 40 to 150 milliseconds.

17	   Reducing router buffer sizes has many advantages, the most important
18	   being cost.  However, there is not much data available today to
19	   effectively calculate this.  This draft details use cases for the
20	   study, and lists data that needs to be taken into consideration to be
21	   able to quantify the size of router buffers.  The details of the
22	   individual measurement metrics are beyond the scope of this document.
23	   Neither does the draft identify methods to gather the data.  What it
24	   identifies is a need to be able to collect, and report this empirical
25	   data in a readable fashion thus providing the ability to study and
26	   compare data in a more standardized method.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at http://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on April 30, 2015.

45	Copyright Notice

47	   Copyright (c) 2014 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	   This document may not be modified, and derivative works of it may not
61	   be created, except to format it for publication as an RFC or to
62	   translate it into languages other than English.

64	Table of Contents

66	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
67	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
68	   3.  Use Case  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
69	     3.1.  Discards with small buffer sizes  . . . . . . . . . . . .   4
70	     3.2.  Discards with large buffer sizes  . . . . . . . . . . . .   4
71	   4.  List of required data for study of router buffer sizes  . . .   4
72	     4.1.  Number of concurrent flows, N . . . . . . . . . . . . . .   5
73	     4.2.  Length of a flow, L . . . . . . . . . . . . . . . . . . .   6
74	     4.3.  Packet Discards, D  . . . . . . . . . . . . . . . . . . .   6
75	     4.4.  Reason for Packet Discards, R . . . . . . . . . . . . . .   6
76	     4.5.  Resolution of time interval, T  . . . . . . . . . . . . .   6
77	     4.6.  5 Tuple Flow Identity, I  . . . . . . . . . . . . . . . .   7
78	   5.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .   7
79	   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
80	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
81	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
82	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
83	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
84	     9.2.  Informative References  . . . . . . . . . . . . . . . . .   8
85	   Appendix A.  Additional Stuff . . . . . . . . . . . . . . . . . .   9
86	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

88	1.  Introduction

90	   "How much buffering do core links need?" is a question that has been
91	   under study for a while.  The question boils down to quantify buffer
92	   sizes and yet achieve 100% utilization on links with maximum
93	   throughput at a feasible cost.

95	   Buffer design could substantially increase costs.  While over-
96	   buffering seems intuitive it can complicate the design of high speed
97	   routers, lead to higher power consumption, more board space, and
98	   lower density.  It can actually increase end-to-end delay in the
99	   presence of congestion.  This can make congestion more persistent.
100	   Additionally, there is always a tradeoff between buffer sizes and the
101	   capacity of a router.

103	   On the other hand, under-buffering while doing away from the above
104	   cons of over-buffering could lead us away from our primary goal of
105	   100 percent link utilization.  This could happen in a scenario using
106	   a simple Additive Increase Multiplicative Decrease (AIMD) for TCP
107	   flows when the sender has packets to send but the window size
108	   advertised is less and as a result the receiver consumes far less
109	   that it could.

111	   The rule of thumb for router buffers has been defined as [Villamizar]
112	   : B = 2RTT*C.  Where B, was the buffer size, RTT the Round Trip Time,
113	   and C the capacity of the bottleneck link.  [RFC3429]  also talks
114	   about the buffer size being at least one TCP window size.

116	   However later studies [Appenzeller], show that the rule of thumb
117	   works either for a single flow or a perfectly synchronized large
118	   number of flows.  Further they postulate that the buffer size is
119	   actually (2RTT * C)/sqrt(n), where n is the number of flows.  This
120	   indicates a significant reduction in the buffer chip promoting lower
121	   costs.

123	   As seen, there have been proponents for large buffers and small.
124	   However, most of these studies are based on theoretical models and
125	   simulations.  Today, there is no model or protocol to mine big data
126	   from a providers network to be able to answer this question
127	   efficiently.  The nature of WAN traffic can be uncertain and varying.
128	   Furthermore the traffic could vastly vary between individual ISPs.
129	   This document implored the need for a model of mining empirical big
130	   data in a providers network to be able to build a network that drives
131	   down the $/GB and at the same time maximizing link utilization.

133	   This document outlines use cases for the study of router buffer sizes
134	   in the WAN and identifies the data that needs to be collected and
135	   analyzed.  It could be further extended to the edge and datacenters,
136	   but it is outside the scope of this draft.

138	2.  Terminology

140	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
141	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
142	   document are to be interpreted as described in RFC 2119 [RFC2119].

144	3.  Use Case

146	   From an operator's perspective it is imperative to monitor discards
147	   and link utilization over WAN links to be able to study the router
148	   buffer sizes.  But these alone will be unable to provide an operator
149	   with enough information as to why the discards happened.  The two use
150	   cases outlined here argue that more data needs to be collected,
151	   reported, and analyzed.

153	3.1.  Discards with small buffer sizes

155	   Trans-pacific and trans-atlantic links of latencies in the range of
156	   150 ms and 90 ms respectively, with low link utilization of 30
157	   percent, and small buffers have seen dropped packets.  The most
158	   intuitive method has been to increase the buffer sizes for these
159	   links on noticing packet discards.  While this might alleviate the
160	   issue temporarily, unless the right problem has been identified this
161	   could readily lead to buffer bloat which has many issues on its own.

163	3.2.  Discards with large buffer sizes

165	   Operators have also observed dropped packets on WAN links within
166	   North America with as large buffers as 125 MB per port with link
167	   utilizations of 60%. If this happens even if the router has not been
168	   specifically configured to drop certain type of packets, or there are
169	   no routing misconfigurations, then clearly the issue here is not the
170	   size of the router buffer.

172	4.  List of required data for study of router buffer sizes

174	   This section talks about the absolute minimum requirements of the
175	   type of data that needs to be collected to be able to effectively
176	   quantify router buffer size.

178	   +---+-------------------------+-------------------------------------+
179	   |   |           Data          |               Details               |
180	   +---+-------------------------+-------------------------------------+
181	   | 1 |   Number of concurrent  |        For aggregate traffic        |
182	   |   |         flows, N        |                                     |
183	   | 2 |  Length of the flow, L  |  [Flow start time - flow end time]  |
184	   | 3 |    Packet Discards, P   |            Per Interface            |
185	   | 4 |    Reason for Packet    |   Buffer overflow, configuration,   |
186	   |   |       Discards, R       |                 etc.                |
187	   | 5 |    Resolution of Time   |  [Flow start time - flow end time]  |
188	   |   |       Interval, T       |                                     |
189	   | 6 |  5 tuple flow identity, |   Src IP, Dest IP, Src port, Dest   |
190	   |   |            I            |           Port, Protocol.           |
191	   +---+-------------------------+-------------------------------------+

193	          Table 1: List of required data for Router Buffer Sizes

195	   A service provider needs to take into consideration several
196	   attributes to determine the right buffer size for its WAN routers.
197	   This section enlists the details as to why the five above have been
198	   identified as the minimum essential data needed to aid the study of
199	   router buffer sizes.

201	4.1.  Number of concurrent flows, N

203	   Studies [Feldmann]  and [Stevens] show that 95% of flows in the
204	   internet today are attributed to TCP [Postel] flows.  The nature of
205	   these flows can vary significantly not only with various time
206	   periods, but also between providers.  Flows that spend most of their
207	   time in slow-start require significantly less buffering than flows
208	   that live mostly in congestion avoidance.  Due to this it is
209	   important to identify the type of concurrent flows that can live on a
210	   WAN link.

212	   Short (non-persistent) flows are those that live for less than one
213	   RTT, and large (persistent) flows are those whose lifetime is larger
214	   that one RTT with congestion overhead.  Internet measurements [Avra]
215	   show that while a smaller number of large flows contribute to maximum
216	   packet transfer, short flows dominate most TCP sessions and large
217	   flows are known to have a larger effect on buffer sizes.  These
218	   combination flows could in turn have an effect on Round Trip Time
219	   (RTT), loss probability and flow lengths.  The ability to detect
220	   large flows is necessary because while the flows can be constant in
221	   steady state, the aggregate traffic can keep changing due to various
222	   arrival and departure rates.  There needs to be a way for the number
223	   of concurrent flows to be collected and analyzed with the granularity
224	   of the lifetime of short flows, as low as one millisecond.

226	4.2.  Length of a flow, L

228	   Length of a flow can be defined as its duration: [flow stop time -
229	   flow start time], or the number of packets/bytes sent in this time
230	   duration.  Identifying the length of flow in a provider's network
231	   will give information of the mix of short and large flows that are
232	   present in the WAN.  This will lead to modeling implications in TCP
233	   flow control.

235	4.3.  Packet Discards, D

237	   Number of packet discards per interface is probably the most
238	   important metric.  Of this the number of outward (WAN) facing
239	   interface discards would be more intuitive to the study of buffer
240	   sizes.  Interface discards can be referred to in [RFC2893]

242	4.4.  Reason for Packet Discards, R

244	   There can be several reasons for packet discards especially when it
245	   is observed on less utilized links.  Some of them could be due to
246	   routing misconfigurations, or designed to drop certain packets due to
247	   configurations.  Clearly stating a reason as insufficient buffer will
248	   help narrow down the data required.  This is especially true in the
249	   case of smart buffer allocations when some ports run out of buffers
250	   but not others.  We could observe that a port has been allocated
251	   only, say, 30 percent of the available total buffer space but is
252	   experiencing the highest utilization and as a result of that is
253	   seeing packet drops pointing to the fact that dynamic buffers' smart
254	   allocations scheme is not adaptive and predictive to the nature of
255	   the WAN traffic.

257	4.5.  Resolution of time interval, T

259	   The time interval should be granular such that it captures not only
260	   the number of concurrent flows in steady state but also the aggregate
261	   traffic over the lifetime of a short flow.  It should also be able to
262	   correlate the discards per interface to the number of concurrent
263	   flows.

265	   Today via IPFIX we can calculate the number of concurrent flows.  Via
266	   Sflow counters or flows, we can calculate the discards.  Using
267	   counters requires upto two times the granularity set for any changes
268	   to be visible due to Nyquist rate.  Reducing the counter export
269	   interval would increase the responsiveness, but at the cost of
270	   increased overhead and reduced scalability.  On the other hand,
271	   packet sampling automatically allocates monitoring resources to busy
272	   links, providing a highly scaleable way to quickly detect traffic
273	   flows wherever they occur in the network.  Responsiveness is
274	   important for a more stable control.

276	4.6.  5 Tuple Flow Identity, I

278	   5 tuple flows have a source IP, destination IP, source port,
279	   destination port, and protocol to identify endpoints for
280	   unidirectional flows.  Having this functionality gives the network
281	   operator a way to identify the offending flows, legitimate elephant
282	   flows, and high priority flows which may happen at certain periods
283	   during the day.  Being able to separate traffic using the 5 tuple,
284	   further increases the strength of the sample set of empirical data
285	   available for the study of router buffer sizes.

287	5.  Conclusion

289	   We see that there are numerous issues at different layers that have
290	   an effect (directly or indirectly) on the sizing of router buffers.
291	   We also notice that there is no study that takes empirical data into
292	   consideration.  Ideally, what would be required is an all knowing
293	   oracle that sees the traffic flow on an end-to-end network across all
294	   layers.  Due to a lack of the resource, the first step to the study
295	   of router buffer sizes is to effectively mine the big data repository
296	   of a provider for the data identified in this draft.

298	6.  Acknowledgements

300	7.  IANA Considerations

302	   This memo includes no request to IANA.

304	   All drafts are required to have an IANA considerations section (see
305	   the update of RFC 2434 [I-D.narten-iana-considerations-rfc2434bis]
306	   for a guide).  If the draft does not require IANA to do anything, the
307	   section contains an explicit statement that this is the case (as
308	   above).  If there are no requirements for IANA, the section will be
309	   removed during conversion into an RFC by the RFC Editor.

311	8.  Security Considerations

313	   This document does not introduce new security issues.

315	9.  References
316	9.1.  Normative References

318	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
319	              Requirement Levels", BCP 14, RFC 2119, March 1997.

321	9.2.  Informative References

323	   [Appenzeller]
324	              G. Appenzeller, I. Klesassy, and N. McKeown, "Some
325	              Internet Architectural Guidelines and Philosophy", 2004,
326	              <SIGCOMM '04 Proceedings of the 2004 conference on
327	              Applications, technologies, architectures, and protocols
328	              for computer communications>.

330	   [Avra]     Konstantin Avrachenkov, INRIA Sophia Antipolis,
331	              "Differentiation Between Short and Long TCP Flows:
332	              Predictability of the Response Time", 2004, <IEEE
333	              INFOCOMM>.

335	   [Feldmann]
336	              A. Feldmann, J. Rexford, and R. Caceres, "Efficient
337	              policies for carrying Web traffic over flow-switched
338	              networks", Dec. 1998, <IEEE/ACM Trans. Networking, vol. 6,
339	              pp. 673-685>.

341	   [I-D.narten-iana-considerations-rfc2434bis]
342	              Narten, T. and H. Alvestrand, "Guidelines for Writing an
343	              IANA Considerations Section in RFCs", draft-narten-iana-
344	              considerations-rfc2434bis-09 (work in progress), March
345	              2008.

347	   [Postel]   J. Postel, "Transmission Control Protocol", Sep. 1981,
348	              <RFC 793>.

350	   [RFC2893]  K. McCloghrie, F. Kastenholz, "The Interfaces Group MIB",
351	              Jun. 2000, <RFC 2893>.

353	   [RFC3429]  R. Bush and D. Meyer, "Some Internet Architectural
354	              Guidelines and Philosophy", Dec. 2002, <RFC 3429>.

356	   [Stevens]  W. R. Stevens, "Transmission Control Protocol", 1994,
357	              <TCP/IP Illustrated. Reading, MA: Addison-Wesley, vol. 1>.

359	   [Villamizar]
360	              C. Villamizar and C. Song, "High performance tcp in
361	              ansnet", 1994, <ACM Computer Communications Review,
362	              24(5):45-60>.

364	Appendix A.  Additional Stuff

366	   This becomes an Appendix.

368	Authors' Addresses

370	   Kamala Subramaniam (editor)
371	   Microsoft
372	   Mountain View, CA  94043
373	   US

375	   Phone: +1 919 345 8778
376	   Email: kasubra@microsoft.com

378	   Darren Loher
379	   Microsoft
380	   Redmond, WA  98052
381	   US

383	   Email: daloher@microsoft.com