idnits 2.17.1 draft-carpenter-v6ops-label-balance-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 6, 2012) is 4428 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 2629 (Obsoleted by RFC 7749) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 V6OPS B. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational S. Jiang 5 Expires: September 7, 2012 Huawei Technologies Co., Ltd 6 W. Tarreau 7 Exceliance 8 March 6, 2012 10 Using the IPv6 Flow Label for Server Load Balancing 11 draft-carpenter-v6ops-label-balance-02 13 Abstract 15 This document describes how the IPv6 flow label can be used in 16 support of layer 3/4 load distribution and balancing for large server 17 farms. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 7, 2012. 36 Copyright Notice 38 Copyright (c) 2012 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Role of the Flow Label . . . . . . . . . . . . . . . . . . . . 5 55 3. Possible extended role . . . . . . . . . . . . . . . . . . . . 8 56 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 57 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 58 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 59 7. Change log [RFC Editor: Please remove] . . . . . . . . . . . . 11 60 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 61 8.1. Normative References . . . . . . . . . . . . . . . . . . . 11 62 8.2. Informative References . . . . . . . . . . . . . . . . . . 11 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 65 1. Introduction 67 The IPv6 flow label has been redefined [RFC6437] and its use for load 68 sharing in multipath routing has been specified [RFC6438]. Another 69 scenario in which the flow label could be used is in load 70 distribution for large server farms. Load distribution is a slightly 71 more general term than load balancing, but the latter is more 72 commonly used. This document starts with a brief introduction to 73 load balancing techniques and then describes how the flow label might 74 be used to enhance layer 3/4 flow balancers in particular. 76 Load balancing for server farms is achieved by a variety of methods, 77 often used in combination [Tarreau]. The flow label is not relevant 78 to all of them. The actual load balancing algorithm (the choice of 79 server for a new client session) is irrelevant to this discussion. 81 o The simplest method is simply using the DNS to return different 82 server addresses for a single name such as www.example.com to 83 different users. Typically this is done by rotating the order in 84 which different addresses are listed by the relevant authoritative 85 DNS server, assuming that the client will pick the first one. 86 Routing may be configured such that the different addresses are 87 handled by different ingress routers. The flow label can have no 88 impact on this method and it is not discussed further. 89 o Another method, for HTTP servers, is to operate a layer 7 reverse 90 proxy in front of the server farm. The reverse proxy will present 91 a single IP address to the world, communicated to clients by a 92 single AAAA record. For each new client session (an incoming TCP 93 connection and HTTP request), it will pick a particular server and 94 proxy the session to it. Hopefully the act of proxying will be 95 cheap compared to the act of serving the required content. The 96 proxy must retain TCP state and proxy state for the duration of 97 the session. This TCP state could, potentially, include the 98 incoming flow label value. 99 o A component of some load balancing systems is an SSL reverse proxy 100 farm. The individual SSL proxies handle all cryptographic aspects 101 and exchange raw HTTP with the actual servers. Thus, from the 102 load balancing point of view, this really looks just like a server 103 farm, except that it's specialised for HTTPS. Each proxy will 104 retain SSL and TCP and maybe HTTP state for the duration of the 105 session, and the TCP state could potentially include the flow 106 label. 107 o Finally the "front end" of many load balancing systems is a layer 108 3/4 load balancer. While it can sometimes be a dedicated 109 hardware, it also happens to be a standard function of some 110 network switches or routers (eg: using ECMP, [RFC2991]). In this 111 case, it is the layer 3/4 load balancer whose IP address is 112 published as the primary AAAA record for the service. All client 113 sessions will pass through this device. According to the precise 114 scenario, it will spread new sessions across the actual 115 application servers, across an SSL proxy farm, or across a set of 116 layer 7 proxies. In all cases, the layer 3/4 load balancer has to 117 recognize incoming packets as belonging to new or existing client 118 sessions, and choose the target server or proxy so as to ensure 119 persistence. 'Persistence' is defined as guaranteeing that a 120 given session will run to completion on a single server. The 121 layer 3/4 load balancer therefore needs to inspect each incoming 122 packet to identify the session. There are two common types of 123 layer 3/4 load balancers, the totally stateless ones which only 124 act on packets, generally involving a per-packet hashing of easy- 125 to-find information such as the source address and/or port into a 126 server number, and the stateful ones which take the routing 127 decision on the very first packets of a session and maintain the 128 same direction for all packets belonging to the same session. 129 Clearly, both types of layer 3/4 balancers could inspect and make 130 use of the flow label value. 132 Our focus is on how the balancer identifies a particular flow. 133 For clarity, note that two aspects of layer 3/4 load balancers are 134 not affected at all by use of the flow label to identify sessions. 136 1. Balancers use various techniques to redirect traffic to a 137 specific target server. 139 - All servers are configured with the same IP address, they 140 are all on the same LAN, and the load balancer sends directly 141 to their individual MAC addresses. 142 - Each server has its own IP address, and the balancer uses an 143 IP-in-IP tunnel to reach it. 144 - Each server has its own IP address, and the balancer 145 performs NAPT (network address and port translation) to 146 deliver the client's packets to that address. 148 The choice between these methods is not affected by use of the 149 flow label. 151 2. A layer 3/4 balancer must correctly handle Path MTU Discovery 152 by forwarding relevant ICMPv6 packets in both directions. 153 This too is not affected by use of the flow label. 155 The following diagram, inspired by [Tarreau], shows a maximum layout. 157 ___________________________________________ 158 ( ) 159 ( Clients in the Internet ) 160 (___________________________________________) 161 | | 162 ------------ ------------ 163 | Ingress | | Ingress | 164 | router | | router | 165 ------------ ------------ 166 ___|_______DNS-based____________|___ 167 | load splitting | 168 | | 169 | | 170 ------------ ------------ 171 | L3/4 ASIC| | L3/4 ASIC| 172 | balancer | | balancer | 173 ------------ ------------ 174 | load | 175 | spreading | 176 __________|________________________|___________ 177 | | | | 178 ------------ ------------ -------- -------- 179 |HTTP proxy|...|HTTP proxy| | SSL |...| SSL | 180 | balancer | | balancer | | proxy| | proxy| 181 ------------ ------------ -------- -------- 182 ____|_____________|_____________|_________|_____ 183 | | | | | 184 -------- -------- -------- -------- -------- 185 |HTTP | |HTTP | |HTTP | |HTTP | |HTTP | 186 |server| |server| |server| |server| |server| 187 -------- -------- -------- -------- -------- 189 From the previous paragraphs, we can identify several points in this 190 diagram where the flow label might be relevant: 192 1. Layer 3/4 load balancers. 193 2. SSL proxies. 194 3. HTTP proxies. 196 2. Role of the Flow Label 198 The IPv6 flow label is a 20 bit field included in every IPv6 header 199 [RFC2460] and it is defined in [RFC6437]. According to this 200 definition, it should be set to a constant value for a given traffic 201 flow (such as an HTTP connection), but until the standard is widely 202 implemented it will often be set to the default value of zero. Any 203 device that has access to the IPv6 header has access to the flow 204 label, and it is at a fixed position in every IPv6 packet. In 205 contrast, transport layer information, such as the port numbers, is 206 not always in a fixed position, since it follows any IPv6 extension 207 headers that may be present. Therefore, within the lifetime of a 208 given transport layer connection, the flow label can be a more 209 convenient "handle" than the port number for identifying that 210 particular connection. 212 According to [RFC6437], source hosts should set the flow label, but 213 if they do not (i.e. its value is zero), forwarding nodes may do so 214 instead. In both cases, the flow label value must be constant for a 215 given transport session, normally identified by the IPv6 and 216 Transport header 5-tuple. The flow label should be calculated by a 217 stateless algorithm. The value should form part of a statistically 218 uniform distribution, making it suitable as part of a hash function 219 used for load distribution. Because of using a stateless algorithm 220 to calculate the label, there is a very low (but non-zero) 221 probability that two simultaneous flows from the same source to the 222 same destination have the same flow label value despite having 223 different transport protocol port numbers. 225 A careful reading of RFC 6437 shows that for a given source accessing 226 a well-known TCP port at a given destination, the flow label is in 227 effect a proxy for the source port number, found at a fixed position 228 in the layer 3 header. Thus, the suggested model for using the flow 229 label in a load balancing mechanism is as follows: 231 o It is clearly better if the original source, e.g. an HTTP client, 232 sets the flow label. However, if the flow label of an incoming 233 packet is zero, there are two possibilities: 234 1. The ingress router at the server site could implement the 235 stateless mechanism in Section 3 of [RFC6437] to set the flow 236 label value to an appropriate value. This relieves the 237 subsequent load balancers of the need to fully analyse the 238 IPv6 and Transport header 5-tuple to identify the packets 239 belonging to the same flow. 240 2. Load balancers will use the flow label value as described 241 below if it is set, but use the transport header in the 242 traditional way otherwise. 243 In either case, the idea is that as the use of the flow label 244 becomes more prevalent, load balancers will reap a growing 245 performance benefit. 246 o The layer 3/4 load balancers can use the 2-tuple {source address, 247 flow label} as the session key for whatever load distribution 248 algorithm they support, instead of searching for the transport 249 port number later in the header. Note that they do not need to 250 consider the destination address as it is always the same, i.e., 251 the server address. 253 Stateless layer 3/4 load balancers would simply apply a hash 254 algorithm to the 2-tuple {source address, flow label} on all 255 packets, while stateful load balancers would apply their usual 256 load distribution algorithm to the first packet of a session, and 257 store the { 2-tuple, server } association in a table so that all 258 packets belonging to the same session are forwarded to the same 259 server. However, for all subsequent packets of the session, it 260 can ignore all IPv6 extension headers, which should lead to a 261 performance benefit. Whether this benefit is valuable will depend 262 on engineering details of the specific load balancer. 264 Layer 3/4 balancers that redirect the incoming packets by NAPT are 265 not expected to obtain any saving of time by using the flow label, 266 because they must in any case follow the extension header chain in 267 order to locate and modify the port number and transport checksum. 268 The same would apply to balancers that perform TCP state tracking 269 for any reason. 270 o Note that correct handling of ICMPv6 for Path MTU Discovery 271 requires the layer 3/4 balancer to keep state for the client 272 source address, independently of either the port numbers or the 273 flow label. 274 o An SSL proxy should forward the flow identifier between the 275 ciphered side and the clear side. Being able to forward data used 276 for persistence is very important, as it's the only way to stack 277 multiple layers of network components without losing information. 278 o The HTTP proxies may do the same. However, since they have to 279 process the transport and application layers in any case, this 280 might not lead to any performance benefit. 282 Note that in the unlikely event of two simultaneous flows from the 283 same source having the same flow label value, the two flows would end 284 up assigned to the same server, where they would be distinguished as 285 normal by their port numbers. Since this would be a statistically 286 rare event, it would not damage the overall load balancing effect. 287 Moreover, it is very likely that there will be many more servers than 288 possible flow label values at most locations (1 million possible 289 values), so it is already expected that many different flow label 290 values will end up on the same server for a given IP address. In the 291 case where many thousands of clients are hidden behind the same 292 large-scale NAT with a single IP address, the assumption of low 293 probability of conflicts might become incorrect unless flow label 294 values are random enough to avoid following similar sequences for all 295 clients. This is not expected to be a factor for IPv6 anyway, since 296 there is no valid reason to implement NAT [RFC4864]. The statistical 297 assumption is valid for sites that implement network prefix 298 translation [RFC6296], since this technique provides a different 299 address for each client. 301 3. Possible extended role 303 A particular aspect of the session persistence issue is when multiple 304 independent transport connections from the same client need to be 305 handled by the same server instance. This can be an extremely 306 difficult task which often requires ugly tricks such as pattern 307 matching within a buffered stream, cookie insertion, etc, which most 308 load balancers have to deal with every day. If the client 309 application has control over the outgoing flow label, then it can 310 itself assign the same label to all transport connections related to 311 a single application session. 313 A common example is FTP. For a load balancer, passive-mode FTP 314 requires parsing the entire control stream (port 21), in order to 315 find which incoming packet will initiate a data session on a port 316 chosen by the server. This does not always work well due to the fact 317 that sometimes clients don't connect, or that the session is finally 318 not used (e.g., because no transfer needs to be performed). 320 Using a flow label, the client could generate an initial random flow 321 identifier when a file transfer is expected, and assign the same flow 322 label to all data connections related to the same control connection. 323 A flow label based load balancer would then by definition send the 324 data traffic to the same server as the control traffic, and would 325 thus guarantee that the sessions are properly associated. Such a 326 mechanism is permitted by [RFC6437], although it is not the 327 recommended default. 329 The same need is even more prominent with HTTP/HTTPS : while it is 330 costly but not difficult to insert a cookie in an HTTP stream to 331 identify the server the user was assigned to, it is very difficult to 332 do that for HTTPS, because the stream must be deciphered first. 333 Deciphering the stream requires a huge amount of centralized power, 334 since the load balancer needs to see the clear stream; this is in 335 fact the main reason for SSL proxies in load balancing scenarios. If 336 a web client (browser) used the same flow label for any protocol 337 targetting a given host (or domain), this could be used by load 338 balancers to reach the same server for both HTTP and HTTPS, without 339 having to open the stream payload at all nor to inspect anything 340 beyond layer 3, which clearly is not possible today. 342 An additional complication that can arise is when a single client 343 inadvertently generates sessions that appear to originate from 344 different IP addresses. This can arise, for example, if an 345 enterprise uses a proxy farm for outgoing traffic, or in mobile 346 applications where several subsequent requests come from different 347 network cells thus different IP addresses (for instance, consulting 348 banking account in the train). When two consecutive client requests 349 pass through two distinct proxies, a different IP source address may 350 be presented to the server load balancer, which then cannot rely on 351 address-based persistence. It would be possible and desirable in 352 principle to use the same flow label value for correlated sessions 353 from the same client, if the proxies were transparent to the flow 354 label value. 356 In some application scenarios, an inadvertent change in the client IP 357 address may have only minor consequences, such as reloading 358 transaction context into a new server. In other cases it may be more 359 serious and result in a transaction failure. For this reason, a 360 reliable solution in which the load balancer would use the flow label 361 value on its own would be advantageous. 363 Using the flow label in this way would also greatly simplify the 364 logging of user sessions. A very common task is to match logs from 365 various equipments to follow a user's activity and decide whether it 366 indicates a bug, user error or attack. Logging a flow label would of 367 course help because it's easier to find the beginning and end of a 368 session and decide whether it's legitimate or not. 370 Such extensions to the role of the flow label in load balancing are 371 theoretically very attractive, but would require a major refresh of 372 client software as well as of load balancers themselves. It amounts 373 to considering an entire application session, in a broad sense, as a 374 single flow for the purposes of RFC 6437. 376 It is worth nothing though that what is important to save server-side 377 resources is wide enough adoption. Most of todays load balanced 378 traffic is HTTP originating from a handful of browsers which are 379 regularly upgraded for security considerations. Once a mechanism is 380 adopted, it can quickly be deployed and become the general case. 382 The difficulty of the upgrade path is then on the server side. The 383 first step would consist in having layer 7 load balancers be able to 384 consider the flow label to avoid costly layer 7 analysis each time it 385 is possible. This means that if a non-null flow label is seen, then 386 the load balancer would consider it, otherwise it would fall back to 387 its default behaviour. The second step would consist in having front 388 layer 3/4 load balancers bypass the layer 7 load balancer farms when 389 the flow label is found. This point would greatly offload layer 7 390 load balancers. 392 4. Security Considerations 394 Security aspects of the flow label are discussed in [RFC6437]. As 395 noted there, a malicious source or man-in-the-middle could disturb 396 load balancing by manipulating flow labels. This risk already exists 397 today where the source address and port are used as hashing key in 398 layer 3/4 load balancers, as well as where a persistence cookies is 399 used in HTTP to designate a server. It even exists on layer 3 400 components which only rely on the source address to select a 401 destination, making them more DDoS-prone, still all these methods are 402 currently used because the benefits for load balancing and 403 persistence hugely outweight the risks. 405 Specifically, [RFC6437] states that "stateless classifiers should not 406 use the flow label alone to control load distribution, and stateful 407 classifiers should include explicit methods to detect and ignore 408 suspect flow label values." The former point is answered by also 409 using the source address. The latter point is more complex. If the 410 risk is considered serious, the ingress router mentioned above should 411 verify incoming flows with non-zero flow label values. If a flow 412 from a given source address and port number does not have a constant 413 flow label value, it is suspect and should be dropped. 415 The suggestion in Section 3 of using the flow label on its own as a 416 session handle is somewhat problematic. It should never be used in 417 applications nor where any form of resource sharing is not desired. 418 For instance, it is not conceivable that an application would 419 identify a user session by its flow label value due to the inevitable 420 collisions. Using the flow label on its own should only be performed 421 where resource sharing is inevitable and desired (for instance, load 422 balancing) and by components explicitely designed for this task, 423 taking into account all the risks exposed here with solid protections 424 against mis-use, and acceptable fallbacks for the remaining 425 situations where the flow label values will not be usable. 427 The flow label may be of use in protecting against distributed denial 428 of service (DDOS) attacks against servers. As noted in RFC 6437, a 429 source should generate flow label values that are hard to predict, 430 most likely by including a secret nonce in the hash used to generate 431 each label. The attacker does not know the nonce and therefore has 432 no way to invent flow labels which will all target the same server, 433 even with knowledge of both the hash algorithm and the load balancing 434 algorithm. Still, it is important to understand that it is always 435 trivial to force a load balancer to stick to the same server during 436 an attack, so the security of the whole solution must not rely on the 437 unpredicatability of the flow label values alone, but should include 438 defensive measures like most load balancers already have against 439 abnormal use of source address or session cookies. 441 New flows are assigned to a server according to any of the usual 442 algorithms available on the load balancer (e.g., least connections, 443 round robin, etc.). The association between the flow label value and 444 the server is stored in a table (often called stick table) so that 445 future connections using the same flow label can be sent to the same 446 server. This method is more robust against a loss of server and also 447 makes it harder for an attacker to target a specific server, because 448 the association between a flow label value and a server is not known 449 externally. 451 5. IANA Considerations 453 This document requests no action by IANA. 455 6. Acknowledgements 457 Valuable comments and contributions were made by Fred Baker, Lorenzo 458 Colitti, Joel Jaeggli, Gurudeep Kamat, Julius Volz, and others. 460 This document was produced using the xml2rfc tool [RFC2629]. 462 7. Change log [RFC Editor: Please remove] 464 draft-carpenter-v6ops-label-balance-02: clarified after WG 465 discussions, 2012-03-06. 467 draft-carpenter-v6ops-label-balance-01: updated with community 468 comments, additional author, 2012-01-17. 470 draft-carpenter-v6ops-label-balance-00: original version, 2011-10-13. 472 8. References 474 8.1. Normative References 476 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 477 (IPv6) Specification", RFC 2460, December 1998. 479 [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, 480 "IPv6 Flow Label Specification", RFC 6437, November 2011. 482 8.2. Informative References 484 [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, 485 June 1999. 487 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 488 Multicast Next-Hop Selection", RFC 2991, November 2000. 490 [RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and 491 E. Klein, "Local Network Protection for IPv6", RFC 4864, 492 May 2007. 494 [RFC6296] Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix 495 Translation", RFC 6296, June 2011. 497 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 498 for Equal Cost Multipath Routing and Link Aggregation in 499 Tunnels", RFC 6438, November 2011. 501 [Tarreau] Tarreau, W., "Making applications scalable with load 502 balancing", 2006, . 504 Authors' Addresses 506 Brian Carpenter 507 Department of Computer Science 508 University of Auckland 509 PB 92019 510 Auckland, 1142 511 New Zealand 513 Email: brian.e.carpenter@gmail.com 515 Sheng Jiang 516 Huawei Technologies Co., Ltd 517 Q14, Huawei Campus 518 No.156 Beiqing Road 519 Hai-Dian District, Beijing 100095 520 P.R. China 522 Email: jiangsheng@huawei.com 524 Willy Tarreau 525 Exceliance 526 R&D Produits reseau 527 3 rue du petit Robinson 528 78350 Jouy-en-Josas 529 France 531 Email: w@1wt.eu