idnits 2.17.1 draft-briscoe-tsvwg-quickstart-rvw-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1545. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1522. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1529. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1535. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([QSrvw]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 396: '...low state is NOT REQUIRED for Quick-St...' RFC 2119 keyword, line 398: '... state is REQUIRED in an untrusted e...' RFC 2119 keyword, line 400: '...flow-state being REQUIRED and NOT REQU...' RFC 2119 keyword, line 403: '... state is REQUIRED. So the authors ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1382 has weird spacing: '... Supply from ...' -- The exact meaning of the all-uppercase expression 'NOT REQUIRED' is not defined in RFC 2119. If it is intended as a requirements expression, it should be rewritten using one of the combinations defined in RFC 2119; otherwise it should not be all-uppercase. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 15, 2005) is 6736 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-04) exists of draft-briscoe-tsvwg-cl-architecture-01 Summary: 5 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Transport Area Working Group B. Briscoe 3 Internet-Draft BT & UCL 4 Expires: May 19, 2006 November 15, 2005 6 Review: Quick-Start for TCP and IP 7 draft-briscoe-tsvwg-quickstart-rvw-00 9 Status of this Memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on May 19, 2006. 34 Copyright Notice 36 Copyright (C) The Internet Society (2005). 38 Abstract 40 This review thoroughly analyses draft 01 of the Quick-Start proposal, 41 focusing mostly on security issues. It is argued that the recent new 42 QS nonce proposal gives insufficient protection against misbehaving 43 receivers, and a new approach is suggested. But it would be perverse 44 to strengthen protection against malicious receivers too much when 45 the protocol only works if all senders can be trusted to comply. The 46 review argues this is an inevitable result of choosing to have 47 routers allocate rate to senders without keeping per-flow state. The 48 paper also questions whether Quick-Start's under-utilisation 49 assumption defines a distinct range of operation where fairness can 50 be ignored. Because traffic variance will always blur the boundary, 51 we argue that under-utilisation should be treated as the extreme of a 52 spectrum where fairness is always an issue to some extent. 54 If we are to avoid per-flow state on routers, the review points to an 55 alternative direction where endpoints allocate rate to themselves. 56 Counter-intuitively, this allows scalable security and a spectrum of 57 fairness to be built in from the start, but rate allocation is less 58 deterministic. 60 Issues not related to security are also raised, including the 61 possibility of a catastrophic overload if path delays are atypical. 62 A solution to this is offered, as well as solutions to scalability 63 issues with the range and precision of the Rate Request field. Many 64 other more minor review comments are given. 66 Author's Statement: Status 68 This document will only ever be posted as an Internet-Draft. The 69 intent is that the Quick-Start I-D itself will incorporate some of 70 these review comments before progressing to RFC. Those comments that 71 question the basic design choices of Quick-Start will be available in 72 a similar BT technical report [QSrvw] for archival and citation 73 purposes. 75 Table of Contents 77 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 2. Summary of Quick-Start . . . . . . . . . . . . . . . . . . . . 5 79 3. Arguments against basic design choices taken . . . . . . . . . 5 80 3.1. The metric: Capacity or impairment? . . . . . . . . . . . 6 81 3.2. Flow state and no flow state as distinct scenarios . . . . 9 82 3.3. No secure association between control and data . . . . . . 10 83 3.4. Fairness issues . . . . . . . . . . . . . . . . . . . . . 10 84 3.5. Quick-Start and QoS . . . . . . . . . . . . . . . . . . . 13 85 3.6. Conceptual model of under-utilisation . . . . . . . . . . 14 86 3.7. Router algorithms purely local policy? . . . . . . . . . . 16 87 3.8. Applicability statement . . . . . . . . . . . . . . . . . 16 88 4. Suggested improvements; taking the Quick-Start design 89 choices as given . . . . . . . . . . . . . . . . . . . . . . . 18 90 4.1. No control-data association . . . . . . . . . . . . . . . 18 91 4.2. Alternative rate reduced nonce . . . . . . . . . . . . . . 19 92 4.3. Maximum rate request . . . . . . . . . . . . . . . . . . . 21 93 4.4. Alternate rate encoding . . . . . . . . . . . . . . . . . 22 94 4.5. Request refusal behaviour . . . . . . . . . . . . . . . . 23 95 4.6. Sender's DoS response . . . . . . . . . . . . . . . . . . 23 96 5. Clarity and nits . . . . . . . . . . . . . . . . . . . . . . . 23 97 5.1. For clarity . . . . . . . . . . . . . . . . . . . . . . . 23 98 5.1.1. Qualify "incrementally deployable" . . . . . . . . . . 23 99 5.1.2. Additional rate semantics unclear . . . . . . . . . . 24 100 5.1.3. Other improvements for clarity . . . . . . . . . . . . 25 101 5.2. Nits . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 102 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 103 7. Security Considerations . . . . . . . . . . . . . . . . . . . 28 104 8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 28 105 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 30 106 10. Informative References . . . . . . . . . . . . . . . . . . . . 30 107 Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . 108 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 34 109 Intellectual Property and Copyright Statements . . . . . . . . . . 35 111 1. Introduction 113 This is a review of the second IETF Transport Area working group 114 draft (01) of Jain _et al_'s Quick-Start [QS_ID]. The review was 115 originally of the 00 draft, but an attempt has been made to update it 116 to take account of changes in the 01 draft. I apologise if there are 117 still complaints about things that have already been fixed between 00 118 and 01--I have tried to check all the changes, but may have missed 119 some. External references, for instance to sections of the Quick- 120 Start I-D under review, are rendered like `S.1', while internal 121 references to other sections of this review itself are rendered like 122 `Section 1'. 124 Although the draft is long (76pp) and growing, it still refers at 125 length to a supporting document by Sarolahti, Allman & 126 Floyd [QS_eval], which is still under submission. This paper only 127 reviews the material in the Internet Draft, not the supporting 128 document, on the basis that, if any details were intended for IETF 129 consideration they would have been included in the Internet Draft. 130 For instance, the draft says that Sarolahti _et al_ presents a number 131 of alternative router algorithms, but it is assumed that the single 132 example given in the draft is representative, at least in terms of 133 intent, if not implementation. The draft also claims Sarolahti _et 134 al_ presents an Extreme Quick-Start mechanism with per-flow state, 135 but given that mechanism is not presented in the draft, and is 136 described as `extreme', it is assumed that it is not considered 137 applicable for consideration by the IETF. 139 The final sentence of S.A.6 says, "...as long as the simple 140 mechanisms are not short-term hacks but mechanisms that lead the 141 overall architecture in the fundamentally correct direction." The 142 position of this review will be made clear from the start. Adding 143 rate allocation to interior routers is not considered likely to be 144 the fundamentally correct direction we should take. Section 3.1 145 justifies this position, but an overview is given here. By 146 extension, this is also a disagreement with the similar architecture 147 used in XCP [XCP], but this review will focus on Quick-Start. 149 It is tempting to get routers to allocate rates directly because that 150 is the final effect we are trying to achieve. But these approaches 151 leave aside the security issue of determining whether to trust the 152 sender until after the design choices have been made. Then solving 153 that problem adds horrible complexity, because routers in distant 154 networks from the sender have to have a security association with the 155 sender. This same problem was at the root of the scalability 156 problems behind Intserv [RFC2208]. This remote security association 157 could potentially be aggregated so a network only needs a security 158 associations with neighbouring networks and hosts. But doing that 159 isn't easy. For instance, when Diffserv did that, it had to 160 compromise by losing precision. 162 However, Quick-Start is a valid experimental direction to try out 163 what happens when flows start fast, without concern about security 164 issues. And, as it doesn't preclude any other future direction 165 (other than using up one more IP Option codepoint and one more TCP 166 Option codepoint, which is not a problem because they are not 167 scarce), its move to experimental status shouldn't be opposed as long 168 as its applicability is made absolutely clear. Then Quick-Start 169 experiments can proceed in parallel to further research into how to 170 securely allow senders to start quickly, which is already on the 171 agenda of the IRTF Internet congestion control research 172 group [ICCRG]. 174 So this review is divided into three main parts: 176 o Argumentation about basic design choices taken 178 o A list of suggested technical improvements setting aside disquiet 179 about basic design choices 181 o Suggestions to improve clarity and simple typos. 183 2. Summary of Quick-Start 185 The Quick-Start protocol allows a Quick-Start sender to request to be 186 allowed to directly start a flow at a high initial rate, rather than 187 slowly probing for network capacity using TCP slow-start. The 188 protocol is intended for controlled environments where all routers 189 are usually under-utilised. The request uses a single hop-by-hop 190 pass from sender to receiver requesting a rate from each router on 191 the way using an IPv4 Option header or an IPv6 extension header. 192 Then the receiver returns the response back to the sender in an e2e 193 TCP Option message. When a request arrives at a router, it 194 determines the initial rate it can support and if necessary reduces 195 the Rate Request field in the protocol to this rate before forwarding 196 it onward. The protocol is designed to detect if any router on the 197 path or the destination is not Quick-Start capable and if so falls 198 back to standard TCP slow-start. A nonce scheme is included to allow 199 the sender to detect if the receiver has tried to feed back a higher 200 allowed rate than it was given by the network. 202 3. Arguments against basic design choices taken 204 Quick-Start routers get the receiver to tell the source what the 205 network can support. But they don't check whether what the source 206 subsequently sends is what it was told it could send. And they can't 207 as they are not required to be stateful. So we assume Quick-Start 208 should also be confined to environments where every source is 209 trusted. This security assumption isn't discussed in the Quick-Start 210 I-D, but this review assumes the authors would agree it is their 211 assumption. A large part of this review is devoted to analysing this 212 issue of trust in the sender. 214 The objection may be made by the authors that other IETF protocols 215 have not had to pass this test. TCP itself relies on trust in the 216 sender. The difference with Quick-Start is that it raises 217 considerable issues of fairness (the Quick-Start I-D disagrees, but 218 Section 3.4 below justifies this statement), so it should require the 219 same protections against sender misbehaviour that are expected of 220 other protocols for differential QoS. 222 3.1. The metric: Capacity or impairment? 224 Approaches for resource allocation can be divided into those where 225 endpoints ask routers for a rate, and those where routers tell 226 endpoints their overall state so that endpoints may choose a rate. 227 Quick-Start falls into the former category. If each router divides 228 its available capacity among competing requests, every router must 229 associate the `identity' making the request with some sharing policy. 230 In the case of Quick-Start, the sharing policy is that available 231 capacity is shared equally among all the requests, but other policies 232 can be chosen. However, the problem is how to recognise different 233 `identities'. If routers are not required to be stateful, each new 234 request has to be defined as from a different `identity'. Another 235 way to say this is that the authorisation model of Quick-Start 236 routers has to be very trusting if they are to be stateless. 238 That is fine for a trusted environment. But it is unlikely to "lead 239 the overall architecture in the fundamentally correct direction" 240 (S.A.6) for when we want to move to an untrusted environment. The 241 goal of the IETF (and those that use its standards) is to build in 242 security from the start. 244 Clearly, the stateless model of Quick-Start is vulnerable to repeat 245 requests being assumed to be multiple identities, so sources can get 246 more, simply by asking more often (or, for that matter, by simply 247 taking more without asking)... But that is only the first part of 248 the problem. 250 The main objection is that even if we did countenance stateful 251 routers, if rate must be allocated on each router, every router (or 252 at least every trust domain) has to establish the identity behind 253 each request. For simplicity, perhaps identity could be based on 254 addressing (flow IDs), rather than cryptographic identifiers 255 (signatures). But we all know the pitfalls of using addressing for 256 security identification. Specifically, in the case of rate 257 allocation we are concerned with sender spoofing and identity 258 splitting: 260 Sender spoofing To allocate rates to senders, routers would need to 261 identify the sender, but for a message to reach a receiver, the 262 sender only needs to truthfully reveal the receiver's address to 263 the network. If it wants the destination to correspond, it need 264 only reveal its own address to the destination, not to the 265 network. 267 Identity splitting If rates are allocated to unique addresses (more 268 generally tuples of addresses), it is relatively easy for the 269 source and destination to conspire to split their identity over 270 multiple addresses between themselves (or via proxy interfaces or 271 hosts) to get more rate. 273 In other words, even stateful Quick-Start routers lead us to per-flow 274 policing at every trust border--the same security premise that 275 Intserv started from. So it seems likely Quick-Start will eventually 276 suffer the same fate as Intserv if it takes this path. 278 So what could "lead the overall architecture in the fundamentally 279 correct direction"? A better hook for endpoint identification at the 280 network/transport layer is to associate a sender with its ingress 281 attachment interface. This sounds obvious, but it still seems hard 282 to be able to identify the source of a flow with precision as it 283 arrives at a router multiple networks downstream. However, recent 284 research has shown how to do this scalably. The direction outlined 285 below arguably has better potential for solving this security 286 problem. It should allow responsibility for flows to be taken on 287 recursively by each network as it presents its requests in bulk to 288 the next network, without any per-flow processing in the network 289 interior, even at borders. 291 This new approach requires routers to declare their `impairment' 292 status to endpoints. An example of impairment status is their level 293 of congestion (e.g. ECN), which is a measure of the risk that a 294 packet will not be served. For under-utilised scenarios like that of 295 Quick-Start, impairment would have to be relative to a lower `bar' 296 than that used for congestion control. In the case of speeding up 297 TCP slow-start, the metric might measure the risk that a router will 298 no longer be under-utilised. Xia _et al_ [+1b] is an example of 299 using this mechanism to solve the same problem as Quick-Start. The 300 virtual queue used in AntiECN [AntiECN] and pre-congestion 301 notification [CL-arch] are other examples. 303 Current research has not reached consensus on the best pre-congestion 304 metric to use. This is why we need to research this issue in the 305 recently set-up IRTF Congestion Control Research Group (CCRG). But 306 this is likely to be an easier issue to solve than fixing the basic 307 security model of rate allocation schemes like Intserv, Quick-Start 308 and XCP--solutions to the problem of identifying accountability for 309 congestion don't fall out of the sky so often as new congestion 310 control schemes. 312 A field for this impairment status would then be provided in every 313 packet passing through each resource. It would need a few more bits 314 than ECN in order to feed back the path's impairment status quickly 315 to the source, rather than having to code a probability over hundreds 316 of packets using binary marking. Then as packets pass through other 317 resources they accumulate the impairment status of the path. 318 ECN [RFC3168] is an example of this model, but for pre-congestion, it 319 is necessary to move the goal posts (the desired utilisation level) 320 downward. 322 We then have impairment information intrinsically associated with 323 rate information, so the two can be traded against each other--cost 324 and benefit. The rate information _is_ the number of bits in packets 325 a source sends. And because the impairment information is carried in 326 each packet, the more packets a source sends, the more impairment can 327 be associated with that source (just as with ECN). 329 But we haven't solved the problem yet. Rate is measurable all along 330 a network path, and it aggregates and de-aggregates nicely. But 331 although impairment information aggregates and de-aggregates too, it 332 accumulates along the path. So, in a connectionless datagram 333 network, impairment seems to be only measurable just before the 334 destination. With direct rate allocation, we said the problem was 335 that every router had to have a security association with the source. 336 But our egress still seems to need a security association with the 337 source (because we can't assume the network can intercept feedback), 338 which isn't much better. And even if the egress can identify the 339 source, it's in the wrong place to allocate the rate to the source, 340 because it can't police whether the source is complying... 342 But we can solve this problem by the source having to declare the 343 impairment it believes will accumulate along the path [re-fb],[re- 344 TCP]. Then rather than accumulating impairment state along the path, 345 routers effectively subtract it. With sufficient bits in the pre- 346 congestion field, after at least one round trip the source can make 347 an accurate prediction of what it should declare in the next round. 348 If the average of the declarations ends up negative at the end, the 349 egress can tell there must have been cheating upstream and drop the 350 packets. 352 So we have now collected all three items of required information at 353 the ingress to the network: source identity, impairment and rate. 354 Identifying the source has become solely the job of the ingress 355 network, based on the attachment interface of the source. So the 356 identification scalability problem has been solved. And the ingress 357 network can also check that the source is doing a proper cost-benefit 358 trade-off between impairment and rate. 360 The downside is that such a system does not allocate rate 361 deterministically. It is statistical. But without per-flow state, 362 Quick-Start suffers from that failing too, though not as seriously 363 (Section 4.1). The question for the working group is whether 364 improved determinism is more important than improved security, or 365 vice versa. 367 The argument for the more secure approach isn't easy to grasp as it 368 is constructed from a number of steps that have not been put together 369 into a detailed approach for the problem of a faster start to TCP 370 flows. All we currently have is the outline above and the references 371 below. But it seems much more likely to lead the architecture in a 372 future-proof direction. This architectural argument was outlined in 373 a discussion of flow-start incentives in [re-fb] (S.3.3.3). The same 374 overall approach is also used in the Internet-Draft on Re-ECN [re- 375 TCP] (S.1), but in this case for TCP congestion control rather than 376 pre-congestion control. 378 Later this review criticises the Quick-Start I-D for depending 379 overmuch on references out to [QS_eval], while this review is itself 380 open to criticism for hypocrisy given it also refers out overmuch. 381 The spirit of this review is to point out advances in very recent 382 research that offer what appears to be a more fruitful architectural 383 direction. It is not trying to prevent Quick-Start progressing; it 384 is however wanting applicability to be clarified first. 386 3.2. Flow state and no flow state as distinct scenarios 388 S.2 clearly says "No per-flow state should be required at the router. 389 Note that while per-flow state is not required we also do not 390 preclude a router from storing per-flow state for making Quick-Start 391 decisions." However, there is a phrase at the end of S.9.4.3 on 392 Collusion between Misbehaving Routers saying, "the router between the 393 ingress and egress nodes that denied the request could be monitoring 394 connection performance, actively penalizing nodes that seem to be 395 using Quick-Start after a Quick-Start request was denied." 396 It is true that flow state is NOT REQUIRED for Quick-Start to work in 397 a trusted environment (but see Section 4.1). However, it seems flow- 398 state is REQUIRED in an untrusted environment. If the draft confined 399 itself to the trusted environment assumption, it wouldn't have to 400 slip between flow-state being REQUIRED and NOT REQUIRED. Because if 401 the draft is going to start entertaining the untrusted scenario, 402 there are problems of state exhaustion attacks to deal with, if flow 403 state is REQUIRED. So the authors need to either keep the untrusted 404 scenario out of this draft, or bring it in completely. 406 3.3. No secure association between control and data 408 S.9.4 or S.9.5: There is no association (binding), let alone a secure 409 association between the request/response and the subsequent data it 410 allows. Therefore, it becomes virtually impossible for the network 411 to somehow police the sender to ensure compliance with the response. 413 Even if there were a secure binding, because of the protocol model of 414 a single pass towards the destination followed by e2e feedback, an 415 upstream router cannot defend a downstream router that has reduced 416 the request, given the response isn't seen again by upstream routers. 418 Worse, the lack of a secure binding between a request and subsequent 419 traffic means that any other node can send a burst of traffic and 420 claim it requested it, with no-one being able to prove it didn't. 421 This problem would preclude a deployment of Quick-Start even if it 422 were confined to a controlled subset of the sources, if other sources 423 that were not part of the experiment could send unaccountable bursts 424 of traffic claiming to be from the authorised sources. 426 Note that this is not an argument that the binding must be secured by 427 cryptography. In Section 3.1 there is an alternative way to do a 428 secure binding based on physical connectivity[Wireless]. 430 3.4. Fairness issues 432 S.A.6: "...More Functionality?" says "...for a mechanism for 433 requesting a initial sending rate in an underutilized environment, 434 the fairness issues of a general congestion control mechanism go 435 away,..." This is unfortunately not true. 437 TCP is designed to seek out the maximum capacity it can on the path. 438 So in an under-utilised environment long-lived TCP flows will 439 continue to rise in rate until they find congestion. But they will 440 finish sooner. So with TCP, under-utilised means there are 441 insufficient long-lived flows to fill capacity and shorter flows end 442 before they have reached congestion. But it doesn't mean that 443 instantaneous utilisation is always low. It only means average 444 utilisation is low. With TCP, the variance of instantaneous 445 utilisation increases greatly in an under-utilised network. Even if 446 traffic is dominated by shorter-lived flows, there will be peaks in 447 congestion as flow arrivals coincide. So the network continually 448 moves rapidly from under-utilisation to congestion. 450 So in an under-utilised environment, fairness splits into three 451 issues, the last two of which only appear each time we cross the 452 boundary between instantaneous under-utilisation and instantaneous 453 congestion: 455 o whether routers give differential responses (see next section 456 Section 3.5) 458 o who has most bandwidth during brief periods of congestion, when 459 multiple flows happen to coincide (variance of congestion over 460 time) 462 o how the risk of congestion varies with path length (variance of 463 congestion in space) 465 Congestion variance over time: At the boundary between a time of 466 under-utilisation and one where congestion starts to set in, 467 fairness depends on who asked for most bandwidth before everyone 468 realises congestion has started. Those flows with most on entry 469 to the period of congestion will be in the most advantageous 470 position during the period of congestion (when we assume normal 471 congestion avoidance takes over). With multiplicative decrease, 472 the higher you start from, the further you fall, but you are still 473 higher after each round trip than everyone else. 475 Congestion variance in space: The single-pass request (with rate 476 allocation being hop by hop along the path, rather than after the 477 request has traversed the whole path) precludes any fairness 478 models that require the rate allocation function to know the state 479 of the whole path. A router only knows what the upstream path can 480 sustain and its own local condition. So rate allocation cannot 481 weigh benefits (rate) against costs (risk of congestion). So 482 Quick-Start is limited to benefit-only fairnesses like max-min or 483 min-max, and cannot ever achieve cost-benefit fairnesses like 484 proportional fairness or the root-proportional fairness of 485 algorithms like TCP. In other words, Quick-Start can only share 486 out benefits (rate) in various ways, without regard to costs (risk 487 of congestion). The under-utilisation assumption essentially says 488 that the risk of congestion is close enough to zero to be 489 negligible (but see Section 3.6). 491 This approach takes no account of the increased risk of congestion 492 the more routers a Quick-Start flow will traverse, even in an under- 493 utilised network. What matters is the risk of congestion across a 494 path, not the utilisation of each individual router. We show below 495 that every QS router on a path can think it is individually below the 496 threshold, but the flow is on a path above the threshold. So all the 497 QS routers give the source the go-ahead when they shouldn't. 499 If the probability of congestion is p at every router (conveniently 500 taken to be all the same), then the probability, P, of congestion 501 across a path of n routers is P = 1-(1-p)^n. For low levels of path 502 congestion (P << 1), the risk of congestion across a path is the sum 503 of the risks of congestion on each router P ~ Sum_n(p) (Table 1). 505 +--------+---------------------------------------------+ 506 | \ n | 1 2 5 10 20 | 507 | p \ | | 508 +--------+---------------------------------------------+ 509 | 1.000% | 1.000% 1.990% 4.901% 9.562% 18.209% | 510 | | | 511 | 0.100% | 0.100% 0.200% 0.499% 0.996% 1.981% | 512 | +--------+ | 513 | 0.010% | 0.010% | 0.020% 0.050% 0.100% 0.200% | 514 | | +--------------------------+ | 515 | 0.001% | 0.001% 0.002% 0.005% 0.010% | 0.020% | 516 +--------+-----------------------------------+---------+ 518 +--+ 520 Table 1: Probability of congestion across a path of n routers, each 521 with probability of congestion p. 523 The Quick-Start protocol finds the minimum of the rates each router 524 allows, solely considering itself in isolation. The protocol 525 precludes a combined view across the path. A Quick-Start router 526 denies Quick-Start requests once its local utilisation is above a 527 threshold. It should, but cannot, take account of whether the risk 528 of whole-path congestion is above a threshold. For instance, if the 529 path congestion threshold for allowing Quick-Start requests were 530 0.020%, then Quick-Start requests should only be honoured for the 531 cases below the staircase in Table 1. 533 Given Quick-Start is more useful for long RTT paths (where TCP slow- 534 start takes longer), the above problem is more likely to occur 535 wherever Quick-Start is most useful. 537 Incidentally, unfairness attacks aren't mentioned in S.9.5. For 538 instance, the router's response gapping defence against a type (1) 539 attack, where a source increases the router's processing and state 540 load, still reduces the number of successful responses to well- 541 behaved nodes, which in turn gives them an incentive to fight fire 542 with fire by increasing their rate of requests.[Gapping] Identity 543 splitting (Section 3.1) is another form of unfairness attack. 545 So fairness is an issue. And, as a consequence, I believe there will 546 be pressures and incentives for people to have differential treatment 547 from schemes like Quick-Start (see Section 3.5 next). Instead of 548 saying, "...there are fewer open issues with Quick-Start..." the I-D 549 should say, "...Quick-Start is targeted at an experimental 550 environment where the more intractable issues can be set aside". The 551 problem is that the chosen scenario has boundaries that the draft 552 recognises will regularly be crossed in any practical network, even 553 if it conforms to the assumptions most of the time. 555 In Section 3.8 I will return to the issue of how a device (host or 556 router) knows whether it is on a path where the assumptions are 557 currently valid. 559 3.5. Quick-Start and QoS 561 S.A.4 says, "The Total Rate semantics makes it easier for routers to 562 "allocate" the same rate to all connections." One person's logic 563 might say the obvious form of fairness is equality, so allocating the 564 same rate to all is the desirable behaviour. Another person might 565 say the obvious form of equality is to allocate rates in proportion 566 to how valuable the customer is, or how much they pay. As far as I 567 know, no other IETF protocol allocates rates equally. They either 568 allocate the product of rate with RTT and the root of congestion 569 equally (TCP-fairness), or they don't allocate rate (UDP). The 570 caution in Tussle in Cyberspace [Tussle] should be listened to at 571 this point. If the design choices behind a protocol intrinsically 572 only give network operators the ability to treat their customers 573 equally, they will break its architecture to be able to sell 574 inequality. 576 There is a strong likelihood that differentiated Quick-Start 577 responses may arise as a possible business model. As it stands, 578 Quick-Start cannot prevent users selfishly creating this 579 differentiation themselves, because they can create multiple 580 identities for themselves (aside from the fact that a stateless 581 Quick-Start can't even distinguish between requests from the same 582 identity). The approach outlined in Section 3.1 allows service 583 differentiation to be added scalably, as a local arrangement between 584 the sender and its ingress network. It also seems possible to make 585 the approach recursive from one network to the next, in a similar way 586 to that described in [re-fb]. 588 But QoS is not just about differentiation within a service. We also 589 have QoS between services. In the previous (00) draft, what is now 590 S.9.3 "Quick-Start with QoS-enabled Traffic" said "...routers should 591 be discouraged from granting Quick-Start requests for higher-priority 592 traffic when this is likely to result in significant packet loss for 593 lower-priority traffic." This review originally argued that the the 594 whole point of some higher QoS classes may be to give priority to 595 quick-start flows to the detriment of other classes. However, as 596 this sentence has been removed in the 01 draft, it is assumed that 597 the authors have already realised this point. 599 3.6. Conceptual model of under-utilisation 601 Section 3.4 above on fairness raised concerns that the Quick-Start 602 authors' conceptual model of under-utilisation perhaps did not take 603 full account of the increased variance of instantaneous utilisation 604 when TCP (and Quick-Start) traffic dominates an under-utilised 605 environment. This section explores whether there is also an 606 assumption that under-utilisation will always be due to host 607 interface limitations. 609 One of the under-utilisation assumptions I had in my head while 610 reading the paper was that any one host is generally able to over- 611 fill available capacity, but that, given a high rate, the flow would 612 end quickly. In this case under-utilisation meant that generally one 613 big flow like this would finish before the next arrival would hit any 614 of the same interfaces on the path. Surely Quick-Start should be 615 designed to handle scenarios where a single host can saturate the 616 network capacity? It seems a feasible scenario particularly in 617 super-computing and data-centre type environments where Quick-Start 618 might be useful.[Monthly] 620 But then, in this 'one big flow at a time' scenario, if the network 621 moves from under-utilisation to moderate utilisation (perhaps on a 622 daily cycle), Quick-Start would have to handle the transition 623 correctly. During the periods of under-utilisation (implying long 624 gaps between these big flows) routers would be able to give each 625 request all the remaining capacity. But during any periods when load 626 might move to medium-utilisation, new requests might arrive more 627 often during the time that a current request was still being served, 628 already filling the remaining capacity. 630 QS could just do first-come-first-served on the full remaining 631 capacity. Or routers could maintain an average of the arrival rate 632 of new requests relative to the amount of capacity available as each 633 request arrived. Then each router could cut down each new request to 634 give predicted requests arriving during the flow a fairer share of 635 the router's remaining capacity without having to push in using 636 standard slow-start. But usually, during under-utilisation, this 637 sharing algorithm would give each new request the whole of the 638 remaining capacity. 640 The point being made is not that sharing of remaining capacity is a 641 correct approach. It certainly may be pragmatic to say it is more 642 important to fill the pipe with known load than leave some spare for 643 predicted load. This review merely asks that the question be 644 considered: whether predictive sharing can be `better' relative to 645 first-come first served. In turn, the absence of this question might 646 imply an implicit assumption needs to be stated that flow rates are 647 limited by host interface capacities before any interior network link 648 capacities can be saturated by one flow. 650 The example router algorithm in S.C has no notion of sharing 651 capacity--it all goes to each request that arrives. It was only the 652 += operator in the last line of the algorithm in Appendix (S.C) that 653 made it clear that division of remaining capacity wasn't being 654 assumed. By reverse engineering this algorithm, it was possible to 655 guess that there was an assumption that host capacity was smaller 656 than the network's, so meeting a request in full would still leave a 657 lot of spare capacity for the next request. This assumption needs to 658 be brought out earlier; not at the end of one of the last appendices 659 (and not by having to reverse-engineer an algorithm). This is a 660 symptom of trying to avoid discussion of router algorithms in the 661 draft (see Section 3.7 below). 663 As well as medium utilisation, we might very occasionally even get 664 high arrival rates of these big requests. Then nearly every request 665 would get zero capacity. Senders would be repeating requests very 666 frequently (cf. congestion collapse on CSMACD media such as shared 667 ethernet) to try to catch all the routers on the path just as 668 capacity comes available before someone else gets it (see 669 Comment Gapping). 671 Indeed, in this case of large requests relative to remaining 672 capacity, it seems Quick-Start would (probably unwittingly) move the 673 Internet towards flow-by-flow capacity allocation rather than packet 674 multiplexing. Such radical thinking is not necessarily beyond 675 consideration (for instance, Key and Massoulie make a good case for 676 this mode of operation when transferring fixed volume objects such as 677 data files [KM99]), but the under-utilisation assumption is clearly 678 ambiguous as it stands. 680 There is probably a lot more about possible router algorithms and the 681 under-utilisation assumptions in [QS_eval]. I could not really make 682 full sense of this I-D without having read [QS_eval], which implies 683 there is too much reference out to [QS_eval] that ought to be 684 included in the I-D. 686 3.7. Router algorithms purely local policy? 688 S.C: "Possible Router Algorithm" says, "we consider the algorithm a 689 particular router uses to be a local policy decision" Surely this 690 approach is insufficient? Surely this I-D needs to set some 691 constraints on possible router algorithms, to enable interworking. 692 For instance, one network might decide that the larger the balance 693 between the request and available capacity, the smaller the response 694 will be (to discourage senders from asking for more than they need). 695 While another might decide the opposite policy: the more you ask, the 696 more you get. Unless the two co-ordinate, their two policies may 697 fight with unpredictable results for both, or they may depend on the 698 order that two networks deal with a request. 700 Alternatively, the I-D could say that experiments are needed (hence 701 the need for experimental RFC status) in order to establish 702 constraints required on router algorithms for interworking, 703 robustness, fairness etc. 705 3.8. Applicability statement 707 The document doesn't ever mention that the sender is assumed to be 708 acting in the interests of the network. Throughout the history of 709 the Internet this assumption has been made. But that does not mean 710 we are allowed to stop admitting that we are making this assumption. 712 Indeed, the text at the start of the QS nonce section S.3.4 implies 713 that the nonce is protecting the sender, rather than allowing the 714 sender to protect the network if it chooses to. It is essential to 715 clearly highlight such a major security assumption in an 716 applicability section, in a new sub-section of S.9.4 on misbehaving 717 senders or in the security considerations section (preferably in all 718 three). 720 Note that, not only do Quick-Start senders have to be trusted, but 721 also other senders who could claim their data had been authorised by 722 a Quick-Start response when it hadn't (Section 3.3). So the usual 723 allowance for this assumption--that TCP code is embedded in the 724 operating system so is difficult to tamper with--loses its force. 725 One could argue that the receiver's TCP code is also embedded in the 726 operating system, so why the need for the QS nonce? The usual 727 argument here is that a deployment scenario might be where only large 728 download sites were trusted to use Quick-Start. But because Quick- 729 Start does not require state on routers, it seems hard for QS routers 730 to determine whether large bursts of data were authorised by a 731 previous successful request, let alone whether they are from a 732 trusted source or not. 734 S.10.3: "Possible Deployment Scenarios" implies that a controlled 735 environment is an initial scenario. It should be put more strongly: 736 that a controlled environment is the _only_ scenario that could ever 737 be contemplated while there is a need to trust the source. The draft 738 suggests GPRS as a possible scenario, and others have been proposed 739 on the mailing list. It should be clearly stated that this proposal 740 is not, and never will be applicable for general public networks 741 without some means to ensure that the sender can be trusted. 743 The draft also needs to say how devices will know whether they are 744 currently part of such a controlled environment. A router can be 745 configured to know that it is part of a network where all requests to 746 it are from controlled sources. But how does a trustworthy Quick- 747 Start sender know when it has roamed to a network where the trust 748 assumption doesn't hold so it should give up sending Quick-Start 749 requests, as routers will always ignore them? Otherwise, (trusted) 750 hosts on public networks will be continually sending innocent Quick- 751 Start requests, possibly unnecessarily tying up router resources 752 further along the path where there is a controlled environment. 754 Quick-Start needs a way for a router to say to a source "Don't bother 755 with Quick-Start any more until you move to another network". 756 Currently, a source may have the Quick-Start request option removed, 757 but it doesn't know whether the ingress router did that on behalf of 758 all possible future connections on this network, or some router 759 further downstream did it, implying only that particular path doesn't 760 support Quick-Start. 762 We need a way for devices to know the deployment status of what they 763 are attached to. Just because this is a general requirement covering 764 many new capabilities (QoS, multicast, IPv6 etc), doesn't mean it 765 shouldn't be mentioned each time it is needed. 767 Finally, below are some specific points in the I-D where the trusted 768 sender assumption is lurking implicitly in the background and needs 769 to be brought out explicitly. 771 o S.9.2: The moderate added complexity at routers is only valid if 772 senders can be trusted. 774 o S.9.4.3: Scenarios of collusion between networks are fairly 775 unlikely within the already limited set of scenarios where the 776 sender is trusted. Collusion between ingress and egress of a 777 network (intranet example in the draft) involves allowing a 778 request that would have been rejected by an interior node. If the 779 sender is trusted to keep to its promises by its access network, 780 why should its access network not be trusted? So it is surely 781 most reasonable to think about collusion between ingress and 782 egress as extensions of the sender and receiver. Given the scheme 783 is vulnerable to senders cheating it will be vulnerable to ingress 784 networks cheating. 786 o S.9.4.3: The rather optimistic argument that QS router collusion 787 is not as bad as ECN router collusion is false. Somehow avoiding 788 congestion drops by a flow being falsely ECN capable is seen as 789 better than starting at a higher rate than you should, thus 790 causing drops to others as well as yourself. The loss to yourself 791 is minimal compared to the loss to others as a whole. This goes 792 back to the lack of secure association between request and data. 793 However, again, the whole idea of worrying about colluding routers 794 when the source can do what it likes seems moot. 796 Discussion of the under-utilisation assumption could also be part of 797 this applicability statement. It is conceded that the Quick-Start 798 draft recognises the difficulties caused by the interaction between 799 routers allocating capacity to Quick-Start traffic and source 800 transports allocating capacity to non-Quick-Start traffic under end 801 to end control. However, the rationale given for why it is good to 802 mix these two forms of capacity allocation is essentially that Quick- 803 Start is surrounded by a set of assumptions (trust and under- 804 utilisation) that rule the question out of scope. 806 So, in summary, ruling questions like fairness out (see above) is 807 fine for a scoped experiment, but in real life we can't set such 808 clear bounds on applicability. The bounds of the experiment merge 809 imperceptibly with the scenarios where the experimental assumptions 810 break down. 812 4. Suggested improvements; taking the Quick-Start design choices as 813 given 815 4.1. No control-data association 817 Because of the requirement of no router flow state, there is no 818 association between control messages (Quick-Start requests/responses) 819 and subsequent data. So, the router cannot know when or whether data 820 has started to arrive as a result of an earlier request. The scheme 821 has been worked out to avoid this being a problem, by: 823 o the use of conservative timers--each router conservatively sets 824 aside capacity for responses given in the last few time-slots 825 irrespective of whether downstream routers reduced the response 826 later, 828 o conservative assumptions about all requests being new total 829 requests, even if they are actually for additional rate (but see 830 Section 5.1.2). 832 However, if responses are delayed (perfectly normal for a best- 833 efforts service) beyond the conservative router timeout, large flows 834 of data could arrive at routers after they had timed out their memory 835 of having allocated the capacity requested for the data. After 836 timing out this memory, any router may well have allocated more 837 capacity to other requests. Being stateless, it assumes its current 838 load includes the data that is actually still approaching in the 839 cloud of dust over the horizon, so to speak. The result could be 840 catastrophic overload. 842 One solution to this problem is for the timeout used by routers to be 843 standardised. And for senders to use the same timeout. So if a 844 request isn't answered within the timeout, the sender re-sends the 845 request and ignores the response if it does arrive. 847 There remains a problem where the data is delayed more than the 848 request/response was delayed. That is after the data leaves the 849 sender but before arriving at a router N that has timed out the 850 request. This could happen because the data itself builds up queues 851 in buffers upstream of N, so its arrival at N gets spread over a 852 longer period, delaying some of it past the timeout of N. Or there 853 could be external causes of increased data delay upstream of router 854 N. 856 Without completely changing the Quick-Start protocol, an improved, 857 but still not completely safe, approach would be for the above 858 proposed source timeout to be considerably less than the router 859 timeout. 861 The likelihood of these race conditions is perhaps the size of a 862 pimple on a pimple, given Quick-Start is intended for an under- 863 utilised environment in the first place. It may be an acceptable 864 risk to occasionally allow unexpected late arrivals of Quick-Start 865 data to overflow a router without any more mechanism than these 866 conservative timers. Particularly given re-routes can already do 867 similar damage. However, we should not really introduce new 868 protocols that can do damage using the excuse that it is no worse 869 than the existing situation. Otherwise, if we solve the re-route 870 problem, we will still have the Quick-Start problem. 872 4.2. Alternative rate reduced nonce 874 S.3.4: The scenario where receivers may be evil, fickle beasts, 875 whereas senders are _always_ trusted to act totally and utterly in 876 the interests of the commonweal seems very contrived. Certainly some 877 senders might be trusted. But Quick-Start requires absolutely _all_ 878 senders that might use a Quick-Start network to be trusted. 880 That being said, we promised to set aside such concerns in this 881 section, so let us assume that some rationale for this state of 882 affairs has been written in to the draft (Section 3.8). Then a QS 883 nonce can be used to check the receiver is honestly reporting the 884 rate response that has managed to emerge from the network. 886 The newly proposed scheme in draft 01 means that a dishonest receiver 887 has a 25% chance of correctly guessing how to undo a reduction of one 888 step in the rate response. 890 The ECN nonce of RFC3540 [RFC3540] can get away with allowing a 891 receiver to guess how to lie correctly 50% of the time, because the 892 guess must be repeated every CE-marked packet in a data stream. The 893 chances of repeatedly making a 50:50 guess correctly, P, reduce 894 exponentially with both the number of packets streamed n and the 895 packet marking rate p. That is P = 2^{-np}. So, the more havoc the 896 receiver tries to wreak, the less likely it can remain undetected. 898 The QS nonce has a different requirement, because it protects a 899 control packet that will authorise a rate for a large number of 900 future packets. With the ECN nonce, as soon as a sender detected a 901 receiver lying it could stop the transfer, and the longer the 902 transfer continued the more challenges were issued to the receiver. 903 But with the QS nonce, the challenge is only issued once at the 904 start. If the receiver guesses correctly just once, its gain is 905 assured for a large number of future packets. 907 If the receiver happens to guess right first time (1:4 chance), it 908 will get 2x initial download speed. It becomes more difficult to 909 guess how to undo two or more rate reductions: the chance to get 910 2^{r}x the rate is 1:2^{2r}. But there is insufficient incentive to 911 prevent receivers having a go sometimes, particularly if their 912 identity is hidden (e.g. behind a large NAT), so the sender cannot 913 record the receiver's reputation against its address, in case they 914 encounter each other again later. 916 A possible alternative QS nonce would work as follows. A w-bit field 917 is set aside for the QS nonce. The bigger the width w is, the 918 (exponentially) more hard it is to brute force the undoing of a rate 919 reduction. The sender generates a random nonce, stores it and puts 920 it in this field. A router that reduces the Rate Request field by r 921 (that is, reducing the rate requested by 2^r) should hash the QS 922 nonce r times, using a one way hash function, such as MD5 [RFC1321] 923 or the secure hash 1 (SHA1) [SHA1]. Which hash function to use and 924 its initialisation vector would have to be standardised for use in 925 Quick-Start. Then the receiver simply returns the resulting QS nonce 926 to the sender with the rate response. The sender knows what it 927 originally requested and what the rate response is, so it can 928 calculate the difference r. The sender knows what it originally set 929 the QS nonce to, which it hashes r times and tests whether the result 930 is the same as the QS nonce in the response. 932 Vulnerability to processor exhaustion attacks can be avoided by 933 limiting the queue of responses to be hashed on routers to bound the 934 amount of processing. This does of course still mean that attackers 935 create unfairness in the shares of requests that get processed 936 (Section 3.4). 938 If it were decided to use a floating point representation for the 939 rate request (Section 4.4), rather than just the exponent as 940 currently defined, this nonce scheme would only be practical if it 941 were applied just to the exponent field, ignoring the mantissa. This 942 would at least prevent a dishonest receiver from undoing a rate 943 reduction to more than 2x the network's desired response. 945 But, to be honest, we need to see a rationale for why we should 946 always trust senders, before we expend too much effort protecting 947 against dishonest receivers. Especially given that the alternative 948 scheme outlined in Section 3.1 moves in the right direction to 949 potentially protect against misbehaving senders _and_ receivers, 950 without using any cryptography. 952 4.3. Maximum rate request 954 S.3.1: The max rate request of 1.3Gbps is inadequate for future- 955 proofing. I know of projects already considering designs for burst- 956 mode allocation of the capacity of optical access networks to single 957 applications at higher rates than this. 959 The Quick-Start protocol seems to go to great lengths to minimise the 960 size of the IP Option field required, to the extent that the rate 961 request granularity has to be coarse (making a guess by a dishonest 962 receiver more likely to be correct--last para S.9.4.2). Given the 963 smallest rate Quick-Start can request is 80kbps, worrying about 964 keeping to a 32bit header seems a little obsessive and unnecessarily 965 limiting [since writing this, draft 01 has changed to a 64-bit 966 header, but the Rate Request field is still 4 bits]. 968 The argument (S.A.2) that TCP window scaling tops out at 1.07Gbps is 969 not relevant, as the whole point of Quick-Start is to start to solve 970 the scaling problems of TCP. If we solve the TCP window scaling 971 problem, we don't want to be left with the Quick-Start scaling 972 problem. 974 4.4. Alternate rate encoding 976 The powers of two coding chosen is too coarse at the top end. Given 977 there is no particular need to keep this protocol within a 32bit 978 option field in total, an obvious alternative is to use a mantissa 979 and exponent representation. The IEEE single-precision 980 representation for floating-point numbers seems fairly appropriate as 981 it has an 8 bit exponent, meaning it can raise the mantissa to 982 2^{255} ~ 6.10^{76}. Nonetheless, note that however large the 983 maximum number that can be represented, I would prefer to only 984 specify normalised numbers in protocol headers, to avoid Y2K-style 985 problems. This is another reason for my preference for the approach 986 outlined in Section 3.1. 988 The IEEE float's exponent is preceded by one sign bit and followed by 989 a 23 bit mantissa, requiring 32 bits in all. We could either use it 990 directly, or better (I believe) modify it for our purposes. 992 The mantissa is interpreted by treating it as if there is a binary 993 point before the first bit of the mantissa which therefore represents 994 a binary fraction f, 0 <= f < 1, with the first bit representing 1/2, 995 the second bit 1/4 and so on. Then 1 is always added to the 996 resulting fraction. So, with binary numbers e and f in the exponent 997 and mantissa fields, the number represented is (1+f) x 2^{(e-b)}, 998 where b is a bias explained below (we recommend b = 0). 1000 Because 1 <= (1+f) < 2 for all f, if two values have different 1001 exponents, they can always be compared solely by comparing their 1002 exponents. Only if the exponents are the same do the mantissas need 1003 comparing. This is an important property of the number 1004 representation, given Quick-Start must be able to compare two numbers 1005 fast (some other floating point representations contain redundancy so 1006 they can represent the same number with different combinations of 1007 mantissa and exponent). 1009 The IEEE single precision representation has some irrelevant 1010 features. For instance, we wouldn't need the sign bit (set to 1 1011 means a negative mantissa). It could be used to signal certain error 1012 conditions, but it would be more correct to have specific flags in 1013 Quick-Start if we need them. Also, the IEEE exponent is a biased 1014 value with a bias of b = 127, meaning 127 is subtracted from the 1015 value before being used as the exponent of 2 (rather than using twos- 1016 complement representation of negative numbers). This allows 1017 fractions to be represented using negative exponents, which we do not 1018 need. It would be sensible to use a bias of b = 0 for Quick-Start. 1020 To reduce Quick-Start's coarseness at the high end of the exponent 1021 range, we could choose between 8, 16 or 24 bit precision of the 1022 mantissa rather than the IEEE 23 bit. Even if we used 24 bits, the 1023 rate field as a whole would take up 32 bits. If we allowed the whole 1024 IP option to take up 64 bits, we could still also fit in a more 1025 robust nonce (see Section 4.2). 1027 4.5. Request refusal behaviour 1029 This is a picky point about over-constraining implementation choices. 1031 S.3.3, 2nd para: When a router wishes to deny a Quick-Start Request 1032 the QS I-D allows it to zero the Rate Request, QS TTL and QS nonce, 1033 rather than removing the Option altogether, which may be less 1034 efficient. Instead, it would be more liberal to say the router 1035 should zero the Rate Request, and should set both the QS TTL and QS 1036 nonce to random values, which may be implemented by clearing the 1037 fields to zero for efficiency. This is because the value zero for 1038 the QS TTL or the QS nonce is not a magic value that the sender tests 1039 for, so there is no need to use it. 1041 Alternatively, error codes could be placed in these secondary fields 1042 giving the reason for denying the request. Perhaps these codes could 1043 be used to indicate to the sender whether it is attached to a network 1044 that doesn't support QS at all (Section 3.8) as opposed to a 1045 temporary refusal. 1047 4.6. Sender's DoS response 1049 S.9.4.1: If a sender gets multiple responses to a single request, it 1050 should stop processing them. 1052 5. Clarity and nits 1054 5.1. For clarity 1056 5.1.1. Qualify "incrementally deployable" 1058 S.13: "Conclusions" and S.A.5 "Alternate Responses to the Loss of a 1059 Quick-Start Packet" claim that Quick-Start is incrementally 1060 deployable. Although this is strictly not an incorrect statement, it 1061 greatly overstates the true position. A network continues to work 1062 while Quick-Start is incrementally deployed. But Quick-Start doesn't 1063 work at all until all routers on a path and both the source and 1064 destination have been upgraded. So, if the probability of a host 1065 being upgraded is P_h and that of a router being upgraded is P_r and 1066 the average network diameter is d routers, the probability of a 1067 Quick-Start request succeeding is P_h^(2).P_r^(d). So for example, 1068 even for a small network with d=3, when 10% of all routers and hosts 1069 have been upgraded the probability of Quick-Start succeeding is 10%^5 1070 = 0.001%. Even if half of all routers and hosts have been upgraded 1071 on such a small network, the chance of being able to start quickly is 1072 only 3%. To remain able to start quickly while growing to a slightly 1073 larger network (with a diameter of say 6), still with half of all 1074 devices upgraded, the chance of success drops again to 0.4%. 1076 So there is a very late network effect, with the benefits only 1077 appearing once nearly everyone has upgraded. This gives no-one any 1078 incentive to start the upgrade process. Thus the only realistic 1079 deployment scenario is where a central administration upgrades nearly 1080 every router on the network at once (assuming each legacy router is 1081 capable of upgrade). Thus the term incrementally deployable rather 1082 overplays the reality unless it is heavily qualified. Perhaps 1083 `backward compatible' would be a better description. 1085 This analysis tends to answer the question of S.A.6 "Why Not Include 1086 More Functionality?". If it will take this long to get any benefit, 1087 it seems sensible to make sure we add other benefits at the same 1088 time. 1090 5.1.2. Additional rate semantics unclear 1092 It is not clear what the semantics are intended to be for a request 1093 for additional rate. Reverse engineering the text about gaming the 1094 system and such like in S.A.4, this is how it seems to work: the 1095 sender requests the total rate, Z, it wants irrespective of how much 1096 rate it is already sending, A. The router doesn't know or care how 1097 much rate is already being sent and treats the request as it would 1098 treat any completely new request. So it responds giving the source X 1099 <= Z. Then, if X > A the source increases its rate by X - A. 1100 Otherwise the source reverts to standard congestion control. Whether 1101 this is what is meant or not, it needs clarifying (the para in S.3.1 1102 doesn't give the whole semantics and there is nothing in S.3.3 or 1103 S.A.4 about the semantics on a router). 1105 If the router is stateless (as is proposed), it cannot know how much 1106 rate is already being used for a flow. So it cannot decide how much 1107 extra capacity is required for a request for additional capacity when 1108 only the total rate is requested. So it cannot decide whether to 1109 allow the request, _unless_ it takes the conservative approach of 1110 assuming the current rate for every request is zero. This might be 1111 what the draft intends, but it is not clear in either S.3.1 or S.A.4. 1113 As S.A.4 says, "For either of these alternatives, there would not be 1114 room to report the current sending rate in the Quick-Start Option 1115 using the current minimal format for the Quick-Start Request." The 1116 I-D says this as a justification for only requesting a Total Rate 1117 (because an Additional Rate could be gamed without knowing the 1118 Current Rate as well). So there seems to be an implication that 1119 somehow the router can work out the current rate in order to know 1120 whether to admit a flow, but it can't work out the current rate in 1121 order to check whether it is being gamed. The only way I could 1122 resolve this apparently conflicting logic was to assume the router 1123 was being conservative (by always assuming the current rate is zero). 1125 If we are trusting the sender, why don't we provide two fields for 1126 the sender to report its current rate and its requested rate, given 1127 we don't need to be stingy with header size in a high capacity 1128 network? 1130 5.1.3. Other improvements for clarity 1132 o (Picky) The first bullet of S.3.3 says a router approving a Quick- 1133 Start Request must decrement the QS TTL by one. It would be safer 1134 to say it should decrement the QS TTL by the same value that it 1135 decrements the IP TTL, which may not always be 1 even though 1136 RFC1812 currently says it should be (e.g. the now deprecated TTL 1137 scoping used on the MBone had a different TTL decrement at 1138 different types of border gateway--this is not to say that Quick- 1139 Start should work with multicast, just that there may be other 1140 operational practices that decrement the TTL by more than 1). 1142 o S.3.4 "The QS Nonce", first sentence: 1143 "The QS Nonce gives the Quick-Start sender some protection against 1144 receivers lying about the value of the received Rate Request...." 1145 --> 1146 "The QS Nonce allows the Quick-Start sender to give the network 1147 some protection against receivers lying about the value of the 1148 received Rate Request..." 1150 o Fig 5: It would make more sense to call the Rate Request field in 1151 the TCP Option the Rate Response field. 1153 o S.9.1: A more complete calculation of the benefit of Quick-Start 1154 would help here (also necessary for the API in S.10.1). That is, 1155 in order to weigh the extra complexity against the benefit, we 1156 need a formula for the benefit relative to TCP flows that would 1157 last longer than slow-start as well as those that would finish 1158 within slow-start. 1160 o S.9.4.4: "Misbehaving Middleboxes and the IP TTL" should surely 1161 not be within S.9.4 "Protection against Misbehaving Nodes" as it 1162 is a feature interaction (albeit due to irritating attempts by 1163 middleboxes to `improve' packets), not a malicious attack. 1165 o Why is S.9.5 "Attacks on Quick-Start" not a sub-section of S.9.4 1166 "Protection against Misbehaving Nodes"? My suspicion is that this 1167 structure is due to an assumption that Quick-Start is somehow 1168 secure as long as Quick-Start senders are trusted, even if other 1169 senders on the same network aren't trusted (see Section 3.3). 1171 o Other related work includes the class of approaches where an 1172 initial request is sent into a `scavenger' class (Singh _et 1173 al_ [lowTCP] give a useful set of references). Also Adams _et 1174 al_ [ARI05]. 1176 o S.A.7: "The Earlier QuickStart Nonce". This appendix might 1177 consider other related work that could have provided an 1178 alternative nonce mechanism, in order to give rationale for why 1179 they haven't been chosen. The closest example I can think of is 1180 Yang _et al_'s capability validation approach [DoScapab] and its 1181 references (Perrig etc.). 1183 o (Picky) S.10.1: "Implementation issues...": As well as an 1184 additional timer, Quick-Start requires the source to hold the 1185 additional state of the TTL-Diff and QS nonce. 1187 o S.A1.1: "ICMP" The ICMP message would need the source and 1188 destination port numbers to know where to demultiplex to at each 1189 host. 1191 o Both S.A1.1 "ICMP" and S.A1.2 "RSVP" talk of a corresponding 1192 transport level to be used for the response. But these requests 1193 at the network layer don't imply any particular transport protocol 1194 -- unless it is encapsulated inside the ICMP or RSVP header. 1196 o S.A.6. "...requires less input to routers than XCP..." [explain 1197 what this means?] 1199 5.2. Nits 1201 (mostly found in draft 00, so some may have been fixed) 1203 Throughout: 1204 "an Rate Request" 1205 --> 1206 "a Rate Request" 1207 (vestige from an earlier change from "Initial Rate" to "Rate 1208 Request"). 1210 Contents and S.4.7: 1212 `...Middle of Connection..." 1213 --> 1214 "...Middle of a Connection..." 1216 S.1 "Introduction", last sentence: 1217 "In contrast, routers would not use Quick-Start to get congestion 1218 information,..." 1219 --> 1220 "In contrast, routers would not use Quick-Start to give congestion 1221 information,..." 1223 S.2 "General Principles", Last bullet: 1224 "A second practical consideration is that packets could be 1225 dropped..." 1226 --> 1227 "A second practical consideration is that request packets could be 1228 dropped..." 1230 S.6(?) "Quick-Start in IP tunnels", last para of way #(1): 1231 "...then the egress node should remove..." 1232 --> 1233 "...then a Quick-Start aware egress node should remove..." 1235 S.6.2 last sentence says "Section 6.2 discusses..." [It must mean 1236 some other section]. 1238 S.9.4.1 "Receivers Lying...", second para: 1239 the the 1240 --> 1241 then the 1243 S.10.2 "Implementation issues..." Last sentence: 1244 send 1245 --> 1246 sent 1248 S.12. Security considerations: 1249 "Sections 9.4 and 9.5 discuss..." 1250 --> 1251 "Sections 3.4, 9.4 and 9.5 discuss..." 1253 S.A.5 "Alternate Responses to the Loss of a Quick-Start Packet", 1254 final sentence: 1255 "However,..." 1256 --> 1257 "In other words,..." 1259 S.A.6. "...More Functionality?", Last sentence of first para: 1261 "...that the current congestion..." 1262 --> 1263 "...than the current congestion..." 1265 S.A.6. penultimate para: 1266 "...a initial sending rate..." 1267 --> 1268 "...an initial sending rate..." 1270 S.A.6. last para: 1271 "...a positive step of meeting..." 1272 --> 1273 "...a positive step towards meeting..." 1275 S.B. "Quick-Start with DCCP", last bullet numbered (1) 1276 "...to send more that twice as fast as the receiver has reported 1277 received..." 1278 --> 1279 "...to send more tha*n* twice as fast as the rate that the receiver 1280 has reported received..." 1282 6. IANA Considerations 1284 None. 1286 7. Security Considerations 1288 The whole of Section 3 reviews the security, fairness and policy 1289 issues of Quick-Start. Section 4.2 and Section 4.6 propose 1290 alternative mechanisms intended to improve Quick-Start security. 1292 8. Conclusions 1294 The Quick-Start proposal requires that every sender on a network must 1295 be trusted to comply with the Quick-Start protocol. We argue this is 1296 an inevitable consequence of choosing to have senders ask routers to 1297 allocate their rate. Any protocol that expects routers to do rate 1298 allocation must also require every trust domain along the path to 1299 hold per-flow-state in order to police each sender. If instead 1300 routers merely tell the endpoints their utilisation, we believe it is 1301 possible for endpoints to allocate their own rate without having to 1302 trust them--using the policing framework of re-feedback [re- 1303 fb] (S.3.3.3). 1305 With endpoint-based rate allocation, the metric used would have to be 1306 a network impairment (i.e. pre-congestion) rather than rate. 1307 Although congestion is negligible in an under-utilised environment, 1308 it is possible to define `the risk of not being under-utilised' 1309 (`pre-congestion') as a form of congestion. 1311 Using pre-congestion as the metric makes endpoint-based rate 1312 allocation less deterministic. But we believe security 1313 considerations should be paramount--precise rate allocation is 1314 useless if senders can do what they want anyway. In other words, 1315 Quick-Start should take note of the classic dictum: `Build in 1316 security from the start.' 1318 Whether or not the authors agree about direction, the current draft 1319 certainly must state very clearly that it assumes all senders are 1320 completely trusted, especially given all the attention to malicious 1321 receivers, malicious networks and even malicious senders launching 1322 DoS attacks (presumably these senders are different ones from those 1323 trusted to comply with Quick-Start). 1325 It would make more sense for the Quick-Start specification to 1326 completely ignore security issues and assume a trusted environment, 1327 rather than shore up three walls of the castle while leaving the 1328 fourth unbuilt. This would at least allow us to move forward 1329 rapidly, to see what happens when flows start quickly in controlled 1330 experiments with trusted devices and under-utilised capacity. 1331 However, if that is the chosen way forward, it should be very clearly 1332 stated that it is a tactical step, not an architectural direction. 1334 This review also questions whether Quick-Start's under-utilisation 1335 assumption allows a distinct range of operation to be defined where 1336 issues like fairness can be ignored, or whether under-utilisation is 1337 just an extreme of a spectrum, making fairness an issue that must 1338 still be handled sometimes and in some places, because traffic 1339 variance will always blur the boundary of the under-utilisation 1340 assumption. The alternative pre-congestion-based approaches use the 1341 fact that pre-congestion sits on a spectrum where fairness is not an 1342 issue at one end, but becomes an issue as pre-congestion increasingly 1343 turns into congestion. 1345 When someone asks for directions, a favourite response in Ireland is, 1346 "If you wanted to get there, I wouldn't have started from here." The 1347 bulk of this review of Quick-Start [QS_ID] is in that spirit. 1348 However, rather than being so unhelpful, pains have been taken to 1349 explain why it would have been better to start from somewhere else. 1350 But also many suggestions for improving the protocol within its own 1351 terms of reference have been made, by temporarily setting aside 1352 disquiet with the underlying assumptions. 1354 This review argues that the recent nonce proposal gives insufficient 1355 protection against misbehaving receivers, and a new approach is 1356 suggested. Issues not related to security are also raised, including 1357 the possibility of a catastrophic overload if path delays are 1358 atypical. A solution to this is offered, as well as an improved 1359 encoding of the Rate Request field giving better scaling of range and 1360 precision. Many other more minor review comments are given. 1362 9. Acknowledgements 1364 Thanks to Alessandro Salvatori and Louise Burness (BT) for numerous 1365 useful review comments mainly improving clarity and to Martin Koyabe 1366 (BT) for pointing out the standard IEEE float encoding. Also thanks 1367 go to Mark Handley (UCL) for pointing out that solutions to lack of 1368 trust in the sender come along less often than congestion control 1369 solutions, which therefore need to be built around the trust 1370 solutions we have. Also Sally Floyd and Pasi Sarolahti have helped 1371 by reviewing this review and clarifying the intentions of Quick- 1372 Start. 1374 10. Informative References 1376 [+1b] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman, 1377 "One more bit is enough", ACM SIGCOMM CCR 35(4)37--48, 1378 August 2005, . 1381 [ARI05] Adams, J., Roberts, L., and A. IJsselmuiden, "Changing the 1382 Internet to Support Real-Time Content Supply from a Large 1383 Fraction of Broadband Residential Users", BT Technology 1384 Journal (BTTJ) 23(2), April 2005. 1386 [AntiECN] Kunniyur, S., "AntiECN marking: A marking scheme for high 1387 bandwidth delay", Proc. IEEE ICC'03 , May 2003, 1388 . 1390 [CL-arch] Briscoe, B., Eardley, P., Songhurst, D., Le Faucheur, F., 1391 Charny, A., Babiarz, J., and K-H. Chan, "A framework for 1392 admission control over Diffserv using Pre-congestion 1393 notification", 1394 I-D draft-briscoe-tsvwg-cl-architecture-01.txt, July 2005, 1395 . 1398 [DoScapab] 1399 Yang, X., Wetherall, D., and T. Anderson, "A DoS-limiting 1400 network architecture", ACM SIGCOMM CCR 35(4)241--252, 1401 August 2005, . 1404 [ICCRG] Handley, M., "Internet Congestion Control Research Group", 1405 IRTF working group charter , July 2005, 1406 . 1408 (Proposal) (Continuously updated) 1410 [KM99] Key, P. and L. Massoulie, "User policies in a network 1411 implementing congestion pricing", Proc. Workshop on 1412 Internet Service Quality and Economics, MIT , 1413 December 1999, . 1416 [QS_ID] Jain, A., Floyd, S., Allman, M., and P. Sarolahti, "Quick- 1417 Start for TCP and IP", draft-ietf-tsvwg-quickstart-01 1418 (work in progress), October 2005. 1420 [QS_eval] Sarolahti, P., Allman, M., and S. Floyd, "Evaluating 1421 Quick-Start for TCP", , February 2005, 1422 . 1424 (under submission) 1426 [QSrvw] Briscoe, B., "Review: Quick-Start for TCP and IP", BT 1427 Technical Report TR-CXR9-2005-007, November 2005, 1428 . 1430 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 1431 April 1992. 1433 [RFC2208] Mankin, A., Baker, F., Braden, B., Bradner, S., O'Dell, 1434 M., Romanow, A., Weinrib, A., and L. Zhang, "Resource 1435 ReSerVation Protocol (RSVP) Version 1 Applicability 1436 Statement Some Guidelines on Deployment", RFC 2208, 1437 September 1997. 1439 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1440 of Explicit Congestion Notification (ECN) to IP", 1441 RFC 3168, September 2001. 1443 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 1444 Congestion Notification (ECN) Signaling with Nonces", 1445 RFC 3540, June 2003. 1447 [SHA1] "Secure hash standard", FIPS, U.S. Department of Commerce, 1448 Washington, D.C. publication 180-1, April 1995. 1450 [Tussle] Clark, D., Sollins, K., Wroclawski, J., and R. Braden, 1451 "Tussle in cyberspace: Defining tomorrow's Internet", ACM 1452 SIGCOMM CCR 32(4)347--356, October 2002, 1453 . 1456 [XCP] Katabi, D., Handley, M., and C. Rohrs, "Congestion control 1457 for high bandwidth-delay product networks", ACM SIGCOMM 1458 CCR 32(4)89--102, October 2002, 1459 . 1461 [lowTCP] Singh, M., Guha, S., and P. Francis, "Utilizing spare 1462 network bandwidth to improve TCP performance", ACM SIGCOMM 1463 2005 Work in Progress session , August 2005, 1464 . 1466 [re-TCP] Briscoe, B., Jacquet, A., and A. Salvatori, "Re-ECN: 1467 Adding Accountability for Causing Congestion to TCP/IP", 1468 I-D draft-briscoe-tsvwg-re-ecn-tcp-00.txt, October 2005, 1469 . 1471 [re-fb] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C., 1472 Salvatori, A., Soppera, A., and M. Koyabe, "Policing 1473 Congestion Response in an Internetwork Using Re-Feedback", 1474 ACM SIGCOMM CCR 35(4)277--288, August 2005, . 1478 Editorial Comments 1480 [Gapping] This requires some way to prevent each source from 1481 frequently repeating requests (the draft only discusses 1482 how a router can enforce gapping for the aggregate of 1483 signals, not how to discriminate against particularly 1484 persistent sources. See also RSVP blockade-state or call- 1485 gapping in the PSTN). 1487 [Monthly] For instance, we once had a request to supply a network 1488 for moving very large amounts of astronomy observation 1489 data across multiple countries just one day in each 1490 month. Here, one host alone could saturate the network 1491 path we were considering. This is a trivial example that 1492 would be made more realistic by thinking of multiple, but 1493 infrequent, competing requests like this across a 1494 network. 1496 [Wireless] Wireless connectivity can still be a problem, but local 1497 physical range constraints or link-local cryptographic 1498 authentication can solve this without a global PKI. 1500 Author's Address 1502 Bob Briscoe 1503 BT & UCL 1504 B54/77, Adastral Park 1505 Martlesham Heath 1506 Ipswich IP5 3RE 1507 UK 1509 Phone: +44 1473 645196 1510 Email: bob.briscoe@bt.com 1511 URI: http://www.cs.ucl.ac.uk/staff/B.Briscoe/ 1513 Intellectual Property Statement 1515 The IETF takes no position regarding the validity or scope of any 1516 Intellectual Property Rights or other rights that might be claimed to 1517 pertain to the implementation or use of the technology described in 1518 this document or the extent to which any license under such rights 1519 might or might not be available; nor does it represent that it has 1520 made any independent effort to identify any such rights. Information 1521 on the procedures with respect to rights in RFC documents can be 1522 found in BCP 78 and BCP 79. 1524 Copies of IPR disclosures made to the IETF Secretariat and any 1525 assurances of licenses to be made available, or the result of an 1526 attempt made to obtain a general license or permission for the use of 1527 such proprietary rights by implementers or users of this 1528 specification can be obtained from the IETF on-line IPR repository at 1529 http://www.ietf.org/ipr. 1531 The IETF invites any interested party to bring to its attention any 1532 copyrights, patents or patent applications, or other proprietary 1533 rights that may cover technology that may be required to implement 1534 this standard. Please address the information to the IETF at 1535 ietf-ipr@ietf.org. 1537 Disclaimer of Validity 1539 This document and the information contained herein are provided on an 1540 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1541 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1542 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1543 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1544 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1545 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1547 Copyright Statement 1549 Copyright (C) The Internet Society (2005). This document is subject 1550 to the rights, licenses and restrictions contained in BCP 78, and 1551 except as set forth therein, the authors retain all their rights. 1553 Acknowledgment 1555 Funding for the RFC Editor function is currently provided by the 1556 Internet Society.