idnits 2.17.1 draft-livingood-woundy-p4p-experiences-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 19, 2009) is 5456 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'SIGCOMM' is defined on line 461, but no explicit reference was found in the text Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force C. Griffiths 3 Internet-Draft J. Livingood, Ed. 4 Intended status: Informational Comcast 5 Expires: November 20, 2009 L. Popkin 6 Pando 7 R. Woundy, Ed. 8 Comcast 9 Y. Yang 10 Yale 11 May 19, 2009 13 Comcast's ISP Experiences In a P4P Technical Trial 14 draft-livingood-woundy-p4p-experiences-07 16 Status of this Memo 18 This Internet-Draft is submitted to IETF in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on November 20, 2009. 39 Copyright Notice 41 Copyright (c) 2009 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents in effect on the date of 46 publication of this document (http://trustee.ietf.org/license-info). 47 Please review these documents carefully, as they describe your rights 48 and restrictions with respect to this document. 50 Abstract 52 This document describes the experiences of Comcast, a large cable 53 broadband Internet Service Provider (ISP) in the U.S., in a Proactive 54 Network Provider Participation for P2P (P4P) technical trial in July 55 2008. This trial used P4P iTracker technology being considered by 56 the IETF, as part of the Application Layer Transport Optimization 57 (ALTO) working group. 59 Table of Contents 61 1. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. High-Level Details . . . . . . . . . . . . . . . . . . . . . . 4 64 4. Differences Between the P4P iTrackers Used . . . . . . . . . . 5 65 4.1. P4P Fine Grain . . . . . . . . . . . . . . . . . . . . . . 5 66 4.2. P4P Coarse Grain . . . . . . . . . . . . . . . . . . . . . 5 67 4.3. P4P Generic Weighted . . . . . . . . . . . . . . . . . . . 6 68 5. High-Level Trial Results . . . . . . . . . . . . . . . . . . . 6 69 5.1. Swarm Size . . . . . . . . . . . . . . . . . . . . . . . . 6 70 5.2. Impact on Download Speed . . . . . . . . . . . . . . . . . 7 71 5.3. General Impacts on Upstream and Downstream Traffic and 72 Other Interesting Data . . . . . . . . . . . . . . . . . . 8 73 6. Important Notes on Data Collected . . . . . . . . . . . . . . 8 74 7. Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . 9 75 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 76 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 77 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 78 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 79 11.1. Normative References . . . . . . . . . . . . . . . . . . . 11 80 11.2. Informative References . . . . . . . . . . . . . . . . . . 11 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 83 1. Requirements Language 85 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 86 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 87 document are to be interpreted as described in RFC 2119 [RFC2119]. 89 2. Introduction 91 Comcast is a large broadband ISP, based in the U.S., serving the 92 majority of its customers via cable modem technology. A trial was 93 conducted in July 2008 with Pando Networks, Yale, and several ISP 94 members of the P4P Working Group, which is part of the Distributed 95 Computing Industry Association (DCIA). Comcast is a member of the 96 DCIA's P4P Working Group, whose mission is to work with Internet 97 service providers (ISPs), peer-to-peer (P2P) companies, and 98 technology researchers to develop "P4P" mechanisms, such as so-called 99 "iTrackers" (hereafter P4P iTrackers), that accelerate distribution 100 of content and optimize utilization of ISP network resources. P4P 101 iTrackers theoretically allow P2P networks to optimize traffic within 102 each ISP, reducing the volume of data traversing the ISP's 103 infrastructure and creating a more manageable flow of data. P4P 104 iTrackers can also accelerate P2P downloads for end users. 106 P4P's so-called "iTracker" technology was conceptually discussed with 107 the IETF at the Peer-to-Peer Infrastructure (P2Pi) Workshop held on 108 May 22, 2008, at the Massachusetts Institute of Technology (MIT), as 109 documented in [I-D.p2pi-cooper-workshop-report]. This work was 110 discussed in greater detail at the 72nd meeting of the IETF, in 111 Dublin, Ireland, in the ALTO BoF on July 29, 2008. Due to interest 112 from the community, Comcast shared P4P iTracker trial data at the 113 73rd meeting of the IETF, in Minneapolis, Minnesota, in the ALTO BoF 114 on November 18, 2008. Since that time, discussion of P4P iTrackers 115 and alternative technologies has continued among participants of the 116 ALTO working group. 118 The P4P iTracker trial was conducted, in cooperation with Pando, 119 Yale, and three other P4P member ISPs, from July 2 to July 17, 2008. 120 This was the first P4P iTracker trial over a cable broadband network. 121 The trial used a Pando P2P client, and Pando distributed a special 21 122 MB licensed video file in order to measure the effectiveness of P4P 123 iTrackers. A primary objective of the trial was to measure the 124 effects that increasing the localization of P2P swarms would have on 125 P2P uploads, P2P downloads, and ISP networks, in comparison to normal 126 P2P activity. 128 3. High-Level Details 130 There were five different swarms for the content used in the trial. 131 The first was a random P2P swarm, as a control group. The second, 132 third, and fourth used different P4P iTrackers: Generic, Coarse 133 Grained, and Fine Grained, all of which are described in Section 4. 134 The fifth was a proprietary Pando mechanism. (The results of the 135 fifth swarm, while very good, are not included here since our focus 136 is on open standards and a mechanism which may be leveraged for the 137 benefit of the entire community of P2P clients.) Comcast deployed a 138 P4P iTracker server in its production network to support this trial, 139 and configured multiple iTracker files to provide varying levels of 140 localization to clients. 142 In the trial itself, a P2P client begins a P2P session by querying a 143 pTracker, which runs and manages the P2P network. The pTracker 144 occasionally queries the P4P iTracker, which in this case was 145 maintained by Comcast, the ISP. Other ISPs either managed their own 146 P4P iTracker or used Pando or Yale to host their P4P iTracker files. 147 The P4P iTracker returns network topology information to the 148 pTracker, which then communicates with P2P clients, in order to 149 enable P2P clients to make network-aware decisions regarding peers. 151 The Pando client was enabled to capture extended logging, when the 152 version of the client included support for it. The extended logging 153 included the source and destination IP address of all P2P transfers, 154 the number of bytes transferred, and the start and end timestamps. 155 This information gives a precise measurement of the data flow in the 156 network, allowing computation of data transfer volumes as well as 157 data flow rates at each point in time. With standard logging, Pando 158 captured the start and completion times of every download, as well as 159 the average transfer rate observed by the client for the download. 161 Pando served the data from an origin server external to Comcast's 162 network. This server served about 10 copies of the file, after which 163 all transfers (about 1 million downloads across all ISPs) were 164 performed purely via P2P. 166 The P2P clients in the trial start with tracker-provided peers, then 167 use peer exchange to discover additional peers. Thus, the initial 168 peers were provided according to P4P iTracker guidance (90% guidance 169 based on P4P iTracker topology, and 10% random guidance), then later 170 peers discover the entire swarm via either additional announces or 171 peer exchange. 173 4. Differences Between the P4P iTrackers Used 175 Given the size of the Comcast network, it was felt that in order to 176 truly evaluate the P4P iTracker application we would need to test 177 various network topologies that reflected its network and would help 178 gauge the level of effort and design requirements necessary to get 179 correct statistical data out of the trial. In all cases, P4P 180 iTrackers were configured with automation in mind, so that any 181 successful P4P iTracker configuration would be automatically 182 updating, rather than manually configured on an on-going basis. All 183 P4P iTrackers were hosted on the same small server, and it appeared 184 to be relatively easy and inexpensive to scale up a P4P iTracker 185 infrastructure should P4P iTracker-like mechanisms become 186 standardized and widely adopted. 188 4.1. P4P Fine Grain 190 The Fine Grain topology was the first and most complex P4P iTracker 191 that we built for this trial. It was a detailed mapping of Comcast 192 backbone-connected network Autonomous System Numbers (ASN) to IP 193 Aggregates which were weighted based on priority and distance from 194 each other. Included in this design was a prioritization of all Peer 195 and Internet transit connected ASNs to the Comcast backbone to ensure 196 that P4P traffic would prefer settlement free and lower cost networks 197 first, and then more expensive transit links. This attempted to 198 optimize and lower transit costs associated with this traffic. We 199 then took the additional step of detailing each ASN and IP aggregate 200 into IP subnets down to Optical Transport Nodes (OTN) where all Cable 201 Modem Termination Systems (CMTS) reside. This design gave a highly 202 localized and detailed description of the Comcast network for the 203 iTracker to disseminate. This design defined 1,182 P4P iTracker node 204 identifiers, and resulted in a 107,357 line configuration file. 206 This P4P iTracker was obviously the most time-consuming to create and 207 the most complex to maintain. Trial results indicated that this 208 level of localization was too high, and was less effective compared 209 to lower levels of localization. 211 4.2. P4P Coarse Grain 213 Given the level of detail in the Fine Grain design, it was important 214 that we also enable a high-level design which still used priority and 215 weighting mechanisms for the Comcast backbone and transit links. The 216 Coarse Grain design was a limited or summarized version of the Fine 217 Grain design, which used the ASN to IP Aggregate and weighted data 218 for transit links, but removed all additional localization data. 219 This insured we would get similar data sets from the Fine Grain 220 design, but without the more detailed localization of each of the 221 networks off of the Comcast backbone. This design defined 22 P4P 222 iTracker node identifiers, and resulted in a 998 line configuration 223 file. 225 From an overall cost, complexity, risk, and effectiveness standpoint, 226 this was judged to be the optimal P4P iTracker for Comcast. 227 Importantly, this did not require revealing the complex, internal 228 network topology that the Fine Grain did. Updates to this iTracker 229 were also far simpler to automate, which will better ensure that it 230 is accurate over time, and keeps administrative overhead relatively 231 low. However, the differences, costs, and benefits of Coarse Grain 232 and Generic Weighted (see below) likely merit further study. 234 4.3. P4P Generic Weighted 236 The Generic Weighted design was a copy of the Coarse Grained design 237 but instead of using ISP-designated priority and weights, all weights 238 were defaulted to pre-determined parameters that the Yale team had 239 designed. All other data was replicated from the Coarse Grain 240 design. Providing the information necessary to support the Generic 241 Weighted iTracker was roughly the same as for Coarse Grain. 243 5. High-Level Trial Results 245 Trial data was collected by Pando Networks and Yale University, and 246 raw trial results were shared with Comcast and all of the other ISPs 247 involved in the trial. Analysis of the raw results was performed by 248 Pando and Yale, and these organizations delivered an analysis of the 249 P4P iTracker trial. Using the raw data, Comcast also analyzed the 250 trial results. Furthermore, the raw trial results for Comcast were 251 shared with Net Forecast, Inc., which performed an independent 252 analysis of the trial for Comcast. 254 5.1. Swarm Size 256 During the trial, downloads peaked at 24,728 per day, per swarm, or 257 nearly 124,000 per day for all five swarms. The swarm size peaked at 258 11,703 peers per swarm, or nearly 57,000 peers for all five swarms. 259 We observed a comparable number of downloads in each of the five 260 swarms. 262 For each swarm, Table 1 below gives the number of downloaders per 263 swarm from Comcast that finished downloading, and the number of 264 downloaders from Comcast that canceled downloading before finishing. 266 Characteristics of P4P iTracker Swarms: 268 +-----------+-----------+---------------+------------+--------------+ 269 | Swarm | Completed | Cancellations | Total | Cancellation | 270 | | Downloads | | Attempts | Rate | 271 +-----------+-----------+---------------+------------+--------------+ 272 | Random | 2,719 | 89 | 2,808 | 3.17% | 273 | (Control) | | | | | 274 | --------- | --------- | ----------- | ---------- | ----------- | 275 | P4P Fine | 2,846 | 64 | 2,910 | 2.20% | 276 | Grained | | | | | 277 | --------- | --------- | ----------- | ---------- | ----------- | 278 | P4P | 2,775 | 63 | 2,838 | 2.22% | 279 | Generic | | | | | 280 | Weight | | | | | 281 | --------- | --------- | ----------- | ---------- | ----------- | 282 | P4P | 2,886 | 52 | 2,938 | 1.77% | 283 | Coarse | | | | | 284 | Grained | | | | | 285 +-----------+-----------+---------------+------------+--------------+ 287 Table 1: Per-Swarm Size and Cancellation Rates 289 5.2. Impact on Download Speed 291 The results of the trial indicated that P4P iTrackers can improve the 292 speed of downloads to P2P clients. In addition, P4P iTrackers were 293 effective in localizing P2P traffic within the Comcast network. 295 Impact of P4P iTrackers on Downloads: 297 +--------------+------------+------------+-------------+------------+ 298 | Swarm | Global Avg | Change | Comcast Avg | Change | 299 | | bps | | bps | | 300 +--------------+------------+------------+-------------+------------+ 301 | Random | 144,045 | n/a | 254,671 bps | n/a | 302 | (Control) | bps | | | | 303 | ---------- | ---------- | ---------- | ---------- | ---------- | 304 | P4P Fine | 162,344 | +13% | 402,043 bps | +57% | 305 | Grained | bps | | | | 306 | ---------- | ---------- | ---------- | ---------- | ---------- | 307 | P4P Generic | 163,205 | +13% | 463,782 bps | +82% | 308 | Weight | bps | | | | 309 | ---------- | ---------- | ---------- | ---------- | ---------- | 310 | P4P Coarse | 166,273 | +15% | 471,218 bps | +85% | 311 | Grained | bps | | | | 312 +--------------+------------+------------+-------------+------------+ 313 Table 2: Per-Swarm Global and Comcast Download Speeds 315 5.3. General Impacts on Upstream and Downstream Traffic and Other 316 Interesting Data 318 An analysis of the effects of P4P iTracker use on upstream 319 utilization and Internet transit was also interesting. It did not 320 appear that P4P iTrackers significantly increased upstream 321 utilization in the Comcast access network; in essence uploading was 322 already occurring no matter what and a P4P iTracker in and of itself 323 did not appear to materially increase uploading for this specific, 324 licensed content. (A P4P iTracker is not intended as a solution for 325 the potential of network congestion to occur.) Random was 143,236 MB 326 and P4P Generic Weight was 143,143 MB, while P4P Coarse Grained was 327 139,669 MB. We also observed that using a P4P iTracker reduced 328 outgoing Internet traffic by an average of 34% at peering points. 329 Random was 134,219 MB and P4P Generic Weight was 91,979 MB, while P4P 330 Coarse Grained was 86,652 MB. 332 In terms of downstream utilization, we observed that the use of a P4P 333 iTracker reduced incoming Internet traffic by an average of 80% at 334 peering points. Random was 47,013 MB, P4P Generic Weight was 8,610 335 MB, and P4P Coarse Grained was 7,764 MB. However, we did notice that 336 download activity in the Comcast access network increased somewhat, 337 from 56,030 MB for Random, to 59,765 MB for P4P Generic Weight, and 338 60,781 MB for P4P Coarse Grained. Note that for each swarm, the 339 number of downloaded bytes according to logging reports is very close 340 to the number of downloaders multiplied by file size. But they do 341 not exactly match due to log report errors and duplicated chunks. 342 One factor contributing to the differences in access network download 343 activity is that different swarms have different numbers of 344 downloaders, due to random variations during uniform random 345 assignment of downloaders to swarms (see Table 1). One interesting 346 observation is that Random has higher cancellation rate (3.17%) than 347 that of the guided swarms (1.77%-2.22%). Whether guided swarms 348 achieve lower cancellation rate is an interesting issue for future 349 research. 351 6. Important Notes on Data Collected 353 Raw data is presented in this document. We did not normalize traffic 354 volume data (e.g. upload and download) by the number of downloads in 355 order to preserve this underlying raw data. 357 We also recommend that readers not focus too much on the absolute 358 numbers, such as bytes downloaded from internal sources and bytes 359 downloaded from external sources. Instead, we recommend readers 360 focus on ratios such as the percentage of bytes downloaded that came 361 from internal sources in each swarm. As a result, the small random 362 variation between number of downloads of each swarm does not distract 363 readers from important metrics like shifting traffic from external to 364 internal sources, among other things. 366 We also wish to note that the data was collected from a sample of the 367 total swarm. Specifically, there were some peers running older 368 versions of the Pando client that did not implement the extended 369 transfer logging. For those nodes, which participated in the swarms 370 but did not report their data transfers, we have download counts. 371 The result of this is that, for example, the download counts 372 generated from the standard logging are a bit higher than the 373 download counts generated by the extended logging. That being said, 374 over 90% of downloads were by peers running the newer software, which 375 we believe shows that the transfer records are highly representative 376 of the total data flow. 378 In terms of which analysis was performed from the standard logging 379 compared to extended logging, all of the data flow analysis was 380 performed using the extended logging. Pando's download counts and 381 performance numbers were generated via standard logging (i.e. all 382 peers report download complete/cancel, data volumes, and measured 383 download speed on the client). Yale's download counts and 384 performance numbers were derived via extended logging (e.g. by 385 summing the transfer records, counting IP addresses reported, etc.). 387 One benefit of having two data sources is that we can compare the 388 two. In this case, the two approaches both reported comparable 389 impacts. 391 7. Next Steps 393 One objective of this document is to share with the IETF community 394 the results of one P4P iTracker trial in a large broadband network, 395 given skepticism regarding the benefits to P2P users as well as to 396 ISPs. From the perspective of P2P users, P4P iTrackers potentially 397 deliver faster P2P downloads. At the same time, ISPs can increase 398 the localization of swarms, enabling them to reduce bytes flowing 399 over transit points, while also delivering an optimized P2P 400 experience to customers. However, an internal analysis of varying 401 levels of P4P iTracker adoption by ISPs leads us to believe that, 402 while P4P iTracker-type mechanisms are valuable on a single ISP 403 basis, the value of P4P iTrackers increases dramatically as many ISPs 404 choose to deploy it. 406 We believe these results can inform the technical discussion in the 407 IETF over how to use P4P iTracker mechanisms. Should such a 408 mechanism be standardized, the use of ISP-provided P4P iTrackers 409 should probably be an opt-in feature for P2P users, or at least a 410 feature of which they are explicitly aware of and which has been 411 enabled by default in a particular P2P client. In this way, P2P 412 users could choose to opt-in either explicitly or by their choice of 413 P2P client in order to choose to use the P4P iTracker to improve 414 performance, which benefits both the user and the ISP at the same 415 time. Importantly in terms of privacy, the P4P iTracker makes 416 available only network topology information, and would not in its 417 current form enable an ISP, via the P4P iTracker, to determine what 418 P2P clients were downloading what content. 420 It is also possible that a P4P iTracker type of mechanism, in 421 combination with a P2P cache, could further improve P2P download 422 performance, which merits further study. In addition, this was a 423 limited trial that, while very promising, indicates a need for 424 additional technical investigation and trial work. Such follow-up 425 study should explore the effects of P4P iTrackers when more P2P 426 client software variants are involved, with larger swarms, and with 427 additional and more technically diverse content (file size, file 428 type, duration of content, etc.). 430 8. Security Considerations 432 There are no security considerations in this document. 434 9. IANA Considerations 436 There are no IANA considerations in this document. 438 10. Acknowledgements 440 The authors wish to acknowledge the hard work of all of the P4P 441 working group members, and specifically the focused efforts of the 442 teams at both Pando and Yale for the trial itself. Finally, the 443 authors recognize and appreciate Peter Sevcik and John Bartlett, of 444 NetForecast, Inc., for their valued independent analysis of the trial 445 results. 447 11. References 448 11.1. Normative References 450 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 451 Requirement Levels", BCP 14, RFC 2119, March 1997. 453 11.2. Informative References 455 [I-D.p2pi-cooper-workshop-report] 456 Peterson, J. and A. Cooper, "Report from the IETF workshop 457 on P2P Infrastructure, May 28, 2008", 458 draft-p2pi-cooper-workshop-report-01 (work in progress), 459 February 2009. 461 [SIGCOMM] Xie, H., Yang, Y., Krishnamurthy, A., Liu, Y., and A. 462 Silberschatz, "ACM SIGCOMM 2008 - P4P: Provider Portal for 463 Applications", Association for Computing Machinery SIGCOMM 464 2008 Proceedings, August 2008, 465 . 467 Authors' Addresses 469 Chris Griffiths 470 Comcast Cable Communications 471 One Comcast Center 472 1701 John F. Kennedy Boulevard 473 Philadelphia, PA 19103 474 US 476 Email: chris_griffiths@cable.comcast.com 477 URI: http://www.comcast.com 479 Jason Livingood (editor) 480 Comcast Cable Communications 481 One Comcast Center 482 1701 John F. Kennedy Boulevard 483 Philadelphia, PA 19103 484 US 486 Email: jason_livingood@cable.comcast.com 487 URI: http://www.comcast.com 488 Laird Popkin 489 Pando Networks 490 520 Broadway Street 491 10th Floor 492 New York, NY 10012 493 US 495 Email: laird@pando.com 496 URI: http://www.pando.com 498 Richard Woundy (editor) 499 Comcast Cable Communications 500 27 Industrial Avenue 501 Chelmsford, MA 01824 502 US 504 Email: richard_woundy@cable.comcast.com 505 URI: http://www.comcast.com 507 Richard Yang 508 Yale University 509 51 Prospect Street 510 New Haven, CT 06520 511 US 513 Email: yry@cs.yale.edu 514 URI: http://www.cs.yale.edu