idnits 2.17.1 draft-livingood-woundy-p4p-experiences-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 27, 2009) is 5470 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force C. Griffiths 3 Internet-Draft J. Livingood, Ed. 4 Intended status: Informational Comcast 5 Expires: October 29, 2009 L. Popkin 6 Pando 7 R. Woundy, Ed. 8 Comcast 9 Y. Yang 10 Yale 11 April 27, 2009 13 Comcast's ISP Experiences In a P4P Technical Trial 14 draft-livingood-woundy-p4p-experiences-04 16 Status of this Memo 18 This Internet-Draft is submitted to IETF in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on October 29, 2009. 39 Copyright Notice 41 Copyright (c) 2009 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents in effect on the date of 46 publication of this document (http://trustee.ietf.org/license-info). 47 Please review these documents carefully, as they describe your rights 48 and restrictions with respect to this document. 50 Abstract 52 This document describes the experiences of Comcast, a large cable 53 broadband Internet Service Provider (ISP) in the U.S., in a Proactive 54 Network Provider Participation for P2P (P4P) technical trial in July 55 2008. This trial used iTracker technology being considered by the 56 IETF, as part of the Application Layer Transport Optimization (ALTO) 57 working group. 59 Table of Contents 61 1. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. High-Level Details . . . . . . . . . . . . . . . . . . . . . . 3 64 4. High-Level Trial Results . . . . . . . . . . . . . . . . . . . 4 65 4.1. Swarm Size . . . . . . . . . . . . . . . . . . . . . . . . 5 66 4.2. Impact on Downloads, or Downstream Traffic . . . . . . . . 5 67 4.3. Other Impacts and Interesting Data . . . . . . . . . . . . 6 68 5. Differences Between the P4P iTrackers Used . . . . . . . . . . 7 69 5.1. P4P Fine Grain . . . . . . . . . . . . . . . . . . . . . . 7 70 5.2. P4P Coarse Grain . . . . . . . . . . . . . . . . . . . . . 8 71 5.3. P4P Generic Weighted . . . . . . . . . . . . . . . . . . . 8 72 6. Important Notes on Data Collected . . . . . . . . . . . . . . 8 73 7. Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . 9 74 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 75 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 76 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 77 11. Normative References . . . . . . . . . . . . . . . . . . . . . 10 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 80 1. Requirements Language 82 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 83 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 84 document are to be interpreted as described in RFC 2119 [RFC2119]. 86 2. Introduction 88 Comcast is a large broadband ISP, based in the U.S., serving the 89 majority of its customers via cable modem technology. A trial was 90 conducted in July 2008 with Pando Networks, Yale, and several ISP 91 members of the P4P Working Group, which is part of the Distributed 92 Computing Industry Association (DCIA). Comcast is a member of the 93 P4P Working Group, whose mission is to work with Internet service 94 providers (ISPs), peer to peer (P2P) companies, and technology 95 researchers to develop "P4P" mechanisms that accelerate distribution 96 of content and optimize utilization of ISP network resources. P4P 97 theoretically allows P2P networks to optimize traffic within each 98 ISP, reducing the volume of data traversing the ISP's infrastructure 99 and creating a more manageable flow of data. P4P can also accelerate 100 P2P downloads for end users. 102 P4P's so-called "iTracker" technology was conceptually discussed with 103 the IETF at the Peer to Peer Infrastructure (P2Pi) Workshop held on 104 May 22, 2008, at the Massachusetts Institute of Technology (MIT). 105 This work was discussed in greater detail at the 72nd meeting of the 106 IETF, in Dublin, Ireland, in the ALTO BoF on July 29, 2008. Due to 107 interest from the community, Comcast shared P4P trial data at the 108 73rd meeting of the IETF, in Minneapolis, Minnesota, in the ALTO BoF 109 on November 18, 2008. Since that time, discussion of iTrackers and 110 alternative technologies has continued among participants of the ALTO 111 working group. 113 The P4P trial was conducted, in cooperation with Pando, Yale, and 114 three other P4P member ISPs, from July 2 to July 17, 2008. This was 115 the first P4P trial over a cable broadband network. The trial used a 116 Pando P2P client, and Pando distributed a special 21 MB licensed 117 video file as in order to measure the effectiveness of P4P iTrackers. 118 A primary objective of the trial was to measure the effects that 119 increasing the localization of P2P swarms would have on P2P uploads, 120 P2P downloads, and ISP networks, in comparison to normal P2P 121 activity. 123 3. High-Level Details 125 There were five different swarms for the content used in the trial. 127 The first was a random P2P swarm, as a control group. The second, 128 third, and fourth used different P4P iTrackers: Generic, Coarse 129 Grained, and Fine Grained. The fifth was a proprietary Pando 130 mechanism. (The results of the fifth swarm, while very good, are not 131 included here since our focus is on open standards and a mechanism 132 which may be leveraged for the benefit of the entire community of P2P 133 clients.) Comcast deployed an iTracker server in its production 134 network to support this trial, and configured multiple iTracker files 135 to provide varying levels of localization to clients. 137 In the trial itself, a P2P client begins a P2P session by querying a 138 pTracker, which runs and manages the P2P network. The pTracker 139 occasionally queries the iTracker, which in this case was maintained 140 by Comcast, the ISP. Other ISPs either managed their own iTracker or 141 used Pando or Yale to host their iTracker files. The iTracker 142 returns network topology information to the pTracker, which then 143 communicates with P2P clients, in order to enable P2P clients to make 144 network-aware decisions regarding peers. 146 The Pando client was enabled to capture extended logging, when the 147 version of the client included support for it. The extended logging 148 included the source and destination IP address of all P2P transfers, 149 the number of bytes transferred, and the start and end timestamps. 150 This information gives a precise measurement of the data flow in the 151 network, allowing computation of data transfer volumes as well as 152 data flow rates at each point in time. With standard logging, Pando 153 captured the start and completion times of every download, as well as 154 the average transfer rate observed by the client for the download. 156 Pando served the data from an origin server external to Comcast's 157 network. This server served about 10 copies of the file, after which 158 all transfers (about 1 million downloads across all ISPs) were 159 performed purely via P2P. 161 The P2P clients in the trial start with tracker-provided peers, then 162 use peer exchange to discover additional peers. Thus, the initial 163 peers were provided according to P4P guidance (90% guidance based on 164 P4P topology, and 10% random guidance), then later peers discover the 165 entire swarm via either additional announces or peer exchange. 167 4. High-Level Trial Results 169 Trial data was collected by Pando Networks and Yale University, and 170 raw trial results were shared with Comcast and all of the other ISPs 171 involved in the trial. Analysis of the raw results was performed by 172 Pando and Yale, and these organizations delivered an analysis of the 173 P4P trial. Using the raw data, Comcast also analyzed the trial 174 results. Furthermore, the raw trial results for Comcast were shared 175 with Net Forecast, Inc., which performed an independent analysis of 176 the trial for Comcast. 178 4.1. Swarm Size 180 During the trial, downloads peaked at 24,728 per day, per swarm, or 181 nearly 124,000 per day for all five swarms. The swarm size peaked at 182 11,703 peers per swarm, or nearly 57,000 peers for all five swarms. 183 We observed a comparable number of downloads in each of the five 184 swarms. 186 For each swarm, Table 1 below gives the number of downloaders per 187 swarm from Comcast that finished downloading, and the number of 188 downloaders from Comcast that canceled downloading before finishing. 190 Characteristics of P4P Swarms: 192 +-----------+-----------+---------------+------------+--------------+ 193 | Swarm | Completed | Cancellations | Total | Cancellation | 194 | | Downloads | | Attempts | Rate | 195 +-----------+-----------+---------------+------------+--------------+ 196 | Random | 2,719 | 89 | 2,808 | 3.17% | 197 | (Control) | | | | | 198 | --------- | --------- | ----------- | ---------- | ----------- | 199 | P4P Fine | 2,846 | 64 | 2,910 | 2.20% | 200 | Grained | | | | | 201 | --------- | --------- | ----------- | ---------- | ----------- | 202 | P4P | 2,775 | 63 | 2,838 | 2.22% | 203 | Generic | | | | | 204 | Weight | | | | | 205 | --------- | --------- | ----------- | ---------- | ----------- | 206 | P4P | 2,886 | 52 | 2,938 | 1.77% | 207 | Coarse | | | | | 208 | Grained | | | | | 209 +-----------+-----------+---------------+------------+--------------+ 211 Table 1: Per-Swarm Size and Cancellation Rates 213 4.2. Impact on Downloads, or Downstream Traffic 215 The results of the trial indicated that P4P can improve the speed of 216 downloads to P2P clients. In addition, P4P was effective in 217 localizing P2P traffic within the Comcast network. 219 However, we did notice that download activity in our access network 220 increased somewhat, from 56,030 MB for Random, to 59,765 MB for P4P 221 Generic Weight, and 60,781 MB for P4P Coarse Grained. Note that for 222 each swarm, the number of downloaded bytes our logs report is very 223 close to the number of downloaders multiplied by file size. But they 224 do not exactly match due to log report errors and duplicated chunks. 225 One factor contributing to the differences in access network download 226 activity is that different swarms have different numbers of 227 downloaders due to random variations during uniform random assignment 228 of downloaders to swarms (see Table 1). One interesting observation 229 is that Random has higher cancellation rate (3.17%) than that of the 230 guided swarms (1.77% to 2.22%). Whether guided swarms achieve lower 231 cancellation rate is an interesting issue for future investigation. 233 Impact of P4P on Downloads: 235 +--------------+------------+------------+-------------+------------+ 236 | Swarm | Global Avg | Change | Comcast Avg | Change | 237 | | bps | | bps | | 238 +--------------+------------+------------+-------------+------------+ 239 | Random | 144,045 | n/a | 254,671 bps | n/a | 240 | (Control) | bps | | | | 241 | ---------- | ---------- | ---------- | ---------- | ---------- | 242 | P4P Fine | 162,344 | +13% | 402,043 bps | +57% | 243 | Grained | bps | | | | 244 | ---------- | ---------- | ---------- | ---------- | ---------- | 245 | P4P Generic | 163,205 | +13% | 463,782 bps | +82% | 246 | Weight | bps | | | | 247 | ---------- | ---------- | ---------- | ---------- | ---------- | 248 | P4P Coarse | 166,273 | +15% | 471,218 bps | +85% | 249 | Grained | bps | | | | 250 +--------------+------------+------------+-------------+------------+ 252 Table 2: Per-Swarm Global and Comcast Download Speeds 254 4.3. Other Impacts and Interesting Data 256 An analysis of the effects of P4P on upstream utilization and 257 Internet transit was also interesting. It did not appear that P4P 258 significantly increased upstream utilization in the Comcast access 259 network; in essence uploading was already occurring no matter what 260 and P4P in and of itself did not appear to materially increase 261 uploading for this specific, licensed content. (P4P is not intended 262 as a solution for the potential of network congestion to occur.) 263 Random was 143,236 MB and P4P Generic Weight was 143,143 MB, while 264 P4P Coarse Grained was 139,669 MB. We also observed that P4P reduced 265 outgoing Internet traffic by an average of 34% at peering points. 266 Random was 134,219 MB and P4P Generic Weight was 91,979 MB, while P4P 267 Coarse Grained was 86,652 MB. 269 In terms of downstream utilization, we observed that P4P reduced 270 incoming Internet traffic by an average of 80% at peering points. 271 Random was 47,013 MB, P4P Generic Weight was 8,610 MB, and P4P Coarse 272 Grained was 7,764 MB. However, we did notice that download activity 273 in the Comcast access network increased somewhat, from 56,030 MB for 274 Random, to 59,765 MB for P4P Generic Weight, and 60,781 MB for P4P 275 Coarse Grained. Note that for each swarm, the number of downloaded 276 bytes according to logging reports is very close to the number of 277 downloaders multiplied by file size. But they do not exactly match 278 due to log report errors and duplicated chunks. One factor 279 contributing to the differences in access network download activity 280 is that different swarms have different numbers of downloaders, due 281 to random variations during uniform random assignment of downloaders 282 to swarms (see Table 1). One interesting observation is that Random 283 has higher cancellation rate (3.17%) than that of the guided swarms 284 (1.77%-2.22%). Whether guided swarms achieve lower cancellation rate 285 is an interesting issue for future research. 287 5. Differences Between the P4P iTrackers Used 289 Given the size of the Comcast network, it was felt that in order to 290 truly evaluate the iTracker application we would need to test various 291 network topologies that reflected its network and would help gauge 292 the level of effort and design requirements necessary to get correct 293 statistical data out of the trial. In all cases, iTrackers were 294 configured with automation in mind, so that any successful iTracker 295 configuration would be automatically updating, rather than manually 296 configured on an on-going basis. All iTrackers were hosted on the 297 same small server, and it appeared to be relatively easy and 298 inexpensive to scale up an iTracker infrastructure should P4P-like 299 mechanisms become standardized and widely adopted. 301 5.1. P4P Fine Grain 303 The Fine Grain topology was the first and most complex iTracker that 304 we built for this trial. It was a detailed mapping of Comcast 305 backbone-connected network Autonomous System Numbers (ASN) to IP 306 Aggregates which were weighted based on priority and distance from 307 each other. Included in this design was a prioritization of all Peer 308 and Internet transit connected ASNs to the Comcast backbone to ensure 309 that P4P traffic would prefer settlement free and lower cost networks 310 first, and then more expensive transit links. This attempted to 311 optimize and lower transit costs associated with this traffic. We 312 then took the additional step of detailing each ASN and IP aggregate 313 into IP subnets down to Optical Transport Nodes (OTN) where all Cable 314 Modem Termination Systems (CMTS) reside. This design gave a highly 315 localized and detailed description of the Comcast network for the 316 iTracker to disseminate. This design defined 1,182 iTracker node 317 identifiers, and resulted in a 210,727 line configuration file. 319 This iTracker was obviously the most time-consuming to create and the 320 most complex to maintain. Trial results indicated that this level of 321 localization was too high, and was less effective compared to lower 322 levels of localization. 324 5.2. P4P Coarse Grain 326 Given the level of detail in the Fine Grain design, it was important 327 that we also enable a high-level design which still used priority and 328 weighting mechanisms for the Comcast backbone and transit links. The 329 Coarse Grain design was a limited or summarized version of the Fine 330 Grain design, which used the ASN to IP Aggregate and weighted data 331 for transit links, but removed all additional localization data. 332 This insured we would get similar data sets from the Fine Grain 333 design, but without the more detailed localization of each of the 334 networks off of the Comcast backbone. This design defined 22 335 iTracker node identifiers, and resulted in a 1,461 line configuration 336 file. 338 From an overall cost, complexity, risk, and effectiveness standpoint, 339 this was judged to be the optimal iTracker for Comcast. Importantly, 340 this did not require revealing the complex, internal network topology 341 that the Fine Grain did. Updates to this iTracker were also far 342 simpler to automate, which will better ensure that it is accurate 343 over time, and keeps administrative overhead relatively low. 344 However, the differences, costs, and benefits of Coarse Grain and 345 Generic Weighted (see below) likely merit further study. 347 5.3. P4P Generic Weighted 349 The Generic Weighted design was a copy of the Coarse Grained design 350 but instead of using ISP-designated priority and weights, all weights 351 were defaulted to pre-determined parameters that the Yale team had 352 designed. All other data was replicated from the Coarse Grain 353 design. Providing the information necessary to support the Generic 354 Weighted iTracker was roughly the same as for Coarse Grain. 356 6. Important Notes on Data Collected 358 Raw data is presented in this document. We did not normalize traffic 359 volume data (e.g. upload and download) by the number of downloads in 360 order to preserve this underlying raw data. 362 We also recommend that readers not focus too much on the absolute 363 numbers, such as bytes downloaded from internal sources and bytes 364 downloaded from external sources. Instead, we recommend readers 365 focus on ratios such as the percentage of bytes downloaded that came 366 from internal sources in each swarm. As a result, the small random 367 variation between number of downloads of each swarm does not distract 368 readers from important metrics like shifting traffic from external to 369 internal sources, among other things. 371 We also wish to note that the data was collected from a sample of the 372 total swarm. Specifically, there were some peers running older 373 versions of the Pando client that did not implement the extended 374 transfer logging. For those nodes, which participated in the swarms 375 but did not report their data transfers, we have download counts. 376 The result of this is that, for example, the download counts 377 generated from the standard logging are a bit higher than the 378 download counts generated by the extended logging. That being said, 379 over 90% of downloads were by peers running the newer software, which 380 we believe shows that the transfer records are highly representative 381 of the total data flow. 383 In terms of which analysis was performed from the standard logging 384 compared to extended logging, all of the data flow analysis was 385 performed using the extended logging. Pando's download counts and 386 performance numbers were generated via standard logging (i.e. all 387 peers report download complete/cancel, data volumes, and measured 388 download speed on the client). Yale's download counts and 389 performance numbers were derived via extended logging (e.g. by 390 summing the transfer records, counting IP addresses reported, etc.). 392 One benefit of having two data sources is that we can compare the 393 two. In this case, the two approaches both reported comparable 394 impacts. 396 7. Next Steps 398 One objective of this document is to share with the IETF community 399 the results of one P4P trial in a large broadband network, given 400 skepticism regarding the benefits to P2P users as well as to ISPs. 401 From the perspective of P2P users, P4P potentially delivers faster 402 P2P downloads. At the same time, ISPs can increase the localization 403 of swarms, enabling them to reduce bytes flowing over transit points, 404 while also delivering an optimized P2P experience to customers. 405 However, an internal analysis of varying levels of iTracker adoption 406 by ISPs leads us to believe that, while P4P-type mechanisms are 407 valuable on a single ISP basis, the value of P4P increases 408 dramatically as many ISPs choose to deploy it. 410 We believe these results can inform the technical discussion in the 411 IETF over how to use iTracker mechanisms. Should such a mechanism be 412 standardized, the use of ISP-provided iTrackers should probably be an 413 opt-in feature for P2P users, or at least a feature of which they are 414 explicitly aware of and which has been enabled by default in a 415 particular P2P client. In this way, P2P users could choose to opt-in 416 either explicitly or by their choice of P2P client in order to choose 417 to use the iTracker to improve performance, which benefits both the 418 user and the ISP at the same time. Importantly in terms of privacy, 419 the iTracker makes available only network topology information, and 420 would not in its current form enable an ISP, via the iTracker, to 421 determine what P2P clients were downloading what content. 423 It is also possible that an iTracker type of mechanism, in 424 combination with a P2P cache, could further improve P2P download 425 performance, which merits further study. In addition, this was a 426 limited trial that, while very promising, indicates a need for 427 additional technical investigation and trial work. Such follow-up 428 study should explore the effects of P4P when more P2P client software 429 variants are involved, with larger swarms, and with additional and 430 more technically diverse content (file size, file type, duration of 431 content, etc.). 433 8. Security Considerations 435 There are no security considerations to include at this time. 437 9. IANA Considerations 439 There are no IANA considerations in this document. 441 10. Acknowledgements 443 The authors wish to acknowledge the hard work of all of the P4P 444 working group members, and specifically the focused efforts of the 445 teams at both Pando and Yale for the trial itself. Finally, the 446 authors recognize and appreciate Peter Sevcik and John Bartlett, of 447 NetForecast, Inc., for their valued independent analysis of the trial 448 results. 450 11. Normative References 452 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 453 Requirement Levels", BCP 14, RFC 2119, March 1997. 455 Authors' Addresses 457 Chris Griffiths 458 Comcast Cable Communications 459 One Comcast Center 460 1701 John F. Kennedy Boulevard 461 Philadelphia, PA 19103 462 US 464 Email: chris_griffiths@cable.comcast.com 465 URI: http://www.comcast.com 467 Jason Livingood (editor) 468 Comcast Cable Communications 469 One Comcast Center 470 1701 John F. Kennedy Boulevard 471 Philadelphia, PA 19103 472 US 474 Email: jason_livingood@cable.comcast.com 475 URI: http://www.comcast.com 477 Laird Popkin 478 Pando Networks 479 520 Broadway Street 480 10th Floor 481 New York, NY 10012 482 US 484 Email: laird@pando.com 485 URI: http://www.pando.com 487 Richard Woundy (editor) 488 Comcast Cable Communications 489 27 Industrial Avenue 490 Chelmsford, MA 01824 491 US 493 Email: richard_woundy@cable.comcast.com 494 URI: http://www.comcast.com 495 Richard Yang 496 Yale University 497 51 Prospect Street 498 New Haven, CT 06520 499 US 501 Email: yry@cs.yale.edu 502 URI: http://www.cs.yale.edu