idnits 2.17.1 draft-marocco-alto-problem-statement-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 392. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 403. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 410. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 416. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 30, 2008) is 5779 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Marocco 3 Internet-Draft Telecom Italia 4 Intended status: Informational V. Gurbani 5 Expires: January 1, 2009 Bell Laboratories, Alcatel-Lucent 6 June 30, 2008 8 Application-Layer Traffic Optimization (ALTO) Problem Statement 9 draft-marocco-alto-problem-statement-01 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on January 1, 2009. 36 Abstract 38 A significant part of the Internet traffic today is generated by 39 peer-to-peer applications used for file sharing, realtime 40 communications and live media streaming. Such applications often 41 deal with large amounts of data in direct peer-to-peer connections, 42 but they usually have little knowledge of the underlying topology, 43 both at the overlay layer and the network layer. As a result, they 44 may choose their peers based on measurements and statistics which, in 45 some specific situations, often lead to suboptimal choices. This 46 document describes problems related to optimizing traffic generated 47 by peer-to-peer applications through the use of link and network 48 layer information. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Research or Engineering? . . . . . . . . . . . . . . . . . 4 54 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 5 56 2.1.1. Information Distribution . . . . . . . . . . . . . . . 5 57 2.1.2. Topology Hiding . . . . . . . . . . . . . . . . . . . 5 58 3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3.1. File sharing . . . . . . . . . . . . . . . . . . . . . . . 5 60 3.2. Live media streaming . . . . . . . . . . . . . . . . . . . 6 61 3.3. Realtime communications . . . . . . . . . . . . . . . . . 6 62 3.4. Distributed Hash Tables . . . . . . . . . . . . . . . . . 6 63 3.5. Cache/Mirror Selection . . . . . . . . . . . . . . . . . . 6 64 4. Security Considerations . . . . . . . . . . . . . . . . . . . 7 65 5. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 7 66 6. Informative References . . . . . . . . . . . . . . . . . . . . 7 67 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 68 Intellectual Property and Copyright Statements . . . . . . . . . . 10 70 1. Introduction 72 A significant part of the Internet traffic is today generated by 73 peer-to-peer (P2P) applications used for file sharing, realtime 74 communications and live media streaming [WWW.cachelogic.picture] 75 [WWW.wired.fuel]. Contrary to client/server architectures, P2P 76 applications access resources (e.g. files or media relays) 77 distributed across the Internet and exchange large amounts of data in 78 connections that they establish directly with nodes hosting such 79 resources. 81 One of the advantages of P2P systems comes from the fact that the 82 resources they offer are often made available through multiple 83 instances. Yet, applications generally ignore the topology of the 84 latent overlay network and have to select among available instances 85 based on information they deduce from empirical measurements which, 86 in some particular situations, could lead to suboptimal choices. 88 For example, popular metrics based on round trip time estimation 89 perform quite badly for file sharing applications, as they tend to 90 ignore bandwidth and reliability of underlying links, which have 91 much more influence than delay on file transfers. 93 Many of the existing overlay networks are built on top of connections 94 between peers that are established regardless of the underlying 95 network topology. In addition to simply achieving suboptimal 96 performance, such networks can lead to congestions and cause serious 97 inefficiencies. As shown in [ACM.fear], traffic generated by popular 98 P2P applications often cross network boundaries multiple times, 99 overloading links which are frequently subject to congestion 100 [ACM.bottleneck]. 102 Recent studies [ACM.ispp2p] [WWW.p4p.overview] [ACM.ono] have shown 103 that if Internet Service Providers (ISP), network operators or third 104 parties in general provide reliable topology and/or bandwidth 105 information to P2P applications, it would be possible to greatly 106 increase aplication performance, reduce congestions and optimize the 107 overall traffic across different networks. 109 This document describes the problem of optimizing traffic generated 110 by P2P applications using information provided by third parties. 111 Section 2 introduces the problem and the main issues to keep in mind 112 when designing a solution, while Section 3 describes some use cases 113 where both P2P users and network operators would benefit from such a 114 solution. 116 1.1. Research or Engineering? 118 At the time of writing, several solutions have been proposed to 119 address the problem described in this document, both inside and 120 outside the IETF [I-D.bonaventure-informed-path-selection] 121 [ACM.ispp2p] [WWW.p4p.overview], all accompained by encouraging 122 simulation and field test results. Such solutions have been proposed 123 independently, but all consists of two essential parts: 124 o a discovery mechanism which can be used by a P2P application to 125 find a reliable information source; 126 o a protocol used by P2P applications to query such sources in order 127 to retrieve the information needed to choose the best endpoint 128 among those which host a desired resource. 130 It is not easy to foresee how such solutions would perform in the 131 Internet, but a more accurate evaluation would require representative 132 data collected from real systems by a critical mass of users. 134 However, wide adoption will probably never happen without an 135 agreement on a common solution based on open standards; whether such 136 a solution should be still studied as a research problem, published 137 as a "Proposed Standard" or an "Experimental" RFC [RFC2026] is an 138 open issue. 140 2. The Problem 142 Network engineers have been facing the problem of traffic 143 optimization for a long time now and have already designed mechanisms 144 like MPLS [RFC3031] and DiffServ [RFC3260] to deal with it. The 145 problem they address consists in finding (or setting) an optimal 146 route for packets travelling between specific source and destination 147 addresses and based on requirements such as low latency, high 148 reliability, and priority. Such solutions are usually implemented at 149 the link and network layers, and tend to be almost transparent. At 150 best, applications can only "mark" the traffic they generate with the 151 corresponding properties. 153 However, P2P applications which are today posing serious challenges 154 to Internet infrastructures, do not benefit much from the above 155 techniques and "cooperating" with external services aware of the 156 network topology could greatly optimize the traffic they generate. 157 In fact, when a P2P application needs to establish a connection, the 158 logical target is not a host, but rather a resource (e.g. a file or a 159 media relay) generally available in multiple instances on different 160 hosts; selection of the closest one -- or, in general, the best from 161 an overlay topological proximity -- has much more impact on the 162 overall traffic than the route followed by its packets to reach the 163 endpoint. 165 Addressing the Application-Layer Traffic Optimization (ALTO) problem 166 means, on the one hand, providing topology information regarding the 167 underlying network and, on the other hand, enhancing P2P applications 168 in order to use such information to select the best endpoints among 169 those that are available for the connections they are going to 170 establish. 172 2.1. Issues 174 2.1.1. Information Distribution 176 As a direct consequence of the total distribution of the Internet, it 177 seems almost impossible to centralize all information P2P 178 applications may need to optimize traffic they generate. It is quite 179 likely that such information would be highly distributed, for 180 example, at an ISP or domain level. It is also reasonable to expect 181 that, in some cases, the same network administrators will control 182 provision of such information. 184 However, as applications usually have no knowledge of the 185 administrative entities running the network they are using, any 186 solution will need to define a discovery mechanism (e.g. based on or 187 similar to reverse DNS [RFC2317]) and perhaps an infrastructure to 188 certify information sources. 190 2.1.2. Topology Hiding 192 Operators can play an important role in addressing the ALTO problem, 193 but they generally consider topology of the networks they control to 194 be confidential information; therefore, in order to succeed and 195 achieve wide adoption, any solution should provide a method to help 196 P2P applications in peer selection without explicitly disclosing 197 topology of the underlying network. 199 3. Use Cases 201 3.1. File sharing 203 File sharing applications allow users to search for content shared by 204 other users and download it. Typically, search results consist of 205 many istances of the same file available from multiple sources; the 206 goal of an ALTO solution would be to help peers find the best ones 207 according to the underlying networks. 209 On the application side, integration of ALTO functionalities may 210 happen at different levels. For example, while in the completely 211 decentralized Gnutella network selection of the best sources is 212 totally up to the user, in systems like BitTorrent and eDonkey, 213 central elements (i.e. trackers or servers) act as mediators. 214 Therefore, in the former case, optimization would require 215 modification in the applications, while in the latter it could just 216 be implemented in some central elements. 218 3.2. Live media streaming 220 P2P applications for live streaming allow users to receive multimedia 221 content produced by one source and targeted to multiple destinations, 222 in a realtime or near-realtime way without recurring to multicast. 223 Such applications tipically participate in the distribution of the 224 content, acting as both receivers and senders; the goal of an ALTO 225 sulution would be to help peers to find the best sources and the best 226 destinations for media flows they receive and relay. 228 3.3. Realtime communications 230 P2P realtime communications allow users to establish direct media 231 flows, usually to place audio and video calls, or to have text chats. 232 In the basic case, media would flow directly between the two 233 endpoints; however, in the general case, a significant portion of 234 communications between users with limited access to the Internet 235 (e.g. users behind NATs, firewalls or HTTP proxies) need to be 236 relayed by other elements. Such media relays are distributed over 237 the Internet -- in some cases co-located with applications with a 238 public address; the goal of an ALTO solution would be to help peers 239 to find the best relays. 241 3.4. Distributed Hash Tables 243 Distributed hash tables (DHT) are a class of overlay algorithms used 244 to implement lookup functionalities in popular P2P systems, without 245 recurring to centralized elements. In such systems, peers maintain 246 addresses of other peers participating in the same DHT in a routing 247 table, sorted according to specific criteria. An ALTO solution would 248 provide valuable information for DHT algorithms which, in order to 249 reduce path latency of distributed queries, include round trip time 250 estimations among such criteria [SIGCOMM.resprox]. 252 3.5. Cache/Mirror Selection 254 Providers of popular content like media and software repositories 255 usually resort to geographically distributed caches and mirrors for 256 load balancing. Selection of the proper mirror for a given user is 257 today based on inaccurate geolocation data, on proprietary network 258 location systems or often delegated to the user himself; an ALTO 259 solution could be easily adopted to ease such a selection in an 260 automated way. 262 4. Security Considerations 264 The approach proposed in this document requires P2P applications to 265 delegate a portion of their routing capability to third parties, 266 giving them a significant role in systems where that would be 267 otherwise excluded. 269 In the case where an ALTO solution is deployed by the network 270 operator, it is conceivable that the P2P community would consider it 271 hostile because the operator could, for example: 272 o redirect applications to corrupted mediators providing malicious 273 content; 274 o track connections to perform content inspection; 275 o apply policies based on criteria other than network efficiency 276 (for example, to avoid peering points regulated by inconvenient 277 economic agreements). 279 However, ALTO is completely optional for P2P applications and its 280 purpose is to help improve performance of such applications. If, for 281 some reason, it fails to achieve this purpose, it would simply fail 282 to gain popularity and would not be used. 284 Even in cases where the ALTO service provider would decide to 285 maliciously alter results returned by queries only after the solution 286 has gained popularity (i.e. it behaves for a while to become popular 287 and then starts misbehaving), it would be fairly easy for P2P 288 application maintainers and users to revert to solutions that are not 289 using it. After all, it would all come down to change some 290 application settings in cases where the protocol is implemented 291 inside the client and upgrading centralized elements for 292 architectures like BitTorrent and eDonkey. 294 5. Acknowledgments 296 We have to acknowledge many people. For the record: Vinay Aggarwal 297 and the P4P working group for the research work done outside the 298 IETF. Emil Ivov and Rohan Mahy for comments and corrections. 300 6. Informative References 302 [ACM.bottleneck] 303 Akella, A., Seshan, S., and A. Shaikh, "An Empirical 304 Evaluation of WideArea Internet Bottlenecks", Proceedings 305 of ACM SIGCOMM, October 2003. 307 [ACM.fear] 308 Karagiannis, T., Rodriguez, P., and K. Papagiannaki, 309 "Should ISPs fear Peer-Assisted Content Distribution?", 310 In ACM USENIX IMC, Berkeley 2005. 312 [ACM.ispp2p] 313 Aggarwal, V., Feldmann, A., and C. Scheideler, "Can ISPs 314 and P2P systems co-operate for improved performance?", In 315 ACM SIGCOMM Computer Communications Review (CCR), 37:3, 316 pp. 29-40. 318 [ACM.ono] Choffnes, D. and F. Bustamante, "Taming the Torrent: A 319 practical approach to reducing cross-ISP traffic in P2P 320 systems", Proceedings of ACM SIGCOMM, August 2008. 322 [I-D.bonaventure-informed-path-selection] 323 Saucez, D. and B. Donnet, "The case for an informed path 324 selection service", 325 draft-bonaventure-informed-path-selection-00 (work in 326 progress), February 2008. 328 [RFC2026] Bradner, S., "The Internet Standards Process -- Revision 329 3", BCP 9, RFC 2026, October 1996. 331 [RFC2317] Eidnes, H., de Groot, G., and P. Vixie, "Classless IN- 332 ADDR.ARPA delegation", BCP 20, RFC 2317, March 1998. 334 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 335 Label Switching Architecture", RFC 3031, January 2001. 337 [RFC3260] Grossman, D., "New Terminology and Clarifications for 338 Diffserv", RFC 3260, April 2002. 340 [SIGCOMM.resprox] 341 Gummadi, K., Gummadi, R., Ratnasamy, S., Gribble, S., 342 Shenker, S., and I. Stoica, "The impact of DHT routing 343 geometry on resilience and proximity", Proceedings of ACM 344 SIGCOMM, August 2003. 346 [WWW.cachelogic.picture] 347 Parker, A., "The true picture of peer-to-peer 348 filesharing", . 350 [WWW.p4p.overview] 351 Xie, H., Krishnamurthy, A., Silberschatz, A., and R. Yang, 352 "P4P: Explicit Communications for Cooperative Control 353 Between P2P and Network Providers", 354 . 356 [WWW.wired.fuel] 357 Glasner, J., "P2P fuels global bandwidth binge", 358 . 360 Authors' Addresses 362 Enrico Marocco 363 Telecom Italia 364 Via G. Reiss Romoli, 274 365 Turin 10148 366 Italy 368 Email: enrico.marocco@telecomitalia.it 370 Vijay K. Gurbani 371 Bell Laboratories, Alcatel-Lucent 372 2701 Lucent Lane 373 Lisle, IL 60532 374 USA 376 Email: vkg@alcatel-lucent.com 378 Full Copyright Statement 380 Copyright (C) The IETF Trust (2008). 382 This document is subject to the rights, licenses and restrictions 383 contained in BCP 78, and except as set forth therein, the authors 384 retain all their rights. 386 This document and the information contained herein are provided on an 387 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 388 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 389 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 390 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 391 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 392 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 394 Intellectual Property 396 The IETF takes no position regarding the validity or scope of any 397 Intellectual Property Rights or other rights that might be claimed to 398 pertain to the implementation or use of the technology described in 399 this document or the extent to which any license under such rights 400 might or might not be available; nor does it represent that it has 401 made any independent effort to identify any such rights. Information 402 on the procedures with respect to rights in RFC documents can be 403 found in BCP 78 and BCP 79. 405 Copies of IPR disclosures made to the IETF Secretariat and any 406 assurances of licenses to be made available, or the result of an 407 attempt made to obtain a general license or permission for the use of 408 such proprietary rights by implementers or users of this 409 specification can be obtained from the IETF on-line IPR repository at 410 http://www.ietf.org/ipr. 412 The IETF invites any interested party to bring to its attention any 413 copyrights, patents or patent applications, or other proprietary 414 rights that may cover technology that may be required to implement 415 this standard. Please address the information to the IETF at 416 ietf-ipr@ietf.org.