idnits 2.17.1 draft-cain-cdnp-known-req-map-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 53 instances of too long lines in the document, the longest one being 1 character in excess of 72. == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 14, 2000) is 8563 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '5' is defined on line 716, but no explicit reference was found in the text ** Obsolete normative reference: RFC 765 (ref. '1') (Obsoleted by RFC 959) ** Obsolete normative reference: RFC 2246 (ref. '2') (Obsoleted by RFC 4346) ** Obsolete normative reference: RFC 2326 (ref. '3') (Obsoleted by RFC 7826) ** Obsolete normative reference: RFC 2616 (ref. '4') (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) == Outdated reference: A later version (-09) exists of draft-day-cdnp-model-02 -- Possible downref: Normative reference to a draft: ref. '5' Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Cain 3 Internet-Draft Mirror Image Internet 4 Expires: May 15, 2001 F. Douglis 5 AT&T Labs 6 M. Green 7 Entera 8 M. Hofmann 9 Lucent 10 R. Nair 11 D. Potter 12 Cisco 13 O. Spatscheck 14 AT&T Labs 15 November 14, 2000 17 Known CDN Request Mapping Mechanisms 18 draft-cain-cdnp-known-req-map-00.txt 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as 28 Internet-Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six 31 months and may be updated, replaced, or obsoleted by other documents 32 at any time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on May 15, 2001. 43 Copyright Notice 45 Copyright (C) The Internet Society (2000). All Rights Reserved. 47 Discussion List & Archives 49 This document and related documents are discussed on the cdn mailing 50 list. To join the list, send mail to cdn-request@ops.ietf.org. To 51 contribute to the discussion, send mail to cdn@ops.ietf.org. The 52 archives are at ftp://ops.ietf.org/pub/lists/cdn.*. 54 Abstract 56 This memo presents a number of known mechanisms used to direct 57 client application requests to surrogate servers based on various 58 policies. In this memo we group mechanisms commonly called request 59 routing, content routing or content redirection under the term 60 request mapping. There exist multiple request mapping mechanisms. At 61 a high-level, these may be classified under: DNS Request Mapping, 62 Transport-layer Mapping, and Application-layer Mapping. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 5 67 2. DNS Request Mapping . . . . . . . . . . . . . . . . . . . 6 68 2.1 Basic DNS Mapping Mechanisms . . . . . . . . . . . . . . . 6 69 2.2 Multiple Replies . . . . . . . . . . . . . . . . . . . . . 6 70 2.3 Multi-level Resolution . . . . . . . . . . . . . . . . . . 6 71 2.4 NS Redirection . . . . . . . . . . . . . . . . . . . . . . 6 72 2.5 CNAME Redirection . . . . . . . . . . . . . . . . . . . . 7 73 2.6 Anycast . . . . . . . . . . . . . . . . . . . . . . . . . 7 74 2.7 Object Encoding . . . . . . . . . . . . . . . . . . . . . 8 75 2.8 DNS Request Mapping Problems . . . . . . . . . . . . . . . 8 76 3. Transport-layer Mapping . . . . . . . . . . . . . . . . . 10 77 4. Application-layer Mapping . . . . . . . . . . . . . . . . 11 78 4.1 Header Inspection . . . . . . . . . . . . . . . . . . . . 11 79 4.1.1 URL-based Mapping . . . . . . . . . . . . . . . . . . . . 11 80 4.1.1.1 302 Redirection . . . . . . . . . . . . . . . . . . . . . 11 81 4.1.1.2 In-Path Element . . . . . . . . . . . . . . . . . . . . . 11 82 4.1.2 Mime Header-based Mapping . . . . . . . . . . . . . . . . 12 83 4.1.3 Site-specific Identifiers . . . . . . . . . . . . . . . . 12 84 4.2 Content Modification . . . . . . . . . . . . . . . . . . . 13 85 4.2.1 Content Modification Overview . . . . . . . . . . . . . . 13 86 4.2.2 Basic Content Modification Mechanism . . . . . . . . . . . 13 87 4.2.2.1 A-priori URL Rewriting . . . . . . . . . . . . . . . . . . 13 88 4.2.2.2 On-Demand URL Rewriting . . . . . . . . . . . . . . . . . 14 89 4.2.2.3 Content Modification Problems . . . . . . . . . . . . . . 14 90 5. Combination of multiple mechanisms . . . . . . . . . . . . 15 91 6. Measurements . . . . . . . . . . . . . . . . . . . . . . . 16 92 6.1 Proximity Measurements . . . . . . . . . . . . . . . . . . 16 93 6.1.1 Probing . . . . . . . . . . . . . . . . . . . . . . . . . 16 94 6.1.2 Passive Measurement . . . . . . . . . . . . . . . . . . . 17 95 6.1.3 Metric Types . . . . . . . . . . . . . . . . . . . . . . . 17 96 6.2 Surrogate Feedback . . . . . . . . . . . . . . . . . . . . 18 97 6.2.1 Probing . . . . . . . . . . . . . . . . . . . . . . . . . 18 98 6.2.2 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 18 99 6.2.3 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 18 100 7. Security Considerations . . . . . . . . . . . . . . . . . 19 101 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 20 102 References . . . . . . . . . . . . . . . . . . . . . . . . 21 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . 21 104 Full Copyright Statement . . . . . . . . . . . . . . . . . 24 106 1. Introduction 108 The term "mapping" is used to convey a more general sense than the 109 word "direction". For example, one type of mapping is based on what 110 is commonly called the HTTP "redirect" mechanism. However, there are 111 methods that direct requests to SURROGATES without relying on a 112 particular protocol's redirection methods. Hence, the term "mapping" 113 is used to cover a wider variety of techniques than what might be 114 implied by the term "direction". 116 There exist multiple request mapping mechanisms. At a high-level, 117 these may be classified under: DNS Request Mapping, Transport-layer 118 Mapping, and Application-layer Mapping. 120 2. DNS Request Mapping 122 DNS Request Mapping is used in many CDNs because of its ubiquity as 123 a directory service. The basic concept of DNS based request mapping 124 is to insert a DNS server in the DNS resolution process. The server 125 returns a different set of IP addresses, or a different ordering of 126 entries in the returned set, depending on various metrics (see 127 Section 6). The overall goal is to improve the performance and 128 scalability of the objects represented by the domain name resolved. 130 2.1 Basic DNS Mapping Mechanisms 132 In its simplest form the mapping DNS server is authoritative for an 133 entire DNS domain or a subdomain. If a DNS resolution for the domain 134 is requested, the mapping DNS server will determine the IP address 135 of the best surrogate, in terms of a metric as defined in Section 6. 136 This IP address is returned in an A record to the client site DNS 137 server, and it may actually be a virtual IP (VIP) address of the 138 best set of surrogates for the client site DNS server. 140 2.2 Multiple Replies 142 To increase the reliability of the solution, the mapping DNS server 143 can return multiple replies. Common implementations of client site 144 DNS servers use those multiple replies in order while rotating them. 145 Therefore, the order in which the records are returned, as the 146 number of times a particular entry is repeated, can be used to map 147 multiple clients using a single client site DNS server. 149 2.3 Multi-level Resolution 151 To allow for multiple mapping decisions, multiple mapping DNS 152 servers can be involved in a single DNS resolution. The rational of 153 utilizing multiple mapping DNS servers in a single DNS resolution is 154 to allow one to distribute more complex decisions from a single 155 mapping DNS server to multiple, more specialized, mapping DNS 156 servers. The most common mechanisms used to insert multiple mapping 157 DNS servers in a single DNS resolution are the use of NS and CNAME 158 records. 160 2.4 NS Redirection 162 Using NS records, multiple mapping DNS servers can be included by 163 redirecting the authority of the next level domain to another 164 mapping DNS server. For example, the client site DNS server 165 resolving a.b.c.com would eventually request a resolution of 166 a.b.c.com from the name server authoritative for c.com. The 167 nameserver authoritative for this domain might be a mapping DNS 168 server. In this case the mapping DNS server can either return a set 169 of A records or can redirect the resolution of the request a.b.c.com 170 to the DNS server that is authoritative for b.c.com using NS records. 172 One drawback in the use of NS records is that the number of mapping 173 DNS servers is limited by the number of parts in the DNS name. This 174 problem results from the DNS policy that causes a client site DNS 175 server to abandon a request if no additional parts of the DNS name 176 are resolved in an exchange with an authoritative DNS server. 178 A second drawback is that the last DNS server can determine the TTL 179 of the entire resolution process. The reason is that the last DNS 180 server can return in the authoritative section of its response its 181 own NS record. The TTL for this record is solely determined by the 182 last DNS server. This cached NS record will be used by the client 183 site DNS server for further resolutions until it expires. 185 Another drawback is that some implementations of bind voluntarily 186 cause timeouts (typically 5 seconds) to simplify their 187 implementation in cases in which a NS-level redirect points to a 188 name server for which no valid A record is returned or cached. This 189 is especially a problem if the domain of the name server does not 190 match the domain currently resolved, since in this case the A 191 records which might be passed in the DNS response are discarded for 192 security reasons. Empirical measurements of DNS lookups to sites 193 with NS-level redirection using this type of setup have a high 194 incidence of DNS timeouts. 196 2.5 CNAME Redirection 198 Multi-level redirection using CNAMEs works similarly to NS records 199 in that a mapping DNS server returns a CNAME for a domain to map the 200 further request resolution to an entirely new domain and potentially 201 a new set of mapping DNS servers. The disadvantage of this approach 202 is mainly the additional overhead of resolving the new domain name. 203 One advantage is that the number of mapping DNS servers is 204 independent of the depth of the domain name. The number of mapping 205 DNS servers is only restricted by the resource limits defined on the 206 client site DNS server. Another advantage is the avoidance of some 207 DNS timeouts. 209 2.6 Anycast 211 To combine measurement and redirection, the mapping DNS server can 212 advertise an anycast address as its IP address. The same anycast 213 address is used by multiple physical DNS servers. In this scenario, 214 the mapping DNS server that is the closest to the client site DNS 215 server in terms of OSPF and BGP routing will receive the packet 216 containing the DNS resolution request. The mapping DNS server at 217 this point knows that it is the closest (by this metric) and can use 218 this information to make a mapping decision. Drawbacks of this 219 solution are: 221 * It is not known if the mapping DNS server is the closest DNS 222 server in terms of routing from the mapping server to the 223 client. 225 * BGP is not load sensitive. So the closest server in terms of 226 routing might not be the server with the least network latency. 228 * The server load is not considered during routing. If server 229 load has to be considered while finding the best mapping 230 server, it has to be folded into the routing metrics used by 231 the routing protocol. 233 2.7 Object Encoding 235 Since only DNS names are visible during the DNS mapping, some 236 solutions encode the object type, object hash or similar information 237 into the DNS name. This might vary from a simple division of objects 238 based on object type (such as images.a.b.c.com and 239 streaming.a.b.c.com) to a sophisticate schema in which the domain 240 name contains a unique identifier (such as a hash) of the object. 241 The obvious advantage is that object information is available during 242 the mapping process. The disadvantage is that the client site DNS 243 server has to perform multiple DNS resolutions to retrieve a single 244 Web page, which might increase rather than decrease the overall 245 latency. 247 2.8 DNS Request Mapping Problems 249 The use of DNS as a request mapping mechanism comes with several 250 problems: 252 1. DNS only allows resolution on a per-domain level, not a 253 per-object level. An ideal request resolution service would 254 service requests with per-object detail. Client-side 255 direction services allow this kind of resolution because of 256 their direct inspection of client requests. 258 2. DNS systems are typically not designed for very high volumes 259 of requests. This occurs in CDNs that desire near real-time 260 direction of requests to surrogates, because they must return 261 DNS entries with a short time-to-live (TTL) in order to offer 262 a different response in the face of changing conditions. 264 3. DNS server and client implementations do not always adhere to 265 the DNS standards and therefore cause problems with DNS 266 request mapping. For example, many implementations do not 267 honor the DNS TTL field. 269 4. DNS Request Mapping is based only on knowledge of the local 270 DNS server, as client addresses are not relayed within DNS 271 requests. DNS request mapping inherently makes use of an 272 assumption that users select a DNS server that is "close" to 273 them. Although this is true in many cases, it is not always 274 valid. This causes problems, especially for proximity-based 275 measurements that are made using an active probing technique. 276 In this case, proximity measurements are made to the user's 277 DNS server. 279 5. DNS servers can request and allow recursive resolution of DNS 280 names. If recursive resolution is used during the resolution 281 of the DNS request, the mapping DNS server does not see the 282 IP address of the client site DNS server, but instead sees 283 the address of the DNS server that is recursively requesting 284 the information --- possibly a DNS server operated by the 285 site for which the mapping DNS server is resolving content. 286 For example, imgs.company.com might be resolved by a CDN, but 287 the request for the resolution might come from 288 dns1.company.com as a result of recursion. 290 6. When a large number of clients share a single client site DNS 291 server, they will all be redirected to the same set of IP 292 addresses during the TTL interval. This might lead to 293 overload of the surrogate or surrogates behind this IP 294 address if during a flash crowd the number of clients 295 requesting documents from that single IP address exceeds the 296 capacity of the surrogate or surrogates. 298 7. Some implementations of bind cause DNS timeouts to occur 299 while exceptional situations are handled. These 300 "exceptional" circumstances include NS redirections to 301 unknown domains. 303 3. Transport-layer Mapping 305 The first stage of CDN selection is typically accomplished using the 306 DNS mechanisms described previously. As described in Section 2, this 307 first level decision must be made based on the information available 308 at the time, specifically the domain name being resolved and the IP 309 address of the client-side DNS server. While this level of 310 information is adequate in many cases, finer levels of granularity 311 can be achieved by inspecting the subsequent request from the client 312 browser to the surrogate chosen by DNS. The simplest of the 313 approaches used today is 'Transport-layer mapping'. 315 Transport-layer mapping makes use of the information available in 316 the first packet of the client request to make surrogate selection 317 decisions. The specific metrics used are identical to those used at 318 DNS time (see Section 6) but include the client's IP address (rather 319 than the client's DNS server) and the layer 4 protocol and port 320 information carried in that first packet in the decision making 321 process. Handing off the session to a more appropriate surrogate is 322 accomplished in a variety of proprietary means beyond the scope of 323 this document. Typically the forward-flow traffic (client to newly 324 selected surrogate) will flow through the surrogate originally 325 chosen by DNS. The reverse-flow (surrogate to client) traffic, which 326 normally transfers much more data than the forward flow, typically 327 takes the direct path. 329 The overhead associated with transport-layer mapping makes the most 330 sense for longer-lived flows such as FTP [1] or RTSP [3] or to 331 direct away from overloaded surrogates. 333 4. Application-layer Mapping 335 Application-layer mapping involves deeper examining of packets 336 beyond the transport layer header. It works together with DNS 337 request mapping and provides fine-grained mapping control down to 338 the level of individual objects and can be effected in real time at 339 the time of the object request. As in the case of transport-layer 340 mapping, the request routing process is more accurate than in the 341 DNS request mapping case, because it is based on the client's own IP 342 address rather than that of a DNS client site server. 344 4.1 Header Inspection 346 Applications such as HTTP [4], RTSP [3], SSL [2], etc. provide hints 347 in the initial portion of the session about how the client request 348 must be mapped. These hints may come from the URL of the content or 349 other parts of the Mime request header such as Cookies. 351 4.1.1 URL-based Mapping 353 HTTP and RTSP content requests describe the requested content by its 354 URL. In many cases, this information is sufficient to disambiguate 355 the content and suitably map the request. In practice, it is often 356 enough to use a sub-string, such as a prefix or suffix, of the URL 357 to make the mapping decision. 359 4.1.1.1 302 Redirection 361 In redirection-based mapping, the client is first resolved to a 362 virtual surrogate which in turn returns an application-specific 363 return code such as the 302 (in the case of HTTP or RTSP) indicating 364 to the client the IP address of the delivery node that is chosen 365 based on suitable metrics as described in Section 6. 367 The advantage of this type of application-aware mapping is 368 simplicity in implementation. However, the main drawback of this 369 method is the additional latency involved in sending the redirect 370 message back to the client. 372 4.1.1.2 In-Path Element 374 An In-Path element is a network element in the forwarding path of 375 the client's request. The In-Path element provides transparent 376 interception of the transport connection. This is accomplished by 377 accepting the connection request and establishing sequence numbers 378 via the three-way handshake with the client. This allows the In-Path 379 element to examine the content requests and glean the request header 380 information such as the URL, match it with a URL template, and make 381 the content routing determination. Again, metrics such as those 382 described in Section 6 may be employed. 384 Finally, the In-Path element splices the client connection to a 385 connection with the appropriate delivery node and passes along the 386 content request. The return path would pass through the In-Path 387 element. However, it is possible to arrange for a direct return by 388 passing the address translation information to the surrogate or 389 delivery node through some proprietary means. 391 The primary disadvantage with this method is the performance 392 implications of URL-parsing in the path of the network traffic. 393 However, it is generally the case that the return traffic is much 394 larger than the forward traffic. 396 Traffic may be partitioned and load balanced among a set of delivery 397 nodes by content objects identified by URLs. This allows 398 object-specific control of server loading. For example, requests for 399 non-cacheable objects may be directed away from a cache. 401 4.1.2 Mime Header-based Mapping 403 This works just like the URL-based mapping except that other 404 mime-headers in the content request are used to make the 405 content-rule selection. Some useful mime-headers are: Cookie, 406 Language, and User-Agent. 408 Cookies are used to identify a customer or session by a web site. 409 Cookie-based request mapping provides content service 410 differentiation based on the client. In addition, it is possible to 411 map a connection from a multi-session transaction to be mapped to 412 the same server to achieve session-level persistence. Note that 413 client IP address is by itself not a reliable indicator of a session 414 due to the presence of proxies that aggregate multiple clients at a 415 single point. 417 The language header can be used to map traffic to a 418 language-specific delivery node. 420 The user-agent header helps identify the type of client device. For 421 example, a voice-browser, PDA, or cell phone can indicate the type 422 of delivery node that has content specialized to handle the content 423 request. 425 4.1.3 Site-specific Identifiers 427 Site-specific identifiers help authenticate and identify a session 428 from a specific user. This information may be used to map a content 429 request. 431 One example of a site-specific identifier is the SSL Session 432 Identifier. This identifier is generated by a web server and used by 433 the web client in succeeding sessions to identify itself and avoid 434 an entire security authentication exchange. In order to inspect the 435 session identifier, an In-Path element. would observe the responses 436 of the web server and determine the session identifier which is then 437 used to associate the session to a specific server. The remaining 438 sessions are routed based on the stored session identifier. Note 439 that SSL Session Identifiers cannot be observed by the redirect 440 method. 442 4.2 Content Modification 444 4.2.1 Content Modification Overview 446 Content modification enables a content provider to take direct 447 control over request mapping without the need for specific switching 448 devices or directory services sitting in-between the client and the 449 origin server. By modifying the content according to the client's 450 specifics, a content provider can directly communicate to the client 451 which surrogate can serve it best. Decisions about the best 452 surrogate can be made on a per-object basis and can depend on 453 various metrics (see Section 6). The overall goal is to improve 454 scalability and the performance for delivering the modified content, 455 including all embedded objects. 457 4.2.2 Basic Content Modification Mechanism 459 Typically, content objects are made up a basic structure that 460 includes references to additional, embedded content objects. Most 461 web pages, for example, consist of an HTML document that contains 462 plain text together with some embedded objects, such as GIF or JPEG 463 images. The embedded objects are referenced using embedded HTML 464 directives. A similar scheme is used for streaming content, which is 465 typically embedded within a SMIL document. Traditionally, embedded 466 HTML or SMIL directives tell the client to fetch embedded objects 467 from the origin server. A content provider can now modify references 468 to embedded objects so that the client is told to fetch an embedded 469 object from the best surrogate (instead of from the origin server). 470 This type of content modification is also referred to as URL 471 Rewriting. It can be done a-priori in a static way, or more 472 dynamically on-demand. The following subsections explore both 473 alternatives. 475 4.2.2.1 A-priori URL Rewriting 477 A content provider can modify its content and rewrite embedded URLs 478 a-priori, i.e. before the content is put on the origin server and 479 made available to clients. In this case, rewriting can be done 480 either manually or by using a software tool that parses the content 481 and replaces embedded URLs. A-priori URL rewriting alone does not 482 allow consideration of client specifics for request mapping. It can 483 be used in combination with DNS request mapping, however, to direct 484 related DNS queries into the domain name space of the service 485 provider (see Section 6.1). Dynamic request mapping based on client 486 specifics is then done using the DNS approach. 488 4.2.2.2 On-Demand URL Rewriting 490 With dynamic URL rewriting, the content is modified when the client 491 request reaches the origin server. At this time, the identity of the 492 client is known and can be considered when rewriting embedded URLs. 493 In particular, an automated process can determine, on-demand, which 494 surrogate would serve the requesting client best. (For a discussion 495 on which metrics can be used and how to get proximity measures, see 496 Section 6.1.) Embedded URLs can then be rewritten so that the client 497 is told to fetch referenced object from the best surrogate rather 498 than from the origin server. 500 4.2.2.3 Content Modification Problems 502 The use of content modification as a request mapping mechanism comes 503 with several drawbacks: 505 1. The first request from a client to a specific site always has 506 to be served from the origin server. 508 2. Content that has been modified to include references to 509 nearby surrogates rather than to the origin server should be 510 marked as non-cacheable and should not be cached. 511 Alternatively, such pages can be marked to be cacheable only 512 for a relative short period of time. Rewritten URLs on cached 513 pages can cause problems, because they can be outdated and 514 point to surrogates that are no longer available or no longer 515 good choices. 517 3. On-demand URL rewriting (including content parsing, 518 information retrieval, and URL rewriting) has to be done in 519 real-time, which poses the question of performance and 520 processing capabilities. 522 5. Combination of multiple mechanisms 524 There are environments in which a combination of different 525 mechanisms can be beneficial and advantageous over using one of the 526 proposed mechanisms alone. The following example illustrates how the 527 mechanisms can be used in combination. 529 A basic problem of DNS request mapping is the resolution granularity 530 that allows resolution on a per-domain level only. A per-object 531 redirection cannot easily be achieved. However, content modification 532 can be used together with DNS request mapping to overcome this 533 problem. With content modification, references to different objects 534 on the same origin server can be rewritten to point into different 535 domain name spaces. Using DNS request mapping, requests for those 536 objects can now dynamically be mapped to different surrogates. 538 6. Measurements 540 CDNs' Request Mapping Systems make use of a variety of metrics for 541 the decision of which surrogate to select for a user request. These 542 metrics are based on both network measurements and feedback from 543 surrogates. 545 It is common practice to combine multiple metrics using both 546 proximity and surrogate feedback for best surrogate selection. 547 There are infinite possibilities for metrics as well as metric 548 combinations; the following sections describe several well-known 549 metrics as well as the two major techniques for obtaining metrics. 551 6.1 Proximity Measurements 553 Some CDN Request Mapping Systems make use of "proximity" 554 measurements to direct users to the "closest" surrogate. If a DNS 555 system is used for request mapping, then these measurements are made 556 to the client's local DNS server; this heuristic is not always 557 accurate. In a client-side direction model, the IP address of the 558 client is directly exposed and therefore more accurate proximity 559 measurements can be obtained. 561 Proximity measurements are used between the CDN surrogate set and 562 the requesting entity. In many cases, proximity measurements are 563 "one-way" in that they measure only either the forward or reverse 564 path of packets from the surrogate to the requesting entity. This 565 is important as many paths in the Internet are asymmetric. 567 In order to obtain a set of proximity measurements, a CDN may employ 568 active probing techniques and/or passive measurement techniques. 569 The following sections describe these two techniques. 571 6.1.1 Probing 573 In order to obtain a set of proximity measurements, a CDN may employ 574 an active probing technique. Active probing is when past or 575 possible requesting entities are probed using one or more techniques 576 to determine one or more metrics from each surrogate (or set). An 577 example of a probing technique would be an ICMP ECHO Request 578 periodically sent from each surrogate (or set) to a potential 579 requesting entity. 581 The problems with an active probing approach are: 583 1. Measurements can only be taken periodically. 585 2. Firewalls and NATs disallow probes. 587 3. Probes often cause security alarms to be triggered on 588 intrusion detection systems. 590 In any active probing approach, a list of potential requesting 591 entities needs to be obtained. This list can be generated 592 dynamically: as requests arrive, the requesting entity addresses can 593 be cached for later probing. Another potential solution is to use 594 an algorithm to divide address space into blocks and to probe those 595 blocks. 597 6.1.2 Passive Measurement 599 The other measurement technique makes use of passive measurements 600 which are obtained when a client actually transfers data to/from a 601 surrogate. In this technique, a bootstrap mechanism is used to 602 direct the client to a bootstrap surrogate. Once the client 603 connects, the actual performance of the transfer is measured. This 604 data is then fed back into the request mapping system. 606 An example of passive measurement is to watch the packet loss from a 607 client to a surrogate by observing TCP behavior. Latency 608 measurements can also be learned by observing TCP behavior (as TCP's 609 congestion control is partly based on RTT). 611 The problems with a passive measurement approach are mostly related 612 to the bootstrapping mechanism. A good mechanism is needed to that 613 every surrogate doesn't need to be "tested" per client to obtain the 614 measurement information. 616 6.1.3 Metric Types 618 The following sections list some of the metrics which can be used 619 for proximity calculations. This list is not meant to be 620 exhaustive. 622 * Latency: Network latency measurements metrics are used to 623 determine the surrogate (or set of surrogates) that has the 624 least delay to the requesting entity. These measurements can 625 be obtained using either an active probing approach or a 626 passive network measurement system. 628 * Packet Loss: Packet loss measurements can be used as a 629 selection metric. A passive measurement approach can easily 630 obtain packet loss information from TCP header information. 631 Active probing can periodically measure packet loss from 632 probes. 634 * Hop Counts: Router hops from the surrogate to the requesting 635 entity can be used as a proximity measurement. 637 * BGP Information: BGP AS PATH and MED attributes can be used to 638 determine the "BGP distance" to a given prefix/length pair. 639 In order to use BGP information for proximity measurements, it 640 must be obtained at each surrogate site/location. 642 6.2 Surrogate Feedback 644 Some CDN request mapping mechanisms make use of surrogate feedback 645 information in order to select a "least-loaded" surrogate. Feedback 646 can be delivered from each surrogate or can be aggregated by site or 647 by location. This feedback information is feed into the Request 648 Mapping System. CDNs often make use of both proximity and surrogate 649 feedback to make decisions. 651 Examples of surrogate feedback metrics include: CPU load, interface 652 load, interface dropped packets, number of connections, etc. 654 6.2.1 Probing 656 Feedback information may be obtained by periodically probing a 657 surrogate for example by issuing a HTTP request and observing the 658 behavior. The problems with probing for surrogate information are: 660 1. It is difficult to obtain "real-time" information. 662 2. Non-real-time information may be inaccurate. 664 6.2.2 Monitoring 666 Feedback information may also be obtained by agents that reside on 667 surrogates. These agents can communicate a variety of metrics about 668 the surrogates. 670 6.2.3 Metrics 672 The following quickly summarizes several of the well known metrics 673 which are used for surrogate feedback: 675 * Surrogate CPU Load. 677 * Interface Load / Dropped packets. 679 * Number of connections being served. 681 * Storage I/O Load. 683 7. Security Considerations 685 This is a preliminary draft for discussion purposes only submitted 686 prior to the formation of the working group. As such, security 687 considerations have been mostly deferred until after the working 688 group is constituted. [This document is not expected to be a formal 689 submission of the working group in its current form.] This document 690 in particular is a summary of mechanisms documented elsewhere. 691 Please consult the referenced documents for any mechanism specific 692 security considerations. 694 8. Acknowledgements 696 [Reviewers go here] 698 References 700 [1] Postel, J., "File Transfer Protocol", RFC 765, June 1980, 701 . 703 [2] Dierks, T. and C. Allen, "The TLS Protocol Version 1", RFC 704 2246, January 1999, 705 . 707 [3] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming 708 Protocol", RFC 2326, April 1998, 709 . 711 [4] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., 712 Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- 713 HTTP/1.1", RFC 2616, June 1999, 714 . 716 [5] Day, M., Cain, B. and G. Tomlinson, "A Model for CDN Peering", 717 draft-day-cdnp-model-02.txt (work in progress), October 2000, 718 . 721 Authors' Addresses 723 Brad Cain 724 Mirror Image Internet 725 49 Dragon Court 726 Woburn, MA 01801 727 US 729 Phone: +1 781 276 1904 730 EMail: brad.cain@mirror-image.com 732 Fred Douglis 733 AT&T Labs 734 Room B137 735 180 Park Ave, Bldg 103 736 Florham Park, NJ 07932 737 US 739 Phone: +1 973 360 8775 740 EMail: douglis@research.att.com 741 Mark Green 742 Entera, Inc. 743 40971 Encyclopedia Circle 744 Fremont, CA 94538 745 US 747 Phone: +1 510 770 5268 748 EMail: markg@entera.com 750 Markus Hofmann 751 Lucent Technologies 752 Room 4F-513 753 101 Crawfords Corner Rd. 754 Holmdel, NJ 07733 755 US 757 Phone: +1 732 332 5983 758 EMail: hofmann@bell-labs.com 760 Raj Nair 761 Cisco Systems 762 50 Nagog Park 763 Acton, MA 01720 764 US 766 Phone: +1 978 206 3029 767 EMail: rnair@cisco.com 769 Doug Potter 770 Cisco Systems 771 50 Nagog Park 772 Acton, MA 01720 773 US 775 Phone: +1 978 206 ???? 776 EMail: dougpott@cisco.com 777 Oliver Spatscheck 778 AT&T Labs 779 Room B131 780 180 Park Ave, Bldg 103 781 Florham Park, NJ 07932 782 US 784 Phone: +1 973 360 ???? 785 EMail: spatsch@research.att.com 787 Full Copyright Statement 789 Copyright (C) The Internet Society (2000). All Rights Reserved. 791 This document and translations of it may be copied and furnished to 792 others, and derivative works that comment on or otherwise explain it 793 or assist in its implementation may be prepared, copied, published 794 and distributed, in whole or in part, without restriction of any 795 kind, provided that the above copyright notice and this paragraph 796 are included on all such copies and derivative works. However, this 797 document itself may not be modified in any way, such as by removing 798 the copyright notice or references to the Internet Society or other 799 Internet organizations, except as needed for the purpose of 800 developing Internet standards in which case the procedures for 801 copyrights defined in the Internet Standards process must be 802 followed, or as required to translate it into languages other than 803 English. 805 The limited permissions granted above are perpetual and will not be 806 revoked by the Internet Society or its successors or assigns. 808 This document and the information contained herein is provided on an 809 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 810 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 811 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 812 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 813 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 815 Acknowledgement 817 Funding for the RFC editor function is currently provided by the 818 Internet Society.