idnits 2.17.1 

draft-cain-cdnp-known-req-map-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 53 instances of too long lines in the document, the longest
     one being 1 character in excess of 72.

  == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 14, 2000) is 8563 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '5' is defined on line 716, but no explicit reference
     was found in the text

  ** Obsolete normative reference: RFC  765 (ref. '1') (Obsoleted by RFC 959)

  ** Obsolete normative reference: RFC 2246 (ref. '2') (Obsoleted by RFC 4346)

  ** Obsolete normative reference: RFC 2326 (ref. '3') (Obsoleted by RFC 7826)

  ** Obsolete normative reference: RFC 2616 (ref. '4') (Obsoleted by RFC
     7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  == Outdated reference: A later version (-09) exists of
     draft-day-cdnp-model-02

  -- Possible downref: Normative reference to a draft: ref. '5' 


     Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                            B. Cain
3	Internet-Draft                                     Mirror Image Internet
4	Expires: May 15, 2001                                         F. Douglis
5	                                                                AT&T Labs
6	                                                                 M. Green
7	                                                                   Entera
8	                                                               M. Hofmann
9	                                                                   Lucent
10	                                                                  R. Nair
11	                                                                D. Potter
12	                                                                    Cisco
13	                                                            O. Spatscheck
14	                                                                AT&T Labs
15	                                                        November 14, 2000

17	                   Known CDN Request Mapping Mechanisms
18	                   draft-cain-cdnp-known-req-map-00.txt

20	Status of this Memo

22	    This document is an Internet-Draft and is in full conformance with
23	    all provisions of Section 10 of RFC2026.

25	    Internet-Drafts are working documents of the Internet Engineering
26	    Task Force (IETF), its areas, and its working groups. Note that
27	    other groups may also distribute working documents as
28	    Internet-Drafts.

30	    Internet-Drafts are draft documents valid for a maximum of six
31	    months and may be updated, replaced, or obsoleted by other documents
32	    at any time. It is inappropriate to use Internet-Drafts as reference
33	    material or to cite them other than as "work in progress."

35	    The list of current Internet-Drafts can be accessed at
36	    http://www.ietf.org/ietf/1id-abstracts.txt.

38	    The list of Internet-Draft Shadow Directories can be accessed at
39	    http://www.ietf.org/shadow.html.

41	    This Internet-Draft will expire on May 15, 2001.

43	Copyright Notice

45	    Copyright (C) The Internet Society (2000). All Rights Reserved.

47	Discussion List & Archives

49	    This document and related documents are discussed on the cdn mailing
50	    list. To join the list, send mail to cdn-request@ops.ietf.org. To
51	    contribute to the discussion, send mail to cdn@ops.ietf.org. The
52	    archives are at ftp://ops.ietf.org/pub/lists/cdn.*.

54	Abstract

56	    This memo presents a number of known mechanisms used to direct
57	    client application requests to surrogate servers based on various
58	    policies. In this memo we group mechanisms commonly called request
59	    routing, content routing or content redirection under the term
60	    request mapping. There exist multiple request mapping mechanisms. At
61	    a high-level, these may be classified under: DNS Request Mapping,
62	    Transport-layer Mapping, and Application-layer Mapping.

64	Table of Contents

66	    1.      Introduction . . . . . . . . . . . . . . . . . . . . . . .  5
67	    2.      DNS Request Mapping  . . . . . . . . . . . . . . . . . . .  6
68	    2.1     Basic DNS Mapping Mechanisms . . . . . . . . . . . . . . .  6
69	    2.2     Multiple Replies . . . . . . . . . . . . . . . . . . . . .  6
70	    2.3     Multi-level Resolution . . . . . . . . . . . . . . . . . .  6
71	    2.4     NS Redirection . . . . . . . . . . . . . . . . . . . . . .  6
72	    2.5     CNAME Redirection  . . . . . . . . . . . . . . . . . . . .  7
73	    2.6     Anycast  . . . . . . . . . . . . . . . . . . . . . . . . .  7
74	    2.7     Object Encoding  . . . . . . . . . . . . . . . . . . . . .  8
75	    2.8     DNS Request Mapping Problems . . . . . . . . . . . . . . .  8
76	    3.      Transport-layer Mapping  . . . . . . . . . . . . . . . . . 10
77	    4.      Application-layer Mapping  . . . . . . . . . . . . . . . . 11
78	    4.1     Header Inspection  . . . . . . . . . . . . . . . . . . . . 11
79	    4.1.1   URL-based Mapping  . . . . . . . . . . . . . . . . . . . . 11
80	    4.1.1.1 302 Redirection  . . . . . . . . . . . . . . . . . . . . . 11
81	    4.1.1.2 In-Path Element  . . . . . . . . . . . . . . . . . . . . . 11
82	    4.1.2   Mime Header-based Mapping  . . . . . . . . . . . . . . . . 12
83	    4.1.3   Site-specific Identifiers  . . . . . . . . . . . . . . . . 12
84	    4.2     Content Modification . . . . . . . . . . . . . . . . . . . 13
85	    4.2.1   Content Modification Overview  . . . . . . . . . . . . . . 13
86	    4.2.2   Basic Content Modification Mechanism . . . . . . . . . . . 13
87	    4.2.2.1 A-priori URL Rewriting . . . . . . . . . . . . . . . . . . 13
88	    4.2.2.2 On-Demand URL Rewriting  . . . . . . . . . . . . . . . . . 14
89	    4.2.2.3 Content Modification Problems  . . . . . . . . . . . . . . 14
90	    5.      Combination of multiple mechanisms . . . . . . . . . . . . 15
91	    6.      Measurements . . . . . . . . . . . . . . . . . . . . . . . 16
92	    6.1     Proximity Measurements . . . . . . . . . . . . . . . . . . 16
93	    6.1.1   Probing  . . . . . . . . . . . . . . . . . . . . . . . . . 16
94	    6.1.2   Passive Measurement  . . . . . . . . . . . . . . . . . . . 17
95	    6.1.3   Metric Types . . . . . . . . . . . . . . . . . . . . . . . 17
96	    6.2     Surrogate Feedback . . . . . . . . . . . . . . . . . . . . 18
97	    6.2.1   Probing  . . . . . . . . . . . . . . . . . . . . . . . . . 18
98	    6.2.2   Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 18
99	    6.2.3   Metrics  . . . . . . . . . . . . . . . . . . . . . . . . . 18
100	    7.      Security Considerations  . . . . . . . . . . . . . . . . . 19
101	    8.      Acknowledgements . . . . . . . . . . . . . . . . . . . . . 20
102	            References . . . . . . . . . . . . . . . . . . . . . . . . 21
103	            Authors' Addresses . . . . . . . . . . . . . . . . . . . . 21
104	            Full Copyright Statement . . . . . . . . . . . . . . . . . 24

106	1. Introduction

108	    The term "mapping" is used to convey a more general sense than the
109	    word "direction". For example, one type of mapping is based on what
110	    is commonly called the HTTP "redirect" mechanism. However, there are
111	    methods that direct requests to SURROGATES without relying on a
112	    particular protocol's redirection methods. Hence, the term "mapping"
113	    is used to cover a wider variety of techniques than what might be
114	    implied by the term "direction".

116	    There exist multiple request mapping mechanisms. At a high-level,
117	    these may be classified under: DNS Request Mapping, Transport-layer
118	    Mapping, and Application-layer Mapping.

120	2. DNS Request Mapping

122	    DNS Request Mapping is used in many CDNs because of its ubiquity as
123	    a directory service. The basic concept of DNS based request mapping
124	    is to insert a DNS server in the DNS resolution process.  The server
125	    returns a different set of IP addresses, or a different ordering of
126	    entries in the returned set, depending on various metrics (see
127	    Section 6). The overall goal is to improve the performance and
128	    scalability of the objects represented by the domain name resolved.

130	2.1 Basic DNS Mapping Mechanisms

132	    In its simplest form the mapping DNS server is authoritative for an
133	    entire DNS domain or a subdomain. If a DNS resolution for the domain
134	    is requested, the mapping DNS server will determine the IP address
135	    of the best surrogate, in terms of a metric as defined in Section 6.
136	    This IP address is returned in an A record to the client site DNS
137	    server, and it may actually be a virtual IP (VIP) address of the
138	    best set of surrogates for the client site DNS server.

140	2.2 Multiple Replies

142	    To increase the reliability of the solution, the mapping DNS server
143	    can return multiple replies. Common implementations of client site
144	    DNS servers use those multiple replies in order while rotating them.
145	    Therefore, the order in which the records are returned, as the
146	    number of times a particular entry is repeated, can be used to map
147	    multiple clients using a single client site DNS server.

149	2.3 Multi-level Resolution

151	    To allow for multiple mapping decisions, multiple mapping DNS
152	    servers can be involved in a single DNS resolution. The rational of
153	    utilizing multiple mapping DNS servers in a single DNS resolution is
154	    to allow one to distribute more complex decisions from a single
155	    mapping DNS server to multiple, more specialized, mapping DNS
156	    servers. The most common mechanisms used to insert multiple mapping
157	    DNS servers in a single DNS resolution are the use of NS and CNAME
158	    records.

160	2.4 NS Redirection

162	    Using NS records, multiple mapping DNS servers can be included by
163	    redirecting the authority of the next level domain to another
164	    mapping DNS server. For example, the client site DNS server
165	    resolving a.b.c.com would eventually request a resolution of
166	    a.b.c.com from the name server authoritative for c.com. The
167	    nameserver authoritative for this domain might be a mapping DNS
168	    server. In this case the mapping DNS server can either return a set
169	    of A records or can redirect the resolution of the request a.b.c.com
170	    to the DNS server that is authoritative for b.c.com using NS records.

172	    One drawback in the use of NS records is that the number of mapping
173	    DNS servers is limited by the number of parts in the DNS name.  This
174	    problem results from the DNS policy that causes a client site DNS
175	    server to abandon a request if no additional parts of the DNS name
176	    are resolved in an exchange with an authoritative DNS server.

178	    A second drawback is that the last DNS server can determine the TTL
179	    of the entire resolution process. The reason is that the last DNS
180	    server can return in the authoritative section of its response its
181	    own NS record. The TTL for this record is solely determined by the
182	    last DNS server. This cached NS record will be used by the client
183	    site DNS server for further resolutions until it expires.

185	    Another drawback is that some implementations of bind voluntarily
186	    cause timeouts (typically 5 seconds) to simplify their
187	    implementation in cases in which a NS-level redirect points to a
188	    name server for which no valid A record is returned or cached. This
189	    is especially a problem if the domain of the name server does not
190	    match the domain currently resolved, since in this case the A
191	    records which might be passed in the DNS response are discarded for
192	    security reasons. Empirical measurements of DNS lookups to sites
193	    with NS-level redirection using this type of setup have a high
194	    incidence of DNS timeouts.

196	2.5 CNAME Redirection

198	    Multi-level redirection using CNAMEs works similarly to NS records
199	    in that a mapping DNS server returns a CNAME for a domain to map the
200	    further request resolution to an entirely new domain and potentially
201	    a new set of mapping DNS servers. The disadvantage of this approach
202	    is mainly the additional overhead of resolving the new domain name.
203	    One advantage is that the number of mapping DNS servers is
204	    independent of the depth of the domain name. The number of mapping
205	    DNS servers is only restricted by the resource limits defined on the
206	    client site DNS server. Another advantage is the avoidance of some
207	    DNS timeouts.

209	2.6 Anycast

211	    To combine measurement and redirection, the mapping DNS server can
212	    advertise an anycast address as its IP address. The same anycast
213	    address is used by multiple physical DNS servers. In this scenario,
214	    the mapping DNS server that is the closest to the client site DNS
215	    server in terms of OSPF and BGP routing will receive the packet
216	    containing the DNS resolution request. The mapping DNS server at
217	    this point knows that it is the closest (by this metric) and can use
218	    this information to make a mapping decision. Drawbacks of this
219	    solution are:

221	       *  It is not known if the mapping DNS server is the closest DNS
222	          server in terms of routing from the mapping server to the
223	          client.

225	       *  BGP is not load sensitive. So the closest server in terms of
226	          routing might not be the server with the least network latency.

228	       *  The server load is not considered during routing. If server
229	          load has to be considered while finding the best mapping
230	          server, it has to be folded into the routing metrics used by
231	          the routing protocol.

233	2.7 Object Encoding

235	    Since only DNS names are visible during the DNS mapping, some
236	    solutions encode the object type, object hash or similar information
237	    into the DNS name. This might vary from a simple division of objects
238	    based on object type (such as images.a.b.c.com and
239	    streaming.a.b.c.com) to a sophisticate schema in which the domain
240	    name contains a unique identifier (such as a hash) of the object.
241	    The obvious advantage is that object information is available during
242	    the mapping process. The disadvantage is that the client site DNS
243	    server has to perform multiple DNS resolutions to retrieve a single
244	    Web page, which might increase rather than decrease the overall
245	    latency.

247	2.8 DNS Request Mapping Problems

249	    The use of DNS as a request mapping mechanism comes with several
250	    problems:

252	       1.  DNS only allows resolution on a per-domain level, not a
253	           per-object level.  An ideal request resolution service would
254	           service requests with per-object detail.  Client-side
255	           direction services allow this kind of resolution because of
256	           their direct inspection of client requests.

258	       2.  DNS systems are typically not designed for very high volumes
259	           of requests.  This occurs in CDNs that desire near real-time
260	           direction of requests to surrogates, because they must return
261	           DNS entries with a short time-to-live (TTL) in order to offer
262	           a different response in the face of changing conditions.

264	       3.  DNS server and client implementations do not always adhere to
265	           the DNS standards and therefore cause problems with DNS
266	           request mapping.  For example, many implementations do not
267	           honor the DNS TTL field.

269	       4.  DNS Request Mapping is based only on knowledge of the local
270	           DNS server, as client addresses are not relayed within DNS
271	           requests.  DNS request mapping inherently makes use of an
272	           assumption that users select a DNS server that is "close" to
273	           them.  Although this is true in many cases, it is not always
274	           valid.  This causes problems, especially for proximity-based
275	           measurements that are made using an active probing technique.
276	           In this case, proximity measurements are made to the user's
277	           DNS server.

279	       5.  DNS servers can request and allow recursive resolution of DNS
280	           names. If recursive resolution is used during the resolution
281	           of the DNS request, the mapping DNS server does not see the
282	           IP address of the client site DNS server, but instead sees
283	           the address of the DNS server that is recursively requesting
284	           the information --- possibly a DNS server operated by the
285	           site for which the mapping DNS server is resolving content.
286	           For example, imgs.company.com might be resolved by a CDN, but
287	           the request for the resolution might come from
288	           dns1.company.com as a result of recursion.

290	       6.  When a large number of clients share a single client site DNS
291	           server, they will all be redirected to the same set of IP
292	           addresses during the TTL interval.  This might lead to
293	           overload of the surrogate or surrogates behind this IP
294	           address if during a flash crowd the number of clients
295	           requesting documents from that single IP address exceeds the
296	           capacity of the surrogate or surrogates.

298	       7.  Some implementations of bind cause DNS timeouts to occur
299	           while exceptional situations are handled.  These
300	           "exceptional" circumstances include NS redirections to
301	           unknown domains.

303	3. Transport-layer Mapping

305	    The first stage of CDN selection is typically accomplished using the
306	    DNS mechanisms described previously. As described in Section 2, this
307	    first level decision must be made based on the information available
308	    at the time, specifically the domain name being resolved and the IP
309	    address of the client-side DNS server. While this level of
310	    information is adequate in many cases, finer levels of granularity
311	    can be achieved by inspecting the subsequent request from the client
312	    browser to the surrogate chosen by DNS. The simplest of the
313	    approaches used today is 'Transport-layer mapping'.

315	    Transport-layer mapping makes use of the information available in
316	    the first packet of the client request to make surrogate selection
317	    decisions. The specific metrics used are identical to those used at
318	    DNS time (see Section 6) but include the client's IP address (rather
319	    than the client's DNS server) and the layer 4 protocol and port
320	    information carried in that first packet in the decision making
321	    process. Handing off the session to a more appropriate surrogate is
322	    accomplished in a variety of proprietary means beyond the scope of
323	    this document. Typically the forward-flow traffic (client to newly
324	    selected surrogate) will flow through the surrogate originally
325	    chosen by DNS. The reverse-flow (surrogate to client) traffic, which
326	    normally transfers much more data than the forward flow, typically
327	    takes the direct path.

329	    The overhead associated with transport-layer mapping makes the most
330	    sense for longer-lived flows such as FTP [1] or RTSP [3] or to
331	    direct away from overloaded surrogates.

333	4. Application-layer Mapping

335	    Application-layer mapping involves deeper examining of packets
336	    beyond the transport layer header. It works together with DNS
337	    request mapping and provides fine-grained mapping control down to
338	    the level of individual objects and can be effected in real time at
339	    the time of the object request. As in the case of transport-layer
340	    mapping, the request routing process is more accurate than in the
341	    DNS request mapping case, because it is based on the client's own IP
342	    address rather than that of a DNS client site server.

344	4.1 Header Inspection

346	    Applications such as HTTP [4], RTSP [3], SSL [2], etc. provide hints
347	    in the initial portion of the session about how the client request
348	    must be mapped. These hints may come from the URL of the content or
349	    other parts of the Mime request header such as Cookies.

351	4.1.1 URL-based Mapping

353	    HTTP and RTSP content requests describe the requested content by its
354	    URL. In many cases, this information is sufficient to disambiguate
355	    the content and suitably map the request. In practice, it is often
356	    enough to use a sub-string, such as a prefix or suffix, of the URL
357	    to make the mapping decision.

359	4.1.1.1 302 Redirection

361	    In redirection-based mapping, the client is first resolved to a
362	    virtual surrogate which in turn returns an application-specific
363	    return code such as the 302 (in the case of HTTP or RTSP) indicating
364	    to the client the IP address of the delivery node that is chosen
365	    based on suitable metrics as described in Section 6.

367	    The advantage of this type of application-aware mapping is
368	    simplicity in implementation. However, the main drawback of this
369	    method is the additional latency involved in sending the redirect
370	    message back to the client.

372	4.1.1.2 In-Path Element

374	    An In-Path element is a network element in the forwarding path of
375	    the client's request. The In-Path element provides transparent
376	    interception of the transport connection. This is accomplished by
377	    accepting the connection request and establishing sequence numbers
378	    via the three-way handshake with the client. This allows the In-Path
379	    element to examine the content requests and glean the request header
380	    information such as the URL, match it with a URL template, and make
381	    the content routing determination. Again, metrics such as those
382	    described in Section 6 may be employed.

384	    Finally, the In-Path element splices the client connection to a
385	    connection with the appropriate delivery node and passes along the
386	    content request. The return path would pass through the In-Path
387	    element. However, it is possible to arrange for a direct return by
388	    passing the address translation information to the surrogate or
389	    delivery node through some proprietary means.

391	    The primary disadvantage with this method is the performance
392	    implications of URL-parsing in the path of the network traffic.
393	    However, it is generally the case that the return traffic is much
394	    larger than the forward traffic.

396	    Traffic may be partitioned and load balanced among a set of delivery
397	    nodes by content objects identified by URLs. This allows
398	    object-specific control of server loading. For example, requests for
399	    non-cacheable objects may be directed away from a cache.

401	4.1.2 Mime Header-based Mapping

403	    This works just like the URL-based mapping except that other
404	    mime-headers in the content request are used to make the
405	    content-rule selection. Some useful mime-headers are: Cookie,
406	    Language, and User-Agent.

408	    Cookies are used to identify a customer or session by a web site.
409	    Cookie-based request mapping provides content service
410	    differentiation based on the client. In addition, it is possible to
411	    map a connection from a multi-session transaction to be mapped to
412	    the same server to achieve session-level persistence. Note that
413	    client IP address is by itself not a reliable indicator of a session
414	    due to the presence of proxies that aggregate multiple clients at a
415	    single point.

417	    The language header can be used to map traffic to a
418	    language-specific delivery node.

420	    The user-agent header helps identify the type of client device. For
421	    example, a voice-browser, PDA, or cell phone can indicate the type
422	    of delivery node that has content specialized to handle the content
423	    request.

425	4.1.3 Site-specific Identifiers

427	    Site-specific identifiers help authenticate and identify a session
428	    from a specific user. This information may be used to map a content
429	    request.

431	    One example of a site-specific identifier is the SSL Session
432	    Identifier. This identifier is generated by a web server and used by
433	    the web client in succeeding sessions to identify itself and avoid
434	    an entire security authentication exchange. In order to inspect the
435	    session identifier, an In-Path element. would observe the responses
436	    of the web server and determine the session identifier which is then
437	    used to associate the session to a specific server. The remaining
438	    sessions are routed based on the stored session identifier. Note
439	    that SSL Session Identifiers cannot be observed by the redirect
440	    method.

442	4.2 Content Modification

444	4.2.1 Content Modification Overview

446	    Content modification enables a content provider to take direct
447	    control over request mapping without the need for specific switching
448	    devices or directory services sitting in-between the client and the
449	    origin server. By modifying the content according to the client's
450	    specifics, a content provider can directly communicate to the client
451	    which surrogate can serve it best. Decisions about the best
452	    surrogate can be made on a per-object basis and can depend on
453	    various metrics (see Section 6). The overall goal is to improve
454	    scalability and the performance for delivering the modified content,
455	    including all embedded objects.

457	4.2.2 Basic Content Modification Mechanism

459	    Typically, content objects are made up a basic structure that
460	    includes references to additional, embedded content objects. Most
461	    web pages, for example, consist of an HTML document that contains
462	    plain text together with some embedded objects, such as GIF or JPEG
463	    images. The embedded objects are referenced using embedded HTML
464	    directives. A similar scheme is used for streaming content, which is
465	    typically embedded within a SMIL document. Traditionally, embedded
466	    HTML or SMIL directives tell the client to fetch embedded objects
467	    from the origin server. A content provider can now modify references
468	    to embedded objects so that the client is told to fetch an embedded
469	    object from the best surrogate (instead of from the origin server).
470	    This type of content modification is also referred to as URL
471	    Rewriting. It can be done a-priori in a static way, or more
472	    dynamically on-demand. The following subsections explore both
473	    alternatives.

475	4.2.2.1 A-priori URL Rewriting

477	    A content provider can modify its content and rewrite embedded URLs
478	    a-priori, i.e. before the content is put on the origin server and
479	    made available to clients. In this case, rewriting can be done
480	    either manually or by using a software tool that parses the content
481	    and replaces embedded URLs. A-priori URL rewriting alone does not
482	    allow consideration of client specifics for request mapping. It can
483	    be used in combination with DNS request mapping, however, to direct
484	    related DNS queries into the domain name space of the service
485	    provider (see Section 6.1). Dynamic request mapping based on client
486	    specifics is then done using the DNS approach.

488	4.2.2.2 On-Demand URL Rewriting

490	    With dynamic URL rewriting, the content is modified when the client
491	    request reaches the origin server. At this time, the identity of the
492	    client is known and can be considered when rewriting embedded URLs.
493	    In particular, an automated process can determine, on-demand, which
494	    surrogate would serve the requesting client best. (For a discussion
495	    on which metrics can be used and how to get proximity measures, see
496	    Section 6.1.) Embedded URLs can then be rewritten so that the client
497	    is told to fetch referenced object from the best surrogate rather
498	    than from the origin server.

500	4.2.2.3 Content Modification Problems

502	    The use of content modification as a request mapping mechanism comes
503	    with several drawbacks:

505	       1.  The first request from a client to a specific site always has
506	           to be served from the origin server.

508	       2.  Content that has been modified to include references to
509	           nearby surrogates rather than to the origin server should be
510	           marked as non-cacheable and should not be cached.
511	           Alternatively, such pages can be marked to be cacheable only
512	           for a relative short period of time. Rewritten URLs on cached
513	           pages can cause problems, because they can be outdated and
514	           point to surrogates that are no longer available or no longer
515	           good choices.

517	       3.  On-demand URL rewriting (including content parsing,
518	           information retrieval, and URL rewriting) has to be done in
519	           real-time, which poses the question of performance and
520	           processing capabilities.

522	5. Combination of multiple mechanisms

524	    There are environments in which a combination of different
525	    mechanisms can be beneficial and advantageous over using one of the
526	    proposed mechanisms alone. The following example illustrates how the
527	    mechanisms can be used in combination.

529	    A basic problem of DNS request mapping is the resolution granularity
530	    that allows resolution on a per-domain level only. A per-object
531	    redirection cannot easily be achieved. However, content modification
532	    can be used together with DNS request mapping to overcome this
533	    problem. With content modification, references to different objects
534	    on the same origin server can be rewritten to point into different
535	    domain name spaces. Using DNS request mapping, requests for those
536	    objects can now dynamically be mapped to different surrogates.

538	6. Measurements

540	    CDNs' Request Mapping Systems make use of a variety of metrics for
541	    the decision of which surrogate to select for a user request.  These
542	    metrics are based on both network measurements and feedback from
543	    surrogates.

545	    It is common practice to combine multiple metrics using both
546	    proximity and surrogate feedback for best surrogate selection.
547	    There are infinite possibilities for metrics as well as metric
548	    combinations; the following sections describe several well-known
549	    metrics as well as the two major techniques for obtaining metrics.

551	6.1 Proximity Measurements

553	    Some CDN Request Mapping Systems make use of "proximity"
554	    measurements to direct users to the "closest" surrogate.  If a DNS
555	    system is used for request mapping, then these measurements are made
556	    to the client's local DNS server; this heuristic is not always
557	    accurate.  In a client-side direction model, the IP address of the
558	    client is directly exposed and therefore more accurate proximity
559	    measurements can be obtained.

561	    Proximity measurements are used between the CDN surrogate set and
562	    the requesting entity.  In many cases, proximity measurements are
563	    "one-way" in that they measure only either the forward or reverse
564	    path of packets from the surrogate to the requesting entity.  This
565	    is important as many paths in the Internet are asymmetric.

567	    In order to obtain a set of proximity measurements, a CDN may employ
568	    active probing techniques and/or passive measurement techniques.
569	    The following sections describe these two techniques.

571	6.1.1 Probing

573	    In order to obtain a set of proximity measurements, a CDN may employ
574	    an active probing technique.  Active probing is when past or
575	    possible requesting entities are probed using one or more techniques
576	    to determine one or more metrics from each surrogate (or set).  An
577	    example of a probing technique would be an ICMP ECHO Request
578	    periodically sent from each surrogate (or set) to a potential
579	    requesting entity.

581	    The problems with an active probing approach are:

583	       1.  Measurements can only be taken periodically.

585	       2.  Firewalls and NATs disallow probes.

587	       3.  Probes often cause security alarms to be triggered on
588	           intrusion detection systems.

590	    In any active probing approach, a list of potential requesting
591	    entities needs to be obtained.  This list can be generated
592	    dynamically: as requests arrive, the requesting entity addresses can
593	    be cached for later probing.  Another potential solution is to use
594	    an algorithm to divide address space into blocks and to probe those
595	    blocks.

597	6.1.2 Passive Measurement

599	    The other measurement technique makes use of passive measurements
600	    which are obtained when a client actually transfers data to/from a
601	    surrogate.  In this technique, a bootstrap mechanism is used to
602	    direct the client to a bootstrap surrogate.  Once the client
603	    connects, the actual performance of the transfer is measured.  This
604	    data is then fed back into the request mapping system.

606	    An example of passive measurement is to watch the packet loss from a
607	    client to a surrogate by observing TCP behavior.  Latency
608	    measurements can also be learned by observing TCP behavior (as TCP's
609	    congestion control is partly based on RTT).

611	    The problems with a passive measurement approach are mostly related
612	    to the bootstrapping mechanism.  A good mechanism is needed to that
613	    every surrogate doesn't need to be "tested" per client to obtain the
614	    measurement information.

616	6.1.3 Metric Types

618	    The following sections list some of the metrics which can be used
619	    for proximity calculations.  This list is not meant to be
620	    exhaustive.

622	       *  Latency: Network latency measurements metrics are used to
623	          determine the surrogate (or set of surrogates) that has the
624	          least delay to the requesting entity.  These measurements can
625	          be obtained using either an active probing approach or a
626	          passive network measurement system.

628	       *  Packet Loss: Packet loss measurements can be used as a
629	          selection metric.  A passive measurement approach can easily
630	          obtain packet loss information from TCP header information.
631	          Active probing can periodically measure packet loss from
632	          probes.

634	       *  Hop Counts: Router hops from the surrogate to the requesting
635	          entity can be used as a proximity measurement.

637	       *  BGP Information: BGP AS PATH and MED attributes can be used to
638	          determine the "BGP distance" to a given prefix/length pair.
639	          In order to use BGP information for proximity measurements, it
640	          must be obtained at each surrogate site/location.

642	6.2 Surrogate Feedback

644	    Some CDN request mapping mechanisms make use of surrogate feedback
645	    information in order to select a "least-loaded" surrogate.  Feedback
646	    can be delivered from each surrogate or can be aggregated by site or
647	    by location.  This feedback information is feed into the Request
648	    Mapping System.  CDNs often make use of both proximity and surrogate
649	    feedback to make decisions.

651	    Examples of surrogate feedback metrics include: CPU load, interface
652	    load, interface dropped packets, number of connections, etc.

654	6.2.1 Probing

656	    Feedback information may be obtained by periodically probing a
657	    surrogate for example by issuing a HTTP request and observing the
658	    behavior.  The problems with probing for surrogate information are:

660	       1.  It is difficult to obtain "real-time" information.

662	       2.  Non-real-time information may be inaccurate.

664	6.2.2 Monitoring

666	    Feedback information may also be obtained by agents that reside on
667	    surrogates.  These agents can communicate a variety of metrics about
668	    the surrogates.

670	6.2.3 Metrics

672	    The following quickly summarizes several of the well known metrics
673	    which are used for surrogate feedback:

675	       *  Surrogate CPU Load.

677	       *  Interface Load / Dropped packets.

679	       *  Number of connections being served.

681	       *  Storage I/O Load.

683	7. Security Considerations

685	    This is a preliminary draft for discussion purposes only submitted
686	    prior to the formation of the working group.  As such, security
687	    considerations have been mostly deferred until after the working
688	    group is constituted. [This document is not expected to be a formal
689	    submission of the working group in its current form.] This document
690	    in particular is a summary of mechanisms documented elsewhere.
691	    Please consult the referenced documents for any mechanism specific
692	    security considerations.

694	8. Acknowledgements

696	    [Reviewers go here]

698	References

700	    [1]  Postel, J., "File Transfer Protocol", RFC 765, June 1980,
701	         <URL:http://www.rfc-editor.org/rfc/rfc765.txt>.

703	    [2]  Dierks, T. and C. Allen, "The TLS Protocol Version 1", RFC
704	         2246, January 1999,
705	         <URL:http://www.rfc-editor.org/rfc/rfc2246.txt>.

707	    [3]  Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming
708	         Protocol", RFC 2326, April 1998,
709	         <URL:http://www.rfc-editor.org/rfc/rfc2326.txt>.

711	    [4]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
712	         Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol --
713	         HTTP/1.1", RFC 2616, June 1999,
714	         <URL:http://www.rfc-editor.org/rfc/rfc2616.txt>.

716	    [5]  Day, M., Cain, B. and G. Tomlinson, "A Model for CDN Peering",
717	         draft-day-cdnp-model-02.txt (work in progress), October 2000,
718	         <URL:http://www.ietf.org/internet-drafts/draft-day-cdnp-model-02
719	         .txt>.

721	Authors' Addresses

723	    Brad Cain
724	    Mirror Image Internet
725	    49 Dragon Court
726	    Woburn, MA  01801
727	    US

729	    Phone: +1 781 276 1904
730	    EMail: brad.cain@mirror-image.com

732	    Fred Douglis
733	    AT&T Labs
734	    Room B137
735	    180 Park Ave, Bldg 103
736	    Florham Park, NJ  07932
737	    US

739	    Phone: +1 973 360 8775
740	    EMail: douglis@research.att.com
741	    Mark Green
742	    Entera, Inc.
743	    40971 Encyclopedia Circle
744	    Fremont, CA  94538
745	    US

747	    Phone: +1 510 770 5268
748	    EMail: markg@entera.com

750	    Markus Hofmann
751	    Lucent Technologies
752	    Room 4F-513
753	    101 Crawfords Corner Rd.
754	    Holmdel, NJ  07733
755	    US

757	    Phone: +1 732 332 5983
758	    EMail: hofmann@bell-labs.com

760	    Raj Nair
761	    Cisco Systems
762	    50 Nagog Park
763	    Acton, MA  01720
764	    US

766	    Phone: +1 978 206 3029
767	    EMail: rnair@cisco.com

769	    Doug Potter
770	    Cisco Systems
771	    50 Nagog Park
772	    Acton, MA  01720
773	    US

775	    Phone: +1 978 206 ????
776	    EMail: dougpott@cisco.com
777	    Oliver Spatscheck
778	    AT&T Labs
779	    Room B131
780	    180 Park Ave, Bldg 103
781	    Florham Park, NJ  07932
782	    US

784	    Phone: +1 973 360 ????
785	    EMail: spatsch@research.att.com

787	Full Copyright Statement

789	    Copyright (C) The Internet Society (2000). All Rights Reserved.

791	    This document and translations of it may be copied and furnished to
792	    others, and derivative works that comment on or otherwise explain it
793	    or assist in its implementation may be prepared, copied, published
794	    and distributed, in whole or in part, without restriction of any
795	    kind, provided that the above copyright notice and this paragraph
796	    are included on all such copies and derivative works. However, this
797	    document itself may not be modified in any way, such as by removing
798	    the copyright notice or references to the Internet Society or other
799	    Internet organizations, except as needed for the purpose of
800	    developing Internet standards in which case the procedures for
801	    copyrights defined in the Internet Standards process must be
802	    followed, or as required to translate it into languages other than
803	    English.

805	    The limited permissions granted above are perpetual and will not be
806	    revoked by the Internet Society or its successors or assigns.

808	    This document and the information contained herein is provided on an
809	    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
810	    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
811	    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
812	    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
813	    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

815	Acknowledgement

817	    Funding for the RFC editor function is currently provided by the
818	    Internet Society.