idnits 2.17.1 draft-tomlinson-epsfw-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' 1. Introduction' ) ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 15 instances of too long lines in the document, the longest one being 15 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 237: '...servers. A proxy MUST implement both t...' RFC 2119 keyword, line 539: '...pported protocol MUST have a correspon...' RFC 2119 keyword, line 540: '...parser. A parser MAY contain subordina...' RFC 2119 keyword, line 545: '... for message parsers SHOULD be defined...' RFC 2119 keyword, line 551: '...nd the rule base MUST be elaborated by...' (67 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 350 has weird spacing: '... proxy must ...' == Line 1238 has weird spacing: '... of the cachi...' == Line 1682 has weird spacing: '...ses) or execu...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 13, 2000) is 8681 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '16' on line 1563 looks like a reference -- Missing reference section? '7' on line 1509 looks like a reference -- Missing reference section? '10' on line 1530 looks like a reference -- Missing reference section? '9' on line 1523 looks like a reference -- Missing reference section? '1' on line 1482 looks like a reference -- Missing reference section? '5' on line 1501 looks like a reference -- Missing reference section? '8' on line 1516 looks like a reference -- Missing reference section? '12' on line 1539 looks like a reference -- Missing reference section? '13' on line 1547 looks like a reference -- Missing reference section? '14' on line 1555 looks like a reference -- Missing reference section? '15' on line 1559 looks like a reference -- Missing reference section? '4' on line 1496 looks like a reference -- Missing reference section? '11' on line 1534 looks like a reference -- Missing reference section? '17' on line 1571 looks like a reference -- Missing reference section? '18' on line 1577 looks like a reference -- Missing reference section? '3' on line 1490 looks like a reference -- Missing reference section? '2' on line 1486 looks like a reference -- Missing reference section? '6' on line 1507 looks like a reference -- Missing reference section? '19' on line 1581 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 5 warnings (==), 21 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Tomlinson 3 Internet-Draft H. Orman 4 Expires: January 11, 2001 Novell 5 M. Condry 6 J. Kempf 7 Sun Microsystems 8 D. Farber 9 Digital Island 10 July 13, 2000 12 Extensible Proxy Services Framework 13 draft-tomlinson-epsfw-00.txt 15 Status of this Memo 17 This document is an Internet-Draft and is in full conformance 18 with all provisions of Section 10 of RFC2026. 20 Internet-Drafts are working documents of the Internet 21 Engineering Task Force (IETF), its areas, and its working 22 groups. Note that other groups may also distribute working 23 documents as Internet-Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other 27 documents at any time. It is inappropriate to use 28 Internet-Drafts as reference material or to cite them other 29 than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on January 11, 2001. 39 Copyright Notice 41 Copyright (C) The Internet Society (2000). All Rights Reserved. 43 Abstract 45 In today's Internet, caching proxies that intermediate between 46 HTTP (and increasingly streaming media) clients and servers 47 provide enhanced performance for Web page access. Both clients 48 and servers are increasingly looking to the network for 49 additional services that can't be provided directly on the 50 client or server, and Web proxies are an attractive place for 51 locating these services. In fact, some such services (content 52 assembly for advertising) are already being offered by Web 53 proxies for servers, but in a nonstandard way. This document 54 describes the problem and solution requirements for a 55 standardized, open and extensible service environment on 56 caching proxies which enables them to provide general services 57 that mediate, modify, and monitor object requests and 58 responses. It introduces an architectural framework along with 59 a set of core requirements necessary to design standardized 60 implementations for this application domain, taking into 61 account relevant IETF RFCs and IETF work-in-progress. The 62 architecture and requirements described here are mindful of 63 the success of end-to-end nature of Internet client/server 64 interactions, and the consequences of the proposed 65 architectural changes on end-to-end semantics are discussed. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 4 70 1.1 Requirement Language . . . . . . . . . . . . . . . . . . . 4 71 1.2 Relationship to other IETF Work . . . . . . . . . . . . . 4 72 1.3 Relationship to known work outside IETF . . . . . . . . . 5 73 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 74 2.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . 9 75 3. Problem Description and Goals . . . . . . . . . . . . . . 10 76 4. Architecture . . . . . . . . . . . . . . . . . . . . . . . 11 77 5. Service Execution Environment Requirements . . . . . . . . 14 78 5.1 Rule Base Requirements . . . . . . . . . . . . . . . . . . 14 79 5.1.1 Message Parser . . . . . . . . . . . . . . . . . . . . . . 14 80 5.1.2 Message Property . . . . . . . . . . . . . . . . . . . . . 15 81 5.1.3 Rule Base . . . . . . . . . . . . . . . . . . . . . . . . 15 82 5.1.4 Rule Processor . . . . . . . . . . . . . . . . . . . . . . 16 83 5.2 Execution Environment Requirements . . . . . . . . . . . . 16 84 5.2.1 Service Management . . . . . . . . . . . . . . . . . . . . 16 85 5.2.2 Resources and functions of the caching proxy . . . . . . . 17 86 5.2.2.1 Resources . . . . . . . . . . . . . . . . . . . . . . . . 17 87 5.2.2.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . 17 88 5.3 Proxylet Requirements . . . . . . . . . . . . . . . . . . 18 89 5.4 Remote Callout Requirements . . . . . . . . . . . . . . . 19 90 5.5 Network Access Requirements . . . . . . . . . . . . . . . 19 91 6. Proxy Discovery . . . . . . . . . . . . . . . . . . . . . 21 92 7. Security Requirements . . . . . . . . . . . . . . . . . . 23 93 7.1 Authorization, Authentication, and Accounting 94 Requirements . . . . . . . . . . . . . . . . . . . . . . . 23 95 7.1.1 AAA in the Existing Web System Model . . . . . . . . . . . 24 96 7.1.2 Service Environment Caching Proxy AAA Requirements . . . . 25 97 7.1.3 Remote Callout Server AAA Requirements . . . . . . . . . . 27 98 7.1.4 Administrative Server AAA Requirements . . . . . . . . . . 29 99 7.2 Requirements on the Service Execution Environment . . . . 30 100 8. Impact on the Internet Architecture . . . . . . . . . . . 32 101 9. Intellectual Property . . . . . . . . . . . . . . . . . . 35 102 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 36 103 References . . . . . . . . . . . . . . . . . . . . . . . . 37 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . 39 105 A. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 41 106 A.1 Request Identification . . . . . . . . . . . . . . . . . . 41 107 A.2 Content Assembly . . . . . . . . . . . . . . . . . . . . . 41 108 A.3 Multimedia Stream Management . . . . . . . . . . . . . . . 42 109 A.4 Virus Detection . . . . . . . . . . . . . . . . . . . . . 42 110 A.5 Transcoding . . . . . . . . . . . . . . . . . . . . . . . 43 111 Full Copyright Statement . . . . . . . . . . . . . . . . . 44 113 1. Introduction 115 Caching proxies are important elements contributing to the 116 scalability and management of content services on the Internet 117 today. They are used to accelerate Web sites, to reduce 118 traffic over expensive transoceanic links, and to reduce 119 latency for groups of users in enterprises or at ISP sites. 120 Increasingly, caching proxies are being used for additional 121 services, for example, content assembly for advertising. Both 122 end users and Web publishers are looking to caching proxies as 123 a potential platform for deploying other services that can't 124 be deployed in clients or servers. 126 There are a variety of existing or proposed protocols that 127 implement particular services or network components for 128 implementing services. NECP [16] handles aspects of control of 129 proxy configuration and topology. ICAP [7] handles transport 130 of Web objects to content modification servers and back. The 131 number of such proposals is likely to grow over time. 132 Consequently, an architecture and core set of requirements to 133 guide standardization of proxy enhancements are desirable. 134 This document describes an architecture and requirements for 135 extending the functionality of a caching proxy to provide 136 general services that mediate, modify, and monitor object 137 requests and responses. The effect of these changes on the 138 end-to-end nature of client/server interactions in the 139 Internet are also discussed. 141 1.1 Requirement Language 143 In this document, the key words "MAY", "MUST, "MUST NOT", 144 "optional", "recommended", "SHOULD", and "SHOULD NOT", are to 145 be interpreted as described in RFC2119 [10]. 147 1.2 Relationship to other IETF Work 149 The authors have explicitly framed it with respect to the 150 documented taxonomy of [9] web replication and caching. 151 Readers are encouraged to become familiar with this taxonomy 152 as it provides a baseline context for this application domain. 154 The authors have also attempted to frame it with respect to 155 IETF standards (both existing and known work-in-progress) for 156 security[1][5]; accounting, authorization and authentication 157 [8][12]; policy [13]; and end to end model architectural 158 principles [14][15]. 160 1.3 Relationship to known work outside IETF 162 The framework has taken into account known work [7] being done 163 outside the IETF that it is considered to be relevant to the 164 application domain. 166 2. Terminology 168 This section contains a list of terms from existing IETF 169 documents (identified by reference number), and new terms 170 specific to this document. 172 avatar 173 A caching proxy located at the network access point of the 174 user agent, delegated the authority to operate on behalf 175 of, and typically working in close co-operation with a 176 group of user agents. 178 cache [4] 179 A program's local store of response messages and the 180 subsystem that controls its message storage, retrieval, and 181 deletion. A cache stores cacheable responses in order to 182 reduce the response time and network bandwidth consumption 183 on future, equivalent requests. Any client or server may 184 include a cache, though a cache cannot be used by a server 185 that is acting as a tunnel. 187 caching proxy [9] 188 A proxy with a cache, acting as a server to clients, and a 189 client to servers. Caching proxies are often referred to as 190 "proxy caches" or simply "caches". The term "proxy" is also 191 frequently misused when referring to caching proxies. 193 content server 194 The server on which content is delivered from. It may be 195 an origin server, replica server, surrogate or parent proxy. 197 inbound/outbound [4] 198 Inbound and outbound refer to the request and response 199 paths for messages: "inbound" means "traveling toward the 200 origin server", and "outbound" means "traveling toward the 201 user agent". 203 interception proxy (a.k.a. "transparent proxy", "transparent 204 cache") [9] 205 The term "transparent proxy" has been used within the 206 caching community to describe proxies used with zero 207 configuration within the user agent. Such use is somewhat 208 transparent to user agents. Due to discrepancies with [4] 209 (see definition of "proxy" above), and objections to the 210 use of the word "transparent", we introduce the term 211 "interception proxy" to describe proxies that receive 212 redirected traffic flows from network elements performing 213 traffic interception. 215 Interception proxies receive inbound traffic flows through 216 the process of traffic redirection. (Such proxies are 217 deployed by network administrators to facilitate or require 218 the use of appropriate services offered by the proxy). 219 Problems associated with the deployment of interception 220 proxies are described in the companion document "Known HTTP 221 Proxy/Caching Problems"[19]. The use of interception 222 proxies requires zero configuration of the user agent which 223 act as though communicating directly with an origin server. 225 non-transparent proxy 226 See "proxy". 228 origin server [4] 229 The server on which a given resource resides or is to be 230 created. 232 proxy [4] 233 An intermediary program which acts as both a server and a 234 client for the purpose of making requests on behalf of 235 other clients. Requests are serviced internally or by 236 passing them on, with possible translation, to other 237 servers. A proxy MUST implement both the client and server 238 requirements of this specification. A "transparent proxy" 239 is a proxy that does not modify the request or response 240 beyond what is required for proxy authentication and 241 identification. A "non-transparent proxy" is a proxy that 242 modifies the request or response in order to provide some 243 added service to the user agent, such as group annotation 244 services, media type transformation, protocol reduction, or 245 anonymity filtering. Except where either transparent or 246 non-transparent behavior is explicitly stated, the HTTP 247 proxy requirements apply to both types of proxies. 249 proxylet 250 Executable code modules that have a procedural interface to 251 the caching proxy's core services. Proxylets may be either 252 downloaded from content servers or user agents, or they may 253 be preinstalled on the caching proxy. 255 proxylet library 256 A language binding dependent API on the service environment 257 caching proxy platform with which proxylets link. This 258 provides a standardized and strictly controlled interface 259 to the service execution environment on the proxy. 261 remote callout server 262 A cooperating server which runs services as the result of 263 network protocol messaging interactions to/from a service 264 environment caching proxy. 266 rule module 267 A collection of message pattern descriptions and consequent 268 actions that are used to match incoming protocol messages 269 and process their contents if a match occurs. 271 service [11] 272 Work performed (or offered) by a server. This may mean 273 simply serving simple requests for data to be sent or 274 stored (as with file servers, gopher or http servers, 275 e-mail servers, finger servers, SQL servers, etc.); or it 276 may be more complex work, such as that of irc servers, 277 print servers, X Windows servers, or process servers. 279 Note: Unqualified use of the term "service" within this 280 memo is constrained to represent work performed on the 281 network traffic flowing through the caching proxy. 283 service environment caching proxy 284 A caching proxy which has functionality beyond the basic 285 short-circuit request fulfillment, making it capable of 286 executing extensible (programmable) services, including 287 network transactions with other hosts for purposes of 288 modifying message traffic. 290 service execution environment 291 The environment on the caching proxy that allows new 292 services to be defined and executed. 294 surrogate [9] 295 A gateway co-located with an origin server, or at a 296 different point in the network, delegated the authority to 297 operate on behalf of, and typically working in close 298 co-operation with, one or more origin servers. Responses 299 are typically delivered from an internal cache. 301 Surrogates may derive cache entries from the origin server 302 or from another of the origin server's delegates. In some 303 cases a surrogate may tunnel such requests. 305 Where close co-operation between origin servers and 306 surrogates exists, this enables modifications of some 307 protocol requirements, including the Cache-Control 308 directives in [4]. Such modifications have yet to be fully 309 specified. 311 Devices commonly known as "reverse proxies" and "(origin) 312 server accelerators" are both more properly defined as 313 surrogates. 315 transparent proxy 316 See "proxy". 318 trigger 319 A rule that matches a network protocol message, causing a 320 proxylet to execute or other action to occur on the matched 321 message segment. 323 user agent [4] 324 The client which initiates a request. These are often 325 browsers, editors, spiders (web-traversing robots), or 326 other end user tools. 328 2.1 Discussion 330 The most common use of caching proxies is to short-circuit 331 HTTP requests by fulfilling them from a cache store. When the 332 caching proxy is dedicated to particular source server sites, 333 it is called a surrogate. When it is dedicated primarily to a 334 group of users it is called an avatar. Avatars that act as 335 caching proxies regardless of the user browser configuration 336 are also called interception proxies. It is possible for a 337 single instance of a caching proxy to fulfill all these roles. 338 It is also possible for a caching proxy to be a surrogate for 339 many sources or to be an avatar for users that belong to 340 different naming or trust domains. 342 Surrogates today have the most elaborated services 343 environment. Surrogate services are used to implement content 344 assembly for advertising. Such modifications have yet to be 345 fully specified, however. 347 3. Problem Description and Goals 349 The fundamental problem that a service environment caching 350 proxy must solve is how to perform modifications on HTTP and 351 other protocol messages flowing upstream/downstream in a way 352 that both maintains good network performance for request 353 fulfillment and allows flexible definition of new services. 354 The architecture provides the framework for solutions to this 355 problem. Since performance is a key determinant, the 356 architecture cannot simply be restricted to network protocols 357 that allow remote callout servers to perform these services 358 (such as [7]), because some simple services can be executed 359 more quickly on the service environment caching proxy itself. 360 Similarly, the architecture cannot simply be restricted to 361 proxylets executing on the proxy, since there are some 362 services (virus checking is an example) that could require 363 more or a different kind of execution resource than is 364 available on the proxy. The need for flexibility of service 365 definition suggests that the functionality provided by the 366 service execution environment on the proxy cannot be specified 367 in advance; rather, the need is for an extensible platform 368 that can be enhanced by downloading proxylets from user agents 369 or content servers. 371 If a service environment caching proxy is to allow downloading 372 of rule modules and proxylets from user agents and content 373 servers, the proxy requires some mechanism whereby it can 374 authenticate that a host attempting an upload is authorized to 375 perform the upload, and to verify that the uploaded proxylet 376 is valid. The proxy must also be able to generate accounting 377 records, potentially across administrative domains, that allow 378 the owners of the proxy to collect for services rendered on 379 the proxy. Interactions with remote callout servers also 380 require authentication and accounting. While executing on the 381 proxy, proxylets from separately authorized parties must be 382 protected from interfering with each other. These constitute 383 the security requirements for the service environment caching 384 proxy. 386 4. Architecture 388 Figure 1 contains a diagram of the network elements involved 389 in the service environment caching proxy architecture. 391 +-----------------+ 392 | RC | 393 | | 394 | Remote Callout | 395 | Server | 396 | | 397 | | 398 +-----------------+ 399 A | 400 | | 401 | | 402 | V 403 +----------+ +-------------------+ +---------------+ 404 | UA |<------|4 P 3|<-----------| S | 405 | | | | | | 406 | User |------>|1 Caching Proxy 2|----------->| Content | 407 | Agent | | | | Server | 408 +----------+ |-------------------| +---------------+ 409 | Service | 410 | Execution | 411 | Environment | 412 +-------------------+ 413 A | 414 | | 415 | | 416 | V 417 +-----------------+ 418 | A | 419 | | 420 | Administration | 421 | Server | 422 | | 423 | | 424 +-----------------+ 426 Figure 1 System Architecture Components 428 There are 5 major network components: 430 1. The user agent (UA) represents any client: PC, 431 wireless, etc; the client makes requests for content 432 service with requests going through the proxy server 433 (P) for potential cache hits. 435 2. The caching proxy server (P) represents the proxy 436 server where the service execution environment is 437 implemented. Services in the Proxy Service Execution 438 Environment are defined with Proxylets and Rule Modules 439 together with general services provided by the Proxylet 440 Library. The four execution points (1-4) represent 441 locations in the round trip message flow where an event 442 trigger can occur, resulting in a proxylet processing 443 the message. These execution points are: 445 Point 1: Client Request 446 represents the client (UA) request to the caching 447 proxy (P) on the inbound flow. 449 Point 2: Proxy Request 450 represents the caching proxy (P) request to the 451 content server (S) on the inbound flow. 453 Point 3: Content Response 454 represents the response from the content server (S) 455 to the caching proxy (P) on the outbound flow. 457 Point 4: Proxy Response 458 represents the caching proxy (P) response to the 459 client (UA) on the outbound flow. 461 3. The Content server (S) is the source of content for the 462 caching proxy. 464 4. The Remote Callout Server (RS) offers remote service 465 execution. The Proxy initiates processing on the Remote 466 Callout Server, and receives the results for potential 467 caching and transmission to/from the User Agent or 468 Content Server. 470 5. The Administrative Server (A) performs the downloading 471 of proxylets at a higher trust level, the collection of 472 accounting and log data, and other administrative tasks 473 on the service environment caching proxy. 475 Although only one proxy is shown in the diagram, serving as 476 both a surrogate and an avatar, a service environment caching 477 proxy can serve in either or both of these roles. In addition, 478 it is possible that proxies can be chained, so that a request 479 may traverse a number of proxies before reaching the end of 480 the inbound or outbound connection. 482 In its unprogrammed state, the service environment caching 483 proxy accepts request messages and generates reply messages in 484 the manner of a traditional caching proxy. By programming the 485 service environment caching proxy, additional value added 486 services may be introduced. Programming occurs by introducing 487 proxylets and rule modules into the service environment 488 caching proxy. Such introduction can occur either directly, 489 through the administrative server, or through callouts in the 490 message stream that cause downloading of proxylets and rule 491 modules from the content server or user agent. 493 The new elements of the service environment caching proxy are 494 message parsers, a rule processor, and a set of actions. The 495 service environment caching proxy may support proxylets 496 authored in one or more programming languages (for example, 497 Java, PERL, etc.). A message parser appropriate to the 498 protocol of the message being examined processes the message 499 stream flowing between the user agent and the content server, 500 extracting message properties relevant to the rule base. The 501 message properties are then fed through the rule processor 502 with the appropriate rule module. A rule is matched if the 503 message properties match those specified in the rule. The 504 matching of a rule results in a trigger that causes an action 505 to occur. The action may be the execution of a proxylet, an 506 action built into the proxylet library, or a callout to a 507 remote callout server for help processing the message. Salient 508 components of the message are passed to the proxylet, built-in 509 action function, or, via a network protocol, to the remote 510 callout server. 512 Proxylets, built-in actions, or remote callout servers may 513 inspect, add, delete, and modify the properties of messages 514 identified by message parsers, within constraints defined by 515 the message parser. The results of processing are passed back 516 to the service execution environment for disposal. The actual 517 action taken by the service execution environment may be to 518 cache the result, or to throw it away and send an error 519 message to the user agent or content server. After action 520 execution is completed, the service execution environment 521 performs no other modification on the message. 523 5. Service Execution Environment Requirements 525 Combining the goals in Section 3 with the architecture 526 described in Section 4 results in a set of requirements on the 527 service execution environment. 529 5.1 Rule Base Requirements 531 5.1.1 Message Parser 533 A message parser is responsible for interpreting messages 534 received, isolating salient elements as message properties, 535 and causing actions to be activated when appropriate. When a 536 trigger occurs, an event context is established containing the 537 relevant properties isolated by the message parser. 539 Any supported protocol MUST have a corresponding message 540 parser. A parser MAY contain subordinate parsers that 541 correspond to subordinate protocols. For example, within an 542 HTTP message parser there may be separate parsers for handling 543 different MIME types such as HTML, XML, and XHTML. From the 544 standpoint of the rule processor, all parsers look like a 545 single engine. The API for message parsers SHOULD be defined 546 so that parsers for new protocols and content can be added 547 modularly. 549 For any protocol to be supported by the service environment 550 caching proxy, the interface between the protocol's message 551 parser and the rule base MUST be elaborated by defining: 553 1. The set of properties defined by the message parser, 554 including the property name, its relationship to the 555 message it characterizes, and the ability of an action 556 to modify it. 558 2. The points in Figure 1 at which rules are be activated. 560 A service environment caching proxy MUST provide a means by 561 which an external process can determine whether a given 562 message parser interface is supported. 564 A message parser SHOULD parse a message in a single pass. 565 Note: The authors recommend single pass parsers to meet the 566 performance requirements anticipated for general usage. 568 A message parser MUST always terminate when processing a 569 finite message, independent of message content. 571 5.1.2 Message Property 573 Each message property consists of a message property name and 574 a message property value. The value may be determined by the 575 message. 577 Some example message property names: 579 httpRequest.url 580 the value of the request-URI in the request line of an HTTP 581 request. 583 httpRequest.host 584 the value of the "host:" header in an HTTP request. 586 httpReply.html.embeddedURL 587 the value of an embedded URI within the text of an HTTP 588 document. Other message properties may be defined to 589 provide the context in which the embedded URI occurs. 591 rtspRequest.url 592 the value of the request-URI in the request line of an RTSP 593 request. 595 5.1.3 Rule Base 597 A rule base consists of a collection of rule modules. Each 598 rule module consists of a collection of rules. Each rule 599 consists of a pattern and an action. A pattern is an 600 expression that can be evaluated with respect to the message 601 properties in an event context, and either the rules will 602 match or fail to match the properties in the context. Actions 603 identify proxylets, built-in proxy library functions or remote 604 callouts that can modify the unmediated operation of the 605 service environment caching proxy. 607 Every rule module has an owner, identified by the source of 608 the module. The trust relationship between the proxy and the 609 owner MAY be at varying levels. Some rule modules MAY be 610 allowed to perform actions that others aren't, for 611 administrative or other purposes. 613 The format of a rule, its constituent patterns and actions 614 MUST be specified. 616 The format of a rule module SHOULD be specified as an XML DTD. 618 The syntax of a pattern MUST include a way to identify and 619 match properties specified by message parsers. 621 The syntax of an action MUST be the name of proxylets, 622 built-in proxy library function or remote callouts. 624 There MUST be a way to associate a rule module with a single 625 owner. 627 There SHOULD be a way to associate a rule module with the 628 location and version of the proxylet library (or libraries) to 629 which it refers. 631 A service environment caching proxy MAY accept one or more 632 rule modules that are applicable to all requests. Such a rule 633 module may be required for administrative purposes. 635 A service environment caching proxy MUST define a method for 636 installing and updating rule modules. 638 The service environment caching proxy SHOULD define a method 639 to allow any external domain that it may proxy to dynamically 640 load or update a rule module. 642 5.1.4 Rule Processor 644 The rule processor is a form of dispatcher, invoked by a 645 message parser at an event, and responsible for activating 646 applicable rules in a rule base. The rule processor activates 647 a rule by determining whether the rule's pattern matches the 648 event properties in a given event context, and if so, invoking 649 the corresponding action. 651 5.2 Execution Environment Requirements 653 The service environment caching proxy MUST present an 654 interface for managing services and for making use of the 655 resources of the caching proxy. 657 5.2.1 Service Management 659 The administrator of a caching proxy MUST be able to specify 660 the policy for accepting new services and for determining 661 their rights and privileges. The service management interface 662 MUST support manual loading of rules and proxylets by a system 663 administrator. The interface MUST support the deletion of 664 rules and proxylets, listing of the all rules and proxylets 665 and their status. The interface MUST support policy regarding 666 extending the service environment. 668 Rules and proxylets MAY be loaded by a service. The proxylet 669 library MUST support such loading, but local policy 670 enforcement can deny the library request. 672 Rules and proxylets MAY be loaded implicitly by content 673 passing through the proxy. The proxylet library MUST support 674 such loading, but local policy enforcement can prevent it. 676 5.2.2 Resources and functions of the caching proxy 678 The execution environment must offer resources and basic 679 functions that rules and proxylets draw on, and it must be 680 able to limit use of those resources and the resources used by 681 the basic functions. 683 5.2.2.1 Resources 685 The fundamental resources of the caching proxy are: 687 1. Cached objects 689 2. Memory for representing the executable services 691 3. CPU cycles for executing the services 693 4. Bandwidth (incoming and outgoing) for network 694 communication 696 5. Secondary storage for cached information (state 697 information, etc.) 699 6. Persistent storage (for configuration information, etc.) 701 7. Network connections 703 8. Lists of object names and metadata 705 9. Users (lists of registered users, current users, etc.) 707 10. Policy relating to control of the caching proxy 708 resources 710 11. Logging and accounting information 712 The services environment MUST be able to limit the use of 713 these resources on a per AAA policy basis to a level specified 714 by the system administrator. 716 5.2.2.2 Functions 718 The basic classes of functions provided by the caching proxy 719 to the extensible services MUST include a minimal set of 720 operations. This set SHOULD include operations from the 721 following list: 723 1. Operations on network messages 725 The operations will be specified in terms of the 726 semantics of the underlying language and MUST 727 include the ability to replace, modify, delete, or 728 insert information. 730 2. Operations on cached objects 732 3. Operations on object names 734 4. Creation and operations on network connections (either 735 client or server mode) 737 The operations MUST include the ability to determine 738 that response message for any incoming message 740 5. Operations on user objects 742 6. Accounting and billing operations 744 7. User object accesses 746 8. Allocation and maintenance of local state information 748 5.3 Proxylet Requirements 750 Proxylets MUST be able to run on arbitrary caching proxies in 751 an environment that gives them access to standard semantics 752 and predictable results. It should not be necessary to write 753 different proxylets for different types of caching proxies - 754 the environment should provide a basic set of well-understood 755 functions and a clear way of determining the state of 756 additional functions. A proxylet itself may provide 757 additional functions to other proxylets. 759 The fundamental motivating use of proxylets is for acting as a 760 proxy for some subclass of network requests. This 761 functionality MUST be supported. Beyond that, proxylets are 762 simply programs that execute on caching proxies, and their 763 purpose may be unrelated to the basic caching or proxying 764 services; for example, a proxylet may provide naming or 765 identity mapping services for users. 767 The proxylet library for a given proxylet language MUST 768 include a way to inspect message properties. 770 The proxylet library for a given proxylet language MUST 771 include a way to modify message properties. 773 The proxylet library MUST provide a way for proxies to 774 determine the available service classes and their revision 775 levels. 777 The proxylet namespace MUST be standardized and global. Local 778 handles MAY be implemented. 780 Service environment caching proxies: 782 o MUST provide a means by which an external server can 783 determine whether a given proxylet language and encoding 784 format is supported. 786 o MUST define a method for installing and updating a library 787 of supported proxylets. 789 o SHOULD define a method to allow any external domain that it 790 may proxy to dynamically load or update a proxylet library. 792 o SHOULD provide means to limit the use of resources by a 793 given proxylet, domain, or owner of the proxylet. 795 o SHOULD provide means by which an operator can monitor the 796 use of resources by proxylets associated with a given 797 domain or owner of the proxylet. 799 5.4 Remote Callout Requirements 801 Some extensible services will require additional systems for 802 extra CPU cycles or for protecting proprietary algorithms or 803 databases. Standard protocols for directing remote operations 804 MUST be part of the architecture. The protocols should be 805 able to handle self-contained or streaming data for both input 806 and output. 808 5.5 Network Access Requirements 810 If service environment proxies become common, it is likely 811 that a single protocol message may traverse multiple proxies 812 and undergo some degree of processing before reaching the 813 ultimate destination. In order to preserve scalability, it is 814 important that proxies beyond the next hop neighbor not be 815 visible along the direction of the inbound/outbound flow. Such 816 proxies may naturally be accessible through a separate 817 connection, but scalability would be compromised if the 818 results of processing on one proxy were dependent on those of 819 another proxy beyond the next hop. This leads to the 820 requirement: 822 o The service execution environment components MUST avoid 823 providing facilities that allow services to construct 824 dependencies on proxies other than the next hop, along the 825 inbound/outbound flow. The service execution environment 826 MAY provide facilities that allow a service to communicate 827 with a proxy that is not adjacent through a separate 828 out-of-band connection. Such facilities MUST treat the 829 connection like any other remote callout server. 831 6. Proxy Discovery 833 In the existing Web architecture, the process by which a 834 client or server discovers its proxy is undefined. Although 835 there have been attempts to standardize on how proxy discovery 836 is performed (see [17] ), no standard has yet been put in 837 place. Proxy discovery is basically ad hoc. Clients typically 838 obtain their proxies from system administrators, servers by 839 static or dynamic configuration based on the business or 840 administrative relationships with the providers of proxy 841 services. 843 With the addition of service execution environment proxies, 844 proxy discovery potentially becomes more complex. There are a 845 number of issues involved: 847 1. In order to preserve the end-to-end nature of the 848 client/server connection, the provider of a service 849 environment proxy are encouraged to explicitly require the 850 client or server to "log on" to the proxy, through an 851 authentication process that may require human intervention 852 particularly if the proxy is an avatar. If so, the current 853 interception proxy model will require an explicit service 854 discovery step, possibly through a URL supplied by a 855 person, but also potentially through some automated 856 service discovery mechanism. 858 2. With the addition of a service environment, proxies become 859 unequal in the services they provide. Whereas previously 860 proxies provided caching and nothing more, they now are 861 able to offer an array of services some of which may be 862 interesting to clients and servers, others of which are 863 uninteresting. 865 These considerations lead to a number of requirements for 866 proxy discovery. 868 Automated proxy discovery within a domain with authentication 869 is provided by Service Location Protocol (SLP) [18]. SLP 870 allows a proxy discovery client to specify a query with 871 attributes describing the services provided by the proxy and a 872 service type describing the service. Responses to the proxy 873 discovery query contain the URLs of proxies that match the 874 attributes. If authentication is enabled, the response 875 contains a digital signature enabling the proxy discovery 876 client to verify trustworthiness of the URL source. Since SLP 877 was explicitly designed for attribute based service discovery, 878 the following requirement is suggested: 880 o Within administrative domains, SLP MAY be used for locating 881 service environment proxies by the services they provide. 882 If authentication of the proxy URL is required, SLP 883 authentication SHOULD be used. 885 Between administrative domains, there is currently no way to 886 discover services based on their characteristics. This 887 suggests the following requirement for inter-domain service 888 discovery. 890 o Between administrative domains, a mechanism for locating a 891 service environment proxy based on the services it provides 892 SHOULD be developed. Authentication of the proxy URLs 893 provided MUST be part of the design. 895 Finally, if discovery based on services offered is not an 896 issue, clients or servers can either be statically or 897 dynamically configured with proxy URLs, and, correspondingly, 898 proxies can be configured with their upstream and downstream 899 peers, if there are intermediate proxies. Any of a variety of 900 existing IP protocols can be used for dynamic configuration. 901 Authentication on dynamically discovered proxies is, however, 902 essential: 904 o When discovery by service offered is not required, the 905 service discovery mechanism used MUST supply an 906 authentication mechanism whereby the discovering client can 907 verify the trustworthiness of the source supplying the 908 service environment caching proxy information. 910 7. Security Requirements 912 Security requirements for the service environment caching 913 proxy break into two broad areas: 915 1. Authentication, authorization, and accounting requirements 916 above and beyond those currently a part of HTTP, to ensure 917 that the trust relationships between the caching proxy and 918 its upstream and downstream peers, any remote callout 919 servers, and any administration servers are validated and 920 that the proper accounting records are generated so that 921 the owners of the caching proxy can bill for services 922 rendered. 924 2. Requirements on the service execution environment to allow 925 proxylets and rule sets from multiple different parties to 926 run without the possibility of unintentional or malicious 927 interference. 929 The next two subsections discuss these topics in more detail. 931 7.1 Authorization, Authentication, and Accounting Requirements 933 The authorization, authentication, and accounting (AAA) 934 requirements for the service environment caching proxy are 935 driven by a need to ensure authorization of a client, 936 publishing server or administrative server attempting to 937 inject proxylet functionality, to authenticate injected 938 proxylets, and to perform accounting on proxylet functions so 939 the client or publishing server can be billed for the 940 services. In addition, AAA is also required for a host willing 941 to act as a remote callout server. 943 Figure 2 contains a diagram of the trust relationships between 944 the different entities in the service environment caching 945 proxy architecture. 947 T2 948 +-------------------+ 949 | T7 T5 | 950 | +------RC ----+ | 951 | / | \ | 952 |/ T1 T4| T3 \| 953 C ------- P ------- S 954 T6| 955 A 957 Figure 2 AAA Trust Relationships 958 These trust relationships govern the communication channels 959 between entities, not necessarily the objects upon which the 960 entities are allowed to operate. 962 7.1.1 AAA in the Existing Web System Model 964 In the traditional client/server Web model, only T2 965 (end-to-end) and T1/T3 (hop by hop) are present. 967 For T2, HTTP 1.1 [4] contains the WWW-Authenticate header for 968 a server to indicate to the client what authentication scheme 969 to use and the Authorization header for the client to present 970 credentials to a server. The client presents these credentials 971 if it receives a 401 (Unauthorized) response. In RFC 2617, 972 HTTP authentication mechanisms that do not involve clear text 973 transmittal of a password are detailed [5]. At the user level, 974 the mechanism used by the server to authorize and authenticate 975 a client is challenge/response (CHAP) with some kind of login 976 box, but there is no requirement for AAA in general. Access 977 control lists (ACLs) have been proposed as a way to fine tune 978 control [3], so the server could deny a client access to a 979 particular object. In addition, if the server uses TLS (SSL) 980 [1], the client is assured of privacy in its transactions and 981 can send a clear text password. 983 In the other direction, there is no support for a client to 984 authenticate a server. Since the client must discover the 985 server's URL somehow, authentication of the source of the URL 986 can provide some assurance that the URL is trusted. Typically, 987 a person obtains the URL through some non-computational means 988 and the client initiates the connection, so the client must 989 know through some non-computational means that the URL is 990 trusted. Examples of where a client can obtain a URL are 991 through an email message from a friend or co-worker, from a 992 print or TV advertisement, or as a link from another Web page. 993 However, unless the client is running secure DNS [2], the 994 client can't determine whether the server's DNS entry has been 995 hijacked (and such cases have occurred). If TLS [1] is used, 996 then bi-directional authentication is possible. However, TLS 997 primarily performs encryption, which might be unnecessary for 998 a particular application, and, additionally, requires a 999 different URL scheme (HTTPS instead of HTTP). 1001 The addition of a proxy without a service environment (except 1002 perhaps for caching) changes the trust model to split T2 into 1003 T1 and T3 (although this does not mean that T2 is equivalent 1004 to T1 and T3). To the server, the proxy acts as a client, 1005 while to the client, it acts as the server. HTTP 1.1 contains 1006 a header, Proxy-Authenticate, that the proxy sends back to the 1007 client along with a 407 (Proxy Authentication Required) if the 1008 client must authenticate itself with the proxy. The client 1009 then sends back the Proxy-Authorization header with 1010 credentials. This addresses the T1 relationship in the client 1011 to proxy direction. The T3 relationship in the proxy to server 1012 direction is addressed by having the server respond with a 407 1013 (Proxy Authentication Required) and the Proxy-Authenticate 1014 header. Since Proxy-Authenticate is a hop-by-hop header, it 1015 can be used to authenticate the proxy to server connection 1016 just as it is used for the client to proxy connection. But 1017 there is still a lack of authorization and authentication in 1018 the proxy to client and server to proxy direction, just as for 1019 end-to-end security. For a proxy acting as an avatar, the 1020 client is likely to have obtained the URL from a system 1021 administrator or other trusted source. Similarly, for a proxy 1022 acting as a surrogate, the publishing server typically has a 1023 business relationship with the surrogate provider, and the 1024 surrogate's URL or address is obtained by the server through 1025 some undefined, but necessarily secure means, because the 1026 surrogate provider wants to charge the publisher and prohibit 1027 unauthorized discovery. 1029 7.1.2 Service Environment Caching Proxy AAA Requirements 1031 The lack of a mechanism whereby a client can authorize a proxy 1032 and a proxy can authorize a server means that the reverse 1033 directions of T1 and T3 are not addressed by HTTP/1.1. 1035 In the service environment caching proxy architecture, servers 1036 provide the caching proxy with computational objects (rule 1037 modules and proxylets) and therefore must be authorized to do 1038 so. This generates the first set of AAA requirement for the 1039 extensible proxy services architecture: 1041 o A mechanism MUST be provided whereby a service environment 1042 caching proxy acting as a surrogate can demand 1043 authentication information from a server and a server can 1044 respond with authentication information appropriate to the 1045 request, to authorize the server to provide computational 1046 objects. 1048 o A mechanism MUST be provided whereby a service environment 1049 caching proxy acting as a surrogate can authenticate 1050 individual proxylets and rule modules provided by an 1051 authorized server, if necessary. 1053 Note that authentication of individual objects may not 1054 necessarily require a protocol exchange between the 1055 proxy and the server, it may be achieved by language 1056 environment-specific mechanisms for performance reasons 1057 [6], though a protocol exchange may be desirable for 1058 generality. 1060 For T1, the existing HTTP Proxy-Authenticate mechanism allows 1061 the service environment caching proxy acting as an avatar to 1062 authorize the client, but there is no mechanism for 1063 authentication of individual proxylets and rule modules, 1064 generating the requirement: 1066 o A mechanism MUST be provided whereby a service environment 1067 caching proxy acting as an avatar can authenticate 1068 individual proxylets and rule modules provided by an 1069 authorized client, if necessary. 1071 The proxy to client direction of T1 requires authentication, 1072 even though none is supplied in standard HTTP/1.1. Because a 1073 client will be providing computational objects to an avatar, 1074 it is essential that the client knows it can trust a service 1075 environment caching proxy acting as an avatar; otherwise, the 1076 computational objects may be provided to an unauthorized or 1077 hostile proxy, much to the client's detriment. This generates 1078 a requirement on the proxy to client direction of T1: 1080 o A mechanism MUST be provided whereby a client can 1081 authenticate a service environment caching proxy offering 1082 to act as an avatar. 1084 While the discussion above assumes that existing HTTP 1085 authentication can be used to authorize T1 in the client to 1086 proxy direction and T3 in the proxy to server direction, it 1087 may be useful to supplement these methods with additional 1088 authentication procedures that are uniform with new procedures 1089 introduced in the opposite direction, or provide the new 1090 procedures so that they are compatible with the old: 1092 o New authentication mechanisms for relationships T1 in the 1093 client to proxy direction and T3 in the proxy to server 1094 direction SHOULD be uniform with mechanisms in the opposite 1095 direction, either by implementing the new mechanisms in a 1096 manner similar to the old or by supplementing the old 1097 mechanisms with new. 1099 This ensures a compatible, easier to use framework for 1100 authentication in both directions on the T1 and T3 1101 relationships. 1103 Finally, services run on the service environment caching proxy 1104 need to be paid. This generates the requirement. 1106 o The service environment caching proxy server MUST be able 1107 to deliver secure, nonrepudiable accounting information to 1108 a billing entity. 1110 7.1.3 Remote Callout Server AAA Requirements 1112 In addition to the injection of proxylet functionality on the 1113 caching proxy, the caching proxy can also make use of a remote 1114 callout engine to modify particular objects. This 1115 architectural piece gives rise to the trust relationship T4, 1116 between the caching proxy and the remote callout engine, T5, 1117 between the remote callout engine and the server, and T6, 1118 between the client and the remote callout engine. 1120 Existing remote callout protocols leverage off of HTTP 1121 authentication for the remote callout server. The ICAP 1122 specification [7] explicitly states that an ICAP server acts 1123 as a proxy for purposes of authentication so a proxy client 1124 can send any Proxy-Authenticate and Proxy-Authorization 1125 headers, although other hop-by-hop headers are not forwarded. 1126 However, this has little use for purposes of authenticating 1127 trust relationships T7 and T5. The remote callout server may 1128 require that the client or publishing server authenticate 1129 separately from the proxy, if the remote callout server is 1130 owned and administered by a separate entity from the proxy. In 1131 addition, a message from the caching proxy to a server that 1132 generates a 407 (Proxy Authentication Required) may or may not 1133 have been processed by the ICAP server, but in any event, the 1134 server won't know that the message was so processed. The 1135 server responds to the sender of the message, namely the 1136 caching proxy. The caching proxy must respond with its 1137 credentials, the ICAP server is essentially invisible as far 1138 as the server is concerned. 1140 Trust relationships T7 and T5 could derive transitively from 1141 T1/T4 and T3/T4. In that case, authorization granted by/to the 1142 caching proxy is considered to be authorization granted by/to 1143 the remote callout server. If the remote callout server is in 1144 the same administrative domain as the caching proxy, as is 1145 assumed in the ICAP specification [7], this is likely to be 1146 the case. However, in the general case, where the remote 1147 callout server resides outside the domain of the service 1148 environment caching proxy, authorization by/of the caching 1149 proxy server is insufficient. 1151 This generates the requirement: 1153 o A mechanism MUST be provided whereby, when the remote 1154 callout server is outside the administrative domain of the 1155 caching proxy, the remote callout server can directly 1156 authenticate with the publishing server and/or with the 1157 client, and the client or publishing server can directly 1158 authorize a remote callout server independent of the proxy. 1160 This requirement, if imposed on the HTTP stream between the 1161 client and server, would remove the invisibility of the remote 1162 callout server. However, this requirement could be met by an 1163 out-of-band authentication procedure, for example, using 1164 Diameter [8], in which case the remote callout server would 1165 remain invisible during HTTP transactions. ACLs could be 1166 established on the server allowing or denying access to the 1167 particular data objects for the remote callout server, at the 1168 expense of making the remote callout server visible to HTTP 1169 streams. Note that there is no need to authenticate 1170 computational objects because the remote callout server, by 1171 definition, does not receive computational objects from the 1172 client and/or publishing server. 1174 The trust relationship T4 is on the remote callout to proxy 1175 connection. If the remote callout server is in a separate 1176 domain, authentication is required between the remote callout 1177 server and the caching proxy. Again, proxy authentication can 1178 be used in the remote callout to proxy direction, but there is 1179 no way for the caching proxy to authenticate the remote 1180 callout server. This leads to the requirement: 1182 o When the remote callout server is outside the 1183 administrative domain of the caching proxy, some means of 1184 authenticating the remote callout server with the caching 1185 proxy is required. 1187 We also require uniform mechanisms on both the forward and 1188 reverse directions of T4, and T7 and T5 as well: 1190 o The new authentication mechanism for the relationship T4 in 1191 the proxy to remote callout direction SHOULD be uniform 1192 with the mechanism in the opposite direction, either by 1193 implementing the new mechanisms in a manner similar to the 1194 old or by supplementing the old mechanisms with new. 1196 o Authentication mechanisms for T7 and T5 MAY be uniform with 1197 other authentication mechanisms. 1199 The requirement on T7 and T5 is looser in order to avoid 1200 overly constraining the mechanisms for verifying the other 1201 trust relationships, in which backward compatibility 1202 considerations may play a large role. 1204 Finally, services run on the remote callout server need to be 1205 paid. This generates the requirement. 1207 o The remote callout server MUST be able to deliver secure, 1208 nonrepudiable accounting information to a billing entity. 1210 Most likely, the billing entity will be the administrative 1211 server, but it may be another. If the billing entity is the 1212 administrative server, and the remote callout server is 1213 outside the domain of the caching proxy, the method whereby 1214 the accounting information is delivered must be secure and 1215 allow nonrepudiation, so that the owners of the remote callout 1216 server can be assured of proper billing and payment. 1218 7.1.4 Administrative Server AAA Requirements 1220 The administrative server is responsible for injecting 1221 proxylets into the service environment caching proxy, and for 1222 collecting accounting information from the service environment 1223 caching proxy and, transitively, from the remote callout 1224 server. The proxylets injected by the administrative server 1225 may run at an additional level of trust from those introduced 1226 by clients and publishing servers, since they may be involved 1227 in collecting accounting information or in other sensitive 1228 tasks. 1230 From a practical standpoint, the administrative server is 1231 highly likely to be within the same administrative domain as 1232 the caching proxy, but as with the remote callout server, the 1233 case where it is not may also occur. This requires that trust 1234 relationship T6 be verified. Therefore, we have the following 1235 requirement: 1237 o A mechanism MUST be provided whereby, when the 1238 administrative server is outside the domain of the caching 1239 proxy, mutual authentication between the caching proxy and 1240 administrative server is possible. 1242 The administrative server also requires some means of 1243 obtaining accounting information from the caching proxy and 1244 remote callout server: 1246 o The administrative server MUST obtain accounting 1247 information that is secure and nonrepudiable from the 1248 caching proxy and remote callout server. 1250 Finally, if the administrative server is allowed to inject 1251 proxylets at an additional trust level, an additional 1252 authentication mechanism may be required: 1254 o If the administrative server can inject proxylets at a 1255 higher trust level into the service environment proxy, a 1256 mechanism MUST be provided whereby the additional trust 1257 level can be verified (possibly with human involvement). 1259 7.2 Requirements on the Service Execution Environment 1261 Although only one client and server are illustrated connected 1262 to the caching proxy in Figure 1, the caching proxy may be 1263 offering services to multiple upstream and/or downstream 1264 peers, some of which may be from mutually exclusive 1265 administrative domains. In order to ensure that all parties 1266 involved in using a service environment caching proxy are 1267 protected, the execution environment must enforce exclusion 1268 between separately authenticated and authorized parties. 1269 Otherwise, unintentional or malicious interference could occur 1270 between rule bases and proxylets running on behalf of 1271 separately authorized parties. An analogous situation exists 1272 in operating systems on time share machines where multiple, 1273 separately authorized parties share a single global execution 1274 environment at the operating system level. 1276 This goal imposes several requirements on the service 1277 execution environment: 1279 o A service environment caching proxy MUST ensure that, for a 1280 rulebase module associated with a single authorized party, 1281 rules are only applied to protocol streams to/from that 1282 party. 1284 o A service environment caching proxy MUST ensure that a 1285 proxylet authorized to run on behalf of one party not be 1286 run on behalf of another party that is not authorized to 1287 run the proxylet. 1289 o A service environment caching proxy MUST prevent any 1290 proxylet running on behalf of an authorized party from 1291 inspecting or modifying data or code from another, 1292 separately authorized party. 1294 As with any software system, proxylet libraries are likely to 1295 undergo evolution with time. Consistent processing results are 1296 likely only if a single version of the library is used to 1297 process a message. This generates a requirement on library 1298 versions: 1300 o A service environment caching proxy MUST ensure that only 1301 one version of a given proxylet library is used to process 1302 any single message. 1304 Finally, the service environment caching proxy and 1305 administrative server may install proxylets for performing 1306 various system services, like collection of accounting data. 1307 These system services may have access to certain API functions 1308 that are not accessible to general proxylets from other 1309 clients. This results an additional requirement: 1311 o If a service environment caching proxy supports privileged 1312 access for authorized proxylets at a higher level of trust, 1313 the execution environment MUST exclude unprivileged 1314 proxylets from accessing privileged APIs. 1316 The inclusion of an API for privileged proxylets in the 1317 execution environment MAY generate a requirement for servers 1318 downloading privileged proxylets to perform additional 1319 authentication (possibly including human involvement), as 1320 discussed in the previous subsection. 1322 8. Impact on the Internet Architecture 1324 On the face of it, the architecture proposed in this paper for 1325 adding services to caching proxies in the Web looks like a 1326 major change in the end-to-end model that has been so 1327 successful in the Internet. The best solution in terms of 1328 preserving the highly successful end-to-end model is to 1329 explicitly expose service environment caching proxies by 1330 requiring the client or server to discover and authenticate 1331 themselves with the proxy. Previous sections have discussed 1332 requirements that modify current Web discovery and 1333 authentication practices to support varying degrees of 1334 exposing service environment caching proxies. It is possible, 1335 however, that for business or technical reasons, the provider 1336 of proxy services may want to keep the proxy transparent to 1337 the client or server (via an interception proxy configuration) 1338 during standard day to day operations. The rest of this 1339 section discusses transparent service environment proxies in 1340 the context of the Internet architecture. 1342 The value of the end to end model is that the network is 1343 simple and transparent, so it is easy to add services, and 1344 easy to diagnose problems when they occur. With the end to end 1345 model, there are only two active entities that count, the 1346 client and the server (or requesting peer and replying peer in 1347 peer-to-peer services). Other entities perform a strictly 1348 limited set of functions associated with packet forwarding. 1349 Yet, there are a few other cases in which the end-to-end model 1350 has been modified. Firewalls and spam filtering on SMTP relays 1351 are examples. Both cases are a result of perceived need for 1352 control over the content of packet flow, as opposed to simply 1353 controlling the flow without regard to content. 1355 In the case of firewalls, ISPs and corporate networks require 1356 the ability to restrict entry into their co-operatively 1357 managed networks, so that only paying customers (for ISPs) or 1358 authorized users (for corporations) can gain access to their 1359 networks. Firewalls in essence allow the owners of these 1360 networks to enforce their property rights with regard to 1361 networks that they own. But they also perform another 1362 important function: they allow corporations and ISPs to 1363 enforce the orderly provisioning of service to users, thereby 1364 ensuring the orderly functioning of the Internet at large. 1365 Despite the continued controversy over the reduction in 1366 transparency caused by firewalls, by and large, firewalls have 1367 been successful in performing their function. Proxy DOS 1368 attacks and other large scale proxy disruptions of the 1369 Internet are impossible to mount from outside through a 1370 firewall, and a firewall allows a co-operatively managed 1371 network to disconnect from the Internet for a time, if a major 1372 disruption does occur, thereby allowing the network provider 1373 to continue providing local service to users. 1375 Spam filtering has been less controversial perhaps because the 1376 connection to direct user need (and direct daily experience of 1377 Internet users) has been more obvious. The nature of email 1378 makes it possible for an email sender bent on a disruptive, 1379 commercial, or other intent, that may not correspond with the 1380 desire of the recipient, to address thousand or millions of 1381 recipients, whether or not those recipients want to be 1382 addressed by that sender. Spam filtering allows the 1383 administrator of an SMTP relay to save his or her users the 1384 trouble of having to hand filter tens or hundreds of annoying 1385 messages a day. For anybody whose job involves having to deal 1386 with hundreds of legitimate messages a day, such a service is 1387 invaluable. 1389 Both of these examples involve security. Adding the ability of 1390 ISPs and other network operators or service providers to 1391 insert services into HTTP proxies is a generalization of 1392 existing cases away from simple security functions, but it is 1393 likewise being driven by the needs of ISPs and other network 1394 operators, and by the desires of their users. 1396 ISPs and other network service providers want the ability to 1397 add services to HTTP proxies because they want to be able to 1398 generate additional revenue from added value that they can 1399 provide. This added revenue helps assure their viability as 1400 commercial concerns, which allows them to continue to provide 1401 service to their customers and grow their network offering. 1402 Since a viable ISP business is absolutely crucial for the 1403 growth and continued maintenance of the Internet, value added 1404 services is a way to help augment funding for Internet growth 1405 and maintenance. 1407 Users benefit from value added services because, like spam 1408 filtering, some services may prove invaluable to them. While 1409 these services could potentially be provided by adhering to 1410 the traditional end to end model, the barrier to deploying 1411 such services on clients is large, and growing larger as the 1412 number of Internet users grows. Today, many Internet users are 1413 consumers that access the Internet through standardized 1414 clients that they would just as soon treat as appliances. They 1415 would rather not have to upgrade their client software to 1416 access new Internet services, due to the potential for 1417 disruption in the stability of their daily Internet access 1418 environment. In the future, with the advent of true Internet 1419 appliances (wireless and otherwise), it may become impossible 1420 to upgrade such clients without throwing the hardware away, 1421 and the amount of processing power available in such devices 1422 may be unsuitable for performing certain services (such as 1423 virus filtering) that proxy value added services can provide. 1424 The ability to deploy services on HTTP proxies allows the 1425 user's ISP or other network service provider to quickly deploy 1426 services that would take years to deploy on clients, and may 1427 not even be possible on some low performance clients. 1429 If the caching proxy continues to remain largely transparent 1430 as a interception proxy, the possibility for abuse is high. 1431 Therefore, setting up a value added caching proxy without a 1432 business (or other social) relationship between either the 1433 client or the server (or both), is a highly unethical act. 1434 Clients and servers are encouraged to use authentication to 1435 limit their vulnerability to unauthorized intermediate 1436 processing on caching proxy. Value added providers are 1437 encouraged to advertise the presence of value added services, 1438 so that clients know that their Web streams are being 1439 modified. Value added caching proxies have a potential to 1440 reduce the processing transparency of the Web, but their 1441 commercial potential, and thus the value they provide both to 1442 publishers and clients, is still higher. In the place of 1443 processing transparency, the business and social relationships 1444 between clients, caching proxies, and servers should become as 1445 non-invasive as possible, so all parties in a Web transaction 1446 know the services for which they've contracted and which are 1447 being performed by their service providers. 1449 9. Intellectual Property 1451 The IETF takes no position regarding the validity or scope of 1452 any intellectual property or other rights that might be 1453 claimed to pertain to the implementation or use of the 1454 technology described in this memo or the extent to which any 1455 license under such rights might or might not be available; 1456 neither does it represent that it has made any effort to 1457 identify any such rights. Information on the IETF's 1458 procedures with respect to rights in standards-track and 1459 standards-related documentation can be found in BCP-11. 1460 Copies of claims of rights made available for publication and 1461 any assurances of licenses to be made available, or the result 1462 of an attempt made to obtain a general license or permission 1463 for the use of such proprietary rights by implementors or 1464 users of this specification can be obtained from the IETF 1465 Secretariat. 1467 The IETF invites any interested party to bring to its 1468 attention any copyrights, patents or patent applications, or 1469 other proprietary rights which may cover technology that may 1470 be required to practice this standard. Please address the 1471 information to the IETF Executive Director. 1473 10. Acknowledgements 1475 In addition to the authors, valuable discussion instrumental 1476 in creating this document has come from Geoff Baehr, of Sun 1477 Microsystems; Steven Reynolds, of Enron; Rob Adams, of CMGIon; 1478 Steve Holbrook, of Novell; and Ed Haslam, of Inktomi. 1480 References 1482 [1] Dierks, T. and C. Allen, "The TLS Protocol, Version 1.0", 1483 RFC 2246, January 1999, 1484 . 1486 [2] Eastlake, D. and C. Kaufman, "Domain Name System Security 1487 Extensions", RFC 2065, January 1997, 1488 . 1490 [3] Sedlar, E. and G. Clemm, "Access Control Extensions to 1491 WebDAV", Internet-Draft draft-ietf-webdav-acl-01.txt, 1492 work in progress, May 2000, 1493 1494 . 1496 [4] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 1497 Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext 1498 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999, 1499 . 1501 [5] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, 1502 S., Leach, P., Luotonen, A. and L. Stewart, "HTTP 1503 Authentication: Basic and Digest Access Authentication", 1504 RFC 2617, June 1999, 1505 . 1507 [6] Oaks, S., "Java Security", May 1998. 1509 [7] Elson, J., Martin, J., Sharp, E., Schuster, J., Cerpa, 1510 A., Danzig, P., Neerdaels, C. and G. Tomlinson, "ICAP, 1511 the Internet Content Adaptation Protocol", External 1512 Reference http://www.i-cap.org/icap_v1-25.txt, work in 1513 progress, January 2000, 1514 . 1516 [8] Calhoun, P., Rubens, A., Akhtar, H. and E. Guttman, 1517 "DIAMETER Base Protocol", Internet-Draft 1518 draft-calhoun-diameter-15.txt, work in progress, June 1519 2000, 1520 1521 . 1523 [9] Cooper, I., Melve, I. and G. Tomlinson, "Internet Web 1524 Replication and Caching Taxonomy", Internet-Draft 1525 draft-ietf-wrec-taxonomy-04.txt, work in progress, June 1526 2000, 1527 1528 . 1530 [10] Bradner, S., "Key words for use in RFCs to Indicate 1531 Requirement Levels", RFC 2119, March 1997, 1532 . 1534 [11] FOLDOC, "Free Online Dictionary of Computing: 1535 Replication Levels", March 1997, 1536 1537 . 1539 [12] Volbrecht, J., Calhoun, P., Farrell, S., Gommans, L., 1540 Gross, G., de Bruijn, B., de Laat, C., Holdrege, M. and 1541 D. Spence, "AAA Authorization Framework", Internet-Draft 1542 draft-ietf-aaa-authz-arch-00.txt, work in progress, 1543 October 1999, 1544 1545 . 1547 [13] Stevens, M., Weiss, W., Mahon, H., Moore, B., Strassner, 1548 J., Waters, G., Westerinen, A. and J. Wheeler, "Policy 1549 Frameworks", Internet-Draft 1550 draft-ietf-policy-framework-00.txt, work in progress, 1551 September 1999, 1552 1553 . 1555 [14] Carpenter, B., "Architecture Principles of the 1556 Internet", RFC 1958, June 1996, 1557 . 1559 [15] Carpenter, B., "Internet Transparency", RFC 2775, 1560 February 2000, 1561 . 1563 [16] Cerpa, A., Elson, J., Beheshti, H., Chankhunthod, A., 1564 Danzig, P., Jalan, R., Neerdaels, C., Schroeder, T. and 1565 G. Tomlinson, "NECP the Network Element Control 1566 Protocol", Internet-Draft draft-cerpa-necp-02.txt, work 1567 in progress, February 2000, 1568 1569 . 1571 [17] Gauthier, P., Cohen, J., Dunsmuir, M. and C. Perkins, 1572 "Web Proxy Auto-Discovery Protocol", Internet-Draft 1573 draft-ietf-wrec-wpad-01.txt, expired work in progress, 1574 July 1999, 1575 . 1577 [18] Guttman, E., Perkins, C., Veizades, J. and M. Day, 1578 "Service Location Protocol, Version 2", RFC 2608, June 1579 1999, . 1581 [19] Cooper, I. and J. Dilley, "Known HTTP Proxy/Caching 1582 Problems", internet-draft 1583 draft-ietf-wrec-known-prob-01.txt, July 2000, 1584 1585 . 1587 Authors' Addresses 1589 Gary Tomlinson 1590 Novell, Inc. 1591 1800 South Novell Place 1592 Provo, UT 84606-6194 1593 US 1595 Phone: +1 801 861 7021 1596 EMail: garyt@novell.com 1598 Hilarie Orman 1599 Novell, Inc. 1600 1800 South Novell Place 1601 Provo, UT 84606-6194 1602 US 1604 Phone: +1 801 861 5278 1605 EMail: horman@novell.com 1607 Michael Condry 1608 Sun Microsystems, Inc. 1609 901 San Antonio Road 1610 Palo Alto, CA 94303-4900 1611 US 1613 Phone: +1 650 786 5568 1614 EMail: michael.condry@sun.com 1616 James Kempf 1617 Sun Microsystems, Inc. 1618 901 San Antonio Road 1619 Palo Alto, CA 94303-4900 1620 US 1622 Phone: +1 650 786 5890 1623 EMail: james.kempf@sun.com 1624 Dave Farber 1625 Digital Island 1626 225 West Hillcrest Drive, Suite 250 1627 Thousand Oaks, CA 91360 1628 US 1630 Phone: +1 805 370 2190 1631 EMail: dave@digisle.net 1632 Appendix A. Examples 1634 The following applications illustrate particular features of 1635 the service environment proxy architecture and how the 1636 architecture can be used to implement some services that would 1637 be difficult to implement in other ways. 1639 A.1 Request Identification 1641 A simple problem is how to attach a custom identification, 1642 such as a cookie or serial number, to a particular type of 1643 request. The service definition consists of a rule that 1644 matches an identifying field in the message header, like the 1645 origin address. When the rule triggers, a proxylet generates 1646 the identification, perhaps by querying the content server, 1647 and adds it by attaching another header field. The request 1648 identification rule is triggered at execution point 2 in 1649 Figure 1, since the identification is made on an incoming 1650 request, and it is only made if the request is not fulfilled 1651 from the cache. Request identification could be handled 1652 directly by the content server, but addition of the service 1653 environment proxy allows multiple content servers potentially 1654 from multiple administrative domains to subscribe to the same 1655 request identification service without having to separately 1656 implement the service, and to potentially correlate 1657 identifications. 1659 A.2 Content Assembly 1661 A common use of proxies today is to insert advertisements into 1662 Web pages. Surrogates are commonly used in the Internet to 1663 perform customized ad generation on Web pages. This 1664 application can be generalized to content assembly of any kind 1665 from multiple origin servers. Content assembly illustrates the 1666 need for message parsers that handle the contents of messages 1667 as well as the headers. 1669 In order to determine whether and where content must be 1670 inserted, a marker is required within the content. For 1671 example, a tag such as "" in the content may 1672 indicate where the advertisement should be inserted. The tag 1673 may contain information indicating where to obtain the ad or 1674 the location of the ad content be determined computationally 1675 when the ad is inserted. The rule engine must respond to 1676 method properties derived from content as well as headers, and 1677 the content must be made available to the proxylet in a way 1678 that allows the proxylet to reassemble the Web page with the 1679 new content inserted. In order to avoid serious performance 1680 penalties, parsing of the message should be performed only 1681 once. Content assembly runs at execution point 3 (for cached 1682 responses) or execution point 4 (for personalized responses) 1683 as defined in Figure 1. 1685 A.3 Multimedia Stream Management 1687 This example illustrates how a service environment caching 1688 proxy can provide a different caching algorithm than LRU for 1689 content that has different requirements. 1691 Multimedia content is a growing part of Web traffic, but the 1692 caching strategy used with multimedia may need to be different 1693 because the streams are real time and the amount of date 1694 involved is larger than standard text and graphics Web pages. 1695 A proxylet can be used to manage the cache for more effective 1696 caching of multimedia data. An example where such a caching 1697 policy might be required is showing movies on demand to 1698 multiple clients from one incoming stream. The incoming stream 1699 is maintained in the cache for some period of time, and new 1700 users who request movie content within that time window have 1701 their requests fulfilled from the cache rather than directly 1702 from the origin server. The cache copy is periodically 1703 refreshed as the time window expires, at which point, the 1704 proxy must go back to the origin server if another request for 1705 the movie comes in. 1707 The cache provides multiple client streams and does accounting 1708 for clients watching movie. If the protocol used to transport 1709 the movie is RTSP, the packet headers need to be transformed 1710 as they are removed from the cache for delivery so each client 1711 receives an appropriate header. This occurs at execution point 1712 4 in Figure 1, since the changes are made after the packets 1713 are removed from the cache for delivery back to the client. 1715 A.4 Virus Detection 1717 An example of an application that would benefit from having a 1718 remote callout server is virus detection. Because virus 1719 detection might be computationally expensive, the service 1720 execution environment should have the option of calling on 1721 more powerful computational resources. In addition, a customer 1722 could outsource virus detection to a computer security firm 1723 that specializes in tracking viruses and provides fast 1724 detection solutions. By allowing a remote callout server to 1725 perform the detection, the customer achieves the benefit of 1726 specialized help for their security needs. Virus detection 1727 would typically run on an avatar, but it might also run on a 1728 surrogate that is taking uploaded material from user agents. 1730 In the virus detection example, the message parser and rule 1731 engine detect data objects that could potentially contain 1732 viruses, and vector off to the remote callout server to verify 1733 the data objects. Virus scanning typically runs at execution 1734 point 3 in an avatar and execution point 1 in a surrogate, to 1735 prevent virus-infected material from being cached. 1737 A.5 Transcoding 1739 This example illustrates an application for service 1740 environment proxies that would be difficult to implement in 1741 any other way. Wireless devices often have screen, keyboard, 1742 and network bandwidth constraints that don't apply to other 1743 devices. The current way of accommodating those constraints is 1744 to construct a gateway that short-circuits requests and 1745 fulfills them with customized content that matches the 1746 constraints of the device, potentially even over a different 1747 networking protocol (i.e. WAP). 1749 With a service environment proxy, the rule base can identify 1750 requests from devices that have some kind of user interface or 1751 network bandwidth constraint, and trigger a proxylet that 1752 processes the content to transcode it into a format that is 1753 more appropriate for the device. The advantage of the service 1754 environment proxy solution is that the content need be written 1755 only once, while gateways require that multiple copies of the 1756 content to be available matching different device 1757 characteristics. Transcoding can also be used for other 1758 purposes, for example, to translate a Web page into multiple 1759 languages. 1761 Full Copyright Statement 1763 Copyright (C) The Internet Society (2000). All Rights Reserved. 1765 This document and translations of it may be copied and 1766 furnished to others, and derivative works that comment on or 1767 otherwise explain it or assist in its implementation may be 1768 prepared, copied, published and distributed, in whole or in 1769 part, without restriction of any kind, provided that the above 1770 copyright notice and this paragraph are included on all such 1771 copies and derivative works. However, this document itself may 1772 not be modified in any way, such as by removing the copyright 1773 notice or references to the Internet Society or other Internet 1774 organizations, except as needed for the purpose of developing 1775 Internet standards in which case the procedures for copyrights 1776 defined in the Internet Standards process must be followed, or 1777 as required to translate it into languages other than English. 1779 The limited permissions granted above are perpetual and will 1780 not be revoked by the Internet Society or its successors or 1781 assigns. 1783 This document and the information contained herein is provided 1784 on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET 1785 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 1786 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE 1787 USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 1788 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 1789 PARTICULAR PURPOSE. 1791 Acknowledgement 1793 Funding for the RFC editor function is currently provided by 1794 the Internet Society.