idnits 2.17.1 draft-dracinschi-opes-callout-requirements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 217 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 14, 2001) is 8198 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-01) exists of draft-tomlinson-opes-model-00 Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft A. Dracinschi Sailer 3 Expires: May, 2002 Lucent Technologies 4 V. Hilt 5 Document: Univ. of Mannheim 6 draft-dracinschi-opes-callout-requirements-00.txt M. Hofmann 7 Lucent Technologies 8 R. R. Menon 9 Intel 11 Category: Informational November 14, 2001 13 Requirements for OPES Callout Protocols 15 Status of this Memo 17 This document is an Internet-Draft and is in full conformance with 18 all provisions of Section 10 of RFC2026. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups MAY also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and MAY be updated, replaced, or obsoleted by other documents 27 at any time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 Abstract 37 In the context of the Content Networks, the Open Pluggable Edge 38 Services represents an infrastructure that enables quick and easy 39 creation of value-added networking services. This document attempts 40 to present requirements for callout protocols that provide 41 communication between an in-path OPES intermediary (e.g. a cache) 42 and remote callout servers. 44 Table of Contents 46 1 Terminology....................................................2 47 2 Introduction...................................................2 48 3 Design Considerations..........................................3 49 3.1 Basic Requirements...........................................3 50 3.1.1 Service identification......................................3 51 3.1.2 Message exchange style......................................3 52 3.1.3 Message context.............................................3 53 3.1.4 Payload transparency........................................4 54 3.1.5 Pipelining requests.........................................4 55 3.1.6 Message segmentation........................................5 56 3.2 Increasing Efficiency........................................5 57 3.2.1 Caching responses...........................................5 58 3.2.2 Channels....................................................5 59 3.2.3 Buffering messages..........................................6 60 3.2.4 Preview.....................................................6 61 3.2.5 Partial content.............................................7 62 3.2.6 Multiple services on the same message.......................8 63 4 Security Considerations........................................9 64 5 Acknowledgments................................................9 65 6 References.....................................................9 66 7 Author's Addresses.............................................9 67 Full Copyright Statement..........................................10 69 1 Terminology 71 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL ", "SHALL NOT", 72 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 73 document are to be interpreted as described in RFC 2119 [1]. 75 OPES related terms are to be interpreted as defined and used in [2]. 77 2 Introduction 79 Content Networks, also known as Content Distribution Networks or 80 Content Delivery Networks (CDNs), are of increasing importance to 81 the overall architecture of the web. CDNs support improving the 82 delivery of content from an origin server to content consumers. 83 Content networks can be seen as an overlay network on top of the 84 traditional packet network infrastructure. Similar to the CDN space, 85 there exists need for delivering a variety of services to 86 corporate/enterprise Intranets. [2] introduces Open Pluggable Edge 87 Services (OPES), an infrastructure for adding valuable content 88 services to a CDN or an Intranet. Examples of such services include 89 dynamic content assembling at the network edge, URL filtering, 90 language translation, location-based services, content adaptation 91 for different devices based on device characteristics, privacy 92 services, etc. 94 This document presents requirements for callout protocols in the 95 context of the OPES architecture. A callout protocol supports 96 message exchanges between an in-path OPES intermediary and a remote 97 callout server. Intermediaries are application gateway devices 98 located in the path between a client and an origin server. Caching 99 proxies are probably the most commonly known and used intermediaries 100 today. A remote callout server is a cooperating server that runs 101 OPES service modules on behalf of an OPES intermediary. Remote 102 callout servers are usually employed in an OPES framework to either 103 offload the OPES intermediary for better scalability or to provide 104 value-added services not available on either the origin server or 105 the OPES intermediary. 107 Section 3 describes the attempts to summarize the requirements for 108 such callout protocol. 110 3 Design Considerations 112 3.1 Basic Requirements 114 A callout protocol's primary purpose is to efficiently forward, from 115 the intermediary to the remote callout server, request/response 116 messages exchanged on the content path (e.g. HTTP, RTSP, or RTP 117 messages) and information about the service to be executed on those 118 messages at the remote server. In order to fulfill this task, a 119 callout protocol SHOULD consider the following design issues: 120 service identification, message exchange style, message context, 121 payload transparency, pipelining and message segmentation. 123 3.1.1 Service identification 125 A callout protocol MUST be able to uniquely identify a remote 126 callout service that is required to be executed on a message. An 127 adequate way to provide such identification MAY be a URI. Such a URI 128 MUST contain the complete hostname and the path identifying the 129 service requested. The method of determining the name of an 130 appropriate service is outside of the scope of a callout protocol. 131 An example for a URL is ucp://my.callout-server.com/service1 133 3.1.2 Message exchange style 135 A callout protocol MUST implement a request/reply communication 136 style. Initiating a callout always requires a request containing the 137 encapsulated message (or parts of it) to be transferred to a callout 138 server. In turn, this server MUST always send back a response either 139 containing the unmodified message, a modified version of the 140 message, a status code (that triggers a certain reaction from the 141 intermediary) or an error code. 143 3.1.3 Message context 145 Some remote callout services require additional information to 146 perform their service. One example for such information is the HTTP- 147 request for a service that is operating on a HTTP-response. Another 148 example is a command line parameter (e.g. the destination language 149 for a translation service). In general, a message context could be 150 any information available in the local execution environment that is 151 needed by a remote callout service. 153 Basically, there are two methods of transferring the message context 154 to the remote server: first, it can be part of the URL (e.g. as user 155 id, additional path elements or a query parameter) with which a 156 service is invoked. An example of such a URL is 157 ucp://volker@my.callout-server.com:8080/translation- 158 service/fast_translation?lang=german. The second possibility to 159 transfer the message context is within a separate field of the 160 request header. As with the payload, no assumptions SHOULD be made 161 on the type or structure of the message context field. Instead, the 162 message context SHOULD be taken as binary data that is encapsulated 163 in the request. An example for information in such a header field is 164 the HTTP-request that is shipped along with a HTTP-response. 166 Both methods of transferring the message context have their 167 advantages and disadvantages. Transferring the message context 168 within the URL is simple and produces very low overhead. However, 169 the size and complexity of information contained in a URL is limited 170 (e.g. encoding a HTTP-request within a URL might not be a good 171 alternative). Using a separate header-field introduces some overhead 172 but is much more flexible than using a URL. 174 Although it would be possible to let the callout server modify parts 175 of the message context and return it along with the response, this 176 SHOULD NOT be allowed. It would substantially increase the 177 complexity of an intermediary since the intermediary would need to 178 assure the consistency of the message context especially if multiple 179 requests are issued in parallel. 181 3.1.4 Payload transparency 183 A callout protocol SHOULD make no assumptions about the protocol 184 used on the content path (in particular, it SHOULD NOT assume that 185 this protocol is HTTP). Instead, a callout protocol SHOULD take the 186 content path protocol messages as binary data and encapsulate these 187 messages during the transfer to and from a remote callout server. 189 This requirement does not prevent a design, where a basic callout 190 protocol captures common aspects of the callout process and an 191 additional payload specification tailors this basic protocol to the 192 needs of a certain content path protocol (similar to the model used 193 by RTP). Nevertheless, the basic callout protocol SHOULD be 194 independent of the protocol used on the content path. 196 If possible, a callout protocol SHOULD also not assume a certain 197 communication pattern (e.g. request/reply) to be used on the content 198 path. The rationale behind the payload transparency is, that a 199 callout protocol SHOULD be capable of handling different content 200 path protocols to avoid the re-implementation of similar 201 functionality for each of these protocols. Examples of common 202 content path protocols are HTTP, RTSP, SMTP, NNTP, and RTP. 204 3.1.5 Pipelining requests 205 It is very likely that a remote callout service is called many times 206 in sequence with a very short time in between two single requests. 207 For example an ad insertion service might be called for every HTTP 208 message passing through an intermediary. For this reason, a callout 209 protocol MUST be capable of issuing a request without having 210 received the response for a previous request. In other words, the 211 protocol MUST be capable of pipelining multiple requests. 213 3.1.6 Message segmentation 215 The messages exchanged on the content path can be of very large 216 sizes. Examples are huge web pages, PostScript or PDF documents, 217 audio and video clips and streamed audio and video. Usually, these 218 messages are segmented and transferred in a stream of small packets. 219 For example, HTTP supports this type of transmission with its 220 chunked transfer encoding. A callout protocol SHOULD be able to 221 redirect the segments of a message to the callout server as soon as 222 the intermediary receives them. The intermediary SHOULD NOT try to 223 receive the entire message before it is sent to the callout server. 224 This would substantially increase the processing time of one message 225 and it would not be possible at all for media streams. An 226 implication for the protocol design is that the size of messages is 227 not known at the time the first packets are sent to the callout 228 server. 230 3.2 Increasing Efficiency 232 Typically, an intermediary has to handle large amounts of network 233 traffic. Depending on the rule configuration and the services 234 provided, a significant part of this traffic may be sent to a remote 235 callout server. For this reason, efficiency SHOULD be one of the 236 major design goals for a callout protocol. Performance measurements 237 on the ICAP protocol indicate that the vast majority of processing 238 time is spent copying messages from the content path to the callout 239 server and back. Thus, the efficiency of a callout protocol can be 240 increased if the amount of data that has to be transmitted is 241 minimized. The following concepts MAY help to achieve this goal. 243 3.2.1 Caching responses 245 A callout protocol SHOULD support the caching of responses. To do 246 so, a remote callout server MUST be able to indicate if and how long 247 a response MAY be cached by an intermediary. If a response is 248 cacheable and still valid, an intermediary MAY satisfy identical 249 requests by using the cached response. Determining which requests 250 are identical is outside of the scope of a callout protocol. If a 251 server has allowed the caching of a response for a certain period of 252 time, there is no means for it to revise this decision. 254 3.2.2 Channels 256 Since it can be assumed that an intermediary sends a large number of 257 requests to a remote callout server, it is reasonable to open a 258 persistent channel to a remote callout server over which all 259 messages are transferred. This will substantially reduce the network 260 overhead for the transmission of one message. An intermediary might 261 decide at which time it opens or closes a channel. A reasonable 262 policy might be to establish a channel at the time the first request 263 for a service is received and to close the channel after a timeout. 264 The policy of opening and closing a channel SHOULD NOT be part of 265 the protocol. 267 During the creation of a channel, an intermediary has the chance to 268 negotiate service parameters, associated with that channel, with the 269 remote callout service. These parameters apply to all messages 270 exchanged over that channel. Examples of such parameters are the 271 service URI, the payload type, or the service context. Exchanging 272 this information once at the channel setup reduces some of the 273 protocol overhead. Although these savings are not really big, they 274 come at almost no cost. Furthermore, negotiation of parameters can 275 be accomplished during channel creation while this might become 276 time-critical if attempted for each message. 278 3.2.3 Buffering messages 280 An intermediary MAY keep a local copy of the message it has sent to 281 a remote callout server. This allows the callout server to avoid 282 returning an entire message always. The server could, for example, 283 return a status code indicating that it does not want to alter the 284 original message. Keeping a copy of the message at the intermediary 285 can significantly decrease the amount of data that has to be 286 transferred between intermediary and callout server. However, it 287 requires the intermediary to store and manage all messages it has 288 sent to the callout server. Thus, it introduces complexity in the 289 intermediary and increases its memory requirements. 291 To alleviate this problem, the intermediary could specify the amount 292 of data it is willing to buffer for one request. If this limit is 293 reached, the intermediary will stop the transmission of the request 294 and will wait for a response. Up to that point, the server is 295 allowed to respond at any time and assume that the intermediary has 296 kept the entire message. If the server is not able to determine a 297 response from the initial part of the request, then it MUST 298 explicitly request the transmission of the remaining part of the 299 request. The next response MUST assume that the intermediary does 300 not have a copy of the message. 302 3.2.4 Preview 304 In some cases, the remote callout service can complete its operation 305 before it has received the entire message. For example, a virus 306 checking service can certify a large fraction of all files as 307 "clean" just by looking at the file type and the first 2K bytes. 308 Another example is a content filtering system that marks a web page 309 as containing "illegal content" as soon as certain words appear in 310 that page. In these cases, the remote callout server does not need 311 to receive the remaining part of the message and can instantly 312 respond with a certain status code. A callout protocol SHOULD 313 provide the possibility for a server to opt out of a transmission 314 early. 316 Basically, there are a two of design alternatives for the preview 317 functionality: In the first approach, the intermediary sends a pre- 318 defined portion of the request to the callout server, then stops and 319 waits for a response from the callout server. If the server returns 320 a positive response, the intermediary sends the remaining part of 321 the message. Otherwise it interrupts the transmission. This approach 322 is used by the ICAP protocol. In the second approach, the callout 323 server is allowed to respond to a request at any time. It MUST 324 indicate in this response if the current transmission SHOULD be 325 completed or interrupted. 327 A prerequisite for the first approach is that the intermediary knows 328 the amount of data required by the server to decide on continuing or 329 interrupting a request. In these cases the intermediary can send 330 exactly this portion of a request and thus minimize the amount of 331 data that is exchanged. A drawback of this approach is that the 332 handshake between intermediary and callout server introduces an 333 additional delay into the processing of one request. The major 334 advantage of the second approach is that it lets the server decide 335 at which point the transmission is interrupted. This can be 336 exploited, for example, by services that make their decision on 337 continuing or interrupting dynamically during the processing of one 338 request. In these cases, the second approach is more efficient, 339 since it allows the server to opt out of the transmission as soon as 340 possible. Summing up, in the ideal case the first approach is used 341 if the size of the preview is known in advance and the second 342 approach is used otherwise. 344 If only one approach SHOULD be supported by a callout protocol, the 345 penalty for not using the optimal approach MUST be considered. If 346 the second approach is used in any case, the intermediary continues 347 sending data after the decision point until it receives a response 348 from the server. If the response is to continue the transmission, no 349 bandwidth has been wasted and, in addition, no delay for the 350 handshake has been introduced. If the response is negative, the 351 intermediary has sent redundant data for the time of one message 352 round trip. If the first approach is used in any case, the 353 intermediary MUST guess the size of the preview. If the chosen size 354 is too large and the server decides to bail out of a transmission, 355 the penalty is the data that is transmitted until the full preview 356 size is reached. If the guess of the preview size was too small, the 357 intermediary MUST continue and send the entire message. Thus, the 358 penalty is the part of the message after the actual decision point. 359 In conclusion, the penalty using the first approach in any case is 360 typically higher than the penalty of always using the second 361 approach. 363 3.2.5 Partial content 365 Some remote callout services only modify small parts of the original 366 message. For example, a translation service typically inserts a 367 small icon into the original page, from which the translated page 368 can be reached. Another example is a service that forces all 369 cacheable data to expire at a certain time by modifying the HTTP 370 header fields. In these cases, returning the entire message from the 371 callout server back to the intermediary would not be very efficient. 372 Instead, a remote callout server could just return the modified 373 parts of a message and indicate the position at which this part MUST 374 be inserted into the original message. 376 This is much like a partial content response of HTTP. It is 377 important to keep the burden on the intermediary as low as possible. 378 For this reason, the response SHOULD always indicate the offset of 379 the partial response in absolute byte numbers. Basically this 380 approach trades an increase of complexity in the callout protocol 381 and the intermediary against a decrease in the amount of data that 382 has to be transmitted. Although the additional complexity seems to 383 be relatively low, the benefits heavily depend on the remote callout 384 services that are able to utilize this feature. 386 3.2.6 Multiple services on the same message 388 A remote callout service provider might offer several callout 389 services. In this case, it might not be reasonable to make a 390 separate call for each remote service to be executed on the same 391 content-path message. Instead, it would be more efficient to 392 transfer the content-path message to the remote callout server once, 393 execute all services and return the entire response. The callout 394 server is responsible for dispatching the message in the correct 395 order to the different services and for aggregating the responses 396 into a single response message. 398 To invoke multiple services, an intermediary MUST be able to specify 399 more than one URL. The design alternatives are to set up one channel 400 for each combination of remote services or to use one channel to a 401 callout server and specify the desired URLs in each message. 403 The most challenging task is to dispatch the requests to multiple 404 services and to aggregate the responses of individual services. This 405 SHOULD be done by a dispatcher on the remote server. Thereby, the 406 following rules can be considered: 408 Caching: the response MUST contain the earliest expiration date. 410 Keeping copy: the remote callout server SHOULD propose the maximum 411 of the prefix sizes of individual services as the prefix size of the 412 compound service. 414 If a service requests the transmission of the entire message, the 415 server MUST return this request to the intermediary and forward the 416 remaining message to the service. This request frees the 417 intermediary from the burden of keeping a copy of the message. If 418 the server itself is not willing to buffer the message, it MUST call 419 all subsequent services with preview size zero. In any case, the 420 server MUST return an entire message to the intermediary. 421 Preview: if the response of a service indicates that no changes are 422 required, the service dispatcher SHOULD NOT opt out of the current 423 transmission of the request. Instead, it SHOULD forward the current 424 message to the next service. Only if all services indicate that no 425 changes are required and the message still has not been transmitted 426 completely, the service dispatcher MAY interrupt this transmission 427 and return a "no changes required" response. 429 Partial content: the message dispatcher of the callout server MUST 430 insert the partial response it receives from each service into the 431 full message before sending it to the next service. If all services 432 have returned partial responses, it MAY decide to aggregate all 433 parts and return as a partial response to the intermediary. 434 Otherwise it returns the response it got from the last service 435 called as an entire message. 437 4 Security Considerations 439 This document does not explicitly require a callout protocol to 440 encrypt the encapsulated content-path messages for transit by 441 default. In the absence of some other form of encryption at the link 442 or network layers, eavesdroppers may be able to record the 443 unencrypted transactions between the intermediary and the callout 444 server. 446 5 Acknowledgments 448 The authors would like to thank all active participants in the OPES 449 mailing list for their thought-provoking discussion. In particular, 450 we want to acknowledge major contributions from Andre Beck, who was 451 heavily involved in shaping this document. 453 6 References 455 [1] S. Bradner. RFC 2119. "Key words for use in RFCs to Indicate 456 Requirement Levels", March 1997 458 [2] Tomlinson, G., et al. "A Model for Open Pluggable Edge 459 Services", Work in Progress, Internet Draft draft-tomlinson- 460 opes-model-00.txt, July 2001. 462 7 Author's Addresses 464 Anca Dracinschi Sailer 465 Room 4F-531 466 Lucent Technologies 467 101 Crawfords Corner Rd. 468 Holmdel, NJ 07733 469 Phone: (732) 494-2259 470 Email: anca@bell-labs.com 472 Volker Hilt 473 Praktische Informatik IV 474 University of Mannheim 475 Phone: +49 621 181 2606 476 Email: hilt@informatik.uni-mannheim.de 478 Markus Hofmann 479 Room 4F-513 480 Lucent Technologies 481 101 Crawfords Corner Rd. 482 Holmdel, NJ 07733 483 Phone: (732) 332-5983 484 Email: hofmann@bell-labs.com 486 Rama R. Menon 487 Intel Corporation 488 M/S JF3-206 489 2111 NE 25th Ave. 490 Hillsboro, OR 97124 491 Phone: +1-503-712-1438 492 Email: rama.r.menon@intel.com 494 Full Copyright Statement 496 Copyright (C) The Internet Society (2000). All Rights Reserved. 498 This document and translations of it MAY be copied and furnished to 499 others, and derivative works that comment on or otherwise explain it 500 or assist in its implementation MAY be prepared, copied, published 501 and distributed, in whole or in part, without restriction of any 502 kind, provided that the above copyright notice and this paragraph 503 are included on all such copies and derivative works. However, this 504 document itself MAY not be modified in any way, such as by removing 505 the copyright notice or references to the Internet Society or other 506 Internet organizations, except as needed for the purpose of 507 developing Internet standards in which case the procedures for 508 copyrights defined in the Internet Standards process MUST be 509 followed, or as required to translate it into languages other than 510 English. 512 The limited permissions granted above are perpetual and will not be 513 revoked by the Internet Society or its successors or assigns. 515 This document and the information contained herein is provided on an 516 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 517 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 518 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 519 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 520 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.