idnits 2.17.1 draft-bertrand-cdni-logging-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 22 instances of too long lines in the document, the longest one being 1 character in excess of 72. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 13, 2012) is 4427 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'MED' is mentioned on line 797, but not defined == Unused Reference: 'RFC2119' is defined on line 909, but no explicit reference was found in the text == Unused Reference: 'I-D.bertrand-cdni-experiments' is defined on line 922, but no explicit reference was found in the text == Unused Reference: 'RFC3444' is defined on line 951, but no explicit reference was found in the text == Unused Reference: 'RFC3466' is defined on line 955, but no explicit reference was found in the text == Unused Reference: 'RFC3568' is defined on line 959, but no explicit reference was found in the text == Outdated reference: A later version (-02) exists of draft-bertrand-cdni-experiments-01 == Outdated reference: A later version (-08) exists of draft-ietf-cdni-problem-statement-03 == Outdated reference: A later version (-17) exists of draft-ietf-cdni-requirements-02 == Outdated reference: A later version (-10) exists of draft-ietf-cdni-use-cases-03 -- Obsolete informational reference (is this intentional?): RFC 3466 (Obsoleted by RFC 7336) Summary: 1 error (**), 0 flaws (~~), 14 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Bertrand, Ed. 3 Internet-Draft E. Stephan 4 Intended status: Informational France Telecom - Orange 5 Expires: August 16, 2012 February 13, 2012 7 CDNI Logging Interface 8 draft-bertrand-cdni-logging-00 10 Abstract 12 This memo specifies the Logging interface between a downstream CDN 13 (dCDN) and an upstream CDN (uCDN). It introduces a framework, an 14 architecture design and a set of new requirements. Then it drafts an 15 information model. 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on August 16, 2012. 34 Copyright Notice 36 Copyright (c) 2012 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 This document may contain material from IETF Documents or IETF 50 Contributions published or made publicly available before November 51 10, 2008. The person(s) controlling the copyright in some of this 52 material may not have granted the IETF Trust the right to allow 53 modifications of such material outside the IETF Standards Process. 54 Without obtaining an adequate license from the person(s) controlling 55 the copyright in such materials, this document may not be modified 56 outside the IETF Standards Process, and derivative works of it may 57 not be created outside the IETF Standards Process, except to format 58 it for publication as an RFC or to translate it into languages other 59 than English. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 65 1.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 5 66 2. Logging Framework . . . . . . . . . . . . . . . . . . . . . . 6 67 3. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 6 68 4. Additional Requirements . . . . . . . . . . . . . . . . . . . 9 69 5. Rationale for Logging Interface . . . . . . . . . . . . . . . 9 70 5.1. Usages of CDNI Logging Information By uCDN . . . . . . . . 9 71 5.1.1. Maintenance/Debugging . . . . . . . . . . . . . . . . 9 72 5.1.2. Accounting . . . . . . . . . . . . . . . . . . . . . . 10 73 5.1.3. End-User Experience Management . . . . . . . . . . . . 10 74 5.1.4. Security . . . . . . . . . . . . . . . . . . . . . . . 10 75 5.2. Logging Information Views . . . . . . . . . . . . . . . . 11 76 5.3. Information Extracted From Logging Data . . . . . . . . . 11 77 6. Log Information Elements . . . . . . . . . . . . . . . . . . . 12 78 6.1. Core Information Elements . . . . . . . . . . . . . . . . 13 79 6.2. Information Elements for Content Delivery . . . . . . . . 14 80 6.3. Information Elements for Content Acquisition . . . . . . . 15 81 6.4. Log Extensibility . . . . . . . . . . . . . . . . . . . . 15 82 7. Core Logging Records . . . . . . . . . . . . . . . . . . . . . 15 83 7.1. Content Delivery . . . . . . . . . . . . . . . . . . . . . 15 84 7.2. Content Acquisition . . . . . . . . . . . . . . . . . . . 16 85 7.3. Content Purging . . . . . . . . . . . . . . . . . . . . . 16 86 7.4. Extended CoDRs . . . . . . . . . . . . . . . . . . . . . . 17 87 8. Logging Process . . . . . . . . . . . . . . . . . . . . . . . 17 88 8.1. Logging Aggregation . . . . . . . . . . . . . . . . . . . 17 89 8.1.1. Logging and Fragmented Objects . . . . . . . . . . . . 18 90 8.2. Logging Protection . . . . . . . . . . . . . . . . . . . . 18 91 8.2.1. Logging Signing . . . . . . . . . . . . . . . . . . . 18 92 8.3. Logging Filtering . . . . . . . . . . . . . . . . . . . . 18 93 8.4. Logging Update and Rectification . . . . . . . . . . . . . 19 94 9. Protocols for Logging . . . . . . . . . . . . . . . . . . . . 19 95 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 96 11. Security Considerations . . . . . . . . . . . . . . . . . . . 20 97 11.1. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 20 98 11.2. Non Repudiation . . . . . . . . . . . . . . . . . . . . . 20 99 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 100 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 101 13.1. Normative References . . . . . . . . . . . . . . . . . . . 21 102 13.2. Informative References . . . . . . . . . . . . . . . . . . 21 103 Appendix A. Examples Log Format . . . . . . . . . . . . . . . . . 22 104 A.1. W3C Common Log File (CLF) Format . . . . . . . . . . . . . 22 105 A.2. W3C Extended Log File (ELF) Format . . . . . . . . . . . . 23 106 A.3. National Center for Supercomputing Applications (NCSA) 107 Common Log Format . . . . . . . . . . . . . . . . . . . . 24 108 A.4. NCSA Combined Log Format . . . . . . . . . . . . . . . . . 24 109 A.5. NCSA Separate Log Format . . . . . . . . . . . . . . . . . 24 110 A.6. Squid 2.0 Native Log Format for Access Logs . . . . . . . 25 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 113 1. Introduction 115 This memo specifies the Logging interface between a downstream CDN 116 (dCDN) and an upstream CDN (uCDN). It introduces a framework, an 117 architecture design and a set of new requirements. Then it drafts an 118 information model. 120 The reader should be familiar with the work of the CDNI WG: 122 o CDNI problem statement [I-D.ietf-cdni-problem-statement] and 123 framework [I-D.davie-cdni-framework] identify a Logging interface, 125 o Section 7 of [I-D.ietf-cdni-requirements] specifies a set of 126 requirements for Logging, 128 o [I-D.ietf-cdni-use-cases] outlines real world use-cases for 129 interconnecting CDNs. These use cases require the exchange of 130 Logging information between the dCDN and the uCDN. 132 The present document describes: 134 o The Logging framework (Section 2), 136 o The architecture (Section 3), 138 o The requirements (Section 4), 140 o Discussion on the monitoring and the reporting (Section 5) 142 o Log information (Section 6 and Section 7), 144 1.1. Terminology 146 We adopt the terminology described in 147 [I-D.ietf-cdni-problem-statement] and [I-D.davie-cdni-framework], and 148 extend it with the additional terms defined below. 150 For clarity, we use the word "Log" only for referring to internal CDN 151 logs and we use the word "Logging" for any inter-CDN information 152 exchange and processing operations related to CDNI Logging interface. 154 Log: CDN internal information collection and processing operations. 156 Logging: Inter-CDN information exchange and processing operations. 158 Small object: [Ed. Note: TBD] 160 Fragmented object: [Ed. Note: Tentative of a simple definition which 161 fits with the current CDNi charter] Fragmented objects are pieces of 162 content provided by a CSP which are delivered individually through a 163 CDN interconnection. They differ from a simple object because the 164 delivery of the content to one user agent may be provided by more 165 than one Surrogate/CDN. 167 CDN Reporting: the process of providing the relevant information that 168 will be used to create a formatted content delivery report provided 169 to the CSP in differed time. Such information typically includes 170 aggregated data that can cover a large period of time (e.g., from 171 hours to several months). One of the usages of reporting is the 172 collection of charging data related to CDN services and the 173 computation of Key Performance Indicators (KPIs). 175 CDN Monitoring: the process of providing content delivery information 176 in real-time. The monitoring typically includes data in real time to 177 provide a vision of the deliveries in progress, for service operation 178 purposes. It presents a view of the global health of the services as 179 well as information on usage and performance, for network services 180 supervision and operation management. In particular, monitoring data 181 can be used to generate alarms. 183 Core log information: minimal information that has to be logged to 184 satisfy the Logging requirements 186 End-user experience management: study of Logging data using 187 statistical analysis to discover, understand, and predict user 188 behavior patterns. 190 Usage data: the usage data refers to all the information related to a 191 specific end-user session. 193 Delivery service: [Ed. Note: to be defined] 195 1.2. Abbreviations 197 [Ed. Note: List of abbreviations to be updated later] 199 o API: Application Programming Interface 201 o CDN: Content Delivery Network 203 o CDNP: Content Delivery Network Provider 205 o CoDR: Content Delivery Record 207 o CSP: Content Service Provider 208 o DASH: Dynamic Adaptive Streaming over HTTP 210 o dCDN: downstream CDN 212 o FTP: File Transfer Protocol 214 o FTPS: FTP Secure 216 o HAS: HTTP Adaptive Streaming 218 o KPI: Key Performance Indicator 220 o PVR: Personal Video Recorder 222 o SNMP: Simple Network Management Protocol 224 o uCDN: upstream CDN 226 2. Logging Framework 228 The framework of the Logging interface is straightforward: dCDN logs 229 any information related to the completion of any task performed by a 230 dCDN on behalf of an uCDN and any exchange related to the management 231 of the contents that the said dCDN delivers on behalf of an uCDN, as 232 discussed in Section 7.1. 234 3. Architecture 236 Logging is a mandatory feature for a CDN, especially if the CDN is 237 interconnected to other CDNs. Logging provides the raw material for 238 some essential operations of a delivery service, such as monitoring, 239 reporting, billing, etc. 241 As stated in [I-D.ietf-cdni-problem-statement], "the CDNI Logging 242 interface enables details of logs or events to be exchanged between 243 interconnected CDNs". 245 Figure 1 provides an example of Logging information exchanges. uCDN 246 is connected to dCDN-1 and dCDN-2. Both dCDN-1, dCDN-2, and uCDN 247 deliver content for CSP. The Logging interface enables the uCDN to 248 obtain Logging data from dCDN-1 and dCDN-2. In the example, uCDN 249 uses the Logging data: 251 o to audit the performance of the delivery operated by the dCDNs and 252 to adjust its routing request as appropriate, 254 o to provide reporting (non real-time) and monitoring (real-time) 255 information to CSP. 257 For instance, uCDN merges Logging data, extracts relevant KPIs, and 258 presents a formatted report to CSP, in addition to a bill for the 259 content delivered. uCDN may also provide Logging data as raw logs to 260 CSP, so that CSP uses its own Logging analysis tools. 262 +-----+ 263 | CSP | 264 +-----+ 265 ^ 266 | 267 | Reporting and monitoring data 268 | Billing 269 | 270 ,--,--. 271 ,-' `-. 272 CoDR ( uCDN ) CoDR 273 ....>( )<.... 274 | ( ) | 275 | ( RRi ) | 276 | `-. Tuning ,-' | 277 | -|-|-' | 278 | | | | 279 | | | | 280 ,--|--. | | ,--|--. 281 ,-' `-. | | ,-' `-. 282 ( dCDN-1 <----+ + ---> dCDN-2 ) 283 `-. ,-' `-. ,-' 284 `--'--' `--'--' 286 Figure 1: Exchange of Logging Information 288 Figure 2 presents the Logging Architecture. More details on the 289 Logging operations are provided in Section 8. A dCDN prepares the 290 CoDRs requested by the uCDN. This preparation involves operations 291 such as filtering, aggregating, anonymizing, and summarizing the 292 logs. The uCDN downloads the corresponding CoDRs and performs its 293 own reporting for the CSP. 295 -------- 296 / \ 297 | CSP | 298 \ / 299 --^----- 300 ^ 301 ^ Reporting, Monitoring, Billing 302 ^ 303 ---^--------------------- ------------------------- 304 / ^ Upstream CDN \ / Downstream CDN \ 305 |+-----+ +-------------+ | | +-------------+ +-----+| 306 || |**| Control | | | | Control |**| || 307 || | +-------------+ | | +-------------+ | I || 308 || I | | CoDR selection | | n || 309 || n | +-------------+ |----------------->| +-------------+ | t || 310 || t |<<| Logging | | | | Logging |<<| e || 311 || e | +-------------+ |<-----------------| +-------------+ | r || 312 || r | | CoDRs | | c L || 313 || c L | | | | o o || 314 || o o | +-------------+ | | +-------------+ | n g || 315 || n g |<<|Req-Routing | | | |Req-Routing |>>| n i || 316 || n i | +-------------+ | | +-------------+ | e c || 317 || e c | | | | c || 318 || c | +-------------+ | | +-------------+ | t || 319 || t |<<| Metadata | | | | Metadata |>>| i || 320 || i | +-------------+ | | +-------------+ | o || 321 || o | | | | n || 322 || n | +-------------+ | | +-------------+ | || 323 || |<<| Distribution| |******************| | Distribution|>>| || 324 |+-----+ +-------------+ | Acquisition | +-------------+ +-----+| 325 \ / \ . * / 326 ------------------------- ---------.-*------------- 327 . . * 328 . Request . * Delivery 329 . . * 330 . +--.-*--+ 331 ..................Request............| User | 332 | Agent | 333 +-------+ 335 Figure 2: Logging Architecture 337 Logging Information elements may be captured at various stages during 338 the lifecycle of content distribution. The arrows (">>") of the 339 above Figure 2 represent the direction of information elements in the 340 Logging process. They illustrate several important aspects: 342 o An Information element may be captured either by an uCDN or a 343 dCDN, or both; 345 o An Information element can be collected on another interface than 346 the Logging (e.g., uCDN's Request-Routing); 348 o Information elements can be collected before the exchange of 349 CoDRs. 351 These points are further discussed in Section 9. 353 4. Additional Requirements 355 Section 7 of [I-D.ietf-cdni-requirements], already specifies a set of 356 requirements for Logging (LOG-1 to LOG-16). Some security 357 requirements also affect Logging (e.g., SEC-4). 359 [Ed. Note: uCDN shall be able to select the type of events that a 360 dCDN should include in the Logging that the latter provides to the 361 uCDN.] 363 5. Rationale for Logging Interface 365 [I-D.davie-cdni-framework] and [I-D.ietf-cdni-problem-statement] 366 already introduce the rationale for the Logging interface as a means 367 for an uCDN to acquire some visibility on the contents the dCDN 368 delivers on behalf of the uCDN. dCDN provides the uCDN with elements 369 of information and CoDRs for operating the CDN interconnection and 370 reporting to the CSP. This section develops use cases that require 371 exchange of Logging information. 373 5.1. Usages of CDNI Logging Information By uCDN 375 This section presents the usage of the CoDRs by an uCDN. It does not 376 make any assumption on where the CoDRs are produced. CoDRs may be 377 produced either by the uCDN or a dCDN. 379 5.1.1. Maintenance/Debugging 381 Logging is useful to permit the detection (and limit the risk) of 382 content delivery failures. In particular, Logging facilitates the 383 resolution of false configuration issues. 385 To detect faults, Logging must enable the reporting of any CDN 386 operation success and failure, such as request redirection, content 387 acquisition, etc. Such information can be summarized into KPIs. For 388 instance, Logging format should allow the computation of the number 389 of times during a given epoch, a content delivery related to a 390 specific service succeeds/fails. 392 This need is taken into account in the events triggering log entries, 393 which are listed in Section 7. 395 Logging is useful to analyze the performance of content delivery 396 services. This implies computing KPIs from the Logging data for 397 service quality analysis and monitoring (see Section 5.3). 399 Logging enables the CDN providers to evaluate the QoS level related 400 to a specific delivery service. For instance, one aspect of this QoS 401 level could be measured through the average delivery throughput 402 experienced by end-users in a given region for this specific service 403 over a period of time. 405 Logging enables the CDN providers to identify and troubleshoot 406 performance degradations. In particular, Logging enables the 407 communication of traffic data (e.g., the amount of traffic that has 408 been forwarded by a dCDN on behalf of an uCDN over a given period of 409 time), which is particularly useful for CDN and network planning 410 operations. 412 5.1.2. Accounting 414 Logging is essential for accounting, to permit inter-CDN billing, and 415 CSP billing by uCDN. For instance, Logging enables the uCDN to check 416 the total amount of traffic delivered by every dCDN and for every 417 delivery service, as well as the associated bandwidth usage (e.g., 418 peak, 95th percentile), and the maximum number of simultaneous 419 sessions over a given period of time. 421 5.1.3. End-User Experience Management 423 The goal of end-user experience management is to gather any relevant 424 information to meter audience, analyze user behavior, etc. For 425 instance, Logging enables the CDN providers to report on content 426 consumption (e.g., delivered sessions per content) in a specific 427 geographic area. 429 5.1.4. Security 431 The goal of security is to prevent and monitor unauthorized access, 432 misuse, modification, and denial of access of a service. A set of 433 information is logged for security purposes. In particular, access 434 to content is usually collected to permit the CSP to detect 435 infringements of content delivery policies and other abnormal end- 436 user behaviors. 438 5.2. Logging Information Views 440 Logging information is useful to the uCDN and potentially to the CSP. 441 Different views of the Logging information may be provided depending 442 on privacy, business, and scalability constraints. Some kind of 443 information format adaptation capability MAY be supported by an uCDN 444 to present some (e.g., filtered, aggregated) data in the appropriate 445 format (raw logs, reports) to the CSP. More details on these 446 operations are provided in Section 8. 448 We provide a non-exhaustive list and description of tools that can be 449 fed with Logging information. 451 o Tools used by the uCDN's operator: billing tools (information 452 system), customer experience intelligence, reporting tools, 453 security auditing tools, dimensioning tools, strategic planning 454 and investment... 456 o Tools used by CSPs: customer experience management tools, 457 reporting tools, security auditing tools... 459 5.3. Information Extracted From Logging Data 461 This section presents, for explanatory purposes, a non-exhaustive 462 list of information that can be extracted/produced from logs. 463 Depending on the inter-CDN agreement, this information may be 464 computed by the uCDN or by the dCDN. 466 CSPs require specific information, such as KPIs, about the delivery 467 of their content. The Logging data must contain appropriate 468 information to enable CSPs or the uCDN to extract the required KPIs. 469 In the present section, we list important examples of KPIs: 471 o Number of delivery requests received from end-users in a given 472 region for each piece of content, during a given period of time 473 (e.g., hour/day/week/month), 475 o Percentage of delivery successes / failures among the 476 aforementioned requests 478 o Number of failures listed by failure type (e.g., HTTP error code) 479 for requests received from end-users in a given region and for 480 each piece of content, during a given period of time (e.g., hour/ 481 day/week/month), 483 o Number and cause of delivery premature termination for end-users 484 in a given region and for each piece of content, during a given 485 period of time (e.g., hour/day/week/month), 487 o Maximum and mean number of simultaneous sessions established by 488 end-users in a given region, for a given delivery service, and 489 during a given period of time (e.g., hour/day/week/month), 491 o Volume of traffic delivered for sessions established by end-users 492 in a given region, for a given delivery service, and during a 493 given period of time (e.g., hour/day/week/month), 495 o Maximum, mean, and minimum delivery throughput for sessions 496 established by end-users in a given region, for a given delivery 497 service, and during a given period of time (e.g., hour/day/week/ 498 month) 500 o Cache-hit and byte-hit ratios for requests received from end-users 501 in a given region for each piece of content, during a given period 502 of time (e.g., hour/day/week/month) 504 o Top 10 of the most popular requested content (with time 505 repartition into day/week/month), 507 o Terminal type (mobile, PC, STB, if this information can be 508 acquired from the browser type header, for example). 510 Additional KPIs can be computed from other sources of information 511 than the Logging, for instance, data collected by a content portal or 512 by specific client-side APIs. 514 6. Log Information Elements 516 CDNI must specify a set of Logging information elements to avoid log 517 format regeneration, which would affect the performance of the log 518 handling chain. A common set of Logging information element eases 519 the sharing of logs among the CDNs and the use of log processing 520 tools, for instance, to prepare reporting. 522 Existing CDNs Logging functions collect and consolidate logs 523 performed by their Surrogates. Surrogates usually store the logs 524 using a format derived from Web servers log standards such as W3C and 525 NCSA [ELF] [CLF]. In practice, these formats are adapted to cope 526 with CDN specifics. Appendix A presents the W3C and NCSA log 527 formats. 529 6.1. Core Information Elements 531 This section describes a set of information elements that structure 532 log information generated by the dCDN. The section does not 533 prescribe a particular encoding (such as SNMP SMI or alternatives). 534 All fields in the log information are optional unless stated 535 otherwise. 537 +--------+----------------------------------------------------------+ 538 | Name | Description | 539 +--------+----------------------------------------------------------+ 540 | Time | A date and time associated with a logged event. For | 541 | | instance, the time that the server finished processing | 542 | | the request. | 543 | URI_lo | The requested URL path (e.g., | 544 | g | /cdn.csp.com/movies/potter.avi?param=11&user=toto if the | 545 | | full request URL was | 546 | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/potter. | 547 | | avi?param=11&user=toto"). The URI without hostname | 548 | | typically includes the "CDN domain" (ex.cdn.csp.com) - | 549 | | cf. [I-D.davie-cdni-framework]: it enables the | 550 | | identification of the CSP service agreed between the CS | 551 | | Pand the CDNP operating the uCDN. | 552 | Protoc | The protocol and protocol version of the message that | 553 | ol | triggered the log entry. | 554 | Reques | The protocol method of the request message that | 555 | t | triggered the log entry. | 556 | metho | | 557 | d | | 558 | Status | The protocol method of the reply message related to the | 559 | | log entry | 560 | Body | The number of bytes in the body of the reply message | 561 | size | related to the log entry. It does not include the size | 562 | | of the response headers. | 563 | Bytes | The number of bytes (headers + body) of the message that | 564 | receiv | triggered the log entry. | 565 | ed | | 566 | Header | Multiple header fields, such as User Agent or Referrer, | 567 | s | could be reproduced in the log entries. | 568 | Durati | The duration of an operation in milliseconds. For | 569 | on | instance, this field could be used to provide the time | 570 | | it took to the Surrogate to send the requested file to | 571 | | the end-user, or the time it took the Surrogate to | 572 | | acquire the file on a cache-miss event. | 573 | Operat | The kind of operation that is logged; for instance, | 574 | ion | Acquisition, Delivery, or Purging. | 575 +--------+----------------------------------------------------------+ 576 Table 1: Core Information Elements 578 Subsequent table illustrates the definition of the core information 579 elements. It provides examples using Apache log format strings 580 [apache] when they exist. The table is here for illustration and 581 does not prescribe a specific encoding. 583 +----------+-------------------+------------------------------------+ 584 | Name | String | Example | 585 +----------+-------------------+------------------------------------+ 586 | Time | %t | [10/Oct/2000:13:55:36 -0700] | 587 | URI_log | - | - | 588 | Protocol | %H | HTTP/1.0 | 589 | Request | %m | GET | 590 | method | | | 591 | Status | %>s | 200 | 592 | Body | %b | 2326 | 593 | size | | | 594 | Bytes | - | - | 595 | received | | | 596 | Header | \"%{Referer}i\" | "http://www.example.com/start.html | 597 | | \"%{User-agent}i\ | ""Mozilla/4.08 [en] (Win98; I | 598 | | " | ;Nav)" | 599 | Duration | - | - | 600 | Operatio | - | - | 601 | n | | | 602 +----------+-------------------+------------------------------------+ 604 Table 2: Examples using Apache format 606 6.2. Information Elements for Content Delivery 608 +-------------+-----------------------------------------------------+ 609 | Name | Definition | 610 +-------------+-----------------------------------------------------+ 611 | uCDN | An element authenticating the operator of the uCDN | 612 | identifier | as the authority having delegated the request to | 613 | | the dCDN | 614 | End-user's | The IP address of the client making a content | 615 | IP address | delivery request (or of its proxy) | 616 | Cache bytes | The number of body bytes served from caches. This | 617 | | quantity permits the computation of the byte hit | 618 | | ratio. | 619 +-------------+-----------------------------------------------------+ 621 Table 3: Delivery Information Elements 623 6.3. Information Elements for Content Acquisition 625 +------------+------------------------------------------------------+ 626 | Name | Definition | 627 +------------+------------------------------------------------------+ 628 | dCDN | An element authenticating the operator of the dCDN | 629 | identifier | as the authority requesting the content to the uCDN | 630 +------------+------------------------------------------------------+ 632 Table 4: Acquisition Information Elements 634 6.4. Log Extensibility 636 Future usages might introduce the need for additional Logging data. 637 In addition, some use-cases such as an Inter-Affiliate 638 Interconnection [I-D.ietf-cdni-use-cases], might take advantage of 639 extended Logging exchanges. Therefore, it is important to permit 640 CDNs to use additional Logging fields than the standard ones, if they 641 want. 643 7. Core Logging Records 645 This section defines a set of central events that a dCDN should 646 register and publish through the Logging interface. There are two 647 types of events. The fist category belongs to legacy Web servers' 648 access and errors logs. The second is directly tied to the auditing 649 of the CDN interconnection. 651 We classify the logged events depending on the CDN operation to which 652 they relate: content delivery, content acquisition, content purging, 653 etc. 655 Next versions of the memo will associate a CoDR to each event. 657 7.1. Content Delivery 659 Some CSPs pay a lot of attention to the protection of their content 660 (e.g., premium video CSPs). To fulfill the needs of these CSPs, a 661 CDN shall log all the details of the content delivery authorizations. 662 This means that a dCDN must be able to provide log detailing the 663 content delivery/content acquisition authorizations and denials as 664 well as information on why the request is authorized/denied. 666 The events triggering the generation of a log record include: 668 o Reception of a content request, 669 The generated log record typically embeds information about: 671 o Denial of delivery (error or unauthorized request) for a request, 673 o Beginning of delivery (authorization) of a requested content, 675 o End of an authorized delivery (success), 677 o End of an authorized delivery (failure), 679 7.2. Content Acquisition 681 In case the uCDN require the dCDN to log acquisition related events, 682 the events triggering the generation of a log record include: 684 o Emission of a content acquisition request (first try or retry) for 685 a cache hit or a cache miss with content revalidation 687 The generated log record typically embeds information about: 689 o Reception of a reply indicating denial of delivery (error or 690 unauthorized request) for a content acquisition request, 692 o End of an authorized acquisition (success), 694 o End of an authorized acquisition (failure) 696 Note that a dCDN may acquire content only from the uCDN. It this 697 case, the uCDN can log the dCDN's content acquisition operations 698 itself, and thus, the uCDN typically does not require the dCDN to log 699 acquisition related events. 701 7.3. Content Purging 703 The purging of a piece of content is typically requested by the uCDN, 704 which can, therefore, log events related to purging. In case the 705 uCDN nevertheless requests a dCDN to log purging events, the events 706 triggering the generation of a log record include: 708 o Reception of a content purging request 710 The generated log record typically embeds information about: 712 o Denial of the purging request (error or unauthorized request), 714 o Beginning of purging (authorization) for a given content purging 715 request, 717 o End of an authorized purging (success), 719 o End of an authorized purging (failure), 721 7.4. Extended CoDRs 723 The required Logging information may depend on the considered 724 services. For instance, static file delivery (e.g., pictures) 725 typically does not include any delivery restrictions. By contrast, 726 video delivery typically implies strong content delivery 727 restrictions, as explained in [I-D.ietf-cdni-use-cases], and Logging 728 could include information about the enforcement of these 729 restrictions. Therefore, to ease the support of different services 730 as well as future services, the Logging interface should support 731 optional log information. 733 8. Logging Process 735 We walk through a "day in the life" of a CDN interconnection to 736 present functions the two CDNs may require to exchange Logging 737 information. This will serve to illustrate many of the functions 738 that could be supported through CDNI Logging interface. We describe 739 capabilities, such as log aggregation, anonymization, and filtering, 740 that might be added to CDNI in a later stage, to optimize Logging 741 operations. 743 8.1. Logging Aggregation 745 CDNs typically handle millions of records per day. The processing of 746 these records to extract relevant monitoring and reporting 747 information is expensive in terms of CPU and time. Therefore, as 748 stated in [I-D.davie-cdni-framework], "a design tradeoff in the 749 Logging interface is the degree of aggregation or summarization of 750 data." 752 In particular, dCDNs aggregate the logs of their elements (e.g., the 753 Surrogates) to avoid both the complexity of distributing multiple log 754 files to the uCDN and to avoid disclosing information about dCDN's 755 internal topology. This aggregation alleviates the Logging 756 processing burden for the uCDN. 758 [Ed. Note: In a later version, the draft will propose methods to 759 optimize the amount of information transmitted: (e.g., transmit only 760 KPIs, use multiple levels of logs granularity such as in Apache 761 (debug, notice, etc.)] 763 8.1.1. Logging and Fragmented Objects 765 Many situations lead to the delivery of fragments of content (DASH, 766 failure of delivery, partial delivery, PVR actions, etc.). A dCDN 767 may not publish a CoDR for each piece of content it delivers, because 768 this can lead to unacceptably large logs. In particular, a CoDR 769 could provide aggregated information about the delivery of several 770 content pieces. uCDN and dCDN must be able to agree on a level of 771 granularity for the CoDRs. This problem is well described for the 772 case of HTTP adaptive streaming in [I-D.davie-cdni-framework]: 774 "Most schemes to deliver HTTP-based adaptive bit- rate video use a 775 large number of relatively small HTTP requests (e.g., one request per 776 3-second chunk of video.) It may be desirable to aggregate Logging 777 information so that a single log entry is provided for the entire 778 video rather than for each chunk. Note however that such aggregation 779 requires a degree of application awareness in dCDN to recognize that 780 the many HTTP requests correspond to a single video." 782 8.2. Logging Protection 784 8.2.1. Logging Signing 786 CDNs need guarantees on logs Integrity. They want to know: 788 o who issued the Logging, and 790 o if the Logging has been modified by a third party. 792 This is extremely important, as the logs can provide a basis for 793 accounting/billing. 795 [Ed. note: propose a mechanism to authenticate the Logging origin] 797 [Ed. note: cf. requirements draft: "SEC-4 [MED] The CDNI solution 798 should be able to ensure that the Downstream CDN cannot spoof a 799 transaction log attempting to appear as if it corresponds to a 800 request redirected by a given Upstream CDN when that request has not 801 been redirected by this Upstream CDN. This ensures non-repudiation 802 by the Upstream CDN of transaction logs generated by the Downstream 803 CDN for deliveries performed by the Downstream CDN on behalf of the 804 Upstream CDN."] 806 8.3. Logging Filtering 808 The dCDN must be able to present only relevant information to the 809 uCDN, to avoid unnecessary log processing load for the uCDN. Hence, 810 the downstream CDN filters its logs, and passes the relevant records 811 directly to each upstream CDN. This requires that the downstream CDN 812 can recognize the set of log entries that relate to each upstream 813 CDN, for instance thanks to the "uCDN identifier" information element 814 Table 3. 816 The dCDN must be able to filter some internal scope data such as 817 information related to its internal alarms (security, failures, load, 818 etc). 820 In some use cases described in [I-D.ietf-cdni-use-cases], the 821 interconnected CDNs do not want to disclose details on their internal 822 topology. The dCDN must be able to filter confidential data on the 823 dCDN's topology (number of servers, location, etc.). In particular, 824 information about the requests served by every Surrogate is 825 confidential. Therefore, the Logging information must be protected 826 so that data such as Surrogates hostnames is not disclosed to the 827 uCDN. In the "Inter-Affiliates Interconnection" use case, this 828 information may be disclosed to the uCDN because both the dCDN and 829 the uCDN are operated by entities of the same group. 831 8.4. Logging Update and Rectification 833 If Logging is generated periodically, it is important that the 834 sessions that start in one Logging period and end in another are 835 correctly reported. If they are reported in the starting period, 836 then the Logging of this period will be available only after the end 837 of the session, which delays the Logging generation. 839 A Logging rectification / update mechanism could be useful to reach a 840 good tradeoff between the Logging generation delay and the Logging 841 accuracy. Such mechanism would be particularly invaluable for real 842 time Logging, which must be provided rapidly and cannot wait for the 843 end of operations in progress. 845 9. Protocols for Logging 847 This section discusses the encoding and the protocols for 848 transporting Logging information. 850 CDNs usually store the logs in a format similar to the ones in use by 851 web servers, such as W3C, NCSA, and Squid's log format, which are 852 described in Appendix A. 854 As presented in [I-D.ietf-cdni-problem-statement], several protocols 855 already exist that could potentially be used to exchange CDNI Logging 856 between interconnected CDNs. The dCDN could publish non real-time 857 Logging on a server where the uCDN would retrieve it using FTP, FTPS, 858 or Syslog. If the CDNs need to exchange real-time information 859 through the Logging interface, they could potentially rely on Web 860 APIs, syslog, SNMP... However, as explained in 861 [I-D.ietf-cdni-problem-statement], "SNMP traps pose scalability 862 concerns and SNMP does not support guaranteed delivery of Traps and 863 therefore could result in log records being lost and the consequent 864 CoDRs and billing records for that content delivery not being 865 produced as well as that content delivery being invisible to any 866 analytics platforms." 868 [Ed. Note: in a later version, this memo will include an analysis of 869 candidate protocols, based upon a set of (basic) requirements, such 870 as reliable transport mode, preservation of the integrity of the 871 information conveyed by the protocol, etc.] 873 10. IANA Considerations 875 This memo includes no request to IANA. 877 11. Security Considerations 879 11.1. Privacy 881 CDNs have the opportunity to collect detailed information about the 882 downloads performed by end users. The provision of this information 883 to another CDN introduces end-users privacy protection concerns. 885 11.2. Non Repudiation 887 Logging provides the raw material for charging. It permits the dCDN 888 to bill the uCDN for the content deliveries that the dCDN makes on 889 behalf of the uCDN. It also permits the uCDN to bill the CSP for the 890 content delivery service. Therefore, non-repudiation of Logging data 891 is essential. 893 12. Acknowledgments 895 The authors would like to thank Anne Marrec, Yannick Le Louedec, and 896 Christian Jacquenet for detailed feedback on early versions of this 897 document and for their input on existing Log formats. 899 The authors would like also to thank Fabio Costa, Yvan Massot, Renaud 900 Edel, and Joel Favier for their input and comments. 902 Finally, they thank the contributors of the EU FP7 OCEAN project for 903 valuable inputs. 905 13. References 907 13.1. Normative References 909 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 910 Requirement Levels", BCP 14, RFC 2119, March 1997. 912 13.2. Informative References 914 [CLF] A. Luotonen, "The Common Logfile Format, W3C (work in 915 progress)", 1995, . 918 [ELF] Phillip M. Hallam-Baker and Brian Behlendorf, "Extended 919 Log File Format, W3C (work in progress), WD-logfile- 920 960323", . 922 [I-D.bertrand-cdni-experiments] 923 Bertrand, G., Faucheur, F., and L. Peterson, "Content 924 Distribution Network Interconnection (CDNI) Experiments", 925 draft-bertrand-cdni-experiments-01 (work in progress), 926 August 2011. 928 [I-D.davie-cdni-framework] 929 Davie, B. and L. Peterson, "Framework for CDN 930 Interconnection", draft-davie-cdni-framework-01 (work in 931 progress), October 2011. 933 [I-D.ietf-cdni-problem-statement] 934 Niven-Jenkins, B., Faucheur, F., and N. Bitar, "Content 935 Distribution Network Interconnection (CDNI) Problem 936 Statement", draft-ietf-cdni-problem-statement-03 (work in 937 progress), January 2012. 939 [I-D.ietf-cdni-requirements] 940 Leung, K. and Y. Lee, "Content Distribution Network 941 Interconnection (CDNI) Requirements", 942 draft-ietf-cdni-requirements-02 (work in progress), 943 December 2011. 945 [I-D.ietf-cdni-use-cases] 946 Gilles, B., Emile, S., Watson, G., Burbridge, T., Eardley, 947 P., and K. Ma, "Use Cases for Content Delivery Network 948 Interconnection", draft-ietf-cdni-use-cases-03 (work in 949 progress), January 2012. 951 [RFC3444] Pras, A. and J. Schoenwaelder, "On the Difference between 952 Information Models and Data Models", RFC 3444, 953 January 2003. 955 [RFC3466] Day, M., Cain, B., Tomlinson, G., and P. Rzewski, "A Model 956 for Content Internetworking (CDI)", RFC 3466, 957 February 2003. 959 [RFC3568] Barbir, A., Cain, B., Nair, R., and O. Spatscheck, "Known 960 Content Network (CN) Request-Routing Mechanisms", 961 RFC 3568, July 2003. 963 [apache] "Apache 2.2 log files documentation", Feb. 2012, 964 . 966 [squid] "Squid LogFormat documentation", Feb. 2012, . 971 Appendix A. Examples Log Format 973 This section provides example of log formats implemented in existing 974 CDNs, web servers, and caching proxies. 976 Web servers (e.g., Apache) maintain at least one log file for Logging 977 accesses to content (the Access Log). They can typically be 978 configured to log errors in a separate log file (the Error Log). The 979 log formats can be specified in the server's configuration files. 980 However, webmasters often use standard log formats to ease the log 981 processing with available log analysis tools. 983 A.1. W3C Common Log File (CLF) Format 985 The Common Log File (CLF) format defined by the World Wide Web 986 Consortium (W3C) working group is compatible with many log analysis 987 tools and is supported by the main web servers (e.g., Apache) Access 988 Logs. 990 According to [CLF], the common logfile format is as follows: 991 remotehost rfc931 authuser [date] "request" status bytes. 993 Example (from [apache]: 127.0.0.1 - frank [10/Oct/2000:13:55:36 994 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 996 The fields are defined as follows [CLF]: 998 +------------+------------------------------------------------------+ 999 | Element | Definition | 1000 +------------+------------------------------------------------------+ 1001 | remotehost | Remote hostname (or IP number if DNS hostname is not | 1002 | | available, or if DNSLookup is Off. | 1003 | rfc931 | The remote logname of the user. | 1004 | authuser | The username that the user employed to authenticate | 1005 | | himself. | 1006 | [date] | Date and time of the request. | 1007 | "request" | An exact copy of the request line that came from the | 1008 | | client. | 1009 | status | The status code of the HTTP reply returned to the | 1010 | | client. | 1011 | bytes | The content-length of the document transferred. | 1012 +------------+------------------------------------------------------+ 1014 Table 5: Information elements in CLF format 1016 A.2. W3C Extended Log File (ELF) Format 1018 The Extended Log File (ELF) format defined by W3C extends the CLF 1019 with new fields. This format is supported by Microsoft IIS 4.0 and 1020 5.0. 1022 The supported fields are listed below [ELF]. 1024 +------------+---------------------------------------------------+ 1025 | Element | Definition | 1026 +------------+---------------------------------------------------+ 1027 | date | Date at which transaction completed | 1028 | time | Time at which transaction completed | 1029 | time-taken | Time taken for transaction to complete in seconds | 1030 | bytes | bytes transferred | 1031 | cached | Records whether a cache hit occurred | 1032 | ip | IP address and port | 1033 | dns | DNS name | 1034 | status | Status code | 1035 | comment | Comment returned with status code | 1036 | method | Method | 1037 | uri | URI | 1038 | uri-stem | Stem portion alone of URI (omitting query) | 1039 | uri-query | Query portion alone of URI | 1040 +------------+---------------------------------------------------+ 1042 Table 6: Information elements in ELF format 1044 Some fields start with a prefix (e.g., "c-", "s-"), which explains 1045 which host (client/server/proxy) the field refers to. 1047 Example: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs- 1048 username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status 1049 time-taken 1051 2011-11-23 15:22:01 x.x.x.x GET /file 80 y.y.y.y Mozilla/ 1052 5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.1.6)+Gecko/ 1053 20091201+Firefox/3.5.6+GTB6 200 0 0 2137 1055 A.3. National Center for Supercomputing Applications (NCSA) Common Log 1056 Format 1058 This format for Access Logs offers the following fields: 1060 o host rfc931 date:time "request" statuscode bytes 1062 o x.x.x.x userfoo [10/Jan/2010:21:15:05 +0500] "GET /index.html 1063 HTTP/1.0" 200 1043 1065 A.4. NCSA Combined Log Format 1067 The NCSA Combined log format is an extension of the NCSA Common log 1068 format with three (optional) additional fields: the referral field, 1069 the user_agent field, and the cookie field. 1071 o host rfc931 username date:time request statuscode bytes referrer 1072 user_agent cookie 1074 o Example: x.x.x.x - userfoo [21/Jan/2012:12:13:56 +0500] "GET 1075 /index.html HTTP/1.0" 200 1043 "http://www.example.com/" "Mozilla/ 1076 4.05 [en] (WinNT; I)" "USERID=CustomerA;IMPID=01234" 1078 A.5. NCSA Separate Log Format 1080 The NCSA Separate log format refers to a log format in which the 1081 information gathered is separated into three separate files. This 1082 way, every entry in the Access Log (in the NCSA Common log format) is 1083 complemented with an entry in a Referral log and another one in an 1084 Agent log. The format of the Referral log is as follows: 1086 o date:time referrer 1088 o Example: [21/Jan/2012:12:13:56 +0500] 1089 "http://www.example.com/index.html" 1091 The format of the Referral log is as follows: 1093 o date:time agent 1094 o [21/Jan/2012:12:13:56 +0500] "Microsoft Internet Explorer - 5.0" 1096 A.6. Squid 2.0 Native Log Format for Access Logs 1098 Squid [squid] is a popular piece of open-source software for 1099 transforming a Linux host into a caching proxy. Variations of Squid 1100 log format are supported by some CDNs. 1102 Squid log format is as follow: time elapsed remotehost code/status 1103 bytes method URL rfc931 peerstatus/peerhost type 1105 According to Squid 2.0 documentation [squid], these fields are 1106 defined as follows: 1108 +-----------+-------------------------------------------------------+ 1109 | Element | Definition | 1110 +-----------+-------------------------------------------------------+ 1111 | time | Unix timestamp as UTC seconds with a millisecond | 1112 | | resolution. | 1113 | duration | The elapsed time in milliseconds the transaction | 1114 | | busied the cache. | 1115 | client | The client IP address. | 1116 | address | | 1117 | bytes | The size is the amount of data delivered to the | 1118 | | client, including headers. | 1119 | request | The request method to obtain an object. | 1120 | method | | 1121 | URL | The requested URL. | 1122 | rfc931 | may contain the ident lookups for the requesting | 1123 | | client (turned off by default) | 1124 | hierarchy | The hierarchy information provides information on how | 1125 | code | the request was handled (forwarding it to another | 1126 | | cache, or requesting the content to the Origin | 1127 | | Server). | 1128 | type | The content type of the object as seen in the HTTP | 1129 | | reply header. | 1130 +-----------+-------------------------------------------------------+ 1132 Table 7: Information elements in Squid format 1134 Authors' Addresses 1136 Gilles Bertrand (editor) 1137 France Telecom - Orange 1138 38-40 rue du General Leclerc 1139 Issy les Moulineaux, 92130 1140 FR 1142 Phone: +33 1 45 29 89 46 1143 Email: gilles.bertrand@orange.com 1145 Stephan Emile 1146 France Telecom - Orange 1147 2 avenue Pierre Marzin 1148 Lannion F-22307 1149 France 1151 Email: emile.stephan@orange.com