idnits 2.17.1 draft-bertrand-cdni-logging-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 10, 2012) is 4270 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'MED' is mentioned on line 1015, but not defined == Unused Reference: 'RFC2119' is defined on line 1225, but no explicit reference was found in the text == Unused Reference: 'I-D.bertrand-cdni-experiments' is defined on line 1238, but no explicit reference was found in the text == Unused Reference: 'RFC3444' is defined on line 1279, but no explicit reference was found in the text == Unused Reference: 'RFC3466' is defined on line 1283, but no explicit reference was found in the text == Unused Reference: 'RFC3568' is defined on line 1287, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-brandenburg-cdni-has-03 == Outdated reference: A later version (-14) exists of draft-ietf-cdni-framework-01 == Outdated reference: A later version (-17) exists of draft-ietf-cdni-requirements-03 -- Obsolete informational reference (is this intentional?): RFC 3466 (Obsoleted by RFC 7336) Summary: 0 errors (**), 0 flaws (~~), 12 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Bertrand, Ed. 3 Internet-Draft E. Stephan 4 Intended status: Informational France Telecom - Orange 5 Expires: February 11, 2013 R. Peterkofsky 6 Skytide, Inc. 7 August 10, 2012 9 CDNI Logging Interface 10 draft-bertrand-cdni-logging-01 12 Abstract 14 This memo specifies the Logging interface between a downstream CDN 15 (dCDN) and an upstream CDN (uCDN). It introduces a framework, an 16 architecture design and a set of new requirements. Then it drafts an 17 information model. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on February 11, 2013. 36 Copyright Notice 38 Copyright (c) 2012 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 This document may contain material from IETF Documents or IETF 52 Contributions published or made publicly available before November 53 10, 2008. The person(s) controlling the copyright in some of this 54 material may not have granted the IETF Trust the right to allow 55 modifications of such material outside the IETF Standards Process. 56 Without obtaining an adequate license from the person(s) controlling 57 the copyright in such materials, this document may not be modified 58 outside the IETF Standards Process, and derivative works of it may 59 not be created outside the IETF Standards Process, except to format 60 it for publication as an RFC or to translate it into languages other 61 than English. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 67 1.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 5 68 2. Logging Framework and Architecture . . . . . . . . . . . . . . 6 69 3. Additional Requirements . . . . . . . . . . . . . . . . . . . 10 70 4. Rationale for Logging Interface . . . . . . . . . . . . . . . 10 71 4.1. Usages of CDNI Logging Information . . . . . . . . . . . . 10 72 4.1.1. Maintenance/Debugging . . . . . . . . . . . . . . . . 10 73 4.1.2. Accounting . . . . . . . . . . . . . . . . . . . . . . 11 74 4.1.3. End-User Experience Management . . . . . . . . . . . . 11 75 4.1.4. Security . . . . . . . . . . . . . . . . . . . . . . . 11 76 4.1.5. Legal Logging Duties . . . . . . . . . . . . . . . . . 11 77 4.2. Logging Information Views . . . . . . . . . . . . . . . . 11 78 4.3. Information Extracted From Logging Data . . . . . . . . . 12 79 5. Log Information Elements . . . . . . . . . . . . . . . . . . . 13 80 5.1. Information Elements . . . . . . . . . . . . . . . . . . . 13 81 5.2. Logging Record Information Elements for Content 82 Delivery . . . . . . . . . . . . . . . . . . . . . . . . . 16 83 5.3. Logging Record Information Elements for . . . . . . . . . 17 84 5.4. Logging Record Information Elements for Other 85 Operations . . . . . . . . . . . . . . . . . . . . . . . . 17 86 6. Core Logging Records . . . . . . . . . . . . . . . . . . . . . 18 87 6.1. Content Delivery . . . . . . . . . . . . . . . . . . . . . 18 88 6.2. Content Acquisition . . . . . . . . . . . . . . . . . . . 18 89 6.2.1. Logging Records Provided by dCDN to uCDN . . . . . . . 18 90 6.2.2. Logging Records Provided by uCDN to dCDN . . . . . . . 19 91 6.3. Content Invalidation and Purging . . . . . . . . . . . . . 19 92 6.4. Logging Extensibility . . . . . . . . . . . . . . . . . . 20 93 7. Default Logging Information Format . . . . . . . . . . . . . . 20 94 7.1. Logging Files . . . . . . . . . . . . . . . . . . . . . . 20 95 7.2. File Format . . . . . . . . . . . . . . . . . . . . . . . 20 96 7.2.1. Headers . . . . . . . . . . . . . . . . . . . . . . . 21 97 7.2.2. Body (Logging Records) Format . . . . . . . . . . . . 21 98 7.2.3. Footer Format . . . . . . . . . . . . . . . . . . . . 22 99 8. Logging Format and Scope Negotiation . . . . . . . . . . . . . 22 100 9. Logging Information Transport . . . . . . . . . . . . . . . . 22 101 9.1. Major Requirements on Logging Protocols . . . . . . . . . 23 102 9.2. Recommended Logging Protocol for Non Real-Time Logging . . 23 103 9.3. Recommended Logging Protocol for Real-Time Logging . . . . 24 104 10. Logging Process . . . . . . . . . . . . . . . . . . . . . . . 24 105 10.1. Logging Aggregation . . . . . . . . . . . . . . . . . . . 24 106 10.2. Logging Filtering . . . . . . . . . . . . . . . . . . . . 25 107 10.3. Logging Update and Rectification . . . . . . . . . . . . . 26 108 11. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 26 109 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 110 13. Security Considerations . . . . . . . . . . . . . . . . . . . 27 111 13.1. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 27 112 13.2. Non Repudiation . . . . . . . . . . . . . . . . . . . . . 27 113 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 114 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28 115 15.1. Normative References . . . . . . . . . . . . . . . . . . . 28 116 15.2. Informative References . . . . . . . . . . . . . . . . . . 28 117 Appendix A. Examples Log Format . . . . . . . . . . . . . . . . . 29 118 A.1. W3C Common Log File (CLF) Format . . . . . . . . . . . . . 29 119 A.2. W3C Extended Log File (ELF) Format . . . . . . . . . . . . 30 120 A.3. National Center for Supercomputing Applications (NCSA) 121 Common Log Format . . . . . . . . . . . . . . . . . . . . 32 122 A.4. NCSA Combined Log Format . . . . . . . . . . . . . . . . . 32 123 A.5. NCSA Separate Log Format . . . . . . . . . . . . . . . . . 32 124 A.6. Squid 2.0 Native Log Format for Access Logs . . . . . . . 32 125 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 33 127 1. Introduction 129 This memo specifies the Logging interface between a downstream CDN 130 (dCDN) and an upstream CDN (uCDN). It introduces a framework, an 131 architecture design and a set of new requirements. Then it drafts an 132 information model. 134 The reader should be familiar with the work of the CDNI WG: 136 o CDNI problem statement [I-D.ietf-cdni-problem-statement] and 137 framework [I-D.ietf-cdni-framework] identify a Logging interface, 139 o Section 7 of [I-D.ietf-cdni-requirements] specifies a set of 140 requirements for Logging, 142 o [I-D.ietf-cdni-use-cases] outlines real world use-cases for 143 interconnecting CDNs. These use cases require the exchange of 144 Logging information between the dCDN and the uCDN. 146 o [I-D.lefaucheur-cdni-logging-delivery] complements the present 147 memo by proposing CDNI Logging formats for content deliveries 148 performed using HTTP or HTTP adaptive streaming. 150 The present document describes: 152 o The Logging framework and architecture (Section 2), 154 o The requirements (Section 3), 156 o Discussion on the monitoring and the reporting use cases 157 (Section 4) 159 o Log information (Section 5 and Section 6), 161 1.1. Terminology 163 In this document, the first letter of each CDNI-specific term is 164 capitalized. We adopt the terminology described in 165 [I-D.ietf-cdni-problem-statement] and [I-D.ietf-cdni-framework], and 166 extend it with the additional terms defined below. 168 For clarity, we use the word "Log" only for referring to internal CDN 169 logs and we use the word "Logging" for any inter-CDN information 170 exchange and processing operations related to CDNI Logging interface. 171 Log and Logging formats may be different. 173 Log: CDN internal information collection and processing operations. 175 Logging: Inter-CDN information exchange and processing operations. 177 Fragmented object: [Ed. Note: Tentative of a simple definition which 178 fits with the current CDNI charter] Fragmented objects are pieces of 179 content provided by a CSP which are delivered individually through a 180 CDN interconnection. They differ from a simple object because the 181 delivery of the content to one user agent may be provided by more 182 than one Surrogate/CDN. 184 CDN Reporting: the process of providing the relevant information that 185 will be used to create a formatted content delivery report provided 186 to the CSP in differed time. Such information typically includes 187 aggregated data that can cover a large period of time (e.g., from 188 hours to several months). One of the usages of reporting is the 189 collection of charging data related to CDN services and the 190 computation of Key Performance Indicators (KPIs). 192 CDN Monitoring: the process of providing content delivery information 193 in real-time. The monitoring typically includes data in real time to 194 provide a vision of the deliveries in progress, for service operation 195 purposes. It presents a view of the global health of the services as 196 well as information on usage and performance, for network services 197 supervision and operation management. In particular, monitoring data 198 can be used to generate alarms. 200 End-User experience management: study of Logging data using 201 statistical analysis to discover, understand, and predict user 202 behavior patterns. 204 Delivery Service: a specific instantiation of content delivery 205 service configuration. For instance, a given uCDN (uCDN1) may 206 request a given dCDN (dCDN1) to configure a Delivery Service for 207 handling requests for HTTP Adaptive streaming videos delegated by 208 uCDN1 and related to a specific CSP (CSP1), and another one for 209 handling request for static pictures delegated by uCDN1 and related 210 to CSP1. In this simple example, uCDN1 may request dCDN1 to include 211 delivery service information in its CDNI Logging, to help uCDN 212 provide relevant reports to CSP1. 214 1.2. Abbreviations 216 o API: Application Programming Interface 218 o CCID: Content Collection Identifier 220 o CDN: Content Delivery Network 221 o CDNP: Content Delivery Network Provider 223 o CoDR: Content Delivery Record 225 o CSP: Content Service Provider 227 o DASH: Dynamic Adaptive Streaming over HTTP 229 o dCDN: downstream CDN 231 o FTP: File Transfer Protocol 233 o HAS: HTTP Adaptive Streaming 235 o KPI: Key Performance Indicator 237 o PVR: Personal Video Recorder 239 o SID: Session Identifier 241 o SFTP: SSH File Transfer Protocol 243 o SNMP: Simple Network Management Protocol 245 o uCDN: upstream CDN 247 2. Logging Framework and Architecture 249 The framework of the Logging interface is straightforward: dCDN logs 250 any information related to the completion of any task performed by a 251 dCDN on behalf of an uCDN and any exchange related to the management 252 of the contents that the said dCDN delivers on behalf of an uCDN, as 253 discussed in Section 6.1. 255 Logging is a mandatory feature for a CDN, especially if the CDN is 256 interconnected to other CDNs. Logging provides the raw material for 257 some essential operations of a delivery service, such as monitoring, 258 reporting, billing, etc. 260 As stated in [I-D.ietf-cdni-problem-statement], "the CDNI Logging 261 interface enables details of logs or events to be exchanged between 262 interconnected CDNs". 264 Figure 1 provides an example of Logging information exchanges. uCDN 265 is connected to dCDN-1 and dCDN-2. Both dCDN-1, dCDN-2, and uCDN 266 deliver content for CSP. The Logging interface enables the uCDN to 267 obtain Logging data from dCDN-1 and dCDN-2. In the example, uCDN 268 uses the Logging data: 270 o to analyze the performance of the delivery operated by the dCDNs 271 and to adjust its operations (e.g., request routing) as 272 appropriate, 274 o to provide reporting (non real-time) and monitoring (real-time) 275 information to CSP. 277 For instance, uCDN merges Logging data, extracts relevant KPIs, and 278 presents a formatted report to CSP, in addition to a bill for the 279 content delivered. uCDN may also provide Logging data as raw log 280 files to CSP, so that CSP uses its own Logging analysis tools. 282 +-----+ 283 | CSP | 284 +-----+ 285 ^ Reporting and monitoring data 286 | Billing 287 ,--,--. 288 Logging ,-' `-. Logging 289 Data ( uCDN ) Data 290 ....> `-. _,-'<.... 291 | `-'-'-' | 292 ,--v--. ^ ^ ,--v--. 293 ,-' `-. | | ,-' `-. 294 ( dCDN-1 )<--+ +-->( dCDN-2 ) Logging 295 `-. ,-' Logging `-. _,-'<...Data 296 `--'--' Tuning `--'-' | 297 ^ ,--|--. 298 Logging | ,' `-. 299 Tuning + -->( dCDN-3 ) 300 `. ,-' 301 `--'--' 303 Figure 1: Exchange of Logging Information 305 A dCDN integrates the logging of its downstream CDNs in the Logging 306 that it provides to the uCDN, as required by 307 [I-D.ietf-cdni-requirements] (LOG-3). 309 Figure 1 represents bi-directional arrows between dCDN and uCDN for 310 the exchange of Logging data, because even if the common case 311 involves the uCDN retrieving Logging data on the dCDN, the reverse 312 case where the dCDN retrieves Logging data (e.g., related to dCDN's 313 content acquisition requests to the uCDN) on the uCDN is also 314 possible. 316 Note that the format of Logging data that the dCDN provides might be 317 different from the one that the dCDN uses internally. In this case, 318 the dCDN needs to reformat the Logging data before it provides this 319 data to the uCDN. Similarly, an uCDN might reformat the Logging data 320 that it receives before providing it to the CSP or to its uCDN. Such 321 reformatting operations are time consuming (delays in the Logging 322 chain) and introduce a processing burden. Therefore, it is 323 recommended that the CDNI Logging format be as close as possible from 324 the most common CDN Log formats. 326 Figure 2 presents the Logging Architecture. More details on the 327 Logging operations are provided in Section 10. A dCDN prepares the 328 Logging data requested by the uCDN. This preparation involves 329 operations such as filtering, aggregating, anonymizing, and 330 summarizing the logs. The uCDN downloads the corresponding Logging 331 Records and performs its own reporting for the CSP. 333 +------+ 334 | CSP | 335 +------+ 336 ^ 337 ^ Reporting, Monitoring, Billing 338 ^ 339 ---^--------------------- Logging Record ------------------------- 340 / ^ Upstream CDN \ selection / Downstream CDN \ 341 |+-----+ +-------------+ | and format nego. | +-------------+ +-----+| 342 || |**| Control | |<---------------->| | Control |**| || 343 || | +-------------+ | | +-------------+ | I || 344 || I | | | | n || 345 || n | +-------------+ | | +-------------+ | t || 346 || t |<<| Logging | | | | Logging |<<| e || 347 || e | +-------------+ |<---------------->| +-------------+ | r || 348 || r | | Logging Records | | c L || 349 || c L | | | | o o || 350 || o o | +-------------+ | | +-------------+ | n g || 351 || n g |<<|Req-Routing | | | |Req-Routing |>>| n i || 352 || n i | +-------------+ | | +-------------+ | e c || 353 || e c | | | | c || 354 || c | +-------------+ | | +-------------+ | t || 355 || t |<<| Metadata | | | | Metadata |>>| i || 356 || i | +-------------+ | | +-------------+ | o || 357 || o | | | | n || 358 || n | +-------------+ | | +-------------+ | || 359 || |<<| Distribution| |******************| | Distribution|>>| || 360 |+-----+ +-------------+ | Acquisition | +-------------+ +-----+| 361 \ / \ . * / 362 ------------------------- ---------.-*------------- 363 . . * 364 . Request . * Delivery 365 . +--.-*--+ 366 ..................Request............| User | 367 | Agent | 368 +-------+ 370 Figure 2: Logging Architecture 372 In Figure 2, the Logging Record selection and format negotiation 373 occurs at Control Interface level, as these operations provide static 374 information for initializing the Logging interface. 376 Logging data captures information elements that may be available at 377 various stages during the life-cycle of content distribution. The 378 arrows (">>") in Figure 2 represent the direction of information 379 elements in the Logging process. 381 3. Additional Requirements 383 Section 7 of [I-D.ietf-cdni-requirements], already specifies a set of 384 requirements for Logging (LOG-1 to LOG-16). Some security 385 requirements also affect Logging (e.g., SEC-4). 387 4. Rationale for Logging Interface 389 [I-D.ietf-cdni-framework] and [I-D.ietf-cdni-problem-statement] 390 introduce the rationale for the Logging interface as a means for an 391 uCDN to acquire some visibility on the contents the dCDN delivers on 392 behalf of the uCDN. dCDN provides the uCDN with elements of 393 information and Logging Records for operating the CDN interconnection 394 and reporting to the CSP. This section develops use cases that 395 require exchange of Logging information. 397 4.1. Usages of CDNI Logging Information 399 This section presents the usage of the Logging Records by an uCDN. 400 It does not make any assumption on where the Logging Records are 401 produced. Logging Records may be produced either by the uCDN or a 402 dCDN. 404 4.1.1. Maintenance/Debugging 406 Logging is useful to permit the detection (and limit the risk) of 407 content delivery failures. In particular, Logging facilitates the 408 resolution of false configuration issues. 410 To detect faults, Logging must enable the reporting of any CDN 411 operation success and failure, such as request redirection, content 412 acquisition, etc. The uCDN can summarize such information into KPIs. 413 For instance, Logging format should allow the computation of the 414 number of times during a given epoch, a content delivery related to a 415 specific service succeeds/fails. 417 Logging is useful to analyze the performance of content delivery 418 services. This implies computing KPIs from the Logging data for 419 service quality analysis and monitoring (see Section 4.3). 421 Logging enables the CDN providers to evaluate the QoS level related 422 to a specific delivery service. For instance, one aspect of this QoS 423 level could be measured through the average delivery throughput 424 experienced by End-Users in a given region for this specific service 425 over a period of time. 427 Logging enables the CDN providers to identify and troubleshoot 428 performance degradations. In particular, Logging enables the 429 communication of traffic data (e.g., the amount of traffic that has 430 been forwarded by a dCDN on behalf of an uCDN over a given period of 431 time), which is particularly useful for CDN and network planning 432 operations. 434 4.1.2. Accounting 436 Logging is essential for accounting, to permit inter-CDN billing, and 437 CSP billing by uCDN. For instance, Logging enables the uCDN to check 438 the total amount of traffic delivered by every dCDN and for every 439 delivery service, as well as, the associated bandwidth usage (e.g., 440 peak, 95th percentile), and the maximum number of simultaneous 441 sessions over a given period of time. 443 4.1.3. End-User Experience Management 445 The goal of End-User experience management is to gather any relevant 446 information to meter audience, analyze user behavior, etc. For 447 instance, Logging enables the CDN providers to report on content 448 consumption (e.g., delivered sessions per content) in a specific 449 geographic area. 451 4.1.4. Security 453 The goal of security is to prevent and monitor unauthorized access, 454 misuse, modification, and denial of access of a service. A set of 455 information is logged for security purposes. In particular, access 456 to content is usually collected to permit the CSP to detect 457 infringements of content delivery policies and other abnormal End- 458 User behaviors. 460 4.1.5. Legal Logging Duties 462 Depending on the country considered, the CDNs may have to retain 463 specific Logging information during a legal retention period, to 464 comply with judicial requisitions. 466 4.2. Logging Information Views 468 Logging information is useful to the uCDN and potentially to the CSP. 469 Different views of the Logging information may be provided depending 470 on privacy, business, and scalability constraints. Some kind of 471 information format adaptation capability may be supported by an uCDN 472 to present some (e.g., filtered, aggregated) data in the appropriate 473 format (raw log files, reports) to the CSP. More details on these 474 operations are provided in Section 10. 476 We provide a non-exhaustive list and description of tools that can be 477 fed with Logging information. 479 o Tools used by the uCDN's operator: billing tools (information 480 system), customer experience intelligence, reporting tools, 481 security auditing tools, dimensioning tools, strategic planning 482 and investment... 484 o Tools used by CSPs: customer experience management tools, 485 reporting tools, security auditing tools... 487 4.3. Information Extracted From Logging Data 489 This section presents, for explanatory purposes, a non-exhaustive 490 list of information that can be extracted/produced from logs. 491 Depending on the inter-CDN agreement, this information may be 492 computed by the uCDN or by the dCDN. Nevertheless, it is usually the 493 uCDN that computes KPIs, because uCDN and dCDN may have different 494 definitions of the KPIs and the computation of some KPIs requires a 495 vision of all the deliveries performed by the uCDN and all its dCDNs. 497 CSPs require specific information, such as KPIs, about the delivery 498 of their content. The Logging data must contain appropriate 499 information to enable CSPs or the uCDN to extract the required KPIs. 500 In the present section, we list important examples of KPIs: 502 o Number of delivery requests received from End-Users in a given 503 region for each piece of content, during a given period of time 504 (e.g., hour/day/week/month), 506 o Percentage of delivery successes / failures among the 507 aforementioned requests 509 o Number of failures listed by failure type (e.g., HTTP error code) 510 for requests received from End-Users in a given region and for 511 each piece of content, during a given period of time (e.g., hour/ 512 day/week/month), 514 o Number and cause of delivery premature termination for End-Users 515 in a given region and for each piece of content, during a given 516 period of time (e.g., hour/day/week/month), 518 o Maximum and mean number of simultaneous sessions established by 519 End-Users in a given region, for a given delivery service, and 520 during a given period of time (e.g., hour/day/week/month), 522 o Volume of traffic delivered for sessions established by End-Users 523 in a given region, for a given delivery service, and during a 524 given period of time (e.g., hour/day/week/month), 526 o Maximum, mean, and minimum delivery throughput for sessions 527 established by End-Users in a given region, for a given delivery 528 service, and during a given period of time (e.g., hour/day/week/ 529 month) 531 o Cache-hit and byte-hit ratios for requests received from End-Users 532 in a given region for each piece of content, during a given period 533 of time (e.g., hour/day/week/month) 535 o Top 10 of the most popular requested content (with time 536 repartition into day/week/month), 538 o Terminal type (mobile, PC, STB, if this information can be 539 acquired from the browser type header, for example). 541 Additional KPIs can be computed from other sources of information 542 than the Logging, for instance, data collected by a content portal or 543 by specific client-side APIs. Such KPIs are out of scope for the 544 present memo. 546 5. Log Information Elements 548 CDNI must specify a set of Logging information elements to avoid log 549 format regeneration, which would affect the performance of the log 550 handling chain. A common set of Logging information element eases 551 the sharing of logs among the CDNs and the use of log processing 552 tools, for instance, to prepare reporting. 554 Existing CDNs Logging functions collect and consolidate logs 555 performed by their Surrogates. Surrogates usually store the logs 556 using a format derived from Web servers' and caching proxies' log 557 standards such as W3C, NCSA [ELF] [CLF], or Squid format [squid]. In 558 practice, these formats are adapted to cope with CDN specifics. 559 Appendix A presents examples of commonly used log formats. 561 5.1. Information Elements 563 This section describes a set of information elements that structure 564 Logging information generated by the dCDN. The section does not 565 prescribe a particular encoding (such as SNMP SMI or alternatives). 566 All fields in the Logging information are optional unless stated 567 otherwise. However, if a given CDN decides to support some of the 568 Logging information fields, it must conform to the definition and 569 format of this field specified in the present memo, to guarantee that 570 interconnected CDNs share a common understanding of the Logging 571 semantic and syntax. 573 +-------------+-----------------------------------------------------+ 574 | Name | Description | 575 +-------------+-----------------------------------------------------+ 576 | Start-time | A start date and time associated with a logged | 577 | | event; for instance, the time at which a Surrogate | 578 | | received a content delivery request or the time at | 579 | | which an origin server received a content | 580 | | acquisition request. | 581 | End-time | An end date and time associated with a logged | 582 | | event. For instance, the time at which a Surrogate | 583 | | completed the handling of a content delivery | 584 | | request (e.g., end of delivery or error). | 585 | Duration | The duration of an operation in milliseconds. For | 586 | | instance, this field could be used to provide the | 587 | | time it took to the Surrogate to send the requested | 588 | | file to the End-User, or the time it took the | 589 | | Surrogate to acquire the file on a cache-miss | 590 | | event. | 591 | Client-IP | The IP address of the User Agent that issued the | 592 | | logged request (or of a proxy). | 593 | Operation | The kind of operation that is logged; for instance, | 594 | | Acquisition, Delivery, or Purging. | 595 | URI_full | The full requested URL (e.g., | 596 | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/po | 597 | | tter.avi?param=11&user=toto"). When HTTP request | 598 | | redirection is used, this URI includes the | 599 | | Surrogate FQDN. If the association of requests to | 600 | | Surrogates is confidential, the dCDN can present | 601 | | only URI_part to uCDN. | 602 | URI_part | The requested URL path (e.g., | 603 | | /cdn.csp.com/movies/potter.avi?param=11&user=toto | 604 | | if the full request URL was | 605 | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/po | 606 | | tter.avi?param=11&user=toto"). The URI without | 607 | | host-name typically includes the "CDN domain" | 608 | | (ex.cdn.csp.com) - cf. [I-D.ietf-cdni-framework]: | 609 | | it enables the identification of the CSP service | 610 | | agreed between the CSP and the CDNP operating the | 611 | | uCDN. | 612 | Protocol | The protocol and protocol version of the message | 613 | | that triggered the Logging entry. | 614 | Request-met | The protocol method of the request message that | 615 | hod | triggered the Logging entry. | 616 | Status | The protocol method of the reply message related to | 617 | | the Logging entry | 618 | Bytes-Trans | The number of bytes at application-layer | 619 | ferred | protocol-level (e.g., HTTP) of the reply message | 620 | | related to the Logging entry. It includes the size | 621 | | of the response headers. | 622 | Bytes-recei | The number of bytes (headers + body) of the message | 623 | ved | that triggered the Logging entry. | 624 | Referrer | The value of the Referrer header in an HTTP | 625 | | request. | 626 | User-Agent | The value of the User Agent header in an HTTP | 627 | | request. | 628 | Cookie | The value of the Cookie header in an HTTP request. | 629 | Record-dige | A digest of the Logging Record; it enables | 630 | st | detecting corrupted Logging Records. | 631 | CCID | A Content Collection IDentifier (CCID) eases the | 632 | | correlation of several Logging Records related to a | 633 | | Content Collection (e.g., a movie split in chunks). | 634 | SID | A Session Identifier (SID) eases the correlation | 635 | | (and aggregation) of several Logging Records | 636 | | related to a session. The SID is especially | 637 | | relevant for summarizing HAS Logging information | 638 | | [I-D.brandenburg-cdni-has]. | 639 +-------------+-----------------------------------------------------+ 641 Table 1: Logging Record Information Elements 643 NB: we define three fields related to the timing of logged 644 operations: Start-time, End-time, and Duration. Only two of these 645 three fields are required to obtain relevant timing information on 646 the operation. Start-time is typically useful for human readers 647 (e.g., while debugging), however, most servers log the operations 648 End-time which correspond to the time of log record generation. 650 Multiple header fields, in addition to User Agent and Referrer, could 651 be reproduced in the Logging entries. 653 Note that uCDN may want to filter Logging data by user (and not by IP 654 address) to provide more relevant information to the CSP. In such 655 case, a user may be identified as a combination of several pieces of 656 information such as the client IP and User Agent or through the SID. 658 The URI_full provides information on the Surrogate that provided the 659 content. This information can be relevant, for instance, for Inter- 660 Affiliates scenarios [I-D.ietf-cdni-use-cases]. However, in some 661 cases it may be considered as confidential and the dCDN may provide 662 URI_part instead. 664 Table 2 illustrates the definition of the information elements. It 665 provides examples using Apache log format strings [apache] when they 666 exist. The table is here for illustration and does not prescribe a 667 specific encoding. 669 +------------+------------------+-----------------------------------+ 670 | Name | String | Example | 671 +------------+------------------+-----------------------------------+ 672 | Time | %t | [10/Oct/2000:13:55:36-0700] | 673 | Duration | - | - | 674 | Client-IP | - | - | 675 | Operation | - | - | 676 | URI_log | - | - | 677 | Protocol | %H | HTTP/1.0 | 678 | Request | %m | GET | 679 | method | | | 680 | Status | %>s | 200 | 681 | Bytes | %O | 2326 | 682 | transferre | | | 683 | d | | | 684 | Bytes | - | - | 685 | received | | | 686 | Header | \"%{Referrer}i\" | "http://www.example.com/start.htm | 687 | | \"%{User-agent}i | l" "Mozilla/4.08 [en] (Win98; I | 688 | | \" | ;Nav)" | 689 +------------+------------------+-----------------------------------+ 691 Table 2: Examples using Apache format 693 5.2. Logging Record Information Elements for Content Delivery 695 Table 3 details specific Logging fields that dCDN may provide to uCDN 696 and that are related to content delivery operations. 698 +-------------------+-----------------------------------------------+ 699 | Name | Definition | 700 +-------------------+-----------------------------------------------+ 701 | uCDN-ID | An element authenticating the operator of the | 702 | | uCDN as the authority having delegated the | 703 | | request to the dCDN. | 704 | Delivering-CDN-ID | An identifier (e.g., an aggregation of an IP | 705 | | address and a FQDN) of the Delivering CDN. | 706 | | The Delivering-CDN-ID might be considered as | 707 | | confidential by the dCDN. In such case, the | 708 | | dCDN could either not provide this field to | 709 | | the uCDN or overwrite the Delivering-CDN-ID | 710 | | with its on identifier. | 711 | End-User-IP | The IP address of the client making a content | 712 | | delivery request (or of its proxy). | 713 | Cache-bytes | The number of body bytes served from caches. | 714 | | This quantity permits the computation of the | 715 | | byte hit ratio. | 716 | Action | The Action describes how a given request was | 717 | | treated locally: through which transport | 718 | | protocol, with or without content | 719 | | revalidation, with a cache hit or cache miss, | 720 | | with fresh or stale content, and if relevant | 721 | | with which error. Example with Squid format | 722 | | [squid]: "TCP_REFRESH_FAIL_HIT" means that an | 723 | | expired copy of an object requested through | 724 | | TCP was in the cache. Squid attempted to | 725 | | make an If-Modified-Since request, but it | 726 | | failed. The old (stale) object was delivered | 727 | | to the client. | 728 +-------------------+-----------------------------------------------+ 730 Table 3: Delivery Information Elements 732 5.3. Logging Record Information Elements for 734 Table 4 details specific Logging fields that are related to content 735 acquisition operations. 737 [Ed. Note: split this section in two parts: logs provided by uCDN / 738 logs provided by dCDN?] 740 +------------+------------------------------------------------------+ 741 | Name | Definition | 742 +------------+------------------------------------------------------+ 743 | dCDN | An element authenticating the operator of the dCDN | 744 | identifier | as the authority requesting the content to the uCDN | 745 +------------+------------------------------------------------------+ 747 Table 4: Acquisition Information Elements 749 These information elements may be used in Content Acquisition Logging 750 provided by dCDN to uCDN and potentially in Content Acquisition 751 Logging provided by uCDN to dCDN. 753 5.4. Logging Record Information Elements for Other Operations 755 Logging can be used for debugging. Therefore, all kind of CDN 756 operations might be logged, depending on the agreement between the 757 dCDN and the uCDN. In particular, operations related to Request 758 Routing, Metadata and Control interfaces can be logged. 760 6. Core Logging Records 762 This section defines a set of central events that a dCDN should 763 register and publish through the Logging interface. 765 We classify the logged events depending on the CDN operation to which 766 they relate: Content Delivery, Content Acquisition, Content 767 Invalidation/Purging, etc. 769 6.1. Content Delivery 771 Some CSPs pay a lot of attention to the protection of their content 772 (e.g., premium video CSPs). To fulfill the needs of these CSPs, a 773 CDN shall log all the details of the content delivery authorizations. 774 This means that a dCDN must be able to provide Logging detailing the 775 content delivery/content acquisition authorizations and denials as 776 well as information on why the request is authorized/denied. 778 CSPs and CDSP pay a lot of attention to errors related to content 779 delivery. It is therefore of upmost importance that the dCDN 780 provides detailed error information in the Logging data. This 781 information should typically be available even when Logging is 782 aggregated (cf. Section 10.1). 784 The content delivery events triggering the generation of a Logging 785 Record include: 787 o Reception of a content request, 789 The generated Logging Record typically embeds information about: 791 o Denial of delivery (error or unauthorized request) for a request, 793 o Beginning of delivery (authorization) of a requested content, 795 o End of an authorized delivery (success), 797 o End of an authorized delivery (failure). 799 6.2. Content Acquisition 801 6.2.1. Logging Records Provided by dCDN to uCDN 803 When the uCDN requires the dCDN to provide Logging for acquisition 804 related events, the events triggering the generation of a Logging 805 Record include: 807 o Emission of a content acquisition request (first try or retry) for 808 a cache hit or a cache miss with content revalidation 810 The generated Logging Record typically embeds information about: 812 o Reception of a reply indicating denial of delivery (error or 813 unauthorized request) for a content acquisition request, 815 o End of an authorized acquisition (success), 817 o End of an authorized acquisition (failure) 819 Note that a dCDN may acquire content only from the uCDN. It this 820 case, the uCDN can log the dCDN's content acquisition operations 821 itself, and thus, the uCDN may not require the dCDN to log 822 acquisition related events (except for security or debugging 823 reasons). 825 6.2.2. Logging Records Provided by uCDN to dCDN 827 When the dCDN requires the uCDN to provide Logging for acquisition 828 related events, the events triggering the generation of a Logging 829 Record include: 831 o Reception of a content acquisition request for the considered 832 delivery service for a cache hit or a cache miss with content 833 revalidation 835 The generated Logging Record typically embeds information about: 837 o Emission of a reply indicating denial of delivery (error or 838 unauthorized request) for a content acquisition request, 840 o End of an authorized acquisition (success), 842 o End of an authorized acquisition (failure). 844 6.3. Content Invalidation and Purging 846 When the uCDN requests a dCDN to log invalidation/purging events 847 (e.g., for security), the events triggering the generation of a 848 Logging Record include: 850 o Reception of a content invalidation/purging request 852 The generated Logging Record typically embeds information about: 854 o Denial of the invalidation/purging request (error or unauthorized 855 request), 857 o Beginning of invalidation/purging (authorization) for a given 858 content purging request, 860 o End of an authorized invalidation/purging (success), 862 o End of an authorized invalidation/purging (failure). 864 6.4. Logging Extensibility 866 Future usages might introduce the need for additional Logging fields. 867 In addition, some use-cases such as an Inter-Affiliate 868 Interconnection [I-D.ietf-cdni-use-cases], might take advantage of 869 extended Logging exchanges. Therefore, it is important to permit 870 CDNs to use additional Logging fields besides the standard ones, if 871 they want. For instance, an "Account-name" identifying the contract 872 enforced by the dCDN for a given request could be provided in 873 extended fields. 875 The required Logging Records may depend on the considered services. 876 For instance, static file delivery (e.g., pictures) typically does 877 not include any delivery restrictions. By contrast, video delivery 878 typically implies strong content delivery restrictions, as explained 879 in [I-D.ietf-cdni-use-cases], and Logging could include information 880 about the enforcement of these restrictions. Therefore, to ease the 881 support of varied services as well as of future services, the Logging 882 interface should support optional Logging Records. 884 7. Default Logging Information Format 886 Interconnected CDNs may support various Logging formats. However, 887 they must support at least the default Logging format described here. 889 7.1. Logging Files 891 [Ed. Note: How many files (one per type of Delivery Service (e.g., 892 HTTP, WMP) and per type of Event (e.g., Errors, Delivery, 893 Acquisition,...?)and what would be inside... These aspects will be 894 detailed in future versions.] 896 7.2. File Format 898 [Ed. note: The Logging file format is not necessarily independant of 899 the selected transport protocol. The definition of the Logging file 900 format should be carried out consistently with the candidate protocol 901 analysis for Logging transport. The present content of this section 902 is therefore non definitive.] 904 7.2.1. Headers 906 As initially proposed in [I-D.lefaucheur-cdni-logging-delivery], 907 Logging files must include a header with the information described in 908 Figure 3. 910 +----------------+-------------------+------------------------------+ 911 | Field | Description | Examples | 912 +----------------+-------------------+------------------------------+ 913 | Format | Identification of | standard_cdni_errors_http_v1 | 914 | | CDNI Log format. | | 915 | Fields | A description of | | 916 | | the records format| | 917 | | (list of fields). | | 918 | Log-ID | Identifier | abcdef1234 | 919 | | for the CDNI Log | | 920 | | file (facilitates | | 921 | | detection of | | 922 | | duplicate Logs | | 923 | | and tracking in | | 924 | | case of | | 925 | | aggregation). | | 926 | Log-Timestamp | Time, in | [20/Feb/2012:00:29.510+0200] | 927 | | milliseconds, the | | 928 | | CDNI Log was | | 929 | | generated. | | 930 | Log-Origin | Identifier of the | cdn1.cdni.example.com | 931 | | authority (e.g., | | 932 | | dCDN or uCDN) | | 933 | | providing the Log-| | 934 | | -ging | | 935 +----------------+-------------------+------------------------------+ 937 Figure 3: Logging Headers 939 7.2.2. Body (Logging Records) Format 941 [Ed. note: the W3C extended log format is a good base candidate to 942 look at.] 944 [Ed. Note: The format for Time is still to be agreed on. RFC 5322 945 (Section 3.3) format could be used or ISO 8601 formatted date and 946 time in UTC (same format as proposed in 948 [draft-caulfield-cdni-metadata-core-00]). Also see RFC5424 Section 949 6.2.3.] 951 [Ed. Note: Records used for real time information and non-real time 952 information could use different formats.] 954 7.2.3. Footer Format 956 As initially proposed in [I-D.lefaucheur-cdni-logging-delivery], 957 Logging files must include a footer with the information described in 958 Figure 4. 960 +---------+----------------------------------------------+----------+ 961 | Field | Description | Examples | 962 +---------+----------------------------------------------+----------+ 963 | Log | Digest of the complete Log (facilitates | | 964 | Digest | detection of Log corruption) | | 965 +---------+----------------------------------------------+----------+ 967 Figure 4: Logging footers 969 8. Logging Format and Scope Negotiation 971 [Ed. Note: Format should be negotiated per delivery service] 973 [Ed. Note: uCDN shall be able to select the type of events that a 974 dCDN should include in the Logging that the latter provides to the 975 uCDN.] 977 9. Logging Information Transport 979 As presented in [I-D.ietf-cdni-problem-statement], several protocols 980 already exist that could potentially be used to exchange CDNI Logging 981 between interconnected CDNs. The dCDN could publish non real-time 982 Logging on a server where the uCDN would retrieve it using for 983 example SSH File Transfer Protocol (SFTP). If the CDNs need to 984 exchange real-time information through the Logging interface, they 985 could potentially rely on Web APIs, Syslog, SNMP... The main 986 criterion for selecting a Logging transport protocol is the time 987 constraint for delivering the Logging. Therefore, the present 988 section highlights the candidate protocols for real-time and non 989 real-time Logging exchanges. 991 9.1. Major Requirements on Logging Protocols 993 Logging data is sensitive as it provides the raw material for 994 producing bills etc. Therefore, the protocol delivering the Logging 995 data must be reliable to avoid information loss. In addition, the 996 protocol must scale to support the transport of large amounts of 997 Logging data. Finally, this protocol must comply with the 998 requirements identified in [I-D.ietf-cdni-requirements]. 1000 CDNs need to trust Logging information, thus, they want to know: 1002 o who issued the Logging (authentication), and 1004 o if the Logging has been modified by a third party (integrity). 1006 This is extremely important, as the logs can provide a basis for 1007 accounting/billing. 1009 Logging also contains confidential data, and therefore, it should not 1010 be protected from eavesdropping. 1012 All these needs translate into security requirements on both the 1013 Logging data format and on the Logging protocol. 1015 [Ed. note: cf. requirements draft: "SEC-4 [MED] The CDNI solution 1016 should be able to ensure that the Downstream CDN cannot spoof a 1017 transaction log attempting to appear as if it corresponds to a 1018 request redirected by a given Upstream CDN when that request has not 1019 been redirected by this Upstream CDN. This ensures non-repudiation 1020 by the Upstream CDN of transaction logs generated by the Downstream 1021 CDN for deliveries performed by the Downstream CDN on behalf of the 1022 Upstream CDN."] 1024 9.2. Recommended Logging Protocol for Non Real-Time Logging 1026 as explained in [I-D.ietf-cdni-problem-statement], "SNMP traps pose 1027 scalability concerns and SNMP does not support guaranteed delivery of 1028 Traps and therefore could result in log records being lost and the 1029 consequent CoDRs and billing records for that content delivery not 1030 being produced as well as that content delivery being invisible to 1031 any analytics platforms." 1033 [Ed. Note: timing constraints... cf LOG-6 offline vs. constrained 1034 time / on demand access to real-time logging information] 1036 [Ed. Note: in a later version, this memo will include an analysis of 1037 candidate protocols, based upon a set of (basic) requirements, such 1038 as reliable transport mode, preservation of the integrity of the 1039 information conveyed by the protocol, etc.] 1041 The offline exchange of non real-time Logging could rely on several 1042 protocols. In particular, the dCDN could publish the Logging on a 1043 server where the uCDN would retrieve them using a secure protocol 1044 (yet to be identified). 1046 [Ed. note: event-triggered or periodic, why?] 1048 [Ed. note: Propose protocol and add call flow] 1050 9.3. Recommended Logging Protocol for Real-Time Logging 1052 The uCDN must be able to retrieve real-time information via near 1053 real-time methods such as: Syslog, SNMP, or through APIs, for 1054 example. 1056 [Ed. note: dCDN does not just forward requests for real time logging. 1057 It should probably provide other (more complex?) information in real 1058 time about the ongoing sessions (e.g., for every active session : IP 1059 of the client, service, CDN name, content consumed (full URL), 1060 average bit rate, downloaded size, date of session start?) 1062 10. Logging Process 1064 We walk through a "day in the life" of a CDN interconnection to 1065 present functions the two CDNs may require to exchange Logging 1066 information. This will serve to illustrate many of the functions 1067 that could be supported through CDNI Logging interface. We describe 1068 capabilities, such as log aggregation, anonymizing, and filtering, 1069 that might be added to CDNI in a later stage, to optimize Logging 1070 operations. 1072 10.1. Logging Aggregation 1074 CDNs typically handle millions of records per day. The processing of 1075 these records to extract relevant monitoring and reporting 1076 information is expensive in terms of CPU and time. Therefore, as 1077 stated in [I-D.ietf-cdni-framework], "a design tradeoff in the 1078 Logging interface is the degree of aggregation or summarization of 1079 data." 1081 In particular, dCDNs must aggregate the logs of their elements (e.g., 1082 the Surrogates) to avoid both the complexity of distributing multiple 1083 log files to the uCDN and to avoid disclosing information about 1084 dCDN's internal topology. This aggregation alleviates the Logging 1085 processing burden for the uCDN. 1087 Many situations also lead to the delivery of fragments of content 1088 (DASH, failure of delivery, partial delivery, PVR actions, etc.). A 1089 dCDN may not publish a Logging Record for each piece of content it 1090 delivers, because this can lead to unacceptably large logs. In 1091 particular, a Logging Record could provide aggregated information 1092 about the delivery of several content pieces. uCDN and dCDN must be 1093 able to agree on a level of granularity for the Logging Records. 1094 This problem is well described for the case of HTTP adaptive 1095 streaming in [I-D.ietf-cdni-framework] and 1096 [I-D.brandenburg-cdni-has]. 1098 In the current version of the draft, we identify the following 1099 options that may be considered for reducing the amount of Logging 1100 data. 1102 o Transmit only summaries, for instance, a summary may aggregate 1103 information of all deliveries that occur during a 5 minutes time 1104 slot or provide only Logging data related to content items that 1105 have been delivered at least a specific number of times. Note 1106 that such aggregation leads to an information loss. This may be 1107 problematic for some usages of Logging (e.g., debugging) and some 1108 information should always be present, for instance, information 1109 about content delivery errors (403,404,...). The use multiple 1110 levels of Logging granularity such as in Apache (debug, notice, 1111 etc.) may help in providing the most relevant amount of 1112 information depending on the intended Logging usage, without 1113 having to renegotiate the Logging format. 1115 o For HAS content, a way to compress logs with minimal information 1116 loss would be to merge all success 200 OK records Records related 1117 to the same level of video Quality into a single record with 1118 appropriate Start-time and End-time. The only information lost in 1119 this process would be the Start-time and End-time for every video 1120 chunk. 1122 o Losslessly compress the Logging data. 1124 o Agree on a Logging retention duration and optionally on a maximum 1125 size of the Logging data that the dCDN must keep. If this size is 1126 exceeded, the dCDN must alert the uCDN but may not keep more Logs 1127 for the considered time period. 1129 [Ed. Note: cite Syslog's concepts for aggregation ] 1131 10.2. Logging Filtering 1133 The dCDN must be able to present only relevant information to the 1134 uCDN, to avoid unnecessary Logging processing load for the uCDN and 1135 potentially to protect End-Users' privacy. Hence, the downstream CDN 1136 filters its logs, and passes the relevant records directly to each 1137 upstream CDN. This requires that the downstream CDN can recognize 1138 the set of log entries that relate to each upstream CDN, for instance 1139 thanks to the "uCDN identifier" information element Table 3. 1141 The dCDN must be able to filter some internal scope data such as 1142 information related to its internal alarms (security, failures, load, 1143 etc). 1145 In some use cases described in [I-D.ietf-cdni-use-cases], the 1146 interconnected CDNs do not want to disclose details on their internal 1147 topology. The dCDN must be able to filter confidential data on the 1148 dCDN's topology (number of servers, location, etc.). In particular, 1149 information about the requests served by every Surrogate is 1150 confidential. Therefore, the Logging information must be protected 1151 so that data such as Surrogates host-names is not disclosed to the 1152 uCDN. In the "Inter-Affiliates Interconnection" use case, this 1153 information may be disclosed to the uCDN because both the dCDN and 1154 the uCDN are operated by entities of the same group. 1156 10.3. Logging Update and Rectification 1158 If Logging is generated periodically, it is important that the 1159 sessions that start in one Logging period and end in another are 1160 correctly reported. If they are reported in the starting period, 1161 then the Logging of this period will be available only after the end 1162 of the session, which delays the Logging generation. 1164 A Logging rectification / update mechanism could be useful to reach a 1165 good trade-off between the Logging generation delay and the Logging 1166 accuracy. Depending on the selected Logging protocol(s), such 1167 mechanism may be particularly invaluable for real time Logging, which 1168 must be provided rapidly and cannot wait for the end of operations in 1169 progress. 1171 11. Open Issues 1173 The level of granularity of the date/time information must be 1174 specified (clock accuracy). 1176 When to log the end of a session when the End-User pauses a video 1177 display? 1179 [Ed. Note: check if all requirements are fulfilled by the proposed 1180 solution] 1182 [Ed. note: (comment from Kevin) how are errors handled ? If the 1183 client gets handed a bunch of 403s and 404s, but still gets the 1184 content eventually, without triggering an event, are those still 1185 logged? For Bytes-Transferred, if there were aborted requests, do 1186 those get counted as well? Not all client behavior can be correlated 1187 with the simplified log.] 1189 12. IANA Considerations 1191 This memo includes no request to IANA. 1193 13. Security Considerations 1195 13.1. Privacy 1197 CDNs have the opportunity to collect detailed information about the 1198 downloads performed by End-Users. The provision of this information 1199 to another CDN introduces End-Users privacy protection concerns. 1201 13.2. Non Repudiation 1203 Logging provides the raw material for charging. It permits the dCDN 1204 to bill the uCDN for the content deliveries that the dCDN makes on 1205 behalf of the uCDN. It also permits the uCDN to bill the CSP for the 1206 content delivery service. Therefore, non-repudiation of Logging data 1207 is essential. Some of the security issues and requirements on 1208 Logging are highlighted in Section 9.1. 1210 14. Acknowledgments 1212 The authors would like to thank Anne Marrec, Yannick Le Louedec, and 1213 Christian Jacquenet for detailed feedback on early versions of this 1214 document and for their input on existing Log formats. 1216 The authors would like also to thank Fabio Costa, Yvan Massot, Renaud 1217 Edel, and Joel Favier for their input and comments. 1219 Finally, they thank the contributors of the EU FP7 OCEAN project for 1220 valuable inputs. 1222 15. References 1223 15.1. Normative References 1225 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1226 Requirement Levels", BCP 14, RFC 2119, March 1997. 1228 15.2. Informative References 1230 [CLF] A. Luotonen, "The Common Log-file Format, W3C (work in 1231 progress)", 1995, . 1234 [ELF] Phillip M. Hallam-Baker and Brian Behlendorf, "Extended 1235 Log File Format, W3C (work in progress), WD-logfile- 1236 960323", . 1238 [I-D.bertrand-cdni-experiments] 1239 Faucheur, F. and L. Peterson, "Content Distribution 1240 Network Interconnection (CDNI) Experiments", 1241 draft-bertrand-cdni-experiments-02 (work in progress), 1242 February 2012. 1244 [I-D.brandenburg-cdni-has] 1245 Brandenburg, R., Deventer, O., Faucheur, F., and K. Leung, 1246 "Models for adaptive-streaming-aware CDN Interconnection", 1247 draft-brandenburg-cdni-has-03 (work in progress), 1248 July 2012. 1250 [I-D.ietf-cdni-framework] 1251 Peterson, L. and B. Davie, "Framework for CDN 1252 Interconnection", draft-ietf-cdni-framework-01 (work in 1253 progress), July 2012. 1255 [I-D.ietf-cdni-problem-statement] 1256 Niven-Jenkins, B., Faucheur, F., and N. Bitar, "Content 1257 Distribution Network Interconnection (CDNI) Problem 1258 Statement", draft-ietf-cdni-problem-statement-08 (work in 1259 progress), June 2012. 1261 [I-D.ietf-cdni-requirements] 1262 Leung, K. and Y. Lee, "Content Distribution Network 1263 Interconnection (CDNI) Requirements", 1264 draft-ietf-cdni-requirements-03 (work in progress), 1265 June 2012. 1267 [I-D.ietf-cdni-use-cases] 1268 Bertrand, G., Emile, S., Burbridge, T., Eardley, P., Ma, 1269 K., and G. Watson, "Use Cases for Content Delivery Network 1270 Interconnection", draft-ietf-cdni-use-cases-10 (work in 1271 progress), August 2012. 1273 [I-D.lefaucheur-cdni-logging-delivery] 1274 Faucheur, F., Viveganandhan, M., and K. Leung, "CDNI 1275 Logging Formats for HTTP and HTTP Adaptive Streaming 1276 Deliveries", draft-lefaucheur-cdni-logging-delivery-01 1277 (work in progress), July 2012. 1279 [RFC3444] Pras, A. and J. Schoenwaelder, "On the Difference between 1280 Information Models and Data Models", RFC 3444, 1281 January 2003. 1283 [RFC3466] Day, M., Cain, B., Tomlinson, G., and P. Rzewski, "A Model 1284 for Content Internetworking (CDI)", RFC 3466, 1285 February 2003. 1287 [RFC3568] Barbir, A., Cain, B., Nair, R., and O. Spatscheck, "Known 1288 Content Network (CN) Request-Routing Mechanisms", 1289 RFC 3568, July 2003. 1291 [apache] "Apache 2.2 log files documentation", Feb. 2012, 1292 . 1294 [squid] "Squid Log-Format documentation", Feb. 2012, 1295 . 1297 Appendix A. Examples Log Format 1299 This section provides example of log formats implemented in existing 1300 CDNs, web servers, and caching proxies. 1302 Web servers (e.g., Apache) maintain at least one log file for logging 1303 accesses to content (the Access Log). They can typically be 1304 configured to log errors in a separate log file (the Error Log). The 1305 log formats can be specified in the server's configuration files. 1306 However, webmasters often use standard log formats to ease the log 1307 processing with available log analysis tools. 1309 A.1. W3C Common Log File (CLF) Format 1311 The Common Log File (CLF) format defined by the World Wide Web 1312 Consortium (W3C) working group is compatible with many log analysis 1313 tools and is supported by the main web servers (e.g., Apache) Access 1314 Logs. 1316 According to [CLF], the common log-file format is as follows: 1317 remotehost rfc931 authuser [date] "request" status bytes. 1319 Example (from [apache]): 127.0.0.1 - frank [10/Oct/2000:13:55:36 1320 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 1322 The fields are defined as follows [CLF]: 1324 +------------+------------------------------------------------------+ 1325 | Element | Definition | 1326 +------------+------------------------------------------------------+ 1327 | remotehost | Remote hostname (or IP number if DNS hostname is not | 1328 | | available, or if DNSLookup is Off. | 1329 | rfc931 | The remote logname of the user. | 1330 | authuser | The username that the user employed to authenticate | 1331 | | himself. | 1332 | [date] | Date and time of the request. | 1333 | "request" | An exact copy of the request line that came from the | 1334 | | client. | 1335 | status | The status code of the HTTP reply returned to the | 1336 | | client. | 1337 | bytes | The content-length of the document transferred. | 1338 +------------+------------------------------------------------------+ 1340 Table 5: Information elements in CLF format 1342 A.2. W3C Extended Log File (ELF) Format 1344 The Extended Log File (ELF) format defined by W3C extends the CLF 1345 with new fields. This format is supported by Microsoft IIS 4.0 and 1346 5.0. 1348 The supported fields are listed below [ELF]. 1350 +------------+---------------------------------------------------+ 1351 | Element | Definition | 1352 +------------+---------------------------------------------------+ 1353 | date | Date at which transaction completed | 1354 | time | Time at which transaction completed | 1355 | time-taken | Time taken for transaction to complete in seconds | 1356 | bytes | bytes transferred | 1357 | cached | Records whether a cache hit occurred | 1358 | ip | IP address and port | 1359 | dns | DNS name | 1360 | status | Status code | 1361 | comment | Comment returned with status code | 1362 | method | Method | 1363 | uri | URI | 1364 | uri-stem | Stem portion alone of URI (omitting query) | 1365 | uri-query | Query portion alone of URI | 1366 +------------+---------------------------------------------------+ 1368 Table 6: Information elements in ELF format 1370 Some fields start with a prefix (e.g., "c-", "s-"), which explains 1371 which host (client/server/proxy) the field refers to. 1373 o Prefix Description 1375 o c- Client 1377 o s- Server 1379 o r- Remote 1381 o cs- Client to Server. 1383 o sc- Server to Client. 1385 o sr- Server to Remote Server (used by proxies) 1387 o rs- Remote Server to Server (used by proxies) 1389 Example: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs- 1390 username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status 1391 time-taken 1393 2011-11-23 15:22:01 x.x.x.x GET /file 80 y.y.y.y Mozilla/ 1394 5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.1.6)+Gecko/ 1395 20091201+Firefox/3.5.6+GTB6 200 0 0 2137 1397 A.3. National Center for Supercomputing Applications (NCSA) Common Log 1398 Format 1400 This format for Access Logs offers the following fields: 1402 o host rfc931 date:time "request" statuscode bytes 1404 o x.x.x.x userfoo [10/Jan/2010:21:15:05 +0500] "GET /index.html 1405 HTTP/1.0" 200 1043 1407 A.4. NCSA Combined Log Format 1409 The NCSA Combined log format is an extension of the NCSA Common log 1410 format with three (optional) additional fields: the referral field, 1411 the user_agent field, and the cookie field. 1413 o host rfc931 username date:time request statuscode bytes referrer 1414 user_agent cookie 1416 o Example: x.x.x.x - userfoo [21/Jan/2012:12:13:56 +0500] "GET 1417 /index.html HTTP/1.0" 200 1043 "http://www.example.com/" "Mozilla/ 1418 4.05 [en] (WinNT; I)" "USERID=CustomerA;IMPID=01234" 1420 A.5. NCSA Separate Log Format 1422 The NCSA Separate log format refers to a log format in which the 1423 information gathered is separated into three separate files. This 1424 way, every entry in the Access Log (in the NCSA Common log format) is 1425 complemented with an entry in a Referral log and another one in an 1426 Agent log. These three entries can be correlated easily thanks to 1427 the date:time value. The format of the Referral log is as follows: 1429 o date:time referrer 1431 o Example: [21/Jan/2012:12:13:56 +0500] 1432 "http://www.example.com/index.html" 1434 The format of the Agent log is as follows: 1436 o date:time agent 1438 o [21/Jan/2012:12:13:56 +0500] "Microsoft Internet Explorer - 5.0" 1440 A.6. Squid 2.0 Native Log Format for Access Logs 1442 Squid [squid] is a popular piece of open-source software for 1443 transforming a Linux host into a caching proxy. Variations of Squid 1444 log format are supported by some CDNs. 1446 Squid common access log format is as follow: time elapsed remotehost 1447 code/status bytes method URL rfc931 peerstatus/peerhost type. 1449 Squid also supports a more detailed native access log format: 1450 Timestamp Elapsed Client Action/Code Size Method URI Ident Hierarchy/ 1451 From Content 1453 According to Squid 2.0 documentation [squid], these fields are 1454 defined as follows: 1456 +-----------+-------------------------------------------------------+ 1457 | Element | Definition | 1458 +-----------+-------------------------------------------------------+ 1459 | time | Unix timestamp as UTC seconds with a millisecond | 1460 | | resolution. | 1461 | duration | The elapsed time in milliseconds the transaction | 1462 | | busied the cache. | 1463 | client | The client IP address. | 1464 | address | | 1465 | bytes | The size is the amount of data delivered to the | 1466 | | client, including headers. | 1467 | request | The request method to obtain an object. | 1468 | method | | 1469 | URL | The requested URL. | 1470 | rfc931 | may contain the ident lookups for the requesting | 1471 | | client (turned off by default) | 1472 | hierarchy | The hierarchy information provides information on how | 1473 | code | the request was handled (forwarding it to another | 1474 | | cache, or requesting the content to the Origin | 1475 | | Server). | 1476 | type | The content type of the object as seen in the HTTP | 1477 | | reply header. | 1478 +-----------+-------------------------------------------------------+ 1480 Table 7: Information elements in Squid format 1482 Squid also uses a "store log", which covers the objects currently 1483 kept on disk or removed ones, for debugging purposes typically. 1485 Authors' Addresses 1487 Gilles Bertrand (editor) 1488 France Telecom - Orange 1489 38-40 rue du General Leclerc 1490 Issy les Moulineaux, 92130 1491 FR 1493 Phone: +33 1 45 29 89 46 1494 Email: gilles.bertrand@orange.com 1496 Stephan Emile 1497 France Telecom - Orange 1498 2 avenue Pierre Marzin 1499 Lannion F-22307 1500 France 1502 Email: emile.stephan@orange.com 1504 Roy Peterkofsky 1505 Skytide, Inc. 1506 One Kaiser Plaza, Suite 785 1507 Oakland CA 94612 1508 USA 1510 Phone: +01 510 250 4284 1511 Email: roy@skytide.com