idnits 2.17.1 draft-ietf-cdni-logging-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 22, 2013) is 4074 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'MED' is mentioned on line 1676, but not defined == Unused Reference: 'RFC2119' is defined on line 1404, but no explicit reference was found in the text == Unused Reference: 'RFC5424' is defined on line 1407, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-brandenburg-cdni-has-04 == Outdated reference: A later version (-14) exists of draft-ietf-cdni-framework-03 == Outdated reference: A later version (-17) exists of draft-ietf-cdni-requirements-04 Summary: 0 errors (**), 0 flaws (~~), 10 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Bertrand, Ed. 3 Internet-Draft I. Oprescu, Ed. 4 Intended status: Informational E. Stephan 5 Expires: August 26, 2013 France Telecom - Orange 6 R. Peterkofsky 7 Skytide, Inc. 8 F. Le Faucheur, Ed. 9 Cisco Systems 10 P. Grochocki 11 Orange Polska 12 February 22, 2013 14 CDNI Logging Interface 15 draft-ietf-cdni-logging-01 17 Abstract 19 This memo specifies the Logging interface between a downstream CDN 20 (dCDN) and an upstream CDN (uCDN) that are interconnected as per the 21 CDN Interconnection (CDNI) framework. First, it describes a 22 reference model for CDNI logging. Then, it specifies the actual 23 protocol for CDNI logging information exchange covering the 24 information elements as well as the transport of those elements. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 26, 2013. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 73 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 74 1.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 8 75 2. CDNI Logging Reference Model . . . . . . . . . . . . . . . . . 8 76 2.1. CDNI Logging interactions . . . . . . . . . . . . . . . . 8 77 2.2. Overall Logging Chain . . . . . . . . . . . . . . . . . . 12 78 2.2.1. Logging Generation and During-Generation 79 Aggregation . . . . . . . . . . . . . . . . . . . . . 13 80 2.2.2. Logging Collection . . . . . . . . . . . . . . . . . . 14 81 2.2.3. Logging Filtering . . . . . . . . . . . . . . . . . . 14 82 2.2.4. Logging Rectification and Post-Generation 83 Aggregation . . . . . . . . . . . . . . . . . . . . . 15 84 2.2.5. Log-Consuming Applications . . . . . . . . . . . . . . 15 85 2.2.5.1. Maintenance/Debugging . . . . . . . . . . . . . . 15 86 2.2.5.2. Accounting . . . . . . . . . . . . . . . . . . . . 16 87 2.2.5.3. Analytics and Reporting . . . . . . . . . . . . . 16 88 2.2.5.4. Security . . . . . . . . . . . . . . . . . . . . . 16 89 2.2.5.5. Legal Logging Duties . . . . . . . . . . . . . . . 16 90 2.2.5.6. Notions common to multiple Log Consuming 91 Applications . . . . . . . . . . . . . . . . . . . 16 92 3. CDNI Logging Transport Requirements . . . . . . . . . . . . . 18 93 3.1. Timeliness . . . . . . . . . . . . . . . . . . . . . . . . 19 94 3.2. Reliability . . . . . . . . . . . . . . . . . . . . . . . 19 95 3.3. Security . . . . . . . . . . . . . . . . . . . . . . . . . 19 96 3.4. Scalability . . . . . . . . . . . . . . . . . . . . . . . 19 97 3.5. Consistency between CDNI Logging and CDN Logging . . . . . 20 98 3.6. Dispatching/Filtering . . . . . . . . . . . . . . . . . . 20 99 4. CDNI Logging Information Structure and Transport . . . . . . . 20 100 5. CDNI Logging Fields . . . . . . . . . . . . . . . . . . . . . 22 101 5.1. Semantics of CDNI Logging Fields . . . . . . . . . . . . . 22 102 5.2. Syntax of CDNI Logging Fields . . . . . . . . . . . . . . 26 103 6. CDNI Logging Records . . . . . . . . . . . . . . . . . . . . . 27 104 6.1. Content Delivery . . . . . . . . . . . . . . . . . . . . . 27 105 6.2. Content Invalidation and Purging . . . . . . . . . . . . . 29 106 6.3. Request Routing . . . . . . . . . . . . . . . . . . . . . 29 107 6.4. Logging Extensibility . . . . . . . . . . . . . . . . . . 29 108 7. CDNI Logging File Format . . . . . . . . . . . . . . . . . . . 29 109 7.1. Logging Files . . . . . . . . . . . . . . . . . . . . . . 29 110 7.2. File Format . . . . . . . . . . . . . . . . . . . . . . . 29 111 7.2.1. Headers . . . . . . . . . . . . . . . . . . . . . . . 30 112 7.2.2. Body (Logging Records) Format . . . . . . . . . . . . 31 113 7.2.3. Footer Format . . . . . . . . . . . . . . . . . . . . 31 114 8. CDNI Logging File Transport Protocol . . . . . . . . . . . . . 31 115 9. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 32 116 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 117 11. Security Considerations . . . . . . . . . . . . . . . . . . . 32 118 11.1. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 33 119 11.2. Non Repudiation . . . . . . . . . . . . . . . . . . . . . 33 120 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33 121 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 122 13.1. Normative References . . . . . . . . . . . . . . . . . . . 33 123 13.2. Informative References . . . . . . . . . . . . . . . . . . 33 124 Appendix A. Examples Log Format . . . . . . . . . . . . . . . . . 34 125 A.1. W3C Common Log File (CLF) Format . . . . . . . . . . . . . 35 126 A.2. W3C Extended Log File (ELF) Format . . . . . . . . . . . . 35 127 A.3. National Center for Supercomputing Applications (NCSA) 128 Common Log Format . . . . . . . . . . . . . . . . . . . . 37 129 A.4. NCSA Combined Log Format . . . . . . . . . . . . . . . . . 37 130 A.5. NCSA Separate Log Format . . . . . . . . . . . . . . . . . 37 131 A.6. Squid 2.0 Native Log Format for Access Logs . . . . . . . 37 132 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 38 133 B.1. Additional Requirements . . . . . . . . . . . . . . . . . 38 134 B.2. Compliancy with Requirements draft . . . . . . . . . . . . 39 135 Appendix C. Analysis of candidate protocols for Logging 136 Transport . . . . . . . . . . . . . . . . . . . . . . 39 137 C.1. Syslog . . . . . . . . . . . . . . . . . . . . . . . . . . 40 138 C.2. XMPP . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 139 C.3. SNMP . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 140 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 40 142 1. Introduction 144 This memo specifies the Logging interface between a downstream CDN 145 (dCDN) and an upstream CDN (uCDN). First, it describes a reference 146 model for CDNI logging. Then, it specifies the actual protocol for 147 CDNI logging information exchange covering the information elements 148 as well as the transport of those elements. 150 The reader should be familiar with the work of the CDNI WG: 152 o CDNI problem statement [RFC6707] and framework 153 [I-D.ietf-cdni-framework] identify a Logging interface, 155 o Section 7 of [I-D.ietf-cdni-requirements] specifies a set of 156 requirements for Logging, 158 o [RFC6770] outlines real world use-cases for interconnecting CDNs. 159 These use cases require the exchange of Logging information 160 between the dCDN and the uCDN. 162 As stated in [RFC6707], "the CDNI Logging interface enables details 163 of logs or events to be exchanged between interconnected CDNs". 165 The present document describes: 167 o The CDNI Logging reference model (Section 2), 169 o The CDNI Logging information structure and Transport (Section 4), 171 o The CDNI Logging Fields (Section 5), 173 o The CDNI Logging Records (Section 6), 175 o The CDNI Logging File format (Section 7), 177 o The CDNI Logging File Transport Protocol (Section 8), 179 In the Appendices, the document provides: 181 o A list of identified requirements (Appendix B.1), which should be 182 considered for inclusion in [I-D.ietf-cdni-requirements], 184 1.1. Terminology 186 In this document, the first letter of each CDNI-specific term is 187 capitalized. We adopt the terminology described in [RFC6707] and 188 [I-D.ietf-cdni-framework], and extend it with the additional terms 189 defined below. 191 For clarity, we use the word "Log" only for referring to internal CDN 192 logs and we use the word "Logging" for any inter-CDN information 193 exchange and processing operations related to CDNI Logging interface. 194 Log and Logging formats may be different. 196 CDN Logging information: logging information generated and collected 197 within a CDN 199 CDNI Logging information: logging information exchanged across CDNs 200 using the CDNI Logging Interface 202 Logging information: logging information generated and collected 203 within a CDN or obtained from another CDN using the CDNI Logging 204 Interface 206 CDNI Logging Field: an atomic element of information that can be 207 included in a CDNI Logging Record. The time an event/task started, 208 the IP address of an End user to whom content was delivered, and the 209 URI of the content delivered are examples of CDNI Logging Fields. 211 CDNI Logging Record: an information record providing information 212 about a specific event. This comprises a collection of CDNI Logging 213 Fields. 215 Separator Character: a specific character used to enable the parsing 216 of Logging Records. This character separates the Logging Fields that 217 compose a Logging Record. 219 CDNI Logging File: a file containing CDNI Logging Records, as well as 220 additional information facilitating the processing of the CDNI 221 Logging Records. 223 CDN Reporting: the process of providing the relevant information that 224 will be used to create a formatted content delivery report provided 225 to the CSP in deferred time. Such information typically includes 226 aggregated data that can cover a large period of time (e.g., from 227 hours to several months). Uses of Reporting include the collection 228 of charging data related to CDN services and the computation of Key 229 Performance Indicators (KPIs). 231 CDN Monitoring: the process of providing content delivery information 232 in real-time. Monitoring typically includes data in real time to 233 provide visibility of the deliveries in progress, for service 234 operation purposes. It presents a view of the global health of the 235 services as well as information on usage and performance, for network 236 services supervision and operation management. In particular, 237 monitoring data can be used to generate alarms. 239 End-User experience management: study of Logging data using 240 statistical analysis to discover, understand, and predict user 241 behavior patterns. 243 Class-of-requests: A Class-of-requests identifies a set of content 244 Requests, related to a specific CSP, received from clients in a given 245 footprint and sharing common properties. These properties include: 247 o Any header, URL parameter, query parameter of an HTTP (or RTMP) 248 content request 250 o Any header, or sub-domain of the FQDN of a DNS lookup request 252 Examples: 254 o Class-of-Requests = all the requests that include the HTTP header 255 "User-Agent: Mozilla/5.0" related to CSP 256 "http://*.cdn.example.com" from AS3215 258 o Class-of-Requests = all the DNS requests from anywhere and related 259 to CSP "cdn*.example.com" 261 Delivery Service: A Delivery Service is defined by a set of Class-of- 262 Requests and a list of parameters that apply to all these Class-of- 263 Requests (logging format, delivery quality/capabilities 264 requirements...) 266 Service Agreement: A service agreement is defined by a uCDN 267 identifier, a dCDN identifier, a set of Delivery Services and a list 268 of parameters that apply to the Service Agreement. 270 Once a Service Agreement is agreed between the administrative 271 entities managing the CDNs to be interconnected, the upstream CDN and 272 the downstream CDN of the CDNI interconnection must be configured 273 according to this agreed Service Agreement. For instance, a given 274 uCDN (uCDN1) may request a given dCDN (dCDN1) to configure one 275 Delivery Service for handling requests for HTTP Adaptive streaming 276 videos delegated by uCDN1 and related to a specific CSP (CSP1) and 277 another one for handling requests for static pictures delegated by 278 uCDN1 and related to CSP1. These Delivery services would belong to 279 the Service Agreement between uCDN1 and dCDN1 for CSP1. In this 280 simple example, uCDN1 may request dCDN1 to include Delivery Service 281 information in its CDNI Logging, to help uCDN1 to provide relevant 282 reports to CSP1. 284 1.2. Abbreviations 286 o API: Application Programming Interface 288 o CCID: Content Collection Identifier 290 o CDN: Content Delivery Network 292 o CDNP: Content Delivery Network Provider 294 o CoDR: Content Delivery Record 296 o CSP: Content Service Provider 298 o DASH: Dynamic Adaptive Streaming over HTTP 300 o dCDN: downstream CDN 302 o FTP: File Transfer Protocol 304 o HAS: HTTP Adaptive Streaming 306 o KPI: Key Performance Indicator 308 o PVR: Personal Video Recorder 310 o SID: Session Identifier 312 o SFTP: SSH File Transfer Protocol 314 o SNMP: Simple Network Management Protocol 316 o uCDN: upstream CDN 318 2. CDNI Logging Reference Model 320 2.1. CDNI Logging interactions 322 The CDNI logging reference model between a given uCDN and a given 323 dCDN involves the following interactions: 325 o customization by the uCDN of the CDNI logging information to be 326 provided by the dCDN to the uCDN (e.g. control of which logging 327 fields are to be communicated to the uCDN for a given task 328 performed by the dCDN, control of which types of events are to be 329 logged). The dCDN takes into account this CDNI logging 330 customization information to determine what logging information to 331 provide to the uCDN, but it may, or may not, take into account 332 this CDNI logging customization information to influence what CDN 333 logging information is to be generated and collected within the 334 dCDN (e.g. even if the uCDN requests a restricted subset of the 335 logging information, the dCDN may elect to generate a broader set 336 of logging information). The mechanism to support the 337 customisation by the uCDN of CDNI Logging information is outside 338 the scope of this document and left for further study. We note 339 that the CDNI Control interface ore the CDNI Metadata interfaces 340 appear as candidate interfaces on which to potentially build such 341 a customisation mechanism. Before such a mechanism is available, 342 the uCDN and dCDN are expected to agree off-line on what CDNI 343 logging information is to be provide by dCDN to UCDN and rely on 344 management plane actions to configure the CDNI Logging functions 345 to generate (respectively, expect) in dCDN (respectively, in 346 uCDN). 348 o generation and collection by the dCDN of logging information 349 related to the completion of any task performed by the dCDN on 350 behalf of the uCDN (e.g., delivery of the content to an end user) 351 or related to events happening in the dCDN that are relevant to 352 the uCDN (e.g., failures or unavailability in dCDN). This takes 353 place within the dCDN and does not directly involve CDNI 354 interfaces. 356 o communication by the dCDN to the uCDN of the logging information 357 collected by the dCDN relevant to the uCDN. This is supported by 358 the CDNI Logging interface and in the scope of the present 359 document. For example, the uCDN may use this logging information 360 to charge the CSP, to perform analytics and monitoring for 361 operational reasons, to provide analytics and monitoring views on 362 its content delivery to the CSP or to perform trouble-shooting. 364 o customization by the dCDN of the logging to be performed by the 365 uCDN on behalf of the dCDN. The mechanism to support the 366 customisation by the dCDN of CDNI Logging information is outside 367 the scope of this document and left for further study. 369 o generation and collection by the uCDN of logging information 370 related to the completion of any task performed by the uCDN on 371 behalf of the dCDN (e.g., serving of content by uCDN to dCDN for 372 acquisition purposes by dCDN) or related to events happening in 373 the uCDN that are relevant to the dCDN. This takes place within 374 the uCDN and does not directly involve CDNI interfaces. 376 o communication by the uCDN to the dCDN of the logging information 377 collected by the uCDN relevant to the dCDN. For example, the dCDN 378 might potentially benefit form this information for security 379 auditing or content acquisition troubleshooting. This is outside 380 the scope of this document and left for further study. 382 Figure 1 provides an example of CDNI Logging interactions (focusing 383 only on the interactions that are in the scope of this document) in a 384 particular scenario where 4 CDNs are involved in the delivery of 385 content from a given CSP: the uCDN has a CDNI interconnection with 386 dCDN-1 and dCDN-2. In turn, dCDN2 has a CDNI interconnection with 387 dCDN3. In this example, uCDN, dCDN-1, dCDN-2 and dCDN-3 all 388 participate in the delivery of content for the CSP. In this example, 389 the CDNI Logging interface enables the uCDN to obtain logging 390 information from all the dCDNs involved in the delivery. In the 391 example, uCDN uses the Logging data: 393 o to analyze the performance of the delivery operated by the dCDNs 394 and to adjust its operations (e.g., request routing) as 395 appropriate, 397 o to provide reporting (non real-time) and monitoring (real-time) 398 information to CSP. 400 For instance, uCDN merges Logging data, extracts relevant KPIs, and 401 presents a formatted report to the CSP, in addition to a bill for the 402 content delivered by uCDN itself or by its dCDNs on his behalf. uCDN 403 may also provide Logging data as raw log files to the CSP, so that 404 the CSP can use its own logging analysis tools. 406 +-----+ 407 | CSP | 408 +-----+ 409 ^ Reporting and monitoring data 410 * Billing 411 ,--*--. 412 Logging ,-' `-. 413 Data =>( uCDN )<= Logging 414 // `-. _,-' \\ Data 415 || `-'-'-' || 416 ,-----. ,-----. 417 ,-' `-. ,-' `-. 418 ( dCDN-1 ) ( dCDN-2 )<== Logging 419 `-. ,-' `-. _,-' \\ Data 420 `--'--' `--'-' || 421 ,-----. 422 ,' `-. 423 ( dCDN-3 ) 424 `. ,-' 425 `--'--' 427 ===> CDNI Logging Interface 428 ***> outside the scope of CDNI 430 Figure 1: Interactions in CDNI Logging Reference Model 432 A dCDN (e.g., dCDN-2) integrates the relevant logging information 433 obtained from its dCDNs (e.g., dCDN-3) in the logging information 434 that it provides to the uCDN, so that the uCDN ultimately obtains all 435 logging information relevant to a CSP for which it acts as the 436 authoritative CDN. 438 Note that the format of Logging information that a CDN provides over 439 the CDNI interface might be different from the one that the CDN uses 440 internally. In this case, the CDN needs to reformat the Logging 441 information before it provides this information to the other CDN over 442 the CDNI Logging interface. Similarly, a CDN might reformat the 443 Logging data that it receives over the CDNI Logging interface before 444 injecting it into its log-consuming applications or before providing 445 some of this logging information to the CSP. Such reformatting 446 operations introduce latency in the logging distribution chain and 447 introduce a processing burden. Therefore, there are benefits in 448 specifying CDNI Logging format that are suitable for use inside CDNs 449 and also are close to the CDN Log formats commonly used in CDNs 450 today. 452 2.2. Overall Logging Chain 454 This section discusses the overall logging chain within and across 455 CDNs to clarify how CDN Logging information is expected to fit in 456 this overall chain. Figure 2 illustrates the overall logging chain 457 within the dCDN, across CDNs using the CDNI Logging interface and 458 within the uCDN. Note that the logging chain illustrated in the 459 Figure is obviously only indicative and varies depending on the 460 specific environments. For example, there may be more or less 461 instantiations of each entity (i.e., there may be 4 Log consuming 462 applications in a given CDN). As another example, there may be one 463 instance of Rectification process per Log Consuming Application 464 instead of a shared one. 466 Log Consuming Log Consuming 467 App App 468 /\ /\ 469 | | 470 Rectification-------- 471 /\ 472 | 473 Filtering 474 /\ 475 | 476 Collection uCDN 477 /\ /\ 478 | | 479 | Generation 480 | 481 CDNI Logging --------------------------------------------- 482 exchange 483 /\ Log Consuming Log Consuming 484 | App App 485 | /\ /\ 486 | | | 487 Rectification Rectification--------- 488 /\ /\ 489 | | 490 Filtering 491 /\ 492 | 493 Collection dCDN 494 /\ /\ 495 | | 496 Generation Generation 497 Figure 2: CDNI Logging in the overall Logging Chain 499 The following subsections describe each of the processes potentially 500 involved in the logging chain of Figure 2. 502 2.2.1. Logging Generation and During-Generation Aggregation 504 CDNs typically generate logging information for all significant task 505 completions, events, and failures. Logs are typically generated by 506 many devices in the CDN including the surrogates, the request routing 507 system, and the control system. 509 The amount of Logging information generated can be huge. Therefore, 510 during contract negotiations, interconnected CDNs often agree on a 511 Logging retention duration, and optionally, on a maximum size of the 512 Logging data that the dCDN must keep. If this size is exceeded, the 513 dCDN must alert the uCDN but may not keep more Logs for the 514 considered time period. In addition, CDNs may aggregate logs and 515 transmit only summaries for some categories of operations instead of 516 the full Logging data. Note that such aggregation leads to an 517 information loss, which may be problematic for some usages of Logging 518 (e.g., debugging). 520 [I-D.brandenburg-cdni-has] discusses logging for HTTP Adaptive 521 Streaming (HAS). In accordance with the recommendations articulated 522 there, it is expected that a surrogate will generate separate logging 523 information for delivery of each chunk of HAS content. This ensures 524 that separate logging information can then be provided to 525 interconnected CDNs over the CDNI Logging interface. Still in line 526 with the recommendations of [I-D.brandenburg-cdni-has], the logging 527 information for per-chunck delivery may include some information (a 528 Content Collection IDentifier and a Session IDentifier as discussed 529 in Section 5) intended to facilitate subsequent post-generation 530 aggregation of per-chunk logs into per-session logs. Note that a CDN 531 may also elect to generate aggregate per-session logs when performing 532 HAS delivery, but this needs to be in addition to, and not instead 533 of, the per-chunk delivery logs. We note that this may be revisited 534 in future versions of this document. 536 Note that in the case of non real-time logging, the trigger of the 537 transmission or generation of the logging file appears to be a 538 synchronous process from a protocol standpoint. The implementation 539 algorithm can choose to enforce a maximum size for the logging file 540 beyound which the transmission is automatically triggered (and thus 541 allow for an asynchrounous transmission process). 543 2.2.2. Logging Collection 545 This is the process that continuously collects logs generated by the 546 log-generating entities within a CDN. 548 In a CDNI environment, in addition to collecting logging information 549 from log-generating entities within the local CDN, the Collection 550 process also collects logging information provided by another CDN, or 551 other CDNs, through the CDNI Logging interface. This is illustrated 552 in Figure 2 where we see that the Collection process of the uCDN 553 collects logging information from log-generating entities within the 554 uCDN as well as logging information coming through CDNI Logging 555 exchange with the dCDN through the CDNI Logging interface. 557 2.2.3. Logging Filtering 559 A CDN may require to only present different subset of the whole 560 logging information collected to various log-consuming applications. 561 This is achieved by the Filtering process. 563 In particular, the Filtering process can also filter the right subset 564 of information that needs to be provided to a given interconnected 565 CDN. For example, the filtering process in the dCDN can be used to 566 ensure that only the logging information related to tasks performed 567 on behalf of a given uCDN are made available to that uCDN (thereby 568 filtering all the logging information related to deliveries by the 569 dCDN of content for its own CSPs). Similarly, the Filtering process 570 may filter or partially mask some fields, for example, to protect End 571 Users' privacy when communicating CDNI Logging information to another 572 CDN. Filtering of logging information prior to communication of this 573 information to other CDNs via the CDNI Logging interface requires 574 that the downstream CDN can recognize the set of log records that 575 relate to each interconnected CDN. 577 The CDN will also filter some internal scope information such as 578 information related to its internal alarms (security, failures, load, 579 etc). 581 In some use cases described in [RFC6770], the interconnected CDNs do 582 not want to disclose details on their internal topology. The 583 filtering process can then also filter confidential data on the 584 dCDNs' topology (number of servers, location, etc.). In particular, 585 information about the requests served by every Surrogate may be 586 confidential. Therefore, the Logging information must be protected 587 so that data such as Surrogates' hostnames is not disclosed to the 588 uCDN. In the "Inter-Affiliates Interconnection" use case, this 589 information may be disclosed to the uCDN because both the dCDN and 590 the uCDN are operated by entities of the same group. 592 2.2.4. Logging Rectification and Post-Generation Aggregation 594 If Logging is generated periodically, it is important that the 595 sessions that start in one Logging period and end in another are 596 correctly reported. If they are reported in the starting period, 597 then the Logging of this period will be available only after the end 598 of the session, which delays the Logging generation. 600 A Logging rectification/update mechanism could be useful to reach a 601 good trade-off between the Logging generation delay and the Logging 602 accuracy. Depending on the selected Logging protocol(s), such 603 mechanism may be invaluable for real time Logging, which must be 604 provided rapidly and cannot wait for the end of operations in 605 progress. 607 In the presence of HAS, some log-consuming applications can benefit 608 from aggregate per-session logs. For example, for analytics, per- 609 session logs allow display of session-related trends which are much 610 more meaningful for some types of analysis than chunk-related trends. 611 In the case where the log-generating entities have generated during- 612 generation aggregate logs, those can be used by the applications. In 613 the case where aggregate logs have not been generated, the 614 Rectification process can be extended with a Post-Generation 615 Aggregation process that generates per-session logs from the per- 616 chunk logs, possibly leveraging the information included in the per- 617 chunk logs for that purpose (Content Collection IDentifier and a 618 Session IDentifier). However, in accordance with 619 [I-D.brandenburg-cdni-has], this document does not define exchange of 620 such aggregate logs on the CDNI Logging interface. We note that this 621 may be revisited in future versions of this document. 623 2.2.5. Log-Consuming Applications 625 2.2.5.1. Maintenance/Debugging 627 Logging is useful to permit the detection (and limit the risk) of 628 content delivery failures. In particular, Logging facilitates the 629 resolution of configuration issues. 631 To detect faults, Logging must enable the reporting of any CDN 632 operation success and failure, such as request redirection, content 633 acquisition, etc. The uCDN can summarize such information into KPIs. 634 For instance, Logging format should allow the computation of the 635 number of times during a given epoch that content delivery related to 636 a specific service succeeds/fails. 638 Logging enables the CDN providers to identify and troubleshoot 639 performance degradations. In particular, Logging enables the 640 communication of traffic data (e.g., the amount of traffic that has 641 been forwarded by a dCDN on behalf of an uCDN over a given period of 642 time), which is particularly useful for CDN and network planning 643 operations. 645 2.2.5.2. Accounting 647 Logging is essential for accounting, to permit inter-CDN billing and 648 CSP billing by uCDNs. For instance, Logging enables the uCDN to 649 check the total amount of traffic delivered by every dCDN and for 650 every Delivery Service, as well as, the associated bandwidth usage 651 (e.g., peak, 95th percentile), and the maximum number of simultaneous 652 sessions over a given period of time. 654 2.2.5.3. Analytics and Reporting 656 The goal of analytics is to gather any relevant information to track 657 audience, analyze user behavior, and monitor the performance and 658 quality of content delivery. For instance, Logging enables the CDN 659 providers to report on content consumption (e.g., delivered sessions 660 per content) in a specific geographic area. 662 The goal of reporting is to gather any relevant information to 663 monitor the performance and quality of content delivery and allow 664 detection of delivery issues. For instance, reporting could track 665 the average delivery throughput experienced by End-Users in a given 666 region for a specific CSP or content set over a period of time. 668 2.2.5.4. Security 670 The goal of security is to prevent and monitor unauthorized access, 671 misuse, modification, and denial of access of a service. A set of 672 information is logged for security purposes. In particular, a record 673 of access to content is usually collected to permit the CSP to detect 674 infringements of content delivery policies and other abnormal End 675 User behaviors. 677 2.2.5.5. Legal Logging Duties 679 Depending on the country considered, the CDNs may have to retain 680 specific Logging information during a legal retention period, to 681 comply with judicial requisitions. 683 2.2.5.6. Notions common to multiple Log Consuming Applications 684 2.2.5.6.1. Logging Information Views 686 Within a given log-consuming application, different views may be 687 provided to different users depending on privacy, business, and 688 scalability constraints. 690 For example, an analytics tool run by the uCDN can provide one view 691 to an uCDN operator that exploits all the logging information 692 available to the uCDN, while the tool may provide a different view to 693 each CSP exploiting only the logging information related to the 694 content of the given CSP. 696 As another example, maintenance and debugging tools may provide 697 different views to different CDN operators, based on their 698 operational role. 700 2.2.5.6.2. Key Performance Indicators (KPIs) 702 This section presents, for explanatory purposes, a non-exhaustive 703 list of Key Performance Indicators (KPIs) that can be extracted/ 704 produced from logs. 706 Multiple log-consuming applications, such as analytics, monitoring, 707 and maintenance applications, often compute and track such KPIs. 709 In a CDNI environment, depending on the situation, these KPIs may be 710 computed by the uCDN or by the dCDN. But it is usually the uCDN that 711 computes KPIs, because uCDN and dCDN may have different definitions 712 of the KPIs and the computation of some KPIs requires a vision of all 713 the deliveries performed by the uCDN and all its dCDNs. 715 Here is a list of important examples of KPIs: 717 o Number of delivery requests received from End-Users in a given 718 region for each piece of content, during a given period of time 719 (e.g., hour/day/week/month) 721 o Percentage of delivery successes/failures among the aforementioned 722 requests 724 o Number of failures listed by failure type (e.g., HTTP error code) 725 for requests received from End Users in a given region and for 726 each piece of content, during a given period of time (e.g., hour/ 727 day/week/month) 729 o Number and cause of premature delivery termination for End Users 730 in a given region and for each piece of content, during a given 731 period of time (e.g., hour/day/week/month) 733 o Maximum and mean number of simultaneous sessions established by 734 End Users in a given region, for a given Delivery Service, and 735 during a given period of time (e.g., hour/day/week/month) 737 o Volume of traffic delivered for sessions established by End Users 738 in a given region, for a given Delivery Service, and during a 739 given period of time (e.g., hour/day/week/month) 741 o Maximum, mean, and minimum delivery throughput for sessions 742 established by End Users in a given region, for a given Delivery 743 Service, and during a given period of time (e.g., hour/day/week/ 744 month) 746 o Cache-hit and byte-hit ratios for requests received from End Users 747 in a given region for each piece of content, during a given period 748 of time (e.g., hour/day/week/month) 750 o Top 10 of the most popularly requested content (during a given 751 day/week/month), 753 o Terminal type (mobile, PC, STB, if this information can be 754 acquired from the browser type header, for example). 756 Additional KPIs can be computed from other sources of information 757 than the Logging, for instance, data collected by a content portal or 758 by specific client-side APIs. Such KPIs are out of scope for the 759 present memo. 761 The KPIs used depend strongly on the considered log-consuming 762 application -- the CDN operator may be interested in different 763 metrics than the CSP is. In particular, CDN operators are often 764 interested in delivery and acquisition performance KPIs, information 765 related to Surrogates' performance, caching information to evaluate 766 the cache-hit ratio, information about the delivered file size to 767 compute the volume of content delivered during peak hour, etc. 769 Some of the KPIs, for instance those providing an instantaneous 770 vision of the active sessions for a given CSP's content, are useful 771 essentially if they are provided in real-time. By contrast, some 772 other KPIs, such as the one averaged on a long period of time, can be 773 provided in non-real time. 775 3. CDNI Logging Transport Requirements 776 3.1. Timeliness 778 Some applications consuming CDNI Logging information, such as 779 accounting or trend analytics, only require logging information to be 780 available with a timeliness of the order of a day or the hour. This 781 document focuses on addressing this requirement. 783 Some applications consuming CDNI Logging information, such as real- 784 time analytics, require logging information to be available in real- 785 time (i.e. of the order of a second after the corresponding event). 786 This document leaves this requirement out of scope. 788 3.2. Reliability 790 CDNI logging information must be transmitted reliably. The transport 791 protocol should contain an anti-replay mechanism. 793 3.3. Security 795 CDNI logging information exchange must allow authentication, 796 integrity protection, and confidentiality protection. Also, a non- 797 repudiation mechanism is mandatory, the transport protocol should 798 support it. 800 3.4. Scalability 802 CDNI logging information exchange must support large scale 803 information exchange, particularly so in the presence of HTTP 804 Adaptive Streaming. 806 For example, if we consider a client pulling HTTP Progressive 807 Download content with an average duration of 10 minutes, this 808 represents 1/600 CDNI delivery Logging Records per second. If we 809 assume the dCDN is simultaneously serving 100,000 such clients on 810 behalf of the uCDN, the dCDN will be generating 167 Logging Records 811 per second to be communicated to the uCDN over the CDNI Logging 812 interface. Or equivalently, if we assume an average delivery rate of 813 2Mb/s, the dCDN generates 0.83 CDNI Logging Records per second for 814 every Gb/s of streaming on behalf of the uCDN. 816 For example, if we consider a client pulling HAS content and 817 receiving a video chunk every 2 seconds, a separate audio chunck 818 every 2 seconds and a refreshed manifest every 10 seconds, this 819 represents 1.1 delivery Logging Record per second. If we assume the 820 dCDN is simultaneously serving 100,000 such clients on behalf of the 821 uCDN, the dCDN will be generating 110,000 Logging Records per second 822 to be communicated to the uCDN over the CDNI Logging interface. Or 823 equivalently, if we assume an average delivery rate of 2Mb/s, the 824 dCDN generates 550 CDNI Logging Records per second for every Gb/s of 825 streaming on behalf of the uCDN. 827 3.5. Consistency between CDNI Logging and CDN Logging 829 There are benefits in using a CDNI logging format as close as 830 possible to intra-CDN logging format commonly used in CDNs tody in 831 order to minimize systematic translation at CDN/CDNI boundary. 833 3.6. Dispatching/Filtering 835 When a CDN is acting as a dCDN for multiple uCDNs, the dCDN needs to 836 dispatch each CDNI Logging Record to the uCDN that redirected the 837 corresponding request. The CDNI Logging format need to allow, and 838 possibly facilitate, such a dispatching. 840 4. CDNI Logging Information Structure and Transport 842 As defined in Section 1.1 a CDNI logging field is as an atomic 843 logging information element and a CDNI Logging Record is a collection 844 of CDNI Logging Fields containing all logging information 845 corresponding to a single logging event. 847 This document defines non-real-time transport of CDNI Logging 848 information over the CDNI interface. For such non-real-time 849 transport, this documents defines a third level of structure, the 850 CDNI Logging File, that is a collection of CDNI Logging Records. 851 This structure is described in Figure 3. This document then 852 specifies how to transport such CDNI Logging Files across 853 interconnected CDNs. We observe that this approach can be tuned in a 854 real deployment to achieve near-real time exchange of CDNI Logging 855 information, e.g., by increasing the frequency of logging file 856 creation and distribution throughout the Logging chain, but it is not 857 expected that this approach can support real time transport (e.g., 858 sub-second) of CDNI logging information. 860 +------------------------------------------------------+ 861 |CDNI Logging File | 862 | | 863 | +--------------------------------------------------+ | 864 | |CDNI Logging Record | | 865 | | +-------------+ +-------------+ +-------------+ | | 866 | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | 867 | | | Field | | Field | | Field | | | 868 | | +-------------+ +-------------+ +-------------+ | | 869 | +--------------------------------------------------+ | 870 | | 871 | +--------------------------------------------------+ | 872 | |CDNI Logging Record | | 873 | | +-------------+ +-------------+ +-------------+ | | 874 | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | 875 | | | Field | | Field | | Field | | | 876 | | +-------------+ +-------------+ +-------------+ | | 877 | +--------------------------------------------------+ | 878 | | 879 | +--------------------------------------------------+ | 880 | |CDNI Logging Record | | 881 | | +-------------+ +-------------+ +-------------+ | | 882 | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | 883 | | | Field | | Field | | Field | | | 884 | | +-------------+ +-------------+ +-------------+ | | 885 | +--------------------------------------------------+ | 886 +------------------------------------------------------+ 888 Figure 3: Structure of Logging Files 890 It is expected that future version of this document will also specify 891 real time transport of CDNI Logging information over the CDNI 892 interface. We note that this might involve direct transport of CDNI 893 Logging Records without prior grouping into a file structure to avoid 894 the latency associated with creating and transporting such a file 895 structure throughout the logging chain. 897 The semantics and encoding of the CDNI Logging fields are specified 898 in Section 5. The semantics and encoding of CDNI Records are 899 specified in Section 6. The CDNI Logging File format is specified in 900 Section 7. The protocol for transport of CDNI Logging File is 901 specified in Section 8. 903 5. CDNI Logging Fields 905 Existing CDNs Logging functions collect and consolidate logs 906 performed by their Surrogates. Surrogates usually store the logs 907 using a format derived from Web servers' and caching proxies' log 908 standards such as W3C, NCSA [ELF] [CLF], or Squid format [squid]. In 909 practice, these formats are adapted to cope with CDN specifics. 910 Appendix A presents examples of commonly used log formats. 912 5.1. Semantics of CDNI Logging Fields 914 This section specifies the semantics of the CDNI Logging Fields. The 915 specific subset of CDNI Logging fields that can be found in each type 916 of Logging Record is specified in Section 6. 918 The semantics of the CDNI Logging Fields are specified in Table 1. 920 +--------------+----------------------------------------------------+ 921 | Name | Description | 922 +--------------+----------------------------------------------------+ 923 | Start-time | A start date and time associated with a logged | 924 | | event; for instance, the time at which a Surrogate | 925 | | received a content delivery request or the time at | 926 | | which an origin server received a content | 927 | | acquisition request. | 928 | End-time | An end date and time associated with a logged | 929 | | event. For instance, the time at which a | 930 | | Surrogate completed the handling of a content | 931 | | delivery request (e.g., end of delivery or error). | 932 | Duration | The duration of an operation in milliseconds. For | 933 | | instance, this field could be used to provide the | 934 | | time it took the Surrogate to send the requested | 935 | | file to the End-User or the time it took the | 936 | | Surrogate to acquire the file on a cache-miss | 937 | | event. In the case where Start-time, End-time, | 938 | | and Duration appear in a Logging Record, the | 939 | | Duration is to be interpreted as a total activity | 940 | | time related to the logged operation. | 941 | Client-IP | The IP address of the User Agent that issued the | 942 | | logged request or of a proxy, for instance | 943 | | "203.0.113.1". | 944 | Client-port | The source port of the logged request (e.g., 9542) | 945 | Destination- | The IP address of the host that received the | 946 | IP | logged request (e.g., 192.0.2.2). | 947 | Destination- | The hostname of the host that received the logged | 948 | hostname | request (e.g., Surrogate1.cdna.com). | 949 | Destination- | The destination port of the logged request (e.g., | 950 | port | 80). | 951 | Operation | The kind of operation that is logged; for instance | 952 | | Delivery or Purging. | 953 | URI_full | The full requested URL (e.g., | 954 | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/p | 955 | | otter.avi?param=11&user=toto"). When HTTP request | 956 | | redirection is used, this URI includes the | 957 | | Surrogate FQDN. If the association of requests t | 958 | | oSurrogates is confidential, the dCDN can present | 959 | | only URI_part to uCDN. | 960 | URI_part | The requested URL path (e.g., | 961 | | /cdn.csp.com/movies/potter.avi?param=11&user=toto | 962 | | if the full request URL was | 963 | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/p | 964 | | otter.avi?param=11&user=toto"). The URI without | 965 | | host-name typically includes the "CDN domain" | 966 | | (ex.cdn.csp.com) - cf. [I-D.ietf-cdni-framework]: | 967 | | it enables the identification of the CSP service | 968 | | agreed between the CSP and the CDNP operating the | 969 | | uCDN. | 970 | Protocol | The protocol and protocol version of the message | 971 | | that triggered the Logging entry (e.g., HTTP/1.1). | 972 | Request-meth | The protocol method of the request message that | 973 | od | triggered the Logging entry. | 974 | Status | The protocol status of the reply message related | 975 | | to the Logging entry | 976 | Bytes-Sent | The number of bytes at application-layer | 977 | | protocol-level (e.g., HTTP) of the reply message | 978 | | related to the Logging entry. It includes the | 979 | | size of the response headers. | 980 | Headers-Sent | The number of bytes corresponding to response | 981 | | headers at application-layer protocol-level (e.g., | 982 | | HTTP) of the reply message related to the Logging | 983 | | entry. | 984 | Bytes-receiv | The number of bytes (headers + body) of the | 985 | ed | message that triggered the Logging entry. | 986 | Referrer | The value of the Referrer header in an HTTP | 987 | | request. | 988 | User-Agent | The value of the User Agent header in an HTTP | 989 | | request. | 990 | Cookie | The value of the Cookie header in an HTTP request. | 991 | Byte-Range | [Ed. note: to be defined] | 992 | Cache-contro | The value of the cache-control header in an HTTP | 993 | l | answer. This header is particularly important for | 994 | | content acquisition logs. | 995 | Record-diges | A digest of the Logging Record; it enables | 996 | t | detecting corrupted Logging Records. | 997 | CCID | A Content Collection IDentifier (CCID) eases the | 998 | | correlation of several Logging Records related to | 999 | | a Content Collection (e.g., a movie split in | 1000 | | chunks). | 1001 | SID | A Session Identifier (SID) eases the correlation | 1002 | | (and aggregation) of several Logging Records | 1003 | | related to a session. The SID is especially | 1004 | | relevant for summarizing HAS Logging information | 1005 | | [I-D.brandenburg-cdni-has]. | 1006 | uCDN-ID | An element authenticating the operator of the uCDN | 1007 | | as the authority having delegated the request to | 1008 | | the dCDN. | 1009 | Delivering-C | An identifier (e.g., an aggregation of an IP | 1010 | DN-ID | address and a FQDN) of the Delivering CDN. The | 1011 | | Delivering-CDN-ID might be considered as | 1012 | | confidential by the dCDN. In such case, the dCDN | 1013 | | could either not provide this field to the uCDN or | 1014 | | overwrite the Delivering-CDN-ID with its on | 1015 | | identifier. | 1016 | Cache-bytes | The number of body bytes served from caches. This | 1017 | | quantity permits the computation of the byte hit | 1018 | | ratio. | 1019 | Action | The Action describes how a given request was | 1020 | | treated locally: through which transport protocol, | 1021 | | with or without content revalidation, with a cache | 1022 | | hit or cache miss, with fresh or stale content, | 1023 | | and (if relevant) with which error. Example with | 1024 | | Squid format [squid]: "TCP_REFRESH_FAIL_HIT" means | 1025 | | that an expired copy of an object requested | 1026 | | through TCP was in the cache. Squid attempted to | 1027 | | make an If-Modified-Since request, but it failed. | 1028 | | The old (stale) object was delivered to the | 1029 | | client. | 1030 | MIME-Type | The MIME-Type of the requested content | 1031 | dCDN | An element authenticating the operator of the dCDN | 1032 | identifier | as the authority requesting the content to the | 1033 | | uCDN | 1034 | Caching_date | Date at which the delivered content was stored in | 1035 | | cache | 1036 | Validity_hea | A copy of all headers related to content validity: | 1037 | ders | Pragma or Cache-Control (no-cache), ETag, Vary, | 1038 | | last-modified... | 1039 | Lookup_durat | Duration of the DNS resolution for resolving the | 1040 | ion | FQDN of (uCDN's or CSP's) origin server. | 1041 | Delay_to_fir | Duration of the operations from the sending of the | 1042 | st_bit | content acquisition request to the reception of | 1043 | | the first bit of the requested content. | 1044 | Delay_to_las | Duration of the operations from the sending of the | 1045 | t_bit | content acquisition request to the reception of | 1046 | | the last bit of the requested content. | 1047 +--------------+----------------------------------------------------+ 1049 Table 1: Semantics of CDNI Logging Fields 1051 NB: we define three fields related to the timing of logged 1052 operations: Start-time, End-time, and Duration. Start-time is 1053 typically useful for human readers (e.g., while debugging), however, 1054 some servers log the operation's End-time which corresponds to the 1055 time of log record generation. In absence of Logging summarization, 1056 only two of these three fields are required to obtain relevant timing 1057 information on the operation. However, when some kind of Logging 1058 aggregation/summarization is used, it can be advantageous to keep the 1059 three fields: for instance, in the case of HAS, keeping the three 1060 fields permits computing an average delivery bitrate from a single 1061 Logging Record aggregating information on the delivery of multiple 1062 consecutive video chunks. 1064 Multiple header fields, in addition to the ones explicitly listed in 1065 the table could be reproduced in the Logging records. 1067 Note that uCDN may want to filter Logging data by user (and not by IP 1068 address) to provide more relevant information to the CSP. In such 1069 case, a user may be identified as a combination of several pieces of 1070 information such as the client IP and User Agent or through the SID. 1072 The URI_full provides information on the Surrogate that provided the 1073 content. This information can be relevant, for instance, for the 1074 Inter-Affiliates use case described in [RFC6770]. However, in some 1075 cases it may be considered as confidential and the dCDN may provide 1076 URI_part instead. 1078 Other information that could be logged include operations that refer 1079 to the general state of the request, before it gets processed 1080 locally. Such information is related to the authorization of the 1081 requests, URL rewriting rules enforced, the X-FORWARDED-FOR non 1082 standard HTTP header... 1084 [Editor's Note: CDNI Logging information may be used for debugging. 1085 Therefore, various CDN operations might be logged, depending on the 1086 agreement between the dCDN and the uCDN, such as operations related 1087 to Request Routing and Metadata. These may call for a few additional 1088 Fields to be defined]. 1090 5.2. Syntax of CDNI Logging Fields 1092 This section is intended to contain the specification for the syntax 1093 and encoding of the CDNI Logging fields. For now, Table 2 1094 illustrates the definition of some information elements. It provides 1095 examples using Apache log format strings [apache] when they exist. 1097 [Ed. note: specify for all Logging Fields the type (e.g., varchar, 1098 int, float, ...) and the maximum size (e.g., varchar(200))] 1099 +----------+-------------------+------------------------------------+ 1100 | Name | String | Example | 1101 +----------+-------------------+------------------------------------+ 1102 | Time | %t | [10/Oct/2000:13:55:36-0700] | 1103 | Duration | %D | - | 1104 | Client-I | %a | 203.0.113.45 | 1105 | P | | | 1106 | Operatio | - | - | 1107 | n | | | 1108 | URI_full | %U | - | 1109 | Protocol | %H | HTTP/1.0 | 1110 | Request | %m | GET | 1111 | method | | | 1112 | Status | %>s | 200 | 1113 | Bytes | %O | 2326 | 1114 | Sent | | | 1115 | Bytes | %I | 432 | 1116 | received | | | 1117 | Header | \"%{Referrer}i\" | "http://www.example.com/start.html | 1118 | | \"%{User-agent}i\ | ""Mozilla/4.08 [en] (Win98; I | 1119 | | " | ;Nav)" | 1120 +----------+-------------------+------------------------------------+ 1122 Table 2: Examples using Apache format 1124 6. CDNI Logging Records 1126 [Ed. note: we need to specify the encoding of the file, the 1127 separation character, etc...] 1129 This section defines the events for which a CDNI Logging record can 1130 be exchanged over the CDNI Logging interafce and for each type of 1131 Logging Record indicates the allowed set of CDNI Information 1132 Elements. 1134 We classify the logged events depending on the CDN operation to which 1135 they relate: Content Delivery, Content Acquisition, Content 1136 Invalidation/Purging, etc. 1138 6.1. Content Delivery 1140 The content delivery event triggering the generation of a Logging 1141 Record include: 1143 o Reception by a dCDN Surrogate of a content request 1145 The Logging Record for Content Delivery contains the following set of 1146 CDNI Logging Elements: 1148 +----------------------+--------------------------------------------+ 1149 | Name | Mandatory/Optional | 1150 +----------------------+--------------------------------------------+ 1151 | Start-time | Mandatory | 1152 | Duration | Mandatory | 1153 | Client-IP | Mandatory | 1154 | Client-port | Optional | 1155 | Destination-IP | Mandatory if Destination-Hostname is | 1156 | | absent | 1157 | Destination-Hostname | Mandatory if Destination-IP is absent | 1158 | Destination-port | Optional | 1159 | Operation | Optional | 1160 | URI_full | Mandatory if URI_part is absent | 1161 | URI_part | Mandatory if URI_full is absent | 1162 | Protocol | Mandatory if protocol is different to | 1163 | | HTTP/1.1 | 1164 | Request-method | Mandatory | 1165 | Status | Mandatory | 1166 | Bytes-Sent | Mandatory | 1167 | Headers-Sent | Optional | 1168 | Bytes-received | Optional | 1169 | Referrer | Optional | 1170 | User-Agent | Optional | 1171 | Cookie | Optional | 1172 | Byte-Range | ? | 1173 | Cache-control | Optional | 1174 | Record-digest | ? | 1175 | CCID | Optional. Only applicable to HTTP | 1176 | | Adaptive Streaming delivery. | 1177 | SID | Optional. Only applicable to HTTP | 1178 | | Adaptive Streaming delivery. | 1179 | Cache-bytes | Optional | 1180 | Action | Mandatory (in particulat re cache | 1181 | | Hit/Miss) | 1182 | MIME-Type | Mandatory | 1183 +----------------------+--------------------------------------------+ 1185 Table 3: CDNI Logging Fields in Delivery Logging Record 1187 In Table 3, "Mandatory" means that this field MUST be included in 1188 each Delivery Record and "Optional" means that it can be included 1189 based on the agreement between the dCDN and the uCDN as established 1190 via mechanism outside the scope of this document (e.g., by human 1191 agreement). 1193 6.2. Content Invalidation and Purging 1195 Given that the Purge interface is expected to contain a mechanism to 1196 report on completion of the Invalidation/purge request, there is no 1197 need to specify separate Log Records for these events. 1199 6.3. Request Routing 1201 [Editor's Note: Is there a requirement for the dCDN to provide logs 1202 for request routing events?] 1204 6.4. Logging Extensibility 1206 Future usages might introduce the need for additional Logging fields. 1207 In addition, some use-cases such as an Inter-Affiliate 1208 Interconnection [RFC6770], might take advantage of extended Logging 1209 exchanges. Therefore, it is important to permit CDNs to use 1210 additional Logging fields besides the standard ones, if they want. 1211 For instance, an "Account-name" identifying the contract enforced by 1212 the dCDN for a given request could be provided in extended fields. 1214 The required Logging Records may depend on the considered services. 1215 For instance, static file delivery (e.g., pictures) typically does 1216 not include any delivery restrictions. By contrast, video delivery 1217 typically implies strong content delivery restrictions, as explained 1218 in [RFC6770], and Logging could include information about the 1219 enforcement of these restrictions. Therefore, to ease the support of 1220 varied services as well as of future services, the Logging interface 1221 should support optional Logging Records. 1223 7. CDNI Logging File Format 1225 Interconnected CDNs may support various Logging formats. However, 1226 they must support at least the default Logging File format described 1227 here. 1229 7.1. Logging Files 1231 [Ed. Note: How many files (one per type of Delivery Service (e.g., 1232 HTTP, WMP) and per type of Event (e.g., Errors, Delivery, 1233 Acquisition,...?)and what would be inside... These aspects needs to 1234 be detailed...] 1236 7.2. File Format 1238 The Logging file format should be independent from the selected 1239 transport protocol, to guarantee a flexible choice of transport 1240 protocols. [Ed. note: for the real time Logging exchanges, this 1241 might be hard] 1243 All Logging Records in a Logging File must share the same format 1244 (same set of Logging Fields, in the same order, with the same 1245 semantics, separated by the same Separator Character), to ease the 1246 parsing of the Logging data by the CDN that receives the Logging 1247 File. The CDN that provides the Logging data is responsible for 1248 guaranteeing the consistency of the Logging records' formats, 1249 typically via its log filtering and aggregation processes (see 1250 Section 2.2.3). 1252 7.2.1. Headers 1254 Logging files must include a header with the information described in 1255 Figure 4. 1257 +----------------+-------------------+------------------------------+ 1258 | Field | Description | Examples | 1259 +----------------+-------------------+------------------------------+ 1260 | Format | Identification of | standard_cdni_errors_http_v1 | 1261 | | CDNI Log format. | | 1262 | Fields | A description of | | 1263 | | the record format | | 1264 | | (list of fields). | | 1265 | Log-ID | Identifier | abcdef1234 | 1266 | | for the CDNI Log | | 1267 | | file (facilitates | | 1268 | | detection of | | 1269 | | duplicate Logs | | 1270 | | and tracking in | | 1271 | | case of | | 1272 | | aggregation). | | 1273 | Log-Timestamp | Time, in | [20/Feb/2012:00:29.510+0200] | 1274 | | milliseconds, the | | 1275 | | CDNI Log was | | 1276 | | generated. | | 1277 | Log-Origin | Identifier of the | cdn1.cdni.example.com | 1278 | | authority (e.g., | | 1279 | | dCDN or uCDN) | | 1280 | | providing the Log-| | 1281 | | -ging | | 1282 +----------------+-------------------+------------------------------+ 1284 Figure 4: Logging Headers 1286 All time-related Logging Fields and data in the Logging File headers/ 1287 footers must provide a time zone and be at least at millisecond (ms) 1288 accuracy. The accuracy must be consistent to permit the computation 1289 of KPIs involving operations realized on several CDNs. 1291 [Ed. note: would it make sense to add a kind of "example Logging 1292 Record" in the Logging file and associated semantic (e.g., in a 1293 structure data format) ?] 1295 7.2.2. Body (Logging Records) Format 1297 [Ed. note: the W3C extended log format is a good base candidate to 1298 look at. ] 1300 Since records for real time information and non-real time information 1301 could use different formats, we do not yet solve the problem of real 1302 time logging exchanges in this version. 1304 7.2.3. Footer Format 1306 Logging files must include a footer with the information described in 1307 Figure 5. 1309 +---------+----------------------------------------------+----------+ 1310 | Field | Description | Examples | 1311 +---------+----------------------------------------------+----------+ 1312 | Log | Digest of the complete Log (facilitates | | 1313 | Digest | detection of Log corruption) | | 1314 +---------+----------------------------------------------+----------+ 1316 Figure 5: Logging footers 1318 This digest field permits the detection of corrupted Logging files. 1319 This can be useful, for instance, if a problem occurs on the 1320 filesystem of the dCDN Logging system and leads to a truncation of a 1321 logging file. Additional mechanisms to avoid corrupted Logging files 1322 are expected to be provided by the Logging transport protocol, cf. 1323 Section 8. 1325 8. CDNI Logging File Transport Protocol 1327 As presented in [RFC6707], several protocols already exist that could 1328 potentially be used to exchange CDNI Logging between interconnected 1329 CDNs. 1331 The offline exchange of non real-time Logging could rely on several 1332 protocols. In particular, the dCDN could publish the Logging on a 1333 server where the uCDN would retrieve them using a secure protocol. 1335 For managed file transfer, the recommended protocol is SSH File 1336 Transfer Protocol (SFTP) [I-D.ietf-secsh-filexfer]. SFTP is widely 1337 deployed and it guarantees the respect of the criteria expressed by 1338 the CDNI Logging Transport Requirements: timeliness, reliability, 1339 security and scalability. 1341 [Ed note: include options for lossless compression] 1343 9. Open Issues 1345 The main remaining tasks on this ID are the following: 1347 o Finalise the list of CDNI Logging Fields 1349 o Finalise the encoding of CDNI Logging Fields, Records and File. 1351 o Identify what can be done (if anything) to maximise reuse of 1352 Logging Fields and Logging Records encoding for future support of 1353 real-time CDNI Logging exchange 1355 [Ed. Note: The format for Time is still to be agreed on. RFC 5322 1356 (Section 3.3) format could be used or ISO 8601 formatted date and 1357 time in UTC (same format as proposed in 1358 [draft-caulfield-cdni-metadata-core-00]). Also see RFC5424 Section 1359 6.2.3.] 1361 [Ed. note: (comment from Kevin) how are errors handled ? If the 1362 client gets handed a bunch of 403s and 404s, but still gets the 1363 content eventually, without triggering an event, are those still 1364 logged? For Bytes-Sent, if there were aborted requests, do those get 1365 counted as well? Not all client behavior can be correlated with the 1366 simplified log] 1368 10. IANA Considerations 1370 TBD 1372 11. Security Considerations 1373 11.1. Privacy 1375 CDNs have the opportunity to collect detailed information about the 1376 downloads performed by End-Users. The provision of this information 1377 to another CDN introduces End-Users privacy protection concerns. 1379 11.2. Non Repudiation 1381 Logging provides the raw material for charging. It permits the dCDN 1382 to bill the uCDN for the content deliveries that the dCDN makes on 1383 behalf of the uCDN. It also permits the uCDN to bill the CSP for the 1384 content Delivery Service. Therefore, non-repudiation of Logging data 1385 is essential. 1387 12. Acknowledgments 1389 The authors would like to thank Sebastien Cubaud, Anne Marrec, 1390 Yannick Le Louedec, and Christian Jacquenet for detailed feedback on 1391 early versions of this document and for their input on existing Log 1392 formats. 1394 The authors would like also to thank Fabio Costa, Sara Oueslati, Yvan 1395 Massot, Renaud Edel, and Joel Favier for their input and comments. 1397 Finally, they thank the contributors of the EU FP7 OCEAN project for 1398 valuable inputs. 1400 13. References 1402 13.1. Normative References 1404 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1405 Requirement Levels", BCP 14, RFC 2119, March 1997. 1407 [RFC5424] Gerhards, R., "The Syslog Protocol", RFC 5424, March 2009. 1409 13.2. Informative References 1411 [CLF] A. Luotonen, "The Common Log-file Format, W3C (work in 1412 progress)", 1995, . 1415 [ELF] Phillip M. Hallam-Baker and Brian Behlendorf, "Extended 1416 Log File Format, W3C (work in progress), WD-logfile- 1417 960323", . 1419 [I-D.brandenburg-cdni-has] 1420 Brandenburg, R., Deventer, O., Faucheur, F., and K. Leung, 1421 "Models for adaptive-streaming-aware CDN Interconnection", 1422 draft-brandenburg-cdni-has-04 (work in progress), 1423 January 2013. 1425 [I-D.ietf-cdni-framework] 1426 Peterson, L. and B. Davie, "Framework for CDN 1427 Interconnection", draft-ietf-cdni-framework-03 (work in 1428 progress), February 2013. 1430 [I-D.ietf-cdni-requirements] 1431 Leung, K. and Y. Lee, "Content Distribution Network 1432 Interconnection (CDNI) Requirements", 1433 draft-ietf-cdni-requirements-04 (work in progress), 1434 December 2012. 1436 [I-D.ietf-secsh-filexfer] 1437 Galbraith, J. and O. Saarenmaa, "SSH File Transfer 1438 Protocol", draft-ietf-secsh-filexfer-13 (work in 1439 progress), July 2006. 1441 [RFC6707] Niven-Jenkins, B., Le Faucheur, F., and N. Bitar, "Content 1442 Distribution Network Interconnection (CDNI) Problem 1443 Statement", RFC 6707, September 2012. 1445 [RFC6770] Bertrand, G., Stephan, E., Burbridge, T., Eardley, P., Ma, 1446 K., and G. Watson, "Use Cases for Content Delivery Network 1447 Interconnection", RFC 6770, November 2012. 1449 [apache] "Apache 2.2 log files documentation", Feb. 2012, 1450 . 1452 [squid] "Squid Log-Format documentation", Feb. 2012, 1453 . 1455 Appendix A. Examples Log Format 1457 This section provides example of log formats implemented in existing 1458 CDNs, web servers, and caching proxies. 1460 Web servers (e.g., Apache) maintain at least one log file for logging 1461 accesses to content (the Access Log). They can typically be 1462 configured to log errors in a separate log file (the Error Log). The 1463 log formats can be specified in the server's configuration files. 1464 However, webmasters often use standard log formats to ease the log 1465 processing with available log analysis tools. 1467 A.1. W3C Common Log File (CLF) Format 1469 The Common Log File (CLF) format defined by the World Wide Web 1470 Consortium (W3C) working group is compatible with many log analysis 1471 tools and is supported by the main web servers (e.g., Apache) Access 1472 Logs. 1474 According to [CLF], the common log-file format is as follows: 1475 remotehost rfc931 authuser [date] "request" status bytes. 1477 Example (from [apache]): 127.0.0.1 - frank [10/Oct/2000:13:55:36 1478 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 1480 The fields are defined as follows [CLF]: 1482 +------------+------------------------------------------------------+ 1483 | Element | Definition | 1484 +------------+------------------------------------------------------+ 1485 | remotehost | Remote hostname (or IP number if DNS hostname is not | 1486 | | available, or if DNSLookup is Off. | 1487 | rfc931 | The remote logname of the user. | 1488 | authuser | The username that the user employed to authenticate | 1489 | | himself. | 1490 | [date] | Date and time of the request. | 1491 | "request" | An exact copy of the request line that came from the | 1492 | | client. | 1493 | status | The status code of the HTTP reply returned to the | 1494 | | client. | 1495 | bytes | The content-length of the document transferred. | 1496 +------------+------------------------------------------------------+ 1498 Table 4: Information elements in CLF format 1500 A.2. W3C Extended Log File (ELF) Format 1502 The Extended Log File (ELF) format defined by W3C extends the CLF 1503 with new fields. This format is supported by Microsoft IIS 4.0 and 1504 5.0. 1506 The supported fields are listed below [ELF]. 1508 +------------+---------------------------------------------------+ 1509 | Element | Definition | 1510 +------------+---------------------------------------------------+ 1511 | date | Date at which transaction completed | 1512 | time | Time at which transaction completed | 1513 | time-taken | Time taken for transaction to complete in seconds | 1514 | bytes | bytes transferred | 1515 | cached | Records whether a cache hit occurred | 1516 | ip | IP address and port | 1517 | dns | DNS name | 1518 | status | Status code | 1519 | comment | Comment returned with status code | 1520 | method | Method | 1521 | uri | URI | 1522 | uri-stem | Stem portion alone of URI (omitting query) | 1523 | uri-query | Query portion alone of URI | 1524 +------------+---------------------------------------------------+ 1526 Table 5: Information elements in ELF format 1528 Some fields start with a prefix (e.g., "c-", "s-"), which explains 1529 which host (client/server/proxy) the field refers to. 1531 o Prefix Description 1533 o c- Client 1535 o s- Server 1537 o r- Remote 1539 o cs- Client to Server. 1541 o sc- Server to Client. 1543 o sr- Server to Remote Server (used by proxies) 1545 o rs- Remote Server to Server (used by proxies) 1547 Example: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs- 1548 username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status 1549 time-taken 1551 2011-11-23 15:22:01 x.x.x.x GET /file 80 y.y.y.y Mozilla/ 1552 5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.1.6)+Gecko/ 1553 20091201+Firefox/3.5.6+GTB6 200 0 0 2137 1555 A.3. National Center for Supercomputing Applications (NCSA) Common Log 1556 Format 1558 This format for Access Logs offers the following fields: 1560 o host rfc931 date:time "request" statuscode bytes 1562 o x.x.x.x userfoo [10/Jan/2010:21:15:05 +0500] "GET /index.html 1563 HTTP/1.0" 200 1043 1565 A.4. NCSA Combined Log Format 1567 The NCSA Combined log format is an extension of the NCSA Common log 1568 format with three (optional) additional fields: the referral field, 1569 the user_agent field, and the cookie field. 1571 o host rfc931 username date:time request statuscode bytes referrer 1572 user_agent cookie 1574 o Example: x.x.x.x - userfoo [21/Jan/2012:12:13:56 +0500] "GET 1575 /index.html HTTP/1.0" 200 1043 "http://www.example.com/" "Mozilla/ 1576 4.05 [en] (WinNT; I)" "USERID=CustomerA;IMPID=01234" 1578 A.5. NCSA Separate Log Format 1580 The NCSA Separate log format refers to a log format in which the 1581 information gathered is separated into three separate files. This 1582 way, every entry in the Access Log (in the NCSA Common log format) is 1583 complemented with an entry in a Referral log and another one in an 1584 Agent log. These three records can be correlated easily thanks to 1585 the date:time value. The format of the Referral log is as follows: 1587 o date:time referrer 1589 o Example: [21/Jan/2012:12:13:56 +0500] 1590 "http://www.example.com/index.html" 1592 The format of the Agent log is as follows: 1594 o date:time agent 1596 o [21/Jan/2012:12:13:56 +0500] "Microsoft Internet Explorer - 5.0" 1598 A.6. Squid 2.0 Native Log Format for Access Logs 1600 Squid [squid] is a popular piece of open-source software for 1601 transforming a Linux host into a caching proxy. Variations of Squid 1602 log format are supported by some CDNs. 1604 Squid common access log format is as follow: time elapsed remotehost 1605 code/status bytes method URL rfc931 peerstatus/peerhost type. 1607 Squid also supports a more detailed native access log format: 1608 Timestamp Elapsed Client Action/Code Size Method URI Ident Hierarchy/ 1609 From Content 1611 According to Squid 2.0 documentation [squid], these fields are 1612 defined as follows: 1614 +-----------+-------------------------------------------------------+ 1615 | Element | Definition | 1616 +-----------+-------------------------------------------------------+ 1617 | time | Unix timestamp as UTC seconds with a millisecond | 1618 | | resolution. | 1619 | duration | The elapsed time in milliseconds the transaction | 1620 | | busied the cache. | 1621 | client | The client IP address. | 1622 | address | | 1623 | bytes | The size is the amount of data delivered to the | 1624 | | client, including headers. | 1625 | request | The request method to obtain an object. | 1626 | method | | 1627 | URL | The requested URL. | 1628 | rfc931 | may contain the ident lookups for the requesting | 1629 | | client (turned off by default) | 1630 | hierarchy | The hierarchy information provides information on how | 1631 | code | the request was handled (forwarding it to another | 1632 | | cache, or requesting the content to the Origin | 1633 | | Server). | 1634 | type | The content type of the object as seen in the HTTP | 1635 | | reply header. | 1636 +-----------+-------------------------------------------------------+ 1638 Table 6: Information elements in Squid format 1640 Squid also uses a "store log", which covers the objects currently 1641 kept on disk or removed ones, for debugging purposes typically. 1643 Appendix B. Requirements 1645 B.1. Additional Requirements 1647 Section 7 of [I-D.ietf-cdni-requirements], already specifies a set of 1648 requirements for Logging (LOG-1 to LOG-16). Some security 1649 requirements also affect Logging (e.g., SEC-4). 1651 This section is a placeholder for requirements identified in the work 1652 on logging, before they are proposed to the requirements draft 1653 authors. 1655 Logging data is sensitive as it provides the raw material for 1656 producing bills etc. Therefore, the protocol delivering the Logging 1657 data must be reliable to avoid information loss. In addition, the 1658 protocol must scale to support the transport of large amounts of 1659 Logging data. 1661 CDNs need to trust Logging information, thus, they want to know: 1663 o who issued the Logging (authentication), and 1665 o if the Logging has been modified by a third party (integrity). 1667 Logging also contains confidential data, and therefore, it should be 1668 protected from eavesdropping. 1670 All these needs translate into security requirements on both the 1671 Logging data format and on the Logging protocol. 1673 Finally, this protocol must comply with the requirements identified 1674 in [I-D.ietf-cdni-requirements]. 1676 [Ed. note: cf. requirements draft: "SEC-4 [MED] The CDNI solution 1677 should be able to ensure that the Downstream CDN cannot spoof a 1678 transaction log attempting to appear as if it corresponds to a 1679 request redirected by a given Upstream CDN when that request has not 1680 been redirected by this Upstream CDN. This ensures non-repudiation 1681 by the Upstream CDN of transaction logs generated by the Downstream 1682 CDN for deliveries performed by the Downstream CDN on behalf of the 1683 Upstream CDN."] 1685 B.2. Compliancy with Requirements draft 1687 This section checks that all the identified requirements in the 1688 Requirements draft are fulfilled by this document. 1690 [Ed. node: to be written later] 1692 Appendix C. Analysis of candidate protocols for Logging Transport 1694 This section will be expanded later with an analysis of alternative 1695 candidate protocols for transport of CDNI Logging in non-real-time as 1696 well as real-time. 1698 C.1. Syslog 1700 [Ed. node: to be written later] 1702 C.2. XMPP 1704 [Ed. node: to be written later] 1706 C.3. SNMP 1708 Authors' Addresses 1710 Gilles Bertrand (editor) 1711 France Telecom - Orange 1712 38-40 rue du General Leclerc 1713 Issy les Moulineaux, 92130 1714 FR 1716 Phone: +33 1 45 29 89 46 1717 Email: gilles.bertrand@orange.com 1719 Iuniana Oprescu (editor) 1720 France Telecom - Orange 1721 38-40 rue du General Leclerc 1722 Issy les Moulineaux, 92130 1723 FR 1725 Phone: +33 6 89 06 92 72 1726 Email: iuniana.oprescu@orange.com 1728 Stephan Emile 1729 France Telecom - Orange 1730 2 avenue Pierre Marzin 1731 Lannion F-22307 1732 France 1734 Email: emile.stephan@orange.com 1735 Roy Peterkofsky 1736 Skytide, Inc. 1737 One Kaiser Plaza, Suite 785 1738 Oakland CA 94612 1739 USA 1741 Phone: +01 510 250 4284 1742 Email: roy@skytide.com 1744 Francois Le Faucheur (editor) 1745 Cisco Systems 1746 Greenside, 400 Avenue de Roumanille 1747 Sophia Antipolis 06410 1748 FR 1750 Phone: +33 4 97 23 26 19 1751 Email: flefauch@cisco.com 1753 Pawel Grochocki 1754 Orange Polska 1755 ul. Obrzezna 7 1756 Warsaw 02-691 1757 Poland 1759 Email: pawel.grochocki@orange.com