idnits 2.17.1 draft-ietf-sipclf-problem-statement-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (June 8, 2010) is 5043 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIPPING V. Gurbani, Ed. 3 Internet-Draft Bell Laboratories, Alcatel-Lucent 4 Intended status: Informational E. Burger, Ed. 5 Expires: December 10, 2010 This space for sale 6 T. Anjali 7 Illinois Institute of Technology 8 H. Abdelnur 9 O. Festor 10 INRIA 11 June 8, 2010 13 The Common Log Format (CLF) for the Session Initiation Protocol (SIP) 14 draft-ietf-sipclf-problem-statement-02 16 Abstract 18 Well-known web servers such as Apache and web proxies like Squid 19 support event logging using a common log format. The logs produced 20 using these de-facto standard formats are invaluable to system 21 administrators for trouble-shooting a server and tool writers to 22 craft tools that mine the log files and produce reports and trends. 23 Furthermore, these log files can also be used to train anomaly 24 detection systems and feed events into a security event management 25 system. The Session Initiation Protocol does not have a common log 26 format, and as a result, each server supports a distinct log format 27 that makes it unnecessarily complex to produce tools to do trend 28 analysis and security detection. We propose a common log file format 29 for SIP servers that can be used uniformly by proxies, registrars, 30 redirect servers as well as back-to-back user agents. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as Internet- 40 Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 The list of current Internet-Drafts can be accessed at 48 http://www.ietf.org/ietf/1id-abstracts.txt. 50 The list of Internet-Draft Shadow Directories can be accessed at 51 http://www.ietf.org/shadow.html. 53 This Internet-Draft will expire on December 10, 2010. 55 Copyright Notice 57 Copyright (c) 2010 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the BSD License. 70 Table of Contents 72 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 74 3. Problem statement . . . . . . . . . . . . . . . . . . . . . . 4 75 4. What SIP CLF is and what it is not . . . . . . . . . . . . . . 4 76 5. Alternative approaches to SIP CLF . . . . . . . . . . . . . . 5 77 5.1. SIP CLF and CDRs . . . . . . . . . . . . . . . . . . . . . 5 78 5.2. SIP CLF and Wireshark packet capture . . . . . . . . . . . 6 79 6. Motivation and use cases . . . . . . . . . . . . . . . . . . . 6 80 7. Challenges in establishing a SIP CLF . . . . . . . . . . . . . 8 81 8. Data model . . . . . . . . . . . . . . . . . . . . . . . . . . 9 82 8.1. SIP CLF data model elements for an UAC . . . . . . . . . . 11 83 8.2. SIP CLF data model elements for an UAS . . . . . . . . . . 11 84 8.3. SIP CLF data model elements for a proxy . . . . . . . . . 11 85 9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 86 9.1. UAC registering with a proxy . . . . . . . . . . . . . . . 13 87 9.2. Direct call between Alice and Bob . . . . . . . . . . . . 14 88 9.3. Single downstream branch call . . . . . . . . . . . . . . 14 89 9.4. Forked call . . . . . . . . . . . . . . . . . . . . . . . 16 90 10. Security Considerations . . . . . . . . . . . . . . . . . . . 19 91 11. Operational guidance . . . . . . . . . . . . . . . . . . . . . 21 92 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 93 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 94 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 95 14.1. Normative References . . . . . . . . . . . . . . . . . . . 22 96 14.2. Informative References . . . . . . . . . . . . . . . . . . 22 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 99 1. Terminology 101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 102 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 103 document are to be interpreted as described in RFC 2119 [RFC2119]. 105 RFC 3261 [RFC3261] defines additional terms used in this document 106 that are specific to the SIP domain such as "proxy"; "registrar"; 107 "redirect server"; "user agent server" or "UAS"; "user agent client" 108 or "UAC"; "back-to-back user agent" or "B2BUA"; "dialog"; 109 "transaction"; "server transaction". 111 2. Introduction 113 Servers executing on Internet hosts produce log records as part of 114 their normal operations. A log record is, in essence, a summary of 115 an application layer protocol data unit (PDU), that captures in 116 precise terms an event that was processed by the server. These log 117 records serve many purposes, including analysis and troubleshooting. 119 Well-known web servers such as Apache and Squid support event logging 120 using a Common Log Format (CLF), the common structure for logging 121 requests and responses serviced by the web server. It can be argued 122 that a good part of the success of Apache has been its CLF because it 123 allowed third parties to produce tools that analyzed the data and 124 generated traffic reports and trends. The Apache CLF has been so 125 successful that not only did it become the de-facto standard in 126 producing logging data for web servers, but also many commercial web 127 servers can be configured to produce logs in this format. An example 128 of Apache CLF is depicted next: 130 %h %l %u %t \"%r\" %s %b 131 remotehost rfc931 authuser [date] request status bytes 133 remotehost: Remote hostname (or IP number if DNS hostname is not 134 available, or if DNSLookup is Off. 136 rfc931: The remote logname of the user. 138 authuser: The username by which the user has authenticated himself. 140 [date]: Date and time of the request. 142 request: The request line exactly as it came from the client. 144 status: The HTTP status code returned to the client. 146 bytes: The content-length of the document transferred. 148 The inspiration for the SIP CLF is the Apache CLF. However, the 149 state machinery for a HTTP transaction is much simpler than that of 150 the SIP transaction (as evidenced in Section 7). The SIP CLF needs 151 to do considerably more. 153 3. Problem statement 155 The Session Initiation Protocol [RFC3261](SIP) is an Internet 156 multimedia session signaling protocol that is increasingly used for 157 other services besides session establishment. A typical deployment 158 of SIP in an enterprise will consist of SIP entities from multiple 159 vendors. Currently, if these entities are capable of producing a log 160 file of the transactions being handled by them, the log files are 161 produced in a proprietary format. The result of multiplicity of the 162 log file formats is the inability of the support staff to easily 163 trace a call from one entity to another, or even to craft common 164 tools that will perform trend analysis, debugging and troubleshooting 165 problems uniformly across the SIP entities of multiple vendors. 167 SIP does not currently have a CLF format and this document serves to 168 provide the rationale to establish a SIP CLF and identifies the 169 required minimal information that must appear in any SIP CLF record. 171 4. What SIP CLF is and what it is not 173 The SIP CLF is a standardized manner of producing a log file. This 174 format can be used by SIP clients, SIP Servers, proxies, and B2BUAs. 175 The SIP CLF is simply an easily digestible log of currently occurring 176 events and past transactions. It contains enough information to 177 allow humans and automata to derive relationships between discrete 178 transactions handled at a SIP entity. For example, a SIP 179 administrator should be able to issue a concise command to discover 180 relationships between transactions or to search a certain dialog or 181 transaction. 183 Note: The exact form of the "concise command" is left unspecified 184 until the working group agrees to one or more formats for encoding 185 the fields. 187 The SIP CLF is amenable to quick parsing (i.e., well-delimited 188 fields) and it is platform and operating system neutral. 190 The SIP CLF is amenable to easy parsing and lends itself well to 191 creating other innovative tools. 193 The SIP CLF is not a billing tool. It is not expected that 194 enterprises will bill customers based on SIP CLF. The SIP CLF 195 records events at the signaling layer only and does not attempt to 196 correlate the veracity of these events with the media layer. Thus, 197 it cannot be used to trigger customer billing. 199 The SIP CLF is not a quality of service (QoS) measurement tool. If 200 QoS is defined as measuring the mean opinion score (MOS) of the 201 received media, then SIP CLF does not aid in this task since it does 202 not summarize events at the media layer. 204 5. Alternative approaches to SIP CLF 206 It is perhaps tempting to consider other approaches --- which though 207 not standardized, are in wide enough use in networks today --- to 208 determine whether or not a SIP CLF would benefit a SIP network 209 consisting of multi-vendor products. The two existing approaches 210 that approximate what SIP CLF does are Call Detail Records (CDRs) and 211 Wireshark packet sniffing. 213 5.1. SIP CLF and CDRs 215 CDRs are used in operator networks widely and with the adoption of 216 SIP, standardization bodies such as 3GPP have subsequently defined 217 SIP-related CDRs as well. Today, CDRs are used to implement the 218 functionality approximated by SIP CLF, however, there are important 219 differences. 221 One, SIP CLF operates natively at the transaction layer and maintains 222 enough information in the information elements being logged that 223 dialog-related data can be subsequently derived from the transaction 224 logs. Thus, esoteric SIP fields and parameters like the To header, 225 including tags; the From header, including tags, the CSeq number, 226 etc. are logged in SIP CLF. By contrast, a CDR is used mostly for 227 charging and thus saves information to facilitate that very aspect. 228 A CDR will most certainly log the public user identification of a 229 party requesting a service (which may not correspond to the From 230 header) and the public user identification of the party called party 231 (which may not correspond to the To header.) Furthermore, the 232 sequence numbers maintained by the CDR may not correspond to the SIP 233 CSeq header. Thus it will be hard to piece together the state of a 234 dialog through a sequence of CDR records. 236 Two, a CDR record will, in all probability, be generated at a SIP 237 entity performing some form of proxy-like functionality of a B2BUA 238 providing some service. By contrast, SIP CLF is light- weight enough 239 that it can be generated by a canonical SIP user agent server and 240 user agent client as well, including those that execute on resource 241 constrained devices (mobile phones). 243 Finally, SIP is also being deployed outside of operator- managed VoIP 244 networks. Universities, research laboratories, and small-to-medium 245 size companies are deploying SIP-based VoIP solutions on networks 246 owned and managed by them. Much of the latter constituencies will 247 not have an interest in generating CDRs, but they will like to have a 248 concise representation of the messages being handled by the SIP 249 entities in a common format. 251 5.2. SIP CLF and Wireshark packet capture 253 Wireshark is a popular raw packet capture tool. It contains filters 254 that can understand SIP at the protocol level and break down a 255 captured message into its individual header components. While 256 Wireshark is appropriate to capture and view discrete SIP messages, 257 it does not suffice to serve in the same capacity as SIP CLF for two 258 reasons. 260 First, while the Wireshark format saves bulk of the information 261 needed to create transaction and dialog state, the Wireshark format 262 is a binary format that does not lend itself very well to being 263 manipulated by text-based tools. Second and more importantly, if the 264 SIP messages are exchanged over a TLS-oriented transport, Wireshark 265 will be unable to decrypt them and render them as individual SIP 266 headers. 268 6. Motivation and use cases 270 As SIP becomes pervasive in multiple business domains and ubiquitous 271 in academic and research environments, it is beneficial to establish 272 a CLF for the following reasons: 274 Common reference for interpreting events: In a laboratory 275 environment or an enterprise service offering there will typically 276 be SIP entities from multiple vendors participating in routing 277 requests. Absent a CLF format, each entity will produce output 278 records in a native format making it hard to establish commonality 279 for tools that operate on the log file. 281 Writing common tools: A CLF format allows independent tool providers 282 to craft tools and applications that interpret the CLF data to 283 produce insightful trend analysis and detailed traffic reports. 284 The format should be such that it retains the ability to be read 285 by humans and processed using traditional Unix text processing 286 tools. 288 Session correlation across diverse processing elements: In 289 operational SIP networks, a request will typically be processed by 290 more than one SIP server. A SIP CLF will allow the network 291 operator to trace the progression of the request (or a set of 292 requests) as they traverse through the different servers to 293 establish a concise diagnostic trail of a SIP session. 295 Note that tracing the request through a set of servers is 296 considerably less challenging if all the servers belong to the 297 same administrative domain. 299 Message correlation across transactions: A SIP CLF can enable a 300 quick lookup of all messages that comprise a transaction (e.g., 301 "Find all messages corresponding to server transaction X, 302 including all forked branches.") 304 Message correlation across dialogs: A SIP CLF can correlate 305 transactions that comprise a dialog (e.g., "Find all messages for 306 dialog created by Call-ID C, From tag F and To tag T.") 308 Trend analysis: A SIP CLF allows an administrator to collect data 309 and spot patterns or trends in the information (e.g., "What is the 310 domain where the most sessions are routed to between 9:00 AM and 311 12:00 PM?") 313 Train anomaly detection systems: A SIP CLF will allow for the 314 training of anomaly detection systems that once trained can 315 monitor the CLF file to trigger an alarm on the subsequent 316 deviations from accepted patterns in the data set. Currently, 317 anomaly detection systems monitor the network and parse raw 318 packets that comprise a SIP message -- a process that is 319 unsuitable for anomaly detection systems [rieck2008]. With all 320 the necessary event data at their disposal, network operations 321 managers and information technology operation managers are in a 322 much better position to correlate, aggregate, and prioritize log 323 data to maintain situational awareness. 325 Testing: A SIP CLF allows for automatic testing of SIP equipment by 326 writing tools that can parse a SIP CLF file to ensure behavior of 327 a device under test. 329 Troubleshooting: A SIP CLF can enable cursory trouble shooting of a 330 SIP entity (e.g., "How long did it take to generate a final 331 response for the INVITE associated with Call-ID X?") 333 Offline analysis: A SIP CLF allows for offline analysis of the data 334 gathered. Once a SIP CLF file has been generated, it can be 335 transported (subject to the security considerations in Section 10) 336 to a host with appropriate computing resources to perform 337 subsequent analysis. 339 Real-time monitoring: A SIP CLF allows administrators to visually 340 notice the events occurring at a SIP entity in real-time providing 341 accurate situational awareness. 343 7. Challenges in establishing a SIP CLF 345 Establishing a CLF for SIP is a challenging task. The behavior of a 346 SIP entity is more complex when compared to the equivalent HTTP 347 entity. 349 Base protocol services such as parallel or serial forking elicit 350 multiple final responses. Ensuing delays between sending a request 351 and receiving a final response all add complexity when considering 352 what fields should comprise a CLF and in what manner. Furthermore, 353 unlike HTTP, SIP groups multiple discrete transactions into a dialog, 354 and these transactions may arrive at a varying inter-arrival rate at 355 a proxy. For example, the BYE transaction usually arrives much after 356 the corresponding INVITE transaction was received, serviced and 357 expunged from the transaction list. Nonetheless, it is advantageous 358 to relate these transactions such that automata or a human monitoring 359 the log file can construct a set consisting of related transactions. 361 ACK requests in SIP need careful consideration as well. In SIP, an 362 ACK is a special method that is associated with an INVITE only. It 363 does not require a response, and furthermore, if it is acknowledging 364 a non-2xx response, then the ACK is considered part of the original 365 INVITE transaction. If it is acknowledging a 2xx-class response, 366 then the ACK is a separate transaction consisting of a request only 367 (i.e., there is not a response for an ACK request.) CANCEL is 368 another method that is tied to an INVITE transaction, but unlike ACK, 369 the CANCEL request elicits a final response. 371 While most requests elicit a response immediately, the INVITE request 372 in SIP can pend at a proxy as it forks branches downstream or at a 373 user agent server while it alerts the user. RFC 3261 [RFC3261] 374 instructs the server transaction to send a 1xx-class provisional 375 response if a final response is delayed for more than 200 ms. A SIP 376 CLF log file needs to include such provisional responses because they 377 help train automata associated with anomaly detection systems and 378 provide some positive feedback for a human observer monitoring the 379 log file. 381 Finally, beyond supporting native SIP actors such as proxies, 382 registrars, redirect servers, and user agent servers (UAS), it is 383 beneficial to derive a CLF format that supports back-to-back user 384 agent (B2BUA) behavior, which may vary considerably depending on the 385 specific nature of the B2BUA. 387 8. Data model 389 The following SIP CLF fields are defined as minimal information that 390 must appear in any SIP CLF record: 392 Timestamp: Date and time of the request or response represented as 393 the number of seconds and milliseconds since the Unix epoch. 395 Source:port: The DNS name or IP address of the upstream client, 396 including the port number. The port number must be separated from 397 the DNS name or IP address by a single ':'. 399 Destination:port: The DNS name or IP address of the downstream 400 server, including the port number. The port number must be 401 separated from the DNS name or IP address by a single ':'. 403 From: The From URI, including the tag. Whilst one may question the 404 value of the From URI in light of RFC4744 [RFC4474], the From URI, 405 nonetheless, imparts some information. For one, the From tag is 406 important and, in the case of a REGISTER request, the From URI can 407 provide information on whether this was a third-party registration 408 or a first-party one. 410 To: The To URI, including tag. 412 Callid: The Call-ID. 414 CSeq: The CSeq header. 416 R-URI: The Request-URI, including any URI parameters. 418 Status: The SIP response status code. 420 SIP Proxies may fork, creating several client transactions that 421 correlate to a single server transaction. Responses arriving on 422 these client transactions, or new requests (CANCEL, ACK) sent on the 423 client transaction need log file entries that correlate with a server 424 transaction. Similarly, a B2BUA may create one or more client 425 transactions in response to an incoming request. These transactions 426 will require correlation as well. The last two data model elements 427 provide this correlation. 429 Server-Txn: Server transaction identification code - the transaction 430 identifier associated with the server transaction. 431 Implementations can reuse the server transaction identifier (the 432 topmost branch-id of the incoming request, with or without the 433 magic cookie), or they could generate a unique identification 434 string for a server transaction (this identifier needs to be 435 locally unique to the server only.) This identifier is used to 436 correlate ACKs and CANCELs to an INVITE transaction; it is also 437 used to aid in forking as explained later in this section. (See 438 Section 9 for usage.) 440 Client-Txn: Client transaction identification code - this field is 441 used to associate client transactions with a server transaction 442 for forking proxies or B2BUAs. Upon forking, implementations can 443 reuse the value they inserted into the topmost Via header's branch 444 parameter, or they can generate a unique identification string for 445 the client transaction. (See Section 9 for usage.) 447 Finally, the SIP CLF should be extensible such that future SIP 448 methods, headers and bodies can be represented as well. Besides the 449 mandatory fields listed above, all other SIP headers will appear as 450 an ordered pairs of header field names and values. 452 This data model applies to all SIP entities --- a UAC, UAS, Proxy, a 453 B2BUA, registrar and redirect server. Note that a B2BUA is a 454 degenerate case of a proxy and as such the SIP CLF field layout 455 format prescribed for a proxy is equally applicable to the B2BUA. 456 Similarly, registrars and redirect servers are a degenerate case of a 457 UAS, and as such the SIP CLF field layout prescribed for a UAS is 458 equally applicable to registrars and redirect servers. 460 The following sections specify the individual SIP CLF data model 461 elements that form a log record for specific instance of a SIP 462 entity. We limit our specification to using the minimum data model 463 elements. It is understood that a SIP CLF record is extensible using 464 extension mechanisms appropriate to the specific representation used 465 to generate the SIP CLF record. This document, however, does not 466 prescribe a specific representation format and it limits the 467 discussion to the mandatory data elements described above. 469 8.1. SIP CLF data model elements for an UAC 471 When an UAC generates a request, the following data model elements 472 --- in the order specified below --- are used to create a SIP CLF 473 record that is subsequently logged: 475 Timestamp CSeq R-URI Destination-IP:port Client-Txn 476 To From Call-ID 478 Similarly, when an UAC receives a response, the following data model 479 elements --- in the order specified below --- are used to create a 480 SIP CLF record that is subsequently logged: 482 Timestamp CSeq Source-IP:port Status Client-Txn To 484 8.2. SIP CLF data model elements for an UAS 486 When an UAS receives a request, the following data model elements --- 487 in the order specified below --- are used to create a SIP CLF record 488 that is subsequently logged: 490 Timestamp CSeq R-URI Source-IP:port Server-Txn To From 491 Call-ID 493 Similarly, when an UAS generates a response, the following data model 494 elements --- in the order specified below --- are used to create a 495 SIP CLF record that is subsequently logged: 497 Timestamp CSeq Destination-IP:port Status Server-Txn 499 8.3. SIP CLF data model elements for a proxy 501 When the UAS half of a SIP proxy receives a request, the following 502 data model elements --- in the order specified below --- are used to 503 create a SIP CLF record that is subsequently logged: 505 Timestamp CSeq R-URI Source:port Server-Txn To From 506 Call-ID 508 Similarly, when a UAS half of a SIP proxy generates a response, the 509 following data model elements --- in the order specified below --- 510 are used to create a SIP CLF record that is subsequently logged: 512 Timestamp CSeq Destination:port Status Server-Txn Client-Txn 513 To 515 The Client-Txn may be empty (or null) since a downstream branch may 516 not have been created when the response log record is generated. 517 Imagine a proxy receiving an INVITE request and generating a "100 518 Trying" response. At the time the provisional response is generated, 519 the proxy may not have progressed the INVITE transaction to the point 520 of creating a client transaction or a downstream destination. Thus, 521 it is acceptable for these fields to be empty (or null.) 523 When an UAC-half of a SIP proxy generates a request, the following 524 data model elements --- in the order specified below --- are used to 525 create a SIP CLF record that is subsequently logged: 527 Timestamp CSeq R-URI Destination:port Server-Txn 528 Client-Txn To From Call-ID 530 Similarly, when an UAC-half receives a response, the following data 531 model elements --- in the order specified below --- are used to 532 create a SIP CLF record that is subsequently logged: 534 Timestamp CSeq Source:port Status Server-Txn Client-Txn To 536 9. Examples 538 In the examples below, we use the horizontal dash ("-") to denote 539 empty (or null) elements. Similarly, the CSeq header field is 540 represented by Method-Number (e.g., INVITE-32). It is important to 541 note that the syntax for the examples in this section is for 542 illustration purposes only, and is not a specific representation of a 543 logging format. It is expected that one or more documents will 544 outline specific formats for logging. 546 The examples use only the mandatory data elements defined in 547 Section 8. Extension elements are not considered. 549 There are five principals in the examples below. They are Alice, the 550 initiator of requests. Alice's user agent uses IPv4 address 551 198.51.100.1, port 5060. P1 is a proxy that Alice's request traverse 552 on their way to Bob, the recipient of the requests. P1 also acts as 553 a registrar to Alice. P1 uses an IPv4 address of 198.51.100.10, port 554 5060. Bob has two instances of his user agent running on different 555 hosts. The first instance uses an IPv4 address of 203.0.113.1, port 556 5060 and the second instance uses an IPv6 address of 2001:db8::9, 557 port 5060. P2 is a proxy responsible for Bob's domain. Table 1 558 summarizes these addresses. 560 +-------------------+--------------------+-------------------+ 561 | Principal | IP:port | Host/Domain name | 562 +-------------------+--------------------+-------------------+ 563 | Alice | 198.51.100.1:5060 | alice.example.com | 564 | P1 | 198.51.100.10:5060 | p1.example.com | 565 | P2 | 203.0.113.200:5060 | p2.example.net | 566 | Bob UA instance 1 | 203.0.113.1:5060 | bob1.example.net | 567 | Bob UA instance 2 | [2001:db8::9]:5060 | bob2.example.net | 568 +-------------------+--------------------+-------------------+ 570 Principal to IP address asignment 572 Table 1 574 Illustrative examples of SIP CLF follow. These examples use the 575 tag defined in [RFC4475] to logically denote a single 576 line. 578 9.1. UAC registering with a proxy 580 Alice sends a registration registrar P1 and receives a 2xx-class 581 response. The register requests causes Alice's UAC to produce a log 582 record shown below. The mandatory data model elements correspond to 583 those listed in Section 8.1. 585 586 1275930743.699 REGISTER-1 sip:example.com 198.51.100.10:5060 587 ty7u7 sip:example.com sip:alice@example.com;tag=76yhh 588 f81-d4-f6@example.com 589 591 After some time, Alice's UAC will receive a response from the 592 registrar. The response causes Alice's agent to produce a log record 593 shown below. The mandatory data elements correspond to those listed 594 in Section 8.1. 596 597 1275930744.100 REGISTER-1 198.51.100.10:5060 200 ty7u7 598 sip:example.com;tag=reg-98j 599 601 9.2. Direct call between Alice and Bob 603 In this example, Alice sends a session initiation request directly to 604 Bob's agent (instance 1.) Bob's agent accepts the session 605 invitation. We first present the SIP CLF logging from Alice's UAC 606 point of view. In line 1, Alice's user agent sends out the INVITE. 607 Shortly, it receives a "180 Ringing" (line 2), followed by a "200 OK" 608 response (line 3). Upon the receipt of the 2xx-class response, 609 Alice's user agent sends out an ACK request (line 4). 611 612 1275930743.699 INVITE-32 sip:bob@bob1.example.net 613 203.0.113.1:5060 c-1-xt6 sip:bob@example.net 614 sip:alice@example.com;tag=76yhh f82-d4-f7@example.com 615 617 618 1275930745.002 INVITE-32 203.0.113.1:5060 180 c-1-xt6 619 sip:bob@example.net;tag=b-in6-iu 620 622 623 1275930746.100 INVITE-32 203.0.113.1:5060 200 c-1-xt6 624 sip:bob@example.net;tag=b-in6-iu 625 627 628 1275930746.120 ACK-32 sip:bob@bob1.example.net 629 203.0.113.1:5060 c-1-xt6 sip:bob@example.net;tag=b-in6-iu 630 sip:alice@example.com;tag=76yhh f82-d4-f7@example.com 631 633 9.3. Single downstream branch call 635 In this example, Alice sends a session invitation request to Bob 636 through proxy P1, which inserts a Record-Route header causing 637 subsequent requests between Alice and Bob to traverse the proxy. The 638 SIP CLF log records correspond to the viewpoint of P1. The log 639 records are presented one per logical line and the line numbers refer 640 to Figure 1 641 Alice P1 Bob 642 +---INV--------->| | Line 1 643 | | | 644 |<---------100---+ | Line 2 645 | | | 646 | +---INV-------->| Line 3 647 | | | 648 | |<--------100---+ Line 4 649 | | | 650 | |<--------180---+ Line 5 651 | | | 652 |<---------180---+ | Line 6 653 | | | 654 | |<--------200---+ Line 7 655 | | | 656 |<---------200---+ | Line 8 657 | | | 658 +---ACK--------->| | Line 9 659 | | | 660 | |---ACK-------->| Line 10 662 Figure 1: Simple proxy-aided call flow 664 665 1 1275930743.699 INVITE-43 sip:bob@example.net 666 198.51.100.1:5060 s-1-tr sip:bob@example.net 667 sip:alice@example.com;tag=al-1 tr-87h@example.com 668 670 671 2 1275930744.001 INVITE-43 198.51.100.1:5060 100 s-1-tr - 672 sip:bob@example.net 673 675 676 3 1275930744.998 INVITE-43 sip:bob@bob1.example.net 677 203.0.113.1:5060 s-1-tr c-1-tr sip:bob@example.net 678 sip:alice@example.com;tag=a1-1 tr-87h@example.com 679 681 682 4 1275930745.200 INVITE-43 203.0.113.1:5060 100 s-1-tr c-1-tr 683 sip:bob@example.net;tag=b1-1 684 686 687 5 1275930745.800 INVITE-43 203.0.113.1:5060 180 s-1-tr c-1-tr 688 sip:bob@example.net;tag=b1-1 689 691 692 6 1275930746.009 INVITE-43 198.51.100.1:5060 180 s-1-tr c-1-tr 693 sip:bob@example.net;tag=b1-1 694 696 697 7 1275930747.120 INVITE-43 203.0.113.1:5060 200 s-1-tr c-1-tr 698 sip:bob@example.net;tag=b1-1 699 701 702 8 1275930747.300 INVITE-43 198.51.100.1:5060 200 s-1-tr c-1-tr 703 sip:bob@example.net;tag=b1-1 704 706 707 9 1275930748.201 ACK-43 sip:bob@bob1.example.net 708 198.51.100.1:5060 s-1-tr sip:bob@example.net;tag=b1-1 709 sip:alice@example.com;tag=al-1 tr-87h@example.com 710 712 713 10 1275930749.100 ACK-43 sip:bob@bob1.example.net 714 203.0.113.1:5060 s-1-tr c-1-tr sip:bob@example.net;tag=b1-1 715 sip:alice@example.com;tag=al-1 tr-87h@example.com 716 718 9.4. Forked call 720 In this example, Alice sends a session invitation to Bob's proxy, P2. 721 P2 forks the session invitation request to two registered endpoints 722 corresponding to Bob's address-of-record. Both endpoints respond 723 with provisional responses. Shortly thereafter, one of Bob's user 724 agent instances accepts the call, causing P2 to send a CANCEL request 725 to the second user agent. P2 does not Record-Route, therefore the 726 subsequent ACK request from Alice to Bob's user agent does not 727 traverse through P2 (and is not shown below.) 729 Figure 2 depicts the call flow. The SIP CLF log records correspond 730 to the viewpoint of P2. The log records are presented one per 731 logical line and the line numbers refer to Figure 2. 733 Bob Bob 734 Alice P2 (Instance 1) (Instance 2) 735 +---INV--->| | | Line 1 736 | | | | 737 |<---100---+ | | Line 2 738 | | | | 739 | +---INV--->| | Line 3 740 | | | | 741 | +---INV----+-------->| Line 4 742 | | | | 743 | |<---100---+ | Line 5 744 | | | | 745 | |<---------+---100---+ Line 6 746 | | | | 747 | |<---180---+---------+ Line 7 748 | | | | 749 |<---180---+ | | Line 8 750 | | | | 751 | |<---180---+ | Line 9 752 | | | | 753 |<---180---+ | | Line 10 754 | | | | 755 | |<---200---+ | Line 11 756 | | | | 757 |<---200---+ | | Line 12 758 | | | | 759 | +---CANCEL-+-------->| Line 13 760 | | | | 761 | |<---------+---487---+ Line 14 762 | | | | 763 | +---ACK----+-------->| Line 15 764 | | | | 765 | |<---------+---200---+ Line 16 767 Figure 2: Forked call flow 769 770 1 1275930743.699 INVITE-43 sip:bob@example.net 771 198.51.100.1:5060 s-1-tr sip:bob@example.net 772 sip:alice@example.com;tag=al-1 tr-87h@example.com 773 775 776 2 1275930744.001 INVITE-43 198.51.100.1:5060 100 s-1-tr - 777 sip:bob@example.net 778 779 780 3 1275930744.998 INVITE-43 sip:bob@bob1.example.net 781 203.0.113.1:5060 s-1-tr c-1-tr sip:bob@example.net 782 sip:alice@example.com;tag=a1-1 tr-87h@example.com 783 785 786 4 1275930745.500 INVITE-43 sip:bob@bob2.example.net 787 [2001:db8::9]:5060 s-1-tr c-2-tr sip:bob@example.net 788 sip:alice@example.com;tag=a1-1 tr-87h@example.com 789 791 792 5 1275930745.800 INVITE-43 203.0.113.1:5060 100 s-1-tr 793 c-1-tr sip:bob@example.net;tag=b1-1 794 796 797 6 1275930746.100 INVITE-43 [2001:db8::9]:5060 100 s-1-tr 798 c-2-tr sip:bob@example.net;tag=b1-2 799 801 802 7 1275930746.700 INVITE-43 [2001:db8::9]:5060 180 s-1-tr 803 c-2-tr sip:bob@example.net;tag=b1-2 804 806 807 8 1275930746.990 INVITE-43 198.51.100.1:5060 180 s-1-tr 808 c-2-tr sip:bob@example.net;tag=b1-2 809 811 812 9 1275930747.100 INVITE-43 203.0.113.1:5060 180 s-1-tr 813 c-1-tr sip:bob@example.net;tag=b1-1 814 816 817 10 1275930747.300 INVITE-43 198.51.100.1:5060 180 s-1-tr 818 c-1-tr sip:bob@example.net;tag=b1-1 819 821 822 11 1275930747.800 INVITE-43 203.0.113.1:5060 200 s-1-tr 823 c-1-tr sip:bob@example.net;tag=b1-1 824 826 828 12 1275930748.000 INVITE-43 198.51.100.1:5060 200 s-1-tr 829 c-1-tr sip:bob@example.net;tag=b1-1 830 832 833 13 1275930748.201 CANCEL-43 sip:bob@bob2.example.net 834 [2001:db8::9]:5060 s-1-tr c-2-tr sip:bob@example.net 835 sip:alice@example.com;tag=a1-1 tr-87h@example.com 836 838 839 14 1275930748.991 INVITE-43 [2001:db8::9]:5060 487 s-1-tr c-2-tr 840 sip:bob@example.net;tag=b1-2 841 843 844 15 1275930749.455 ACK-43 sip:bob@bob2.example.net [2001:db8::9]:5060 845 s-1-tr c-2-tr sip:bob@example.net;tag=b1-2 846 sip:alice@example.com;tag=a1-1 tr-87h@example.com 847 849 850 16 1275930750.001 CANCEL-43 [2001:db8::9]:5060 200 s-1-tr c-2-tr 851 sip:bob@example.net;tag=b1-2 852 854 The above SIP CLF log makes it easy to search for specific 855 transactions or a state of the session. On a Linux/Unix system, a 856 command of "grep c-1-tr" on the above log will readily yield the 857 information that an INVITE was sent to sip:bob@bob1.example.com, it 858 elicited a 100 followed by a 180 and then a 200. The absence of the 859 ACK request signifies that the ACK was exchanged end-to-end. 861 A command of "grep c-2-tr" yields a more complex scenario of sending 862 an INVITE to sip:bob@bob2.example.net, receiving 100 and 180. 863 However, the log makes it apparent that the request to 864 sip:bob@bob2.example.net was subsequently CANCEL'ed before a final 865 response was generated, and that the pending INVITE returned a 487. 866 The ACK to the final non-2xx response and a 200 to the CANCEL request 867 complete the exchange on that branch. 869 10. Security Considerations 871 A log file by its nature reveals both the state of the entity 872 producing it and the nature of the information being logged. To the 873 extent that this state should not be publicly accessible and that the 874 information is to be considered private, appropriate file and 875 directory permissions attached to the log file should be used. The 876 following threats may be considered for the log file while it is 877 stored: 879 o An attacker may gain access to view the log file, or may 880 surreptitiously make a copy of the log file for later vieweing; 881 o An attacker may mount a replay attack by modifying existing 882 records in the log file or inserting new records; 883 o An attacker may delete parts of --- or indeed, the whole --- file. 885 It is outside the scope of this document to specify how to protect 886 the log file while it is stored on disk. However, operators may 887 consider using common administrative features such as disk encryption 888 and securing log files [schneier-1]. Operators may also consider 889 hardening the machine on which the log files are stored by 890 restricting physical access to the host as well as restricting access 891 to the files themselves. 893 In the worst case, public access to the SIP log file provides the 894 same information that an adversary can gain using network sniffing 895 tools (assuming that the SIP traffic is in clear text.) If all SIP 896 traffic on a network segment is encrypted, then as noted above, 897 special attention must be directed to the file and directory 898 permissions associated with the log file to preserve privacy such 899 that only a privileged user can access the contents of the log file. 901 Transporting SIP CLF files across the network pose special challenges 902 as well. The following threats may be considered for transferring 903 log files or while transferring individual log records: 905 o An attacker may view the records; 906 o An attacker may modify the records in transit or insert previously 907 captured records into the stream; 908 o An attacker may remove records in transit, or may stage a man- in- 909 the-middle attack to deliver a partially or entirely falsified log 910 file. 912 It is also outside the scope of this document to specify protection 913 methods for log files or log records that are being transferred 914 between hosts. However, operators may consider using common security 915 protocols described in [RFC3552] to transfer log files or individual 916 records. Alternatively, the log file may be transferred through bulk 917 methods that also guarantees integrity, or at least detects and 918 alerts to modification attempts. 920 The SIP CLF represents the minimum fields that lend themselves to 921 trend analysis and serve as information that may be deemed useful. 923 Other formats can be defined that include more headers (and the body) 924 from Section 8. However, where to draw a judicial line regarding the 925 inclusion of non-mandatory headers can be challenging. Clearly, the 926 more information a SIP entity logs, the longer time the logging 927 process will take, the more disk space the log entry will consume, 928 and the more potentially sensitive information could be breached. 929 Therefore, adequate tradeoffs should be taken in account when logging 930 more fields than the ones recommended in Section 8. 932 Implementers need to pay particular attention to buffer handling when 933 reading or writing log files. SIP CLF entries can be unbounded in 934 length. It would be reasonable for a full dump of a SIP message to 935 be thousands of octets long. This is of particular importance to CLF 936 log parsers, as a SIP CLF log writers may add one or more extension 937 fields to the message to be logged. 939 11. Operational guidance 941 SIP CLF log files will take up substantive amount of disk space 942 depending on traffic volume at a processing entity and the amount of 943 information being logged. As such, any enterprise using SIP CLF 944 should establish operational procedures for file rollovers as 945 appropriate to the needs of the organization. 947 Listing such operational guidelines in this document is out of scope 948 for this work. 950 NOTE: Preliminary volume analysis was presented to the working group 951 mailing list during the Anaheim IETF (please see 952 http://www.ietf.org/mail-archive/web/sip-clf/current/msg00123.html 953 for the analysis.) An open question is whether the working group 954 thinks that this analysis should be put in this document. 956 12. IANA Considerations 958 This document does not require any considerations from IANA. 960 13. Acknowledgments 962 Members of the sipping, dispatch, ipfix and syslog working groups 963 provided invaluable input to the formulation of the draft. These 964 include Benoit Claise, Spencer Dawkins, John Elwell, David 965 Harrington, Christer Holmberg, Hadriel Kaplan, Atsushi Kobayashi, 966 Jiri Kuthan, Scott Lawrence, Chris Lonvick, Simon Perreault, Adam 967 Roach, Dan Romascanu, Robert Sparks, Brian Trammell, Dale Worley, 968 Theo Zourzouvillys and others that we have undoubtedly, but 969 inadvertently, missed. 971 Rainer Gerhards, David Harrington, Cullen Jennings and Gonzalo 972 Salgueiro helped tremendously in discussions related to arriving at 973 the beginnings of a data model. 975 14. References 977 14.1. Normative References 979 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 980 Requirement Levels", BCP 14, RFC 2119, March 1997. 982 14.2. Informative References 984 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 985 A., Peterson, J., Sparks, R., Handley, M., and E. 986 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 987 June 2002. 989 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 990 Text on Security Considerations", BCP 72, RFC 3552, 991 July 2003. 993 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 994 Authenticated Identity Management in the Session 995 Initiation Protocol (SIP)", RFC 4474, August 2006. 997 [RFC4475] Sparks, R., Hawrylyshen, A., Johnston, A., Rosenberg, J., 998 and H. Schulzrinne, "Session Initiation Protocol (SIP) 999 Torture Test Messages", RFC 4475, May 2006. 1001 [rieck2008] 1002 Rieck, K., Wahl, S., Laskov, P., Domschitz, P., and K-R. 1003 Muller, "A Self-learning System for Detection of Anomalous 1004 SIP Messages", Principles, Systems and Applications of IP 1005 Telecommunications Services and Security for Next 1006 Generation Networks (IPTComm), LNCS 5310, pp. 90-106, 1007 2008. 1009 [schneier-1] 1010 Schneier, B. and J. Kelsey, "Secure audit logs to support 1011 computer forensics", ACM Transactions on Information and 1012 System Security (TISSEC), 2(2), pp. 159,176, May 1999. 1014 Authors' Addresses 1016 Vijay K. Gurbani (editor) 1017 Bell Laboratories, Alcatel-Lucent 1018 1960 Lucent Lane 1019 Naperville, IL 60566 1020 USA 1022 Email: vkg@bell-labs.com 1024 Eric W. Burger (editor) 1025 This space for sale 1026 USA 1028 Email: eburger@standardstrack.com 1029 URI: http://www.standardstrack.com 1031 Tricha Anjali 1032 Illinois Institute of Technology 1033 316 Siegel Hall 1034 Chicago, IL 60616 1035 USA 1037 Email: tricha@ece.iit.edu 1039 Humberto Abdelnur 1040 INRIA 1041 INRIA - Nancy Grant Est 1042 Campus Scientifique 1043 54506, Vandoeuvre-les-Nancy Cedex 1044 France 1046 Email: Humberto.Abdelnur@loria.fr 1048 Olivier Festor 1049 INRIA 1050 INRIA - Nancy Grant Est 1051 Campus Scientifique 1052 54506, Vandoeuvre-les-Nancy Cedex 1053 France 1055 Email: Olivier.Festor@loria.fr