idnits 2.17.1 draft-salgueiro-sipclf-indexed-ascii-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 4 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 5, 2011) is 4802 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.gurbani-sipping-clf' is defined on line 489, but no explicit reference was found in the text == Outdated reference: A later version (-13) exists of draft-ietf-sipclf-problem-statement-04 -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIPCLF G. Salgueiro 3 Internet-Draft Cisco Systems 4 Intended status: Standards Track V. Gurbani 5 Expires: September 6, 2011 Bell Labs, Alcatel-Lucent 6 A. B. Roach 7 Tekelec 8 March 5, 2011 10 Format for the Session Initiation Protocol (SIP) Common Log Format (CLF) 11 draft-salgueiro-sipclf-indexed-ascii-03 13 Abstract 15 The SIPCLF Workgroup has defined a common log format framework for 16 Session Initiation Protocol (SIP) servers. This common log format 17 mimics the wildly successful event logging mechanism found in well- 18 known web servers like Apache and web proxies like Squid. This 19 document proposes an indexed text encoding format for the SIP Common 20 Log Format (CLF) that retains the key advantages of a text-based 21 format, while significantly increasing processing performance over a 22 purely text-based implementation. This file format adheres to the 23 SIP CLF data model and provides an effective encoding scheme for all 24 mandatory and optional fields that appear in a SIP CLF record. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 6, 2011. 43 Copyright Notice 45 Copyright (c) 2011 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Example Record . . . . . . . . . . . . . . . . . . . . . . . . 9 64 5. Text Tool Considerations . . . . . . . . . . . . . . . . . . . 10 65 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 66 7. Operational Guidance . . . . . . . . . . . . . . . . . . . . . 11 67 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 68 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 69 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 70 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 71 10.2. Informative References . . . . . . . . . . . . . . . . . 12 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 74 1. Terminology 76 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 77 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 78 document are to be interpreted as described in [RFC2119]. 80 [RFC3261] defines additional terms used in this document that are 81 specific to the SIP domain such as "proxy"; "registrar"; "redirect 82 server"; "user agent server" or "UAS"; "user agent client" or "UAC"; 83 "back-to-back user agent" or "B2BUA"; "dialog"; "transaction"; 84 "server transaction". 86 This document uses the term "SIP Server" that is defined to include 87 the following SIP entities: user agent server, registrar, redirect 88 server, a SIP proxy in the role of user agent server, and a B2BUA in 89 the role of a user agent server. 91 2. Introduction 93 The extensive list of benefits and the widespread adoption of the 94 Apache Common Log Format (CLF) has prompted the development of a 95 functionally equivalent event logging mechanism for the Session 96 Initiation Protocol [RFC3261] (SIP). Implementing a logging scheme 97 for SIP is a considerable challenge. This is due in part to the fact 98 that the behavior of a SIP entity is more complex as compared to an 99 HTTP entity. Additionally, there are shortcomings to the purely 100 text-based HTTP Common Log Format that need to be addressed in order 101 to allow for real-time inspection of SIP log files. Experience with 102 Apache Common Log Format has shown that dealing with large quantities 103 of log data can be very processor intensive, as doing so necessarily 104 requires reading and parsing every byte in the log file(s) of 105 interest. 107 An implementation independent framework for the SIP CLF has been 108 defined in [I-D.ietf-sipclf-problem-statement]. This memo describes 109 an indexed text file format for logging SIP messages received and 110 sent by SIP clients, servers, and proxies that adheres to the data 111 model presented in Section 8 of [I-D.ietf-sipclf-problem-statement]. 112 This document defines a format that is no more difficult to generate 113 by logging entities, while being radically faster to process. In 114 particular, the format is optimized for both rapidly scanning through 115 log records, as well as quickly locating commonly accessed data 116 fields. 118 Further, the format proposed by this document retains the key 119 advantage of being human readable and able to be processed using the 120 various Unix text processing tools, such as sed, awk, perl, cut, and 121 grep. 123 3. Format 125 The Common Log Format for the Session Initiation Protocol 126 [I-D.ietf-sipclf-problem-statement] defines a data model to which 127 this format adheres. Each SIP CLF record MUST consist of all the 128 mandatory data model elements outlined in Section 8.1 of 129 [I-D.ietf-sipclf-problem-statement]. 131 The following figure depicts the format whereby all SIP CLF records 132 are encoded: 134 0 7 8 15 16 23 24 31 135 +-----------+-----------+-----------+-----------+ 136 | Version | Record Length | 0 - 3 137 +-----------+-----------+-----------+-----------+ 138 | Record Length (cont) | 0x2C | 4 - 7 139 +-----------+-----------+-----------+-----------+ 140 | Flags Field | 0x2C | 8 - 11 141 +-----------+-----------+-----------+-----------+ 142 | CSeq Pointer (Hex) | 12 - 15 143 +-----------+-----------+-----------+-----------+ 144 | Response Status-Code Pointer (Hex) | 16 - 19 145 +-----------+-----------+-----------+-----------+ 146 | R-URI Pointer (Hex) | 20 - 23 147 +-----------+-----------+-----------+-----------+ 148 | Destination IP address:port Pointer (Hex) | 24 - 27 149 +-----------+-----------+-----------+-----------+ 150 | Source IP address:port Pointer (Hex) | 28 - 31 151 +-----------+-----------+-----------+-----------+ 152 | To URI Pointer (Hex) | 32 - 35 153 +-----------+-----------+-----------+-----------+ 154 | To Tag Pointer (Hex) | 36 - 39 155 +-----------+-----------+-----------+-----------+ 156 | From URI Pointer (Hex) | 40 - 43 157 +-----------+-----------+-----------+-----------+ 158 | From Tag Pointer (Hex) | 44 - 47 159 +-----------+-----------+-----------+-----------+ 160 | Call-Id Pointer (Hex) | 48 - 51 161 +-----------+-----------+-----------+-----------+ 162 | Server-Txn Pointer (Hex) | 52 - 55 163 +-----------+-----------+-----------+-----------+ 164 | Client-Txn Pointer (Hex) | 56 - 59 165 +-----------+-----------+-----------+-----------+ 166 | TLV Start Pointer (Hex) | 60 - 63 167 +-----------+-----------+-----------+-----------+ 168 | 0x0A | | 64 - 67 169 +-----------+ + 170 | Timestamp | 68 - 71 171 + +-----------+ 172 | | 0x2E | 72 - 75 173 +-----------+-----------+-----------+-----------+ 174 | Fractional Seconds | 0x09 | 76 - 79 175 +-----------+-----------+-----------+-----------+ 176 | | 177 | | 178 | Mandatory Fields | 179 | | 180 | | 181 +-----------+-----------+-----------+-----------+ \ 182 | 0x09 | Tag (Hex) | \ 183 +-----------+-----------+-----------+-----------+ \ Repeated 184 | Tag (cont)| 0x2C | Length (Hex) | \ as many 185 +-----------+-----------+-----------+-----------+ > times as 186 | Length (cont) | 0x2C | | / necessary 187 +-----------+-----------+-----------+ + / 188 | Value | / 189 +-----------+-----------+-----------+-----------+ / 190 | 0x0A | 191 +-----------+ 193 Figure 1: SIP Common Log Format 195 Note that indications of "hexadecimal encoded" indicate that the 196 value is to be written out in human-readable base-16 numbers using 197 the ASCII characters 0x30 through 0x39 ('0' through '9') and 0x41 198 through 0x46 ('A' through 'F'). Similarly, indications of "decimal 199 encoded" indicate that the value is to be written out in human 200 readable base-10 number using the ASCII characters 0x30 through 0x39 201 ('0' through '9'). In both encodings, numbers always take up the 202 number of bytes indicated, and are padded on the left with ASCII '0' 203 characters to fill the entire space. 205 First, a 64-byte header indicates meta-data about the record. 207 Version (1 byte): 0x41 for this document; hexadecimal encoded. 209 Record Length (6 bytes): Hexadecimal encoded total length of this 210 log record, including "Flags" and "Record Length" fields, and 211 terminating line-feed. 213 Flags Field (3 bytes): 215 byte 1 - Request/Response flag 217 R = request 218 r = response 220 byte 2 - Retransmission flag 222 o = original transmission 223 d = duplicate transmission 224 s = server is stateless [i.e., retransmissions are not 225 detected] 227 byte 3 - Sent/Received flag 229 u = received UDP mesage 230 t = received TCP mesage 231 l = received TLS mesage 232 U = sent UDP mesage 233 T = sent TCP mesage 234 L = sent TLS mesage 236 Bytes 12 through 59 contain hexadecimal encoded pointers that point 237 to the values of variable-length mandatory fields. Note that there 238 are no delimiters between these pointer values -- they are packed 239 together as a single, 52-character hexadecimal encoded string.The 240 "Pointer" fields indicate absolute byte values within the record, and 241 MUST be >=80. They point to the start of the corresponding value 242 within the "Mandatory Fields" area. 244 When a given mandatory field is not applicable in the SIP CLF record 245 (i.e. a particular entity is not able to log that field because it 246 does not make sense for the role the entity is playing in the SIP 247 ecosystem) then it MUST be encoded as a single ASCII dash (0x2D) and 248 will consequently have a corresponding length field value of 1. The 249 final pointer, "TLV Start Pointer," points to the ASCII Tab (0x09) 250 character for the first entry in the Tag/Length/Value area; if no 251 such entries are present, this value is set to zero. 253 CSeq: The Command Sequence header field, including the CSeq number 254 and method name. 256 Response Status-Code: Set to the value of the SIP response status 257 code for responses. Set to a single ASCII dash (0x2D) for 258 requests. 260 R-URI: The Request-URI in the start line (mandatory in request), 261 including any URI parameters. 263 Destination IP address:port The IP address of the downstream server, 264 including the port number. The port number MUST be separated from 265 the IP address by a single ':'. 267 Source IP address:port The IP address of the upstream client, 268 including the port number over which the SIP message was received. 269 The port number MUST be separated from the IP address by a single 270 ':'. 272 To URI: Value of the URI in the To header field. 274 To Tag: Value of the tag parameter (if present) in the To header 275 field. 277 From URI: Value of the URI in the From header field. 279 From Tag: Value of the tag parameter in the From header field. 281 Whilst one may question the value of the From URI in light of 282 [RFC4474], the From URI, nonetheless, imparts some information. For 283 one, the From tag is important and, in the case of a REGISTER 284 request, the From URI can provide information on whether this was a 285 third-party registration or a first-party one. 287 Call-Id: The value of the Call-ID header field. 289 Server-Txn: Server transaction identification code - the transaction 290 identifier associated with the server transaction. 291 Implementations can reuse the server transaction identifier (the 292 topmost branch-id of the incoming request, with or without the 293 magic cookie), or they could generate a unique identification 294 string for a server transaction (this identifier needs to be 295 locally unique to the server only.) This identifier is used to 296 correlate ACKs and CANCELs to an INVITE transaction; it is also 297 used to aid in forking. (See Section 9 of 298 [I-D.ietf-sipclf-problem-statement] for usage.) 300 Client-Txn: Client transaction identification code - this field is 301 used to associate client transactions with a server transaction 302 for forking proxies or B2BUAs. Upon forking, implementations can 303 reuse the value they inserted into the topmost Via header's branch 304 parameter, or they can generate a unique identification string for 305 the client transaction. (See Section 9 of 306 [I-D.ietf-sipclf-problem-statement] for usage.) 308 Following the pointers, several fixed-length fields are encoded. As 309 before, all fields are completely filled, pre-pending values with '0' 310 characters as necessary. 312 Timestamp (10 bytes): Date and time of the request or response 313 represented as the number of seconds since the Unix epoch (i.e. 314 seconds since midnight, January 1st, 1970, GMT). Decimal encoded. 316 Fractional Seconds (6 bytes): Fractional seconds portion of the 317 Timestamp field to millisecond accuracy. Decimal encoded. 319 Mandatory Field Data: Contains actual values for the mandatory 320 fields. This data MUST appear in the order listed, and each field 321 MUST be present. Fields are separated by a single ASCII Tab 322 character (0x09). Any tab characters present in the data to be 323 written will be replaced by an ASCII space character (0x20) prior 324 to being logged. If a given mandatory field is not present then 325 it MUST be encoded as a horizontal dash ("-"). 327 After the "Mandatory Fields" section, the OPTIONAL Tag/Length/Value 328 groups appear zero or more times. These TLV groups allow SIP CLF 329 implementers the flexibility to extend the logging capability of the 330 indexed-ASCII representation beyond just the mandatory log elements. 331 The location within the SIP CLF record is indicated by the "TLV Start 332 Pointer" field. This "TLV Start Pointer" field MUST be set to 0x0000 333 if the OPTIONAL TLV groups are not implemented. 335 Tag Field (4 bytes): indicates the type of value coded by this TLV; 336 hexadecimal encoded. Currently defined tags are: 338 0x0000 - Contact value (can be repeated) Contains entire value of 339 Contact header field 341 0x0001 - Remote Host (mandatory in request) The DNS name of the 342 IP address from which the message was received (if "Sent/ 343 Received flag" is set to "r"). The DNS name of the IP address 344 to which the message is being sent (if "Sent/Received flag" is 345 set to "s") 347 0x0002 - Authenticated User Contains the user name by which the 348 user has been authenticated 350 0x0003 - Complete SIP Message (SHOULD be omitted by default) 351 Contains complete SIP message. Can be repeated multiple times 352 to accommodate SIP messages that exceed 65535 bytes in length. 354 Length Field (2 bytes): indicates the length of the value coded in 355 this TLV, hexadecimal encoded. This length does NOT include the 356 TLV header. 358 Value Field (0 to 65535 bytes): contains the actual value of this 359 TLV. As with the mandatory fields, ASCII Tab characters (0x09) 360 are replaced with ASCII space characters (0x20). 362 4. Example Record 364 The following SIP message is an INVITE request sent by a SIP client: 366 INVITE sip:192.168.217.74 SIP/2.0 367 To: 368 Call-ID: DL70dff590c1-1079051554@petermac.magor.local. 369 From: "PeterM" ; 370 tag=DL88360fa5fc;epid=0x34619b0 371 CSeq: 1 INVITE 372 Max-Forwards: 70 373 Via: SIP/2.0/TCP 192.168.217.117:5060; 374 branch=z9hG4bK-1f6be070c4-DL 375 Contact: "1001" 376 Allow: INVITE,CANCEL,ACK,OPTIONS,INFO,SUBSCRIBE,NOTIFY,BYE, 377 MESSAGE,UPDATE,REFER 378 Supported: replaces,norefersub 379 User-Agent: Dylogic Mirial 7.0.33 380 Content-Type: application/sdp 381 Content-Length: 418 383 v=0 384 o=1001 1456139204 0 IN IP4 192.168.217.117 385 s=- 386 i=Dylogic Mirial 7.0.33 387 c=IN IP4 192.168.217.117 388 b=AS:2048 389 t=0 0 390 m=audio 13756 RTP/AVP 0 101 391 a=rtpmap:0 PCMU/8000 392 a=rtpmap:101 telephone-event/8000 393 a=fmtp:101 0-16 394 a=x-mpdp:192.168.217.117:13756 395 m=video 13758 RTP/AVP 96 396 a=rtpmap:96 H264/90000 397 a=fmtp:96 profile-level-id=420015; max-mbps=47520; max-fs=1584; 398 max-dpb=7680 399 a=x-mpdp:192.168.217.117:13758 401 Shown below is approximately how this message would appear as a 402 single record in a SIP CLF logging file if encoded according to the 403 syntax described in this document. Due to internet-draft 404 conventions, this log entry has been split into seven lines, instead 405 of the two lines that actually appear in a log file; and the tab 406 characters have been padded out using spaces to simulate their 407 appearance in a text terminal. 409 A000120,Rou,0051005A005C006F0083009900AC00AE00D200DF010D01170000 411 0000000000.010 1 INVITE - sip:192.168.217.74 412 192.168.217.74:5060 192.168.217.117:56485 413 sip:192.168.217.74 - sip:1001@petermac.magor.local.:5060 414 DL88360fa5fc DL70dff590c1-1079051554@petermac.magor.local. 415 server-tx client-tx 417 A Base64 encoded version of this log entry (without the changes 418 required to format it for an internet-draft) is shown below: 420 begin-base64 644 clf_record 421 QTAwMDEyMCxSb3UsMDA1MTAwNUEwMDVDMDA2RjAwODMwMDk5MDBBQzAwQUUwMEQyMDBERjAx 422 MEQwMTE3MDAwMAowMDAwMDAwMDAwLjAxMAkxIElOVklURQktCXNpcDoxOTIuMTY4LjIxNy43 423 NAkxOTIuMTY4LjIxNy43NDo1MDYwCTE5Mi4xNjguMjE3LjExNzo1NjQ4NQlzaXA6MTkyLjE2 424 OC4yMTcuNzQJLQlzaXA6MTAwMUBwZXRlcm1hYy5tYWdvci5sb2NhbC46NTA2MAlETDg4MzYw 425 ZmE1ZmMJREw3MGRmZjU5MGMxLTEwNzkwNTE1NTRAcGV0ZXJtYWMubWFnb3IubG9jYWwuCXNl 426 cnZlci10eAljbGllbnQtdHgK 427 ==== 429 5. Text Tool Considerations 431 This format has been designed to allow text tools to easily process 432 logs without needing to understand the indexing format. Index lines 433 may be rapidly discarded by checking the first character of the line: 434 index lines will always start with an alphabetical character, while 435 field lines will start with a numerical character. 437 Within a field line, script tools can quickly split fields at the tab 438 characters. The first 12 fields are positional, and the meaning of 439 any subsequent fields can be determined by checking the first four 440 characters of the field. Alternately, these non-positional fields 441 can be located using a regular expression. For example, the "Contact 442 value" in a request can be found by searching for the perl regex 443 /\t0000,....,([^\t]*)/. 445 Note also that requests can be distinguished from responses by 446 checking the third positional field -- for requests, it will always 447 be set to "000"; any other value indicates a response. 449 6. Security Considerations 451 This document does not introduce any new security considerations 452 beyond those in [I-D.ietf-sipclf-problem-statement]. 454 7. Operational Guidance 456 SIP CLF log files will take up substantive amount of disk space 457 depending on traffic volume at a processing entity and the amount of 458 information being logged. As such, any enterprise using SIP CLF 459 should establish operational procedures for file rollovers as 460 appropriate to the needs of the organization. 462 Listing such operational guidelines in this document is out of scope 463 for this work. 465 8. IANA Considerations 467 This document does not require any considerations from IANA. 469 9. Acknowledgements 471 TBD 473 10. References 475 10.1. Normative References 477 [I-D.ietf-sipclf-problem-statement] 478 Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O. 479 Festor, "The Common Log Format (CLF) for the Session 480 Initiation Protocol (SIP)", 481 draft-ietf-sipclf-problem-statement-04 (work in progress), 482 October 2010. 484 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 485 Requirement Levels", BCP 14, RFC 2119, March 1997. 487 10.2. Informative References 489 [I-D.gurbani-sipping-clf] 490 Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O. 491 Festor, "The Common Log File (CLF) format for the Session 492 Initiation Protocol (SIP)", draft-gurbani-sipping-clf-01 493 (work in progress), March 2009. 495 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 496 A., Peterson, J., Sparks, R., Handley, M., and E. 497 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 498 June 2002. 500 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 501 Authenticated Identity Management in the Session 502 Initiation Protocol (SIP)", RFC 4474, August 2006. 504 Authors' Addresses 506 Gonzalo Salgueiro 507 Cisco Systems 508 7200-12 Kit Creek Road 509 Research Triangle Park, NC 27709 510 US 512 Email: gsalguei@cisco.com 514 Vijay Gurbani 515 Bell Labs, Alcatel-Lucent 516 1960 Lucent Lane 517 Rm 9C-533 518 Naperville, IL 60563 519 US 521 Email: vkg@bell-labs.com 523 Adam Roach 524 Tekelec 525 17210 Campbell Rd. 526 Suite 250 527 Dallas, TX 75252 528 US 530 Email: adam@nostrum.com