SIPCLF July 26, 2010 @ 1740

Minutes from: Shida Scubert and Adam Roach (many thanks!)
Edited by: Peter Musgrave co-chair

Vijay Gurbani -- Problem Statement
==================================

- Review of changes since -01 & -02. Summarized on slides; see
  slide deck for details.

- Brian: Should volume analysis be done in this doc? Is it not implementation specific?
- Peter: Agreed. 

- Hadriel: If this is a problem statement draft, why does it have
  a solution?

- Vijay: It's not a solution. It's a list of data elements, with
  an example representation to demonstrate how they can be used.

- Robert: So, is the example format a proposal?

- Vijay: No.

- Robert: Do you expect anyone to bring it forward as a proposal?

- Vijay: Not by me.

- Robert: Before this document is completed, these examples should
  be re-cast in one of the actually proposed formats.

- Vijay: That would delay publication.

- Robert: I doesn't really matter, you can still work with the
  contents of the document while developing format(s).

- Vijay: If the working group agrees, we can take it out.

- Robert: I would still propose using a proposed format.

- Hadriel: Just pull it out. It's a problem statement.

- Brian, ETH: Keeping the original HTTP CLF base as an example
  is useful to have an example of a sensical log message that
  it makes sense to represent. So, leave it in, but leave it in
  in a tabular format showing the name of the field, and what it
  might look like.

- Vijay: I'm okay with turning it into key/value tables.

- Benoit: Agrees that turning it into key/value table would be
  the best way forward.

- Chris Lonick: Is the draft only US-ASCII, or UTF-8?

- Short clarification about the encoding of field types, answer
  is: it depends, some are UTF-8, some ore just bits.


Vijay Gurbani on behalf of Gonzalo Salgueiro -- Indexed ASCII
=============================================================

- Short history of document heritage -- merging of
  draft-gurbani-sipping-clf with draft-roach-sipping-clf-synta0x

- Overview of proposed logging format.

- Proposal to rearrange length/pointer block to be after the mandatory fields

- Adam: That breaks the effiecncy
- Conclusion: Leave it as it is

- Peter: if I use a text editor can I see something useful? (Resulting discussion concluded text headers would be on every second line)
- Hadriel: Body with a LF will break that
- Adam: If you want to use a text tool then turn off body logging

- Chris Lonick: We need to take care that the the length fields are
  accurate. You run into problems if you start logging and don't
  completely finish the record.

- Hadriel: If we adopt this, we need to make sure that there is
  a well-defined transform to/from the IPFIX solution.

- slide 6 as next step where do we go? (no conclusion)

Brian Trammell -- IPFIX
=======================

- Summary of changes since -00 -- see slides

- Quick intro do IPFIX -- see slides

- Description of IPFIX extensions for SIP -- see slides

- Summary of IPFIX information elements to be incorporated
  in logging -- see slides

- Adam asked a clarifying question about how to find variable-length
  fields.

- Question from Jabber: In the problem statement field, all
  fields are marked mandatory. Will that be true?

- Brian: In IPFIX, you'd have mandatory fields in the template.
  If you wanted to include a different set of information,
  you would need to make use of a second template.

- Vijay: Not all fields are mandatory. Implementations need to
  have the ability to log all the fields, but they can log a
  subset at run-time

- Presentation of exmaple request record message format and overhead

- Vijay: If I want to add a new header, do I have to create a new
  template?

- Brian: Yes, you would need to export a different template at
  the beginning of the file.

- Hadriel: So, can vendors add proprietary fields?

- Brian: Yes, that's the reason we have proprietary (enterprise
  number based) field numbers.

- ???: Do you close the file before changing a template?

- Brian: No, you can have more than one template active
  within the same file.

- Summary of advantages of approach.

- Discussion around searchability of logs -- answer was
  basically: (1) if it matters that much, put it in a
  relational database, and (2) we have two solutions; if
  the indexed ASCII format is more efficient, you can use
  it instead.

- Robert: The examples in the draft are very simplisitic.
  It would be nice to see some more complex messages (cf RFC 4475,
  sections 3.1.1.2 and 3.1.1.11) to make certain the proposals
  can handle more complex messages.

- Brian: For IPFIX, we do plan to do this. It didn't make sense
  for a first example, but we're happy to include it.

- Hadriel: What layer do we expect SIPCLF to operate on? For
  example -- message size: if the message is compressed, do
  we include compressed size, uncompressed size?

- Vijay: There does have to be some parsing. The proposal is that
  the size would be the uncompressed size. Also, Indexed ASCII
  will be doing the torture tests also.

- Hadriel: why are we logging bodies? This isn't wireshark.

- Brian: Problem with log messages > 65535 in length.
  Suggest the body be "SIP body section" that includes only the
  first several bytes of the body. Also, include a 32-bit length
  field that says how large the body was originally?

- Brian: Are these likely to keep getting big?

- Hadriel: Probably. We should take this to the lsit.


Chairs -- Open Discussion
=========================

- Chairs: What is the way forward vis a vis the two different
  approaches?

- Vijay: Consensus out of Anaheim was to do 2 distinct formats.

- Robert: What I heard in Anaheim was that there may be a community
  that wanted an ASCII-only (not indexed ASCII) format. That said,
  if it's just between indexed ASCII and IPFIX, and we can convert
  back & forth, then producing both would be okay. It's suboptimal,
  and would probably require rechartering. There are arguments for
  both, but also good arguments for saying we need only one.

- Benoit: If there's a mapping, then this is effectively one
  representation.

- Brian: Is aribtirary-length logging an actual requirement?
  It causes problems for both formats.

- Kurt Jager: Yes, we have a requirement to do exactly that.

- Peter: Is there a way to do a unified format that combines both?

- Chairs: Should we take a poll here or ask on the list?
  Should we continue on the 2 drafts in their current form
  and attempt to adopt them both, or to bring them together
  and have a single document that describes both formats?

- Cullen: Having two formats will harm interoperability. The two
  on the table have very little advantage over each other. Just
  pick one! It's what the charter requires.

- Robert: yes, we should get IESG buy-in if we plan to do two solutions.
  But, yes, there are functional difference between length limits,
  searchability, etc. So it's not exactly a coin flip.

- Hadriel: Really, we're addressing two different customers.
  One for more home/research environments; the other for more
  commercial products.

- Robert: I'm not sure that's true. Indexed ASCII, for example,
  is optimized for searchability. So, the differentiation is
  a bit more subtle than that.

- Cullen: SIP elements today are very expensive, and produce ASCII
  logs. So the argument that a "real" operator won't use ASCII
  is a bit bogus.

- Hadriel: if you need searchability, we can add that. It would
  be crazy. But you could do it.

[ missed a lot here becuase I was up at the mic, arguing ]

- Eric: Pick one. If there's a huge argument for one, it should be
  clear. If not, time for a deathmatch. But I like the ASCII (UTF8)
  one.

- Chairs: So, do we want two or just one?

*** RESULT OF HUM: STRONG CONSENSUS FOR JUST ONE FORMAT ***

- Chairs: Indexed ASCII or IPFIX

*** RESULTS: SLIGHTLY MORE SUPPORT FOR IPFIX. WILL TAKE TO THE LIST. ***

Robert: Really want to see the ugly examples from the SIP torture test. 
Robert: Are reqURIs logged? 
Vijay: Yes
Robert: Then since a body can be embedded in an R-URI there may be cases which break things. 
[Point being this is a field which could then exceed 64K]