Working Group chairs:
Spencer Dawkins: spencer@wonderhamster.org
Theo Zourzouvillys: theo@crazygreek.co.uk
Scribe: Ben Campbell: ben@nostrum.com
Spencer is solo chair for this meeting. Theo sends his
regrets (from
Vijay Gurbani
draft-gurbani-sipclf-problem-statement
*** Vijay presents slides. Lots of contributors.
Problem scope as chartered was really narrow. We have
transport, operational and security considerations out of scope, so we’re
mostly writing log entries into a file, and focusing on the format that will be
used for the log entries.
Common Log Format - summary of an application PDU. Our biggest problem is knowing what to write – SIP isn’t request/response, and dialogs can take quite some time to be processed (INVITE to 200 OK may ring for a number of seconds, then CFNA to another SIP UAS…).
We are relying on Hadriel’s Session ID draft for
correlation, but Hadriel didn’t rev his draft for this IETF, and it’s still an
individual contribution.
*** Question to Room - How many familiar with HTTP CLF
Answer: About 8 hands.
Daryl Malas: CLF is challenging thing in general. Draft says this doesn't replace CDR, but people may adopt because this is more efficient to get info from than CDR, or they just want summary info. What’s the differentiator? Is it more efficient? Need to know what CDR is used for, and why this isn’t duplication.
Vijay: Some SIP is run outside carriers right now, so they
may not even have CDR. CDRs just record what was
necessary to charge. We need a standardized format when we go multivendor –
none of the vendor tools are going to read all the vendor formats.
Daryl (Malas): Issue is, do people
care about a common CLF across application servers, vs. proprietary formats?
Vijay: Even though Apache HTTP CLF is not standardized, even
MS supports it.
Adam Roach: CDR is not adequate for what we need. Common
format allows tools that read logs from multiple vendor devices. (Eric Burger
expressed agreement)
Daryl: Operators are doing a lot more than charging with
their CDRs. If Adam's multi-vendor environment is
important, it was not clearly articulated in the draft.
Vijay: Original vision was single-server CLF, but that
changed.
Eric Burger: Reiterates more people using SIP than just
carriers. MS adopted Apache HTTP CLF because they were forced to by the market.
Laura Liess - Deutche
Telecom: People use Wireshark for tracing and analysis, troubleshooting. They
need end-to-end session id. CLF is useless without session-ID.
Daryl: Goal is a common format that can correlate, so there
has to be some identifier. Session ID is a dependency if this stuff is going to
be real – not going to be able to tell another carrier “show me your logs”, but
need to troubleshoot end-to-end.
Vijay: Important to look at message as it crosses mesh, and
have some sort of global string that identifies messages.
Daryl: Cross service provider tracing is not going to work.
*** Vijay presents challenges in defining SIP CLF
We have to be able to handle serial and parallel forking in SIP – that’s not in HTTP, either.
ACK and CANCEL are also both special cases that need to be accommodated – we need to be able to construct a tree that shows us what happened to a call (Spencer’s opinion, which he didn’t say out loud during the meeting – we need to explicitly describe this as “a tree” in the document – that was much clearer to Spencer, but that’s not what the document says now).
Vijay is showing ASCII in his example – that’s a decision
that the working group will need to make.
*** Vijay presents set of SIP CLF fields
Spencer (channeling Robert): Session ID may be important for
us, but not in scope now. We can talk about it if we find that helpful, but we
can’t put it in a document _yet_.
*** Vijay presents CLF example
Dave Harrington: Concern that this is based on a proprietary
format. Suggest a better labeled format that works with existing tools like XML
– XML is self-labeling. Thinks the format we seem to be leaning toward is a
mistake.
Vijay: XML is visually hard to read.
Dave: So is this format. There are tools to display XML.
Adam: The examples are not a real proposed format, just
examples. But one problem with XML is that this will create vast amounts of
data fast. Need to be able to search very quickly - looking for a needle in a
haystack. XML would take minutes with terabytes of data. Adam's draft has
indexing scheme to allow fast discard of non-interesting records.
Eric Burger: What Adam said.
Spencer: 3 and a half proposals going into this meeting (ASCII, indexed-ASCII, PCAP, and IPFIX – Hadriel didn’t write a draft about this last one, but he’s been talking about it). Charter allows both binary and text formats as long as they are mappable.
Talking about content and format are separate conversations.
Let’s talk about content.
Dave: Just because you can do binary doesn't mean you
should. (Take it from an SNMP guy.) Look at RFC3535 to see what operators had
to say. If we are worried about speed,
look at IPFIX.
Eric: IPFIX was discussed in
Sumanth Channabasapba: will we have any filters built into this?
Vijay: That would be downstream of the format we’re
discussing here. Filtering is up to you.
Sumanth: Talking about speed and
data size, be cognizant that there are tools that help to this in XML.
Vijay: (pointed out problems with several formats) He only
learned of IPFIX recently--does he need to implement entire framework?
Spencer: Need to get through problem statement side before
starting food fight.
Dave: Not knowing IPFIX is not a good reason to ignore it.
Lots of people worked to define a way to get important data off of box quickly.
*** Spencer: Is the problem statement done?
Daryl: Not done. Wants draft to show compelling reason to do
this work. Does not see clearly stated reason why existing carriers and vendors
would choose to use this rather than their existing proprietary tools.
Vijay: Non-operators who do not care about CDRs may want this.
Daryl: Not seeing a compelling statement. Sees lots of stuff
on what we can do, but not enough on why what exists today will not work.
Eric: What exists today often works
for that product, but not across multiple products, or products without CDR.
There's a bunch of almost compatible stuff that exists that may be useful
(example Apache CLF).
Spencer: Are you saying we need info in the draft that says
that?
Eric: Not if he has to write it :-)
… but “yes” …
Daryl: that info should be in the draft.
Eric, restating for minutes: We need language on how log
data is captured by existing products in almost compatible ways. Our experience
with Apache CLF shows value in being able to correlate records across various pieces
of network equipment. Lots of open source tools have sprung up around Apache
CLF.
Daryl: For this to be useful, we need session ID. Most of
these devices are B2BUAs. Without a session ID you can only look at one system
at a time. What value is that over proprietary formats? Is this defining just
CLF, or a correlation methodology?
Vijay: Just the format.
Adam: Lots of SBCs out there,
ability to correlate is absolutely necessary. More than one
way to skin the cat. One is an identifier,
another is to define the format to show before and after IDs. This is not
intractable and should not block starting the work.
Spencer: Chartering discussion allowed for inserting a
session id later, just not in scope for our work today. My previous gig was
correlating SIP calls monitored at multiple points as they moved through a
network--it's a hard problem and the heuristics sometime fail. Wants to make this a simple problem.
Daryl: Problem does span beyond SBCs.
Now more application servers are b2buas that break signaling info.
Adam: Mean b2buas in general, not just SBCs.
*** Spencer (chair): Some presentation level editing needed. Is there other things needed in the draft before it goes forward?
Captured need for more compelling language on why we need
this. Talked about session ID. Anything
else?
Daryl: Understands we want to close this so we can work on
format selection, but would like to see the updates, read through it again,
before passing judgment.
Chair and Vijay: If you haven't read draft, please do.
Spencer: DISPATCH process does not remove need to review things. DISPATCH process allows expedited WG setup--need to maintain momentum.
*** Spencer: Close of discussion on problem statement, look
at next version and adopt. Moving to solution space...Key decisions to make: ASCII
vs binary, different formats in each category. Are we
recording transactions, messages, dialogs?
Vijay: Recording fields from messages.
*** Vijay presents slides on solution space.
There is a draft called SIPFIX for sending data using IPFIX.
Eric: When we started we said IPFIX wouldn't work. Anyone
remember why?
Vijay: Started looking at it in
Dan Romascanu: One argument was that the SIP session setup
does not match classic flow described by IPFIX. Issue has been answered in
(missed name) draft.
Vijay: Also discussed SYSLOG. Decided that
SIP CLF would be too much data for SYSLOG. Summary info to SYSLOG could
make sense.
Eric: We should pick one and only one format. We could have
other formats, but keep them is separate docs because people are only going to
implement the simple ASCII formats.
Dave: You don't want SYSLOG. (Dave is chair of SYSLOG wg). IPFIX might be okay.
Spencer: Where are we? We've identified fields. Have editing
to do on problem statement draft. Then we have work to determine encoding of
the fields.
Vijay: Agreed. But Vijay does not have enough background in IPFIX
to work on that.
Adam: Someone who understands this and can figure out IPFIX,
_please_ write something up. For due diligence purposes, we need to analyze
this.
*** Spencer: Anyone in the room willing and able to go look
at this? (crickets).
Dave: Suggest posting to IPFIX mailing list. See if anyone
there is interested enough to come get involved here.
Vijay: Wants to preserve ASCII nature so you can use tools
like grep, perl, etc.
Spencer: Can anyone talk about how else this might be used?
(ed:
missed name, NTT): (ed:
missed comment)
Daryl: Wants to stop capturing every SIP message. Main
application of CLF is to stop doing that. Will still capture
every session but in shorter more readable format. But if you find a
problem in CLF, will the troubleshooters want to go back and get full messages?
... Since CLF would be smaller, maybe it could be captured
by UAs as well.
Vijay: Format the same whether it is collected by proxy or
UA.
Eric: Keep coming back to people using Wireshark, CDRs, etc. Wants to find out from room, if it's worth doing
CLF. Argues that it is. There's a core constituency,
and once it's there others will use it.
Daryl: This is why I would like to see the problem statement
updated, and think about it some more. Hacks or not, there are operational
practices in use today. Would like to see them change, but the solution needs
to be compelling enough to change operational behavior. Hopes updated problem
statement will say this in a compelling way.
Dave: Group has not done sufficient due diligence on
existing possibilities. Problem statement has not sufficiently expressed why
existing solutions won't work.
Vijay: To me, HTTP CLF is compelling. It's widely used. If
we justify that CLF is better than CDR, we tell people CLF can replace CDR.
Draft should clearly state we are not replacing CDR.
Spencer: Maybe we are saying to use CLF instead of CDR for
_some_ things you use CDR for.
Daryl: Stop using flat head screwdriver to drive phillips screws. Hopes this effort
will provide right tool for what people are trying to do with the data. E.g. troubleshooting. Not billing. Draft says it is not to
replace CDR, but doesn't say why you want to use it instead of those.
Daryl: Let’s agree to some text for the next revision.
Daryl: Needs to see the uses more clearly articulated. Wants
to use draft to explain to people what they need. Not clear from current draft
whether Daryl and Vijay's heads are in the same place.
*** Spencer (tries to) puts charter on screen. Current
schedule does not allow us to just work on problem statement.
*** Spencer: Who plans to work on log format specification?
Only Vijay raised hand.
Eric: If no one else wants to work on it, it can be an
individual submission.
Daryl: Would love to work on it, but it's not a priority for
his day job. Still cares about how it progresses. But could still review,etc.
Spencer: Reviewing is working. In perfect
world. we'd have a design team. Is someone other
than Vijay on the design team?
(Hands: Daryl, Eric).
Daryl: Can work on problem statement, but does not have
expertise to work on solution.
Adam: There were some hands in
Spencer: Then we should ask on mail list, to flush out the
previous suspects.
Vijay: Are we talking about refining the problem statement?
Spencer: No. We're talking about solution spec.
***Action Item:
Spencer to send out note to list to ask who would like to join Vijay,
Eric, and Daryl to work on this.
Spencer: Is going down another road now – not using IPFIX - going
to prevent us from using IPFIX in the future?
Adam: Make sure we don't try to force our requirements to
fit IPFIX.
Spencer: Talking about solution, not requirements. In order
to finish based on our chartered milestones, we have to finish the solution
specification before the next meeting.
Adam: Need to find someone who understands IPFIX quickly.
Dave: SYSLOG work group has _never_ met, but has gone on for
years.
Spencer: recap: reving problem
statement quickly. Moving ahead on format spec itself.
Putting together a team to work together on that. Ask
on ipfix mailing list to see if anyone there is
interested in joining this effort.
Vijay: We should get an IPFIX expert that we can educate in
SIP, and then tell us how to use ipfix.
Dave: When SNMP was initially adopted, we expected an SNMP
person to learn technology to be managed. This did not scale, so we expected
the managed technology people to learn a little SNMP. It's unrealistic to
expect an IPFIX person to learn SIP. Far fewer IPFFIX rfcs than SIP rfcs.
Vijay: Need to address IPFiX in
due diligence. Vijay has time to work with an IPFIX expert, but not to come up
with an IPFIX binding on his own. (Seems to be an implication of multiple
bindings?)
Spencer: If we are going to meet our milestones, we will
need to come up with an ASCII or indexed-ASCII binding. Decisions will happen
on mail lists--please make sure conversations happen there.
Adjourn for cookies.