IETF CLUE WG interim meeting, Sept. 19-20th, 2012 San Jose, CA Hosted by Cisco Attendees: ---------- Mary Barnes (Polycom) (chair) David Benham (Cisco) Stephen Botzko (Polycom) Bo Burman (Ericsson) Roni Even (Huawei) Rob Hansen (Cisco) Christer Holmberg (Ericsson) Paul Kyzivat (Huawei) (chair) Jonathan Lennox (Vidyo) Andy Pepperell (Silverflare) Erick Sasaki (NTT) (Wed only) Stephan Wenger (Vidyo) Paul Witty (Silverflare) Webex: ------ Espen Berger (Cisco) Gonzalo Camarillo (Ericsson) (AD) Keith Drage (Alcatel Lucent) Cullen Jennings (Cisco) Dan Romascanu (Avaya) ==================================================== Notetakers: Mary Barnes, Rob Hansen, Andy Pepperell ==================================================== Recordings: =========== Sept. 19th ---------- - Streaming: https://ietf.webex.com/ietf/ldr.php?AT=pb&SP=MC&rID=8311957&rKey=01cb3a68a6ab26ec - Download: https://ietf.webex.com/ietf/lsr.php?AT=dw&SP=MC&rID=8311957&rKey=f34973ea517e24da Sept. 20th ---------- - Streaming: https://ietf.webex.com/ietf/ldr.php?AT=pb&SP=MC&rID=8329807&rKey=f23021d9f3acea02 - Download: https://ietf.webex.com/ietf/lsr.php?AT=dw&SP=MC&rID=8329807&rKey=9ee5e7c7279affb3 Conclusions: ============ 1) Ticket #16. Agreed to use the term "capture encoding". (Action ii) 2) Do we need additional facilities for the advertisement to indicate limitations on simulcast? Yes. (Action v) 3) Do we need a way to reject a Configure message? Conclusion: Yes. - Under what circumstances do you reject? Only when it is mal-formed or configure requests something that wasn’t advertised. - What does it mean if you reject? The generator of the reject should send a new advertisement. - Do we need a explicit Ack? Assume yes for now. (Action vii) 4) How do we signal support for Clue? Conclusion: Define a feature tag. (Action viii) 5) Ticket #15. What signaling protocol should be used for the CLUE application specific information? Conclusion: General agreement to use XML schema to represent the information to be carried in the protocol. (Ticket remains open as we still need to re-consider reusing existing protocols.) 6) Is there an expectation on establishing a CLUE session that you will immediately (after 200 ok) send/get an advertisement? Conclusion: No. 7) Empty Advertisement: Is this allowed and what does it mean? Conclusion: Yes. It means you have nothing you are willing to send. May also still want media in other direction. 8) Should it be allowed to honor out-of-date advertisements? Conclusion: Yes. Decision of advertiser. Would make sense to honor in particular if you just sent an advertisement that only adds. [Note: renumbering a capture or a missing number indicates that you won’t send that capture anymore.] 9) Are you allowed to send “plain” SIP/legacy media before 1st advertisement (i.e., between 1st O/A with m-line under CLUE control and Configure) when you know it’s CLUE? Conclusion: Define empty advertisement and default config (e.g., one capture). Perhaps just describe within use case and/or state model. New Issues: =========== a) Need to decide whether advertisement is complete information "all" or just a "delta" [Note: current framework is "all"] b) Site Switching: there is an issue when you have multiple captures and do site switching. It needs to be consistent. May need real-time updates for spatial information, time synchronization, etc. - e.g., RTCP, XCON notifications. (Action vi) c) What is advantage of not having a 3rd O/A exchange in the session setup? d) Must the configure message be consistent with the current SDP? Needs further consideration….e.g. order that things are changed, etc. Overarching objective is to minimize what info might be duplicated. e) Need a mechanism to know which m-lines are under CLUE control within SDP. Also, need to define how this works within O/A. f) Can CLUE and BFCP control the same m-line? How do CLUE and BFCP interact? What resources do we want to control with a floor in CLUE? To the level of capture (or scene)? What are semantics? Who drives? Configure or Advertisement? A use case would be helpful. (Action xi) g) Define a (minimal) state machine. (Note this relates to the CLUE instance and is couple with protocol state machine). Actions: ======== i) Chairs: add new issues to tracker ii) Andy: Propose text/definition for "capture encoding" term (Conclusion #1). Work with Mark to get framework updated. iii) Chairs: close ticket #16. iv) Christian: Based on concerns raised in draft-groves-clue-scene-clarification, point out places in the document that are unclear or propose additional text in specific sections to clarify his concerns. v) Andy: Need to define the mechanism to indicate limitations on simulcast in the advertisement. (Conclusion #2) vi) Jonathan: Put out a proposal for discussion of Site Switching (Issue b) vii) Andy: Add text to Framework for rejecting Configure (Conclusion #3) viii) Roni: Feature tag to signal CLUE support - need to fwd our requirements to the SCTP/UDP work to define a CLUE channel (Conclusion #4) ix) Keith: Propose text as to how handling of out-of-date advertisements will work. Describe how this optimizes glare as opposed to current model. (Conclusion #8) x) Christer: propose a mechanism – e.g, grouping, labels or whatever to know which m-lines are under CLUE control within SDP. (Issue e). Also, need to define how this works within O/A. xi) Keith: Propose A use case with CLUE and BFCP controlling same m-line. (Issue f). Way Forward: ============ Documents for the following (revised or new): - Updates to framework for “capture encoding” terminology. (Andy) - Draft-romanow-clue-call-flows – develop signaling model further (Rob, Simon/Roberta) - Telemedical call flow doc – updated to include basic case and more detail added. (Paul) - Draft-even-clue-rtp-mapping – updates based on discussions including “capture encoding” terminology. (Roni, Jonathan) - Working document for CLUE instance and State Machine. (Simon, Roberta) - Signaling proposal based on a strawman for schema. (Simon, Roberta) ============== Detailed notes ============== Mary's notes (Sept. 19): ======================== Data model ---------- - consumer instance would be different than provider instance - may advertise "all" (complete) information or only "delta" - endpoint needs to keep track of config - Stream instantiation is based on advertise, config, SDP info, etc. - Consumer role is independent of Provider role and not coupled like SDP O/A - (Roni) only info that is CLUE specific is spatial information - CONFIGURE defines upper bound, but SDP limit may override - Cullen: there will be problems if you have different limits Ticket #16: ----------- - Conclusion: agree the term "capture encoding" Action: Andy. Propose text/definition on mailing list. Work with Mark to get framework updated. Chairs, close ticket #16. Going off topic: ---------------- - Issue: Can the FW explicitly restrict the use of simulcast? - Jonathan: Do we need a model (richer language) for advertisements to define what is valid CLUE scene clarifications (draft-groves-clue-scene-clarifications): ---------------------------------------------------------------- - discussing optimal versus alternative experiences - issue with how to select if screen limited -- Are alternatives only in capture scene? -- composite versus separate scenes - Site switching (and a reference to the draft in Stockholm). There is an Issue when you have multiple capture scenes and you do site switching. You don't want to send everything. May need to do realtime updates in RTCP (or XCON pkg). Objective is consistency (e.g., someone's arm is continuous frames when it moves beyond its own area of capture) Reviewed each of Christian's suggestions. a), b) and d) are already part of the framework. The concept of a "capture group" c) was not agreed. As for d) it is not a MUST (it is a should). Action for Christian to point out places in the document that are unclear or propose additional text in specific sections to clarify his concerns. RTP ---- - Don't want every capture in SDP. Provider needs to know what consumer can receive. Don't have a way for the provider to know consumer capabilities. Roni discussed the various drafts. With regards to mapping option documents, Christer noted that Draft-westerlund-avtcore-max-ssrc will be submitted to MMUSIC. Roni clarified that these documents are there so they need to be considered (not necessarily endorsing). On the SSRC document (Draft-westerlund-avtcore-max-ssrc), Jonathan L asked whether we negotiate the use of SSRC multiplex? Do you already need to say what m-line you need? Answers: Yes. Paul W: should be able to do CLUE w/o multiplex - just need multiplied m-lines for each encoding group. Can use labels ala BFCP. [Note: Andy's notes are quite thorough here, so I'm not going to type mine out] Call Flows (Rob) ---------------- Must advertisement messages be within what the recipient has negotiated in SDP? Must configure messages be within what the sender has negotiated in SDP? - No. If yes, what to do about middle boxes changing SDP parameters? If no, what happens when someone asks for more than can be sent to them? - Sender always needs to know BW. Receiver does as well due to configure. What happens when a new advertisement message is sent mid-call? Can this invalidate the previous configure? Must it invalidate the previous configure? Should configure messages referring to previous advertisements be rejected? - Yes - there should be a way to reject a configure message. Telemedical Use Case (Paul) --------------------------- - configure tied to SDP (e.g., m-line) - An O/A should be consistent with the most recent advertisement in each direction. New O/A can come before the configure Christer: must send Advertisement before O/A Roni: advertise is peslgptng. Config is different. Should the O/A have info/linkage with the advertisement? Do we need a way to reject a config? Want an explicit rejection Slide 5 (revised): Do you have to send a config? Before config s/b 0 captures. - Must interwork with legacy. - Need to consider pre-conditions Conclusion: need a way to signal CLUE and not assume based on m-lines. What is advantage of not having a 3rd O/A exchange in the session setup? Must the configuration be consistent with the current SDP? Signaling (Rob) --------------- - Consensus: AGree to use SCTP/UDP/DTLS as the transport. Noted that we could share the channel with RTCWEB. - Action: take to list for confirmation (and close Ticket #14) - Action: include motivation for this approach in the documents What CLUE-specific information should be in an initial SDP? What additional CLUE-specific information should be in subsequent SDPs? Must there always be a second round of SDP offer/answer? Conclusion: No. Andy's notes (Sept. 19th): =========================== Day 1 (Wednesday) Topic: Data model. Objective is to try to agree the data necessary to be exchanged; CLUE instance model vs messaging model. Christer: need to understand what is meant by CLUE instance - shouldn't think of what we should transport where, but what data is needed. Jonathan: provider needs to know what it has sent to far end general agreement that we shouldn't define in too much detail actual implementations Roni: do we assume that advertisement is valid until updated by provider. 2 parts to the state - what's advertised and what's chosen. Does 2nd advertisement be a complete replacement or adding / removing sections? What does a new advertisement mean - and if it's a delta model then we need to be able to *remove* as well as add. PaulK: whether full state or difference is up for discussion general agreement that this isn't something we need to decide on right now - whatever we decide needs to be in the data model we can come up with a "deltas" method for Topic: Ticket #16 Mary: We had an e-mail thread about what to call "capture encoding"s - last contribution was from Espen, and there's some discussion about perhaps using "srcname" for this. Bo: srcname is hierarchical and so you could have all captures generated from the same device (eg. a camera) would have most of their srcname in common - e.g. "..///". Jonathan: is it a problem with the model that simulcast is effectively always enabled ("3 out of 9" example - is it possible right now as the framework document stands for a provider to be able to say that it can provide any 3 captures our of 9 available but no single capture more than once). This is believed not to be possible right now - any encoding group allowing multiple encodings effectively signals that simulcast is possible for any captures associated with that encoding group. Andy / PaulW: perhaps we should add a restriction on the number of simulcasts of each capture that's possible General agreement that "capture encoding" is the best term we've come up with for ticket #16 Andy action item: Need to make sure "capture encoding" is defined and agreed on mailing list and then added to framework - stated intent to consult with Mark Duckworth on this Roni: think we lack the ability to express whether different capture scenes can be encoded simultaneously or not. Andy: this should be possible today, given that encoding groups are outside of capture scenes. Need to confirm that this is the current stated definition / structure of framework / data model. PaulK: if you receive an advertisement with multiple scenes, you should get an entry from each scene, e.g. a video and audio capture scene entry for each capture scene, where there might be one capture scene for "main" media and one for "presentation" The issue raised that perhaps we need the concept of capture scene alternatives, for instance a capture scene being able to be defined as an alternative to one or more other capture scene - this would allow, say, a provider to advertise a pre-composed combined main / presentation video stream. At this point, introducing prioritisation of capture scenes was also proposed. Topic: "Discussion of scene and capture scene entry concepts draft-groves-clue-scene-clarifications-00" This discussion focused on points "a)" - "d)" in the "2. Proposals" section of the document. a) and b) already a given - we believe the framework already says this (modulo "capture devices" / "spatially related" tweaks) c) "capture group" - not seen as a solution to this issue - no general support for adding the concept of capture groups Stephen: do need to solve the problem with real-time co-ordinate updates d) no changes should result from this - perhaps a recommendation that automatically chosen captures should always be complete capture scene entries. Christian to clarify the meaning of this, or which pieces of text are wrong My take away from this session was that we really need clarification from Christian exactly whether points a) - d) were all intended to describe modifications to the existing (framework) draft, and if so what deficiencies they were intending to address. Afternoon session RTP - Roni and Jonathan Introduction slide - trying to avoid duplication between SDP and CLUE. Assumption - CLUE systems support different topologies: Point to point, Media mixers, Media switching mixers, Source projection mixers Two SSRC behaviors: static SSRCs (assigned by MCU mixer), dynamic SSRCs (original sources' SSRCs relayed to participants, e.g. via CSRCs) Most of the work so far has assumed session multiplexing rather than SSRC multiplexing. Several relevant drafts for this sort of thing: source attribute (RFC 5576) to describe attributes of RTP sources based on their SSRCs, RFC6236 for generic image attributes, draft-westerlund-avtcore-max-ssrc for multiple SSRCs within an RTP session, draft-westerlund-avtcore-rtp-simulcast Cullen: number of SSRCs is different to the number of streams - so not necessarily as relevant for CLUE. Cullen: seems to be a general need for more linkage between SDP and application layers such as XCON, CLUE, webrtc - maybe we want to solve this in a general way PaulK: support for SSRC multiplexing so far has been geared towards decision being made by sender (at the static vs dynamic level, anyway). Do we need to be able to specify an m-line to use when requesting a capture encoding? General feeling is yes - might also need provider to signal which m-lines are valid for each capture (or perhaps this coul dbe done at the encoding group level?) Mary's notes (Sept. 20th): =========================== Issue summary from Wednesday -------------- - Reviewed and updated the list of issues from Wednesday (in the chairs charts) 1. Need to decide whether advertisement is complete information "all" or just a "delta" [Note: current framework is "all"] 2. Can the FW explicitly restrict the use of Simulcast? 3. Site Switching: there is an issue when you have multiple captures and do site switching. It needs to be consistent. May need real-time updates (e.g., RTCP, XCON notifications). 4. Call Flow: Do we need a way to reject a Configure message? 5. Call Flow: How do we signal support for Clue? 6. Call Flow: What is advantage of not having a 3rd O/A exchange in the session setup? 7. Call Flow: Must the configuration be consistent with the current SDP? 8. Ticket #12: What CLUE information is carried in SDP? - What CLUE-specific information should be in an initial SDP? - What additional CLUE-specific information should be in subsequent SDPs? 9. Ticket #13: What CLUE information is carried in CLUE specific signaling? 10. Ticket #15: What signaling protocol should be used for the CLUE information? General agreement to use XML schema for the information. 11. What information is required to be supported (i.e., must specify whether information is optional or mandatory)? Telemedical Call Flow - simple version (Paul) ------------------------------------ - 1st Invite is before CLUE - Christer: concerned about 491s - Paul: could introduce a mechanism if we think there will be a glare problem - can see CLUE glare on configure. - Rob: depends upon how much the SDP will change based on the advertisement as to whether you need a re-Invite. Must re-invite if advertisement impacts SDP. - is there an order in which the messages occur or must the CLUE entities be able to handle them in any order - Andy: basic audio and basic video is optional. If you do that need to define when that ceases. - Andy: do both advertisements need to occur before re-Invite? - Discussion: NO. They are not synchronized - Roni: we're assuming a system in which you have a SIP call for multiple cameras, etc. (versus multiple SIP calls More complex flow - MCU case (Paul K) ------------------------- - Discussion of empty configure and empty advertisement. - Christer and Roni: No. Issues: 1) Is there an expectation on establishing a CLUE session that you will have an advertisement. 2) Is it possible to send an empty advertisement (or should you)? If so what are the semantics and when is it allowed/valid to do so? Christer: need semantic for advertisement Andy: Agree. Maybe we need to consider a media sync. Don't need synchronous advertisements Rob: empty advertisement means something bc you could have been doing something thus empty advert stops that Rob: you can still send media w/o CLUe messaging Roni: concerned about state mis-match Paul K: Depends upon what state you are in. Startup or default? Ronis: is this mode only if there wasn't a previous advertisement? Rob: empty advert is "I'm not doing CLUE in this direction". Roni: if you do that, can only do before a configure Jonathan: this is like BFCP - if it has token and is controlling m-line. Need to know which m-lines are under CLue control. Paul K: Suggest an advert this is default (not empty). Want a non-configure state for before 1st advert Issue: Do you send media before 1st advertisement when you know it's CLUE? Issue: Can CLUE and BFCP control the same object/m-line? Note: need to associate a floor with a capture. REview of issues (as summarized in chair charts): ------------------------------------------------- - Note that conclusions and actions for the issues are captured in the chair charts. A couple of new issues did arise during this review that need to be added to the list. - Issue: What happens (i.e., what's the behavior) if a CLUE channel goes down? What does media do? Should you send an advertisement or configure when the CLUE channel comes back up? Action: We need to feed our Reqs into the SCTP/UDP work in order to define a CLUE channel. Rob's notes (Sept. 20th): =========================== 9am start, 2012-09-20 Mary presented some initial issues for discussion that came up yesterday: * Need to decide if information is complete or a delta - framework currently assumes all * The framework defaults to simulcast being available unless specified otherwise. It's not clear if it's possible to communicate all possible restrictions a sender may have given the current encoding framework. * Site switching: In the switched case we may want to send real-time updates of the originating spatial information from the originating source. * Do we need a way to reject a configure message (and what does it mean if we do so) * How do we signal support for CLUE * What is the advantage of not having a third O/A exchange at the start of the call * Must the configuration be consistent with the most recent SDP * Ticket 12: What CLUE information is carried in SDP * Ticket 13: What CLUE information is carried in the CLUe-specific signalling * Ticket 15: What signalling protocol should be used for the CLUE information (general agreement on XML messages in SDP over UDP) * What information is mandatory and what is optional 9:20 Paul had updated his call-flow, and stripped it down to now be a point-to-point call. He now went through this. There was concern over the likelihood of glare - alternatives such as mandating an INVITE from one side seemed less good, though. Andy pointed out that this was the first time we'd started talking heavily about new O/As to add m-lines; we had always stated that these were needed but this is the first time we'd taken to discussing it. There was agreement that while in most cases implementations will probably use one m-line per media type we need to provision for the more complicated case. Paul then went on to the draft itself, which contained the more complicated version of the call-flow. Christer wondered why the MCU was sending an initial empty advertisment rather than holding off until a second party called in. Issue: On establishing a Clue channel, should there be an expectation that both sides immediatelly send an advertisment message. Issue: Is it possible to construct an 'empty' advertisment that advertises that nothing is available. There was debate about what the state is when the call starts but before an advertisment/configure is done. Paul K suggested that the default advertisment could match the RTP state (eg, it would be if it wasn't there). The alternative was that the default state should be that nothing can be sent until the far end has sent a configure in response to an advertisment. Rob also asked if m-lines should be explicitely bound (or unbound) to clue. There was general agreement that this was probably the case. Paul K asked if it would be necessary to associate an m-line with both clue and bfcp. Jonathon pointed out that we would presumably need to associate captures (or something else) in CLUE with the floors. Issue: What is the 'default' state of the media before an advertisment/configure is sent? Issue: Do m-lines need to be explicitely bound or not bound to CLUE control? Issue: How will CLUE and BFCP interact? Break Restart 10:47 Andy volunteered to take a look at whether limitations were needed for simulcast and propose a change. There was discussion of forwarding the original spatial information in the switched conferencing case. While RTCP SDES was seen as a potential place, it was not seen as a particularly good one. The roster list was seen as a better candidate, though Andy pointed out that it could involve a lot of information. Jonathon was volunteered to take charge of this issue and see if he could make a proposal. There was dicussion of whether it was necessary to reject a configure message. There was debate over whether a configure should refer specifically to a previous advertisment and be rejected if it is not the most recent, or whether advetisments should be constructed so that changes would be unambiguous when a configure was received (and configurations that couldn't be fully met would only send the portions that *could* be sent). There was also debate over whether a malformed configure should be rejected; Keith was concerned that adding rejection messages made CLUE into a negotiation protocol, which is SDP's job. Finally, there was debate about potentially unsyncronised state (for instance, because the transport dropped and was reestablished). Most people felt we would need an explicit ACK/NACK, though this was not unanimous. There was discussion of feature tags. People generally felt that feature tags would be useful (and would signal that CLUe is *possible*, not that its being done), while options tag were not appropriate. This meant we wouldn't have 'require' functionality, but no one felt very strongly about that. People also felt it was important that implementations should be able to tell that CLUE was being used from the initial offer/answer (eg, without it being negotiated in SCTP or something). There was a significant disagreement over the need for consistency at all times between configuration and current SDP - some people felt that configurations should always be consistent with the current SDP, while others felt that this requirement was unnecessary. When talking about what information required by CLUE should be in SDP, Rob wondered if SDP was the best place to relate CLUE encoding ID to RTP SSRC, as it adds CLUE-specific information to SDP and couples the two together. There was general agreement that while there are many decisions to be made, having an XML schema as a strawman starting point. Break Restart 12:50 People explored whether at the start of a CLUE establishment a sender must send an advertisment, and what the default state was before an advertisment was sent. There was debate about whether implementations should be able to honour out-of-date advertisments. There was discussion of whether it was better to have a sequence number per advertisment (and each advertisment invalidates the last - configurations referring to this are rejected) or whether capture ids should be unique (so a configuration referring to a previous advertisment may still be valid). Keith was tapped as proposing text for the latter to show its advantages. Christer agreed to look at a mechanism for specifying in SDP which m-lines are under CLUE control. There was discussion of what the default state is when the CLUE O/A is first established. While the 'empty', no captures available state was the simplest, it was pointed out that in escalating from CLUEless to CLUEful this would cause a glitch that isn't strictly necessary. No one was sure how important a problem this was to work around. It was acknowledged that more investigation is required for how BFCP and CLUE will interact. Finally, action items were assigned - see the issue tracker.