CLUE, 80th IETF, Prague, Czech Republic

Date:                      Wednesday, March 30, 2011
Location:
               Prague, Czech Republic
Chairs:
                   Mary Barnes, Paul Kyzivat
Note Taker:
            Stephan Wenger, Christer Holmberg
Minutes Editor:       Paul Kyzivat
Jabber Scribes:     Dan Burnett
Recorded playback (with Jabber transcript):

 http://meetecho-ietf.comics.unina.it/recordings/download/CLUE.jar

Agenda bash, Status and items of interest

Presenter:     Mary Barnes
Slides:
         
http://www.ietf.org/proceedings/80/slides/clue-0.ppt

Agenda bash: No discussion, ok

Status: First WG session, nothing else to report

Use Cases

Presenter:     Peter Musgrave
Slides:
         
http://www.ietf.org/proceedings/80/slides/clue-1.ppt
Document:   
draft-romanow-clue-telepresence-use-cases

Summary of action items:

1)       Terminology for describing use cases needs to be clearly specified – i.e., left, right, etc.

2)       Need to more clearly define the difference between a heterogeneous use-case and an asymmetric use-case,  noting that each type of use case could also involve a legacy device that might not understand the new features,  protocol elements etc.  defined in CLUE. (Action: Peter)

3)       It was suggested to add additional variations in the use cases from the user’s perspective (Action: Mark Duckworth)

4)       Provide additional use cases as the current ones only represent currently deployed functionality. (Action: Stephan Wenger to provide input)

5)       Audio should be included in the cases (Action: Peter)

Conclusion: 

Unanimous support for document as a WG document.  Editor to submit next version of document as a WG -00 document .

Detailed Discussions:

Slide deck presentation

(slide 3x3 example)

Christer Holmberg: need indication/specification of direction of viewpoint for cameras (agreed by speaker)

Slide before Multi-party TP use cases, first slide:

Christer Holmberg: this is two party pair-to-pair, without MCU, drafts needs to be clear on that.

Sohel Khan: Multi-point use case: what does 4/3 mean: four cameras, three screens, No mixing in MCU,  both?

Roni: Use case describes common use cases, the doc can generally activity based you can decide whether you switch or mix???

Slide “additional use cases”:

Christer: “traditional devices” is a special form of asymmetric use case.  What is the difference?

???: Difference is: whatever is done in CLUE is not supported by traditional devices, to be clarified in the draft

Roni: there is no definition in the semantic of M lines today for semantics, this is what we are going to do.

Christer: far enough

Allyn: three things being discussed: 1) “asymmetric” different number of cameras/screens.  2) Heterogeneous: means two things, referring to devices not all the same type (immersive telepresence vs. single person vc … includes phone, software, …), and 3) legacy: devices supports/does not support CLUE.  Permutations possible.

Mark Duckworth: Small number of broad use cases.  Need to flesh out more details.   Speaker: give us those. 

Mark: Some are mention in doc already.  Education case. 

Allyn: two different types of use cases.  Use case draft addresses use case from POV of user, not implementation or else.  Was Mark talking about implementation stuff?  This should not be in the same draft.

Mark: agreed.  But even from user’s viewpoint, there are more.

Chair: Mark to provide those.

???: Up to the group

Stephan: need more use cases, not only from market leaders. 

Chair: propose. 

Stephan: will do

Magnus: ??? (missed - audio stuff?)

Speaker: agreed.

Sohel Khan: MCU related.  Do we do floor controllish stuff?

Speaker: not in our charter.

Roni: XCON is doing this stuff

General:

Chair: How many people read the draft, willing to contribute?
Each about 20

Chair: Hmm - who supports making this a WG document?

Requirements

Presenter:     Allyn Romanow
Slides:
         
http://www.ietf.org/proceedings/80/slides/clue-2.pdf
Document:   
draft-romanow-clue-telepresence-requirements

Summary of action items:

1.        Ensure consistency in terminology (within this document and between other documents): 

a.        Clarification of session vs endpoint vs room.  

b.       audio should be described in terms of “rendering” rather than “layout”.

c.        Define “media stream”

d.       Define  “device”

2.        General: 

a.        Split requirements into senders, receivers and middleboxes (intermediaries) and add a definition for the latter.

b.       Remove notes in the requirements.

c.        Requirements should be written from the perspective of “including” “a mechanism to support…”

3.        Remove ASMP-2 and ASMP-7. 

4.        REQMT-3.  Receiver could also be endpoint or middlebox.

5.        REQMT-4.  Clarify what is meant by synchronization. Also use term “media” as opposed to audio, video, etc. 

6.        REQMT-5.  It was suggested that the max number of devices not be restricted by CLUE.

7.        REQMT-6. Negotiation needs to be clearly described.

8.        REQMT-7. Remove example.

9.        REQMT-8.  Make specific – i.e., simulcast, SVC, etc.

10.     REQMT-10. Split into multiple requirements.

Conclusion: 

Document needs to be updated before being considered as a WG document.

Detailed Discussions:

Assumptions:

  

ASMP-1:

Christer: what do you mean by disjoint?  What do you mean by source description in ASMP-2.  Needs to be clarified

???: Disjoint: not in the same stream.

Michael Richardson: requirement for end system -  do you want to write requirements for protocol?

Alynn: ???

Stephan: precision needed for stuff like Stream

Chair: propose ???

Kashnawish: what is a virtual endpoint.  Embedded agent in device, ???

Allyn: ???

Christer: More imprecision: “session is the same as RTP session”. 

Allyn: agreed

???: What does endpoint mean?  Is a room an endpoint, or is one camera an endpoint.

Allyn: endpoint is defined.

Steve Botzko: Endpoint endsystem /room sources and sinks media streams (plural).  Endpoint is a room.

Jim Cole: how do you distinguish endpoint/room from user agent? 

Allyn: no SIP terminology. 

ASMP-2:

Jim Cole: Motivation for ASMP-2. 

Allyn: Endpoint should be “receiving endpoint”.

Jonathan Lennox: there is no requirement that endpoints render in a certain way, but it’s a recommendation

Steve Botzko: we are enabling endpoint to take hints, if there is no info that does not enable it, we are not doing our job.

Steve Botzko: something about gaze /spatial relationship with source, even for single screen. 

Allyn: is this assumption?

Magnus: Jonathan assumption is good.  In general, most devices

Stephan: no standards policy: cannot enforce left/right implementation in endpoint

Christer: ASMP-2 is not an assumption, it is a requirement for certain use cases.

Allyn: assumption is “you want to do your best”

Magnus: let’s call this requirement

Roni: ???

Allyn: ASMP-2 is gone

ASMP-3:

???: ‘systems’ is not clear. Does it mean endpoints, or does it also include MCUs?

Stephan: think about “either”

Christer: Layout definition.  Layout should cover audio.  How is a stream rendered on a display?  Have to worry about multiple displays.

Allyn: definition is fine

Steve Botzko: different definition for audio sound field

Jonathan: “rendering” is a generic term.

Botzko: rendering is not good in the context

John Elwell: what does “remotely” mean here

Allyn: ??? 

John: The “system” need to refer to anything in the chain.

Mark: Example for how layout can be done remotely.

ASMP-4:

Mark: for assumptions like this, help us to get to the requirements. 

Jonathan: useful to define the problem space more formally(?) than the use case doc does

Stephan: use “all” instead of both

ASMP-6:

???: lots of subtext here.  Not overly concise.

Christer: this verbose stuff should go to intro

Allyn: agreed;

Roni: doc structure.  Do we keep this as a single doc for problem statement and assumptions, and requirements.

Chair: yes, one doc.

Jonathan: “call doc problem statement and requirements;

Chair: we could, but requirements would be fine as well

ASMP-7:

Stephan/Allyn: discussion about mandatory; mandatory probably has to go.

Magnus: it’s necessary to separate sender/receiver/middlebox

James Polk: ???

Chair: Decision: ASMP-7 goes

Jonathan: there is some use of putting this stuff somewhere.

ASMP-8:

???: Remove this, as it is part of the charter

Requirements:

 

REQMT-1

???: The requirement shall be split into separate requirements for stream relationship and sender/receiver capabilities

???: The note shall either be removed, or moved into a separate requirement

Cullen: wants clarification if spatial relationships is about left/center/right, or more detail about physical size/placement that allows life size rendering.

Christer: asks that capabilities and relationships be split into two requirements. He also wants “middlebox” replaced with something like MCU – more specific to clue.

Allyn: wants to take the note out of this. Others think this perhaps should be converted in separate requirement(s) on an MCU. Jonathan L said more about MCU requirements, and was volunteered to contribute text.

Keith: general comments: The solution most support? 

???: Yes, it’s a MUST.  All have the same strength.

Cullen: what is spatial relationship? Just screen positions or full 3-D?  Start add this level of detail in requirements:

Kashnavish: what about audio stuff in second bullet

Chair: audio needs to be included

Jonathan: definition section: middlebox, MCU,   Two bullets are at least two requirements, perhaps more. (editorial)

Christer: Relation capabilities and stream relationship are different things, split in two requirements.

Christer: Regarding note about middlebox: if you want to keep it, move into separate requirement and describe what middlebox is - its probably MCU right?

 Sohel Kahn : take note out?

Chair: take note out, move useful stuff to definitions.

Christer: requirements for MCUs should be written as requirements for MCU

James Polk: Really greenfield second bullet?  Maximum resolution/frame rate? 

???: Will be coming into protocol

Jonathan Lennox: need section on middlebox requirements

REQMT-2

Keith: first, second bullet misses the requirement.  SIP/SDP do what’s in first bullet.  What needs to be said is telepresence characteristics.

Christer: changes, does this require re-negotiation?

Allyn: other requirement.  This is about sending characteristics.

Steve Botzko: dynamic behavior includes spatial behavior during conference?  If yes, add bullet for it.  (Layout feels more like rendering)

Allyn: ok

Magnus: some of this requires renegotiation, others can just happen. 

Peter Musgrave: would it be more precise that when renegotiation happens, we need to restate semantics of CLUE information?

Christer: renegotiation is the wrong word.  “Signaling”?    Perhaps there is stuff where floor control is appropriate. 

REQMT-3:

Christer: wants clear distinction between requirements that the protocol must support, what implementations must support, and what implementations must do.

John: disagrees – he thinks the requirements here apply to the mechanism document, and that doc will spec what to implement/use.

Christer: (editorial): receiver can be either endpoint or middlebox

Chair: fix in definition

Jonathan: ???

Allyn: ???

Jonathan: use “virtual stream”?  

???

Keith: which streams to receive?  Do you suggest modifying SDP?

Chair: We don’t know yet

Magnus: makes a use case that cannot be done with SDP.

Christer: two requirements here.  First sentence is one requirement, and rest is other requirement.

Allyn: yes, second part is covered.

REQMT-4:

???: Sync is about lip sync or something else? 

Allyn: Something else

Steve Botzko: “synchronize” is wrong word, take it to the list.

Allyn: lip sync it is

Steve Botzko: in this case everything needs to be synchronized

Khan: how about text (call it media)

Khan: ???

Tom Wang: ???

Allyn: something has to be considered as part of video

Christer: ???

Jonathan: ???

Steve Botzko: Distinguishing text messages from closed captioning.  General text message (timed text) is in charter, other text may be not.

???

Christer:  there is “what you must support” and “what you must do”. Synchronization has to be ???

Roni: do not like media stream: need to define

Stephan: define media stream as “audio, video, timed text” and restrict REQMT-4 to single endpoint:

???

John Ewell: Christer’s point (must support and must use) all these should be rephrased to “include a mechanism to support”.

Steve Botzko: agreed, what protocol needs to support, and not what systems need to support.  Also RTP stream out of scope - would be far end camera control

REQMT-5:

Keith: “arbitrary” is undefined

Stephan, Jonathan: “left/center/right” is too narrow, but arbitrary is too fine. 

Christer: number of media capture devices may be different. 

???

Steve: in addition to arbitrary, we need to add “asymmetric”

Mark: number of capture devices in conference.  Should there be a separate requirement?

Christer: ???  no number here

Mark: required for protocol.  Protocol needs to support so many endpoints/streams, what not, no that’s not good,

Chair: are 32 bits good enough?

Brian Rosen: wording makes unfortunate association that number of capture devices equals number of streams. There is no one-to-one requirements

???: Should consider the number of ???

Jonathan: 2000 x 500 endpoints protocol has to support that much.  Scaling to large number of  ??? No practical restraints on number

???: The "arbitrary" terminology is unclear, and the WG should not try to come up with maximum number of capture devices etc.

REQMT-6:

Christer: what is negotiation and re-negotiation: media and/or session parameters

Peter, Stephan: what is negotiated here:

Allyn to clarify.

REQMT-7

Andrew Allen: suggests to replace “heterogeneous” with: “devices with different abilities”

Christer: Only first part of requirement is needed. The rest should be covered in the definitions section.

REQMT-8

James Polk: Wants “default”, MUST support. This needs to be clarified.

Keith: REQMT-8 is subset of req REQMT-7. 

Roni: requirement came from simulcast and/or SVC

Paul Cliverdale: audio quality would refer to different whatevers?  Nothing ruled out?

Doug: ???

REQMT-9

Allyn: must be rewritten as requirement.

Charles Eckel: idea is sender provides ???

REQMT-10

Christer: two requirements, one source selection, the other layout.  Also applies to audio

Roni: distinction here p2p vs. multipoint.  Specific view of the room

Christer: for me, source is a source from a specific endpoint, not an endpoint itself.  Need definition for source.

REQMT-11

???: not just VAD – “segment” selection.

Steve Botzko: is this about rendering or segment switching? 

Allyn: Segment switching; needs to be adjusted in doc

REQMT-12

Paul  Cliverdale: need numerical values.

???: Must not add latency : increase of SDP is OK.

Chair: way forward: Requirements doc needs one more pass

It was indicated that some of the assumptions probably belong to the WG charter, or to the problem statement description of the WG. Some of the assumptions will be removed, and the text associated with those assumptions might be moved elsewhere.

There was a general comment that every requirement should be written in a consistent way, indicating what the solution/system needs to support.