idnits 2.17.1 

draft-roach-mmusic-mlines-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 31, 2013) is 4096 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-54) exists of
     draft-ietf-mmusic-sdp-bundle-negotiation-01


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	MMUSIC                                                       A. B. Roach
3	Internet-Draft                                                   Mozilla
4	Intended status: Informational                          January 31, 2013
5	Expires: August 4, 2013

7	       Thoughts on syntax for representing multiple media streams
8	                      draft-roach-mmusic-mlines-00

10	Abstract

12	   This document briefly explores the ramifications of combining
13	   multiple media streams into one SDP m= section versus expressing each
14	   in its own m= section.

16	Status of this Memo

18	   This Internet-Draft is submitted in full conformance with the
19	   provisions of BCP 78 and BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF).  Note that other groups may also distribute
23	   working documents as Internet-Drafts.  The list of current Internet-
24	   Drafts is at http://datatracker.ietf.org/drafts/current/.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   This Internet-Draft will expire on August 4, 2013.

33	Copyright Notice

35	   Copyright (c) 2013 IETF Trust and the persons identified as the
36	   document authors.  All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents
40	   (http://trustee.ietf.org/license-info) in effect on the date of
41	   publication of this document.  Please review these documents
42	   carefully, as they describe your rights and restrictions with respect
43	   to this document.  Code Components extracted from this document must
44	   include Simplified BSD License text as described in Section 4.e of
45	   the Trust Legal Provisions and are provided without warranty as
46	   described in the Simplified BSD License.

48	1.  Introduction

50	   As part of the ongoing RTCWEB and CLUE work, it has become clear that
51	   the current mechanisms in SDP are insufficient for describing complex
52	   sessions with multiple streams.  Two competing schools of thought
53	   have emerged.  One holds that the m= lines should apply to RTP
54	   sessions, regardless of how many media streams they contain.  Another
55	   holds that m= lines should apply to media streams exclusively, and
56	   that an additional mechanism should be applied to combine multiple
57	   streams into a single RTP session, if necessary.

59	2.  Alternatives

61	2.1.  Alternative 1: Multiple streams per m= section

63	   One approach to specifying multiple streams in a single RTP session
64	   is to put information for several streams into a single m= section;
65	   and, by doing do, implicitly combine them into a single session.

67	   To maintain some level of backwards compataibility with SDP, this
68	   approach might choose to have one m= section for audio and a second
69	   for video (with additional m= sections for other media types if they
70	   are used in the future), combining those sections with a=group:BUNDLE
71	   [I-D.ietf-mmusic-sdp-bundle-negotiation]; we will call this
72	   "Alternative 1a".  An alternate approach would be the definition of a
73	   new media type which effectively allows transmission of any kind of
74	   media, thereby avoiding the need to bundle multiple sections together
75	   at all.  A syntax for such an approach is proposed by
76	   [I-D.holmberg-mmusic-sdp-mmt-negotiation].  We will call this
77	   "Alternative 1b".

79	   In both of the cases described above, certain SDP attributes might be
80	   targeted at only one of the streams in an RTP session.  These
81	   attributes can be matched up with individual streams using the
82	   "a=ssrc" extension defined in [RFC5576].

84	   For "Alternative 1a", we have the additional challenge of specifying
85	   attributes that apply to the entire RTP session, such as a=rtcp-fb
86	   and ICE candidate parameters.  One approach would be inclusion of
87	   such parameters only in the first m= section within a bundle, with
88	   the implication that they apply to the entire session.

90	2.1.1.  Alternative 1a: One section per RTP session per type

92	   v=0
93	   o=- 2890844526 2890844526 IN IP4 host.example.com
94	   s=
95	   c=IN IP4 host.example.com
96	   t=0 0
97	   a=group:BUNDLE c1 c2
98	   m=audio 10000 RTP/AVP 0 8 97
99	   a=mid:c1
100	   a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
101	   a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
102	      192.0.2.240 rport 51091
103	   a=rtpmap:0 PCMU/8000
104	   a=rtpmap:8 PCMA/8000
105	   a=rtpmap:97 iLBC/8000
106	   a=ssrc:11111 label:speaker-audio
107	   a=ssrc:22222 label:floor-mic
108	   m=video 10000 RTP/AVP 31 32
109	   a=mid:c2
110	   a=rtpmap:31 H261/90000
111	   a=rtpmap:32 MPV/90000
112	   a=ssrc:33333 label:speaker-video
113	   a=ssrc:44444 label:slides

115	2.1.2.  Alternative 1b: One section per RTP session

117	   v=0
118	   o=- 2890844526 2890844526 IN IP4 host.example.com
119	   s=
120	   c=IN IP4 host.example.com
121	   t=0 0
122	   a=group:MMT foo bar zoe
123	   m=anymedia 10000 RTP/AVP 0 8 97 31 32
124	   a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
125	   a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
126	      192.0.2.240 rport 51091
127	   a=rtpmap:0 PCMU/8000
128	   a=rtpmap:8 PCMA/8000
129	   a=rtpmap:97 iLBC/8000
130	   a=rtpmap:31 H261/90000
131	   a=rtpmap:32 MPV/90000
132	   a=mmtype:0 audio
133	   a=mmtype:8 audio
134	   a=mmtype:97 audio
135	   a=mmtype:31 video
136	   a=mmtype:32 video
137	   a=ssrc:11111 label:speaker-audio
138	   a=ssrc:22222 label:floor-mic
139	   a=ssrc:33333 label:speaker-video
140	   a=ssrc:44444 label:slides

142	2.2.  Alternative 2: Single stream per m= section

144	   An alternate proposal is constraining one m= section to talk about a
145	   single media stream.  Like alternative 1a, above, the BUNDLE
146	   extension is used to combine several m= sections into a single RTP
147	   session.  Any attributes that are applicable to a single media stream
148	   can be correlated by putting them in the corresponding m= section.
149	   Any attributes that apply to the transport paramters (e.g., rtcp-fb,
150	   ICE parameters) are conveyed in the first m= section within the
151	   bundle (alternate schemes are possible, but this seems the simplest
152	   and most straightforward).

154	   v=0
155	   o=- 2890844526 2890844526 IN IP4 host.example.com
156	   s=
157	   c=IN IP4 host.example.com
158	   t=0 0
159	   a=group:BUNDLE c1 c2 c3 c4
160	   m=audio 10000 RTP/AVP 0 8 97
161	   a=mid:c1
162	   a=label:speaker-audio
163	   a=rtpmap:0 PCMU/8000
164	   a=rtpmap:8 PCMA/8000
165	   a=rtpmap:97 iLBC/8000
166	   a=candidate:0 1 UDP 2113601791 192.0.2.240 51091 typ host
167	   a=candidate:1 1 UDP 1694194431 198.51.100.32 51091 typ srflx raddr
168	      192.0.2.240 rport 51091
169	   m=audio 10000 RTP/AVP 0 8 97
170	   a=mid:c2
171	   a=label:floor-mic
172	   a=rtpmap:0 PCMU/8000
173	   a=rtpmap:8 PCMA/8000
174	   a=rtpmap:97 iLBC/8000
175	   m=video 10000 RTP/AVP 31 32
176	   a=mid:c3
177	   a=label:speaker-video
178	   a=rtpmap:31 H261/90000
179	   a=rtpmap:32 MPV/90000
180	   m=video 10000 RTP/AVP 31 32
181	   a=mid:c4
182	   a=label:slides
183	   a=rtpmap:31 H261/90000
184	   a=rtpmap:32 MPV/90000

186	2.3.  Pros and Cons

188	2.3.1.  Codec Selection

190	   Currently, in SDP and the various documents that rely on it (such as
191	   [RFC3264]), there are certain assumptions made about the ordinality
192	   of streams to m= sections.  Consider, for example, wanting to convey
193	   two audio streams with a low-bandwidth voice codec preferred for one,
194	   but a high-quailty codec preferred for the other.  RFC 3264 has rules
195	   indicating that codecs are conveyed in the order of their preference.
196	   With alternative 2, it is trivial to provide different ordering (or
197	   even a different set) of codecs to acheive such a goal.  Alternatives
198	   1a and 1b lack the ability to do so without additional extensions.

200	   This set of facts supports alternative 2 in preference to
201	   alternatives 1a and 1b.

203	2.3.2.  Port Number Handling

205	   When multiple sections are used to represent a single session, we
206	   need to make a decision regarding the port number conveyed in the m=
207	   line itself.  One option is to use the same port number in all
208	   related m= sections.  According to Cullen Jennings, this interacts
209	   very poorly with existing implementations that use SDP.  The other
210	   alternative is to indicate bogus port numbers in all (or all but one)
211	   of the m= lines.  According to Hadriel Kaplan, this usage will lead
212	   to certain media intermediaries destroying the session when it
213	   determines that a signaled port is going unused.

215	   Alternative 1b avoids this problem altogether by having only one m=
216	   per IP/port combination, thereby completely sidestepping the question
217	   of what to put in subsequent m= lines.

219	   This set of facts supports alternative 1b in preference to
220	   alternatives 1a and 2.

222	2.3.3.  Attribute handling

224	   Attributes that appear inside m= sections can be generally broken
225	   down into three categories: those intended to apply to a single media
226	   stream (e.g., framerate); those intended to apply to an RTP session
227	   (e.g., rtcp-fb), and those that are explicitly bound to the m= line
228	   itself (e.g., rtpmap).  By and large, these attributes have been
229	   defined with an assumption that each RTP session had one stream and
230	   vice-versa.

232	   By specifying a model that breaks this one-to-one correspondence, we
233	   have created the need to be able designate a specific media stream
234	   within an RTP session (for alternatives 1a and 1b), or the need to be
235	   able to talk about session-level attributes (for alternatives 1a and
236	   2).

238	   Alternatives 1a and 1b can perform stream-level designation through
239	   the use of the ssid attribute specified in [RFC5576].  Alternatives
240	   1a and 2 can apply a convention that any RTP-session-level attributes
241	   are placed in the first m= section in a bundle (although other, more
242	   complicated approaches may also be possible).

244	   Note, in particular, that alternative 1a inherits both problems of
245	   being able to designate attributes as applying to a single stream, as
246	   well as being able to talk about session-level attributes when
247	   multiple m=lines are bundled together.

249	   This set of facts supports alternatives 1b and 2 in preference to
250	   alternative 1a.

252	2.3.4.  What We're Unaware of Not Knowing

254	   It is worth noting that the problem described in Section 2.3.1 was
255	   not discovered for quite a long time after the discussion of multiple
256	   media streams had begun.  In the characterization of "known knowns,"
257	   "known unknowns," and "unknown unknowns," this issue remained an
258	   unknown unknown for more than a little time.

260	   Generally, addressing these unknown unknowns is likely to be easiest
261	   if we have the highest granularity of control.  Alternative 2, by
262	   breaking each stream apart into its own instance of the control
263	   structure that has historically been used to work with media (the m=
264	   section), provides this high granularity where alternatives 1a and 1b
265	   do not.

267	   It is the author's opinion that the probable existance of such
268	   unknown unknowns favors alternative 2 over 1a or 1b.

270	2.4.  Red Herrings

272	   During the course of discussing this topic, several points have been
273	   raised that, while relevant, do not bias the selection of one
274	   solution over another.

276	   One issue that has been brought up is that SDP offer/answer requires
277	   signaling of the number of m= sections in the offer, to allow clear
278	   semantics for negotiation.  Some proponents of solutions 1a and 1b
279	   have indicated a belief that allowing multiple streams per m= section
280	   avoides this restriction.  This assertion has a number of problems.
281	   First, it assumes that implementations can perform reasonable
282	   operations on dynamically created media streams that begin and end
283	   without any signaling.  It further assumes that the problems that the
284	   offer/answer model imposed the m-line restrictions for are no longer
285	   applicable (at least, not on a stream level).  Finally, this
286	   assertion assumes that no control surfaces are necessary to talk
287	   about and/or manipulate the individual streams (alternately, if such
288	   control surfaces are introduced, then additional SDP round-trips to
289	   exchange information about those controls is necessary, making them
290	   semantically equivalent to a new offer/answer exchange -- which
291	   eliminates any purported advantage).

293	   It has also been observed that, in addition to being sometimes
294	   applicable to streams and sometimes applicable to sessions, attribute
295	   are also sometimes unidirectional, and sometimes bidirectional.
296	   While an astute observation, this does not appear to have any bearing
297	   on the ultimate solution selected, as all three alternatives face
298	   exactly the same challenges in dealing with issues of directionality.

300	   Finally, it should be noted that any decision to include multiple
301	   sections within a single m= section does little to simplify
302	   implementation.  Even if native RTCWEB implementations generate the
303	   fewest m= sections necessary to convey their desired session state,
304	   the selection of alternatives 1a and 1b does not obviate the
305	   requirement that implementations must be able to receive SDP with
306	   several m=audio sections (for example).  Interoperation with legacy
307	   implementations, even through a gateway, will require that proper
308	   handling of such session descriptions is present in every RTCWEB
309	   implementation.

311	2.5.  Summary

313	   The following table summarizes the pros and cons conveyed in the
314	   preceding sections on a per-solution basis.

316	                      +---------------+----+----+---+
317	                      | Issue         | 1a | 1b | 2 |
318	                      +---------------+----+----+---+
319	                      | Section 2.3.1 | -  | -  | + |
320	                      | Section 2.3.2 | -  | +  | - |
321	                      | Section 2.3.3 | -  | +  | + |
322	                      | Section 2.3.4 | -  | -  | + |
323	                      +---------------+----+----+---+

325	   Based on these criteria, it is the author's belief that Alternative 2
326	   provides the most benefit, with Alternative 1b providing a close
327	   second place.

329	   Alternative 1a has the remarkable property of combining all of the
330	   drawbacks of solutions 1b and 2, forming a kind of "sweet-spot" of
331	   ill-advisement, and thereby maximizing the amount of work required of
332	   the MMUSIC, RTCWEB,and CLUE working groups.

334	3.  IANA Considerations

336	   This document makes no requests of IANA.

338	4.  Security Considerations

340	   The author does not beleive that the syntax under discussion has an
341	   impact on the security properties of those protocols that make use of
342	   SDP.

344	5.  Normative References

346	   [I-D.holmberg-mmusic-sdp-mmt-negotiation]
347	              Holmberg, C., Alvestrand, H., and J. Lennox, "Multiplexed
348	              Media Types (MMT) Using Session Description Protocol (SDP)
349	              Port Numbers",
350	              draft-holmberg-mmusic-sdp-mmt-negotiation-00 (work in
351	              progress), October 2012.

353	   [I-D.ietf-mmusic-sdp-bundle-negotiation]
354	              Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation
355	              Using Session Description Protocol (SDP) Port Numbers",
356	              draft-ietf-mmusic-sdp-bundle-negotiation-01 (work in
357	              progress), August 2012.

359	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
360	              with Session Description Protocol (SDP)", RFC 3264,
361	              June 2002.

363	   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
364	              Media Attributes in the Session Description Protocol
365	              (SDP)", RFC 5576, June 2009.

367	Author's Address

369	   Adam Roach
370	   Mozilla
371	   Dallas, TX
372	   US

374	   Email: adam@nostrum.com