idnits 2.17.1 

draft-ietf-avtext-rtp-grouping-taxonomy-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 20, 2015) is 3202 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-11) exists of
     draft-ietf-avtcore-rtp-multi-stream-08

  == Outdated reference: A later version (-25) exists of
     draft-ietf-clue-framework-22

  == Outdated reference: A later version (-54) exists of
     draft-ietf-mmusic-sdp-bundle-negotiation-23

  == Outdated reference: A later version (-14) exists of
     draft-ietf-mmusic-sdp-simulcast-00

  == Outdated reference: A later version (-19) exists of
     draft-ietf-rtcweb-overview-14

  -- Obsolete informational reference (is this intentional?): RFC 4566
     (Obsoleted by RFC 8866)


     Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                          J. Lennox
3	Internet-Draft                                                     Vidyo
4	Intended status: Informational                                  K. Gross
5	Expires: January 21, 2016                                            AVA
6	                                                           S. Nandakumar
7	                                                            G. Salgueiro
8	                                                           Cisco Systems
9	                                                          B. Burman, Ed.
10	                                                                Ericsson
11	                                                           July 20, 2015

13	A Taxonomy of Semantics and Mechanisms for Real-Time Transport Protocol
14	                             (RTP) Sources
15	               draft-ietf-avtext-rtp-grouping-taxonomy-08

17	Abstract

19	   The terminology about, and associations among, Real-Time Transport
20	   Protocol (RTP) sources can be complex and somewhat opaque.  This
21	   document describes a number of existing and proposed properties and
22	   relationships among RTP sources, and defines common terminology for
23	   discussing protocol entities and their relationships.

25	Status of This Memo

27	   This Internet-Draft is submitted in full conformance with the
28	   provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF).  Note that other groups may also distribute
32	   working documents as Internet-Drafts.  The list of current Internet-
33	   Drafts is at http://datatracker.ietf.org/drafts/current/.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."

40	   This Internet-Draft will expire on January 21, 2016.

42	Copyright Notice

44	   Copyright (c) 2015 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (http://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document.  Please review these documents
51	   carefully, as they describe your rights and restrictions with respect
52	   to this document.  Code Components extracted from this document must
53	   include Simplified BSD License text as described in Section 4.e of
54	   the Trust Legal Provisions and are provided without warranty as
55	   described in the Simplified BSD License.

57	Table of Contents

59	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
60	   2.  Concepts  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
61	     2.1.  Media Chain . . . . . . . . . . . . . . . . . . . . . . .   5
62	       2.1.1.  Physical Stimulus . . . . . . . . . . . . . . . . . .   9
63	       2.1.2.  Media Capture . . . . . . . . . . . . . . . . . . . .   9
64	       2.1.3.  Raw Stream  . . . . . . . . . . . . . . . . . . . . .   9
65	       2.1.4.  Media Source  . . . . . . . . . . . . . . . . . . . .  10
66	       2.1.5.  Source Stream . . . . . . . . . . . . . . . . . . . .  10
67	       2.1.6.  Media Encoder . . . . . . . . . . . . . . . . . . . .  11
68	       2.1.7.  Encoded Stream  . . . . . . . . . . . . . . . . . . .  12
69	       2.1.8.  Dependent Stream  . . . . . . . . . . . . . . . . . .  12
70	       2.1.9.  Media Packetizer  . . . . . . . . . . . . . . . . . .  12
71	       2.1.10. RTP Stream  . . . . . . . . . . . . . . . . . . . . .  13
72	       2.1.11. RTP-based Redundancy  . . . . . . . . . . . . . . . .  13
73	       2.1.12. Redundancy RTP Stream . . . . . . . . . . . . . . . .  14
74	       2.1.13. RTP-based Security  . . . . . . . . . . . . . . . . .  14
75	       2.1.14. Secured RTP Stream  . . . . . . . . . . . . . . . . .  15
76	       2.1.15. Media Transport . . . . . . . . . . . . . . . . . . .  15
77	       2.1.16. Media Transport Sender  . . . . . . . . . . . . . . .  16
78	       2.1.17. Sent RTP Stream . . . . . . . . . . . . . . . . . . .  17
79	       2.1.18. Network Transport . . . . . . . . . . . . . . . . . .  17
80	       2.1.19. Transported RTP Stream  . . . . . . . . . . . . . . .  17
81	       2.1.20. Media Transport Receiver  . . . . . . . . . . . . . .  17
82	       2.1.21. Received Secured RTP Stream . . . . . . . . . . . . .  18
83	       2.1.22. RTP-based Validation  . . . . . . . . . . . . . . . .  18
84	       2.1.23. Received RTP Stream . . . . . . . . . . . . . . . . .  18
85	       2.1.24. Received Redundancy RTP Stream  . . . . . . . . . . .  18
86	       2.1.25. RTP-based Repair  . . . . . . . . . . . . . . . . . .  18
87	       2.1.26. Repaired RTP Stream . . . . . . . . . . . . . . . . .  18
88	       2.1.27. Media Depacketizer  . . . . . . . . . . . . . . . . .  19
89	       2.1.28. Received Encoded Stream . . . . . . . . . . . . . . .  19
90	       2.1.29. Media Decoder . . . . . . . . . . . . . . . . . . . .  19
91	       2.1.30. Received Source Stream  . . . . . . . . . . . . . . .  19
92	       2.1.31. Media Sink  . . . . . . . . . . . . . . . . . . . . .  19
93	       2.1.32. Received Raw Stream . . . . . . . . . . . . . . . . .  20
94	       2.1.33. Media Render  . . . . . . . . . . . . . . . . . . . .  20
95	     2.2.  Communication Entities  . . . . . . . . . . . . . . . . .  20
96	       2.2.1.  Endpoint  . . . . . . . . . . . . . . . . . . . . . .  22
97	       2.2.2.  RTP Session . . . . . . . . . . . . . . . . . . . . .  22
98	       2.2.3.  Participant . . . . . . . . . . . . . . . . . . . . .  23
99	       2.2.4.  Multimedia Session  . . . . . . . . . . . . . . . . .  23
100	       2.2.5.  Communication Session . . . . . . . . . . . . . . . .  24
101	   3.  Concepts of Inter-Relations . . . . . . . . . . . . . . . . .  24
102	     3.1.  Synchronization Context . . . . . . . . . . . . . . . . .  24
103	       3.1.1.  RTCP CNAME  . . . . . . . . . . . . . . . . . . . . .  25
104	       3.1.2.  Clock Source Signaling  . . . . . . . . . . . . . . .  25
105	       3.1.3.  Implicitly via RtcMediaStream . . . . . . . . . . . .  25
106	       3.1.4.  Explicitly via SDP Mechanisms . . . . . . . . . . . .  25
107	     3.2.  Endpoint  . . . . . . . . . . . . . . . . . . . . . . . .  25
108	     3.3.  Participant . . . . . . . . . . . . . . . . . . . . . . .  26
109	     3.4.  RtcMediaStream  . . . . . . . . . . . . . . . . . . . . .  26
110	     3.5.  Multi-Channel Audio . . . . . . . . . . . . . . . . . . .  26
111	     3.6.  Simulcast . . . . . . . . . . . . . . . . . . . . . . . .  27
112	     3.7.  Layered Multi-Stream  . . . . . . . . . . . . . . . . . .  28
113	     3.8.  RTP Stream Duplication  . . . . . . . . . . . . . . . . .  29
114	     3.9.  Redundancy Format . . . . . . . . . . . . . . . . . . . .  30
115	     3.10. RTP Retransmission  . . . . . . . . . . . . . . . . . . .  31
116	     3.11. Forward Error Correction  . . . . . . . . . . . . . . . .  33
117	     3.12. RTP Stream Separation . . . . . . . . . . . . . . . . . .  34
118	     3.13. Multiple RTP Sessions over one Media Transport  . . . . .  35
119	   4.  Mapping from Existing Terms . . . . . . . . . . . . . . . . .  35
120	     4.1.  Telepresence Terms  . . . . . . . . . . . . . . . . . . .  35
121	       4.1.1.  Audio Capture . . . . . . . . . . . . . . . . . . . .  35
122	       4.1.2.  Capture Device  . . . . . . . . . . . . . . . . . . .  35
123	       4.1.3.  Capture Encoding  . . . . . . . . . . . . . . . . . .  36
124	       4.1.4.  Capture Scene . . . . . . . . . . . . . . . . . . . .  36
125	       4.1.5.  Endpoint  . . . . . . . . . . . . . . . . . . . . . .  36
126	       4.1.6.  Individual Encoding . . . . . . . . . . . . . . . . .  36
127	       4.1.7.  Media Capture . . . . . . . . . . . . . . . . . . . .  36
128	       4.1.8.  Media Consumer  . . . . . . . . . . . . . . . . . . .  36
129	       4.1.9.  Media Provider  . . . . . . . . . . . . . . . . . . .  37
130	       4.1.10. Stream  . . . . . . . . . . . . . . . . . . . . . . .  37
131	       4.1.11. Video Capture . . . . . . . . . . . . . . . . . . . .  37
132	     4.2.  Media Description . . . . . . . . . . . . . . . . . . . .  37
133	     4.3.  Media Stream  . . . . . . . . . . . . . . . . . . . . . .  37
134	     4.4.  Multimedia Conference . . . . . . . . . . . . . . . . . .  37
135	     4.5.  Multimedia Session  . . . . . . . . . . . . . . . . . . .  38
136	     4.6.  Multipoint Control Unit (MCU) . . . . . . . . . . . . . .  38
137	     4.7.  Multi-Session Transmission (MST)  . . . . . . . . . . . .  38
138	     4.8.  Recording Device  . . . . . . . . . . . . . . . . . . . .  39
139	     4.9.  RtcMediaStream  . . . . . . . . . . . . . . . . . . . . .  39
140	     4.10. RtcMediaStreamTrack . . . . . . . . . . . . . . . . . . .  39
141	     4.11. RTP Sender  . . . . . . . . . . . . . . . . . . . . . . .  39
142	     4.12. RTP Session . . . . . . . . . . . . . . . . . . . . . . .  39
143	     4.13. Single Session Transmission (SST) . . . . . . . . . . . .  39
144	     4.14. SSRC  . . . . . . . . . . . . . . . . . . . . . . . . . .  39

146	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  40
147	   6.  Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .  40
148	   7.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  40
149	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  41
150	   9.  Informative References  . . . . . . . . . . . . . . . . . . .  41
151	   Appendix A.  Changes From Earlier Versions  . . . . . . . . . . .  44
152	     A.1.  Modifications Between WG Version -07 and -08  . . . . . .  44
153	     A.2.  Modifications Between WG Version -06 and -07  . . . . . .  45
154	     A.3.  Modifications Between WG Version -05 and -06  . . . . . .  45
155	     A.4.  Modifications Between WG Version -04 and -05  . . . . . .  46
156	     A.5.  Modifications Between WG Version -03 and -04  . . . . . .  46
157	     A.6.  Modifications Between WG Version -02 and -03  . . . . . .  47
158	     A.7.  Modifications Between WG Version -01 and -02  . . . . . .  47
159	     A.8.  Modifications Between WG Version -00 and -01  . . . . . .  48
160	     A.9.  Modifications Between Version -02 and -03 . . . . . . . .  48
161	     A.10. Modifications Between Version -01 and -02 . . . . . . . .  48
162	     A.11. Modifications Between Version -00 and -01 . . . . . . . .  48
163	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  49

165	1.  Introduction

167	   The existing taxonomy of sources in the Real-Time Transport Protocol
168	   (RTP) [RFC3550] has previously been regarded as confusing and
169	   inconsistent.  Consequently, a deep understanding of how the
170	   different terms relate to each other becomes a real challenge.
171	   Frequently cited examples of this confusion are (1) how different
172	   protocols that make use of RTP use the same terms to signify
173	   different things and (2) how the complexities addressed at one layer
174	   are often glossed over or ignored at another.

176	   This document improves clarity by reviewing the semantics of various
177	   aspects of sources in RTP.  As an organizing mechanism, it approaches
178	   this by describing various ways that RTP sources are transformed on
179	   their way between sender and receiver, and how they can be grouped
180	   and associated together.

182	   All non-specific references to ControLling mUltiple streams for
183	   tElepresence (CLUE) in this document map to [I-D.ietf-clue-framework]
184	   and all references to Web Real-Time Communications (WebRTC) map to
185	   [I-D.ietf-rtcweb-overview].

187	2.  Concepts

189	   This section defines concepts that serve to identify and name various
190	   transformations and streams in a given RTP usage.  For each concept,
191	   alternate definitions and usages that co-exist today are listed along
192	   with various characteristics that further describes the concept.
193	   These concepts are divided into two categories, one related to the
194	   chain of streams and transformations that media can be subject to,
195	   the other for entities involved in the communication.

197	2.1.  Media Chain

199	   In the context of this document, Media is a sequence of synthetic or
200	   Physical Stimuli (Section 2.1.1) (sound waves, photons, key-strokes),
201	   represented in digital form.  Synthesized Media is typically
202	   generated directly in the digital domain.

204	   This section contains the concepts that can be involved in taking
205	   Media at a sender side and transporting it to a receiver, which may
206	   recover a sequence of physical stimuli.  This chain of concepts is of
207	   two main types, streams and transformations.  Streams are time-based
208	   sequences of samples of the physical stimulus in various
209	   representations, while transformations changes the representation of
210	   the streams in some way.

212	   The below examples are basic ones and it is important to keep in mind
213	   that this conceptual model enables more complex usages.  Some will be
214	   further discussed in later sections of this document.  In general the
215	   following applies to this model:

217	   o  A transformation may have zero or more inputs and one or more
218	      outputs.

220	   o  A stream is of some type, such as audio, video, real-time text,
221	      etc.

223	   o  A stream has one source transformation and one or more sink
224	      transformations (with the exception of Physical Stimulus
225	      (Section 2.1.1) that may lack source or sink transformation).

227	   o  Streams can be forwarded from a transformation output to any
228	      number of inputs on other transformations that support that type.

230	   o  If the output of a transformation is sent to multiple
231	      transformations, those streams will be identical; it takes a
232	      transformation to make them different.

234	   o  There are no formal limitations on how streams are connected to
235	      transformations.

237	   It is also important to remember that this is a conceptual model.
238	   Thus real-world implementations may look different and have different
239	   structure.

241	   To provide a basic understanding of the relationships in the chain we
242	   first introduce the concepts for the sender side (Figure 1).  This
243	   covers physical stimuli until media packets are emitted onto the
244	   network.

246	               Physical Stimulus
247	                      |
248	                      V
249	           +----------------------+
250	           |     Media Capture    |
251	           +----------------------+
252	                      |
253	                 Raw Stream
254	                      V
255	           +----------------------+
256	           |     Media Source     |<- Synchronization Timing
257	           +----------------------+
258	                      |
259	                Source Stream
260	                      V
261	           +----------------------+
262	           |    Media Encoder     |
263	           +----------------------+
264	                      |
265	                Encoded Stream      +------------+
266	                      V             |            V
267	           +----------------------+ | +----------------------+
268	           |   Media Packetizer   | | | RTP-based Redundancy |
269	           +----------------------+ | +----------------------+
270	                      |             |            |
271	                      +-------------+  Redundancy RTP Stream
272	               Source RTP Stream                 |
273	                      V                          V
274	           +----------------------+   +----------------------+
275	           |  RTP-based Security  |   |  RTP-based Security  |
276	           +----------------------+   +----------------------+
277	                      |                          |
278	              Secured RTP Stream   Secured Redundancy RTP Stream
279	                      V                          V
280	           +----------------------+   +----------------------+
281	           |   Media Transport    |   |   Media Transport    |
282	           +----------------------+   +----------------------+

284	             Figure 1: Sender Side Concepts in the Media Chain

286	   In Figure 1 we have included a branched chain to cover the concepts
287	   for using redundancy to improve the reliability of the transport.

289	   The Media Transport concept is an aggregate that is decomposed in
290	   Section 2.1.15.

292	   In Figure 2 we review a receiver media chain matching the sender
293	   side, to look at the inverse transformations and their attempts to
294	   recover identical streams as in the sender chain, subject to what may
295	   be lossy compression and imperfect Media Transport.  Note that the
296	   streams out of a reverse transformation, like the Source Stream out
297	   the Media Decoder are in many cases not the same as the corresponding
298	   ones on the sender side, thus they are prefixed with a "Received" to
299	   denote a potentially modified version.  The reason for not being the
300	   same lies in the transformations that can be of irreversible type.
301	   For example, lossy source coding in the Media Encoder prevents the
302	   Source Stream out of the Media Decoder to be the same as the one fed
303	   into the Media Encoder.  Other reasons include packet loss or late
304	   loss in the Media Transport transformation that even RTP-based
305	   Repair, if used, fails to repair.  However, some transformations are
306	   not always present, like RTP-based Repair that cannot operate without
307	   Redundancy RTP Streams.

309	          +----------------------+   +----------------------+
310	          |   Media Transport    |   |   Media Transport    |
311	          +----------------------+   +----------------------+
312	            Received |                 Received | Secured
313	            Secured RTP Stream       Redundancy RTP Stream
314	                     V                          V
315	          +----------------------+   +----------------------+
316	          | RTP-based Validation |   | RTP-based Validation |
317	          +----------------------+   +----------------------+
318	                     |                          |
319	            Received RTP Stream   Received Redundancy RTP Stream
320	                     |                          |
321	                     |     +--------------------+
322	                     V     V
323	          +----------------------+
324	          |   RTP-based Repair   |
325	          +----------------------+
326	                     |
327	            Repaired RTP Stream
328	                     V
329	          +----------------------+
330	          |  Media Depacketizer  |
331	          +----------------------+
332	                     |
333	           Received Encoded Stream
334	                     V
335	          +----------------------+
336	          |    Media Decoder     |
337	          +----------------------+
338	                     |
339	           Received Source Stream
340	                     V
341	          +----------------------+
342	          |      Media Sink      |--> Synchronization Information
343	          +----------------------+
344	                     |
345	            Received Raw Stream
346	                     V
347	          +----------------------+
348	          |    Media Renderer    |
349	          +----------------------+
350	                     |
351	                     V
352	             Physical Stimulus

354	            Figure 2: Receiver Side Concepts of the Media Chain

356	2.1.1.  Physical Stimulus

358	   The Physical Stimulus is a physical event in the analog domain that
359	   can be sampled and converted to digital form by an appropriate sensor
360	   or transducer.  This include sound waves making up audio, photons in
361	   a light field, or other excitations or interactions with sensors,
362	   like keystrokes on a keyboard.

364	2.1.2.  Media Capture

366	   Media Capture is the process of transforming the analog Physical
367	   Stimulus (Section 2.1.1) into digital Media using an appropriate
368	   sensor or transducer.  The Media Capture performs a digital sampling
369	   of the physical stimulus, usually periodically, and outputs this in
370	   some representation as a Raw Stream (Section 2.1.3).  This data is
371	   considered "Media", because it includes data that is periodically
372	   sampled, or made up of a set of timed asynchronous events.  The Media
373	   Capture is normally instantiated in some type of device, i.e. media
374	   capture device.  Examples of different types of media capturing
375	   devices are digital cameras, microphones connected to A/D converters,
376	   or keyboards.

378	   Characteristics:

380	   o  A Media Capture is identified either by hardware/manufacturer ID
381	      or via a session-scoped device identifier as mandated by the
382	      application usage.

384	   o  A Media Capture can generate an Encoded Stream (Section 2.1.7) if
385	      the capture device supports such a configuration.

387	   o  The nature of the Media Capture may impose constraints on the
388	      clock handling in some of the subsequent steps.  For example, many
389	      audio or video capture devices are not completely free in
390	      selecting the sample rate.

392	2.1.3.  Raw Stream

394	   A Raw Stream is the time progressing stream of digitally sampled
395	   information, usually periodically sampled and provided by a Media
396	   Capture (Section 2.1.2).  A Raw Stream can also contain synthesized
397	   Media that may not require any explicit Media Capture, since it is
398	   already in an appropriate digital form.

400	2.1.4.  Media Source

402	   A Media Source is the logical source of a time progressing digital
403	   media stream synchronized to a reference clock.  This stream is
404	   called a Source Stream (Section 2.1.5).  This transformation takes
405	   one or more Raw Streams (Section 2.1.3) and provides a Source Stream
406	   as output.  The output is synchronized with a reference clock
407	   (Section 3.1), which can be as simple as a system local wall clock or
408	   as complex as an NTP synchronized clock.

410	   The output can be of different types.  One type is directly
411	   associated with a particular Media Capture's Raw Stream.  Others are
412	   more conceptual sources, like an audio mix of multiple Source Streams
413	   (Figure 3).  Mixing multiple streams typically requires that the
414	   input streams are possible to relate in time, meaning that they have
415	   to be Source Streams (Section 2.1.5) rather than Raw Streams.  In
416	   Figure 3, the generated Source Stream is a mix of the three input
417	   Source Streams.

419	                Source    Source    Source
420	                Stream    Stream    Stream
421	                  |         |         |
422	                  V         V         V
423	              +--------------------------+
424	              |        Media Source      |<-- Reference Clock
425	              |           Mixer          |
426	              +--------------------------+
427	                            |
428	                            V
429	                      Source Stream

431	         Figure 3: Conceptual Media Source in form of Audio Mixer

433	   Another possible example of a conceptual Media Source is a video
434	   surveillance switch, where the input is multiple Source Streams from
435	   different cameras, and the output is one of those Source Streams
436	   based on some selection criteria, like a round-robin or based on some
437	   video activity measure.

439	2.1.5.  Source Stream

441	   A Source Stream is a stream of digital samples that has been
442	   synchronized with a reference clock and comes from particular Media
443	   Source (Section 2.1.4).

445	2.1.6.  Media Encoder

447	   A Media Encoder is a transform that is responsible for encoding the
448	   media data from a Source Stream (Section 2.1.5) into another
449	   representation, usually more compact, that is output as an Encoded
450	   Stream (Section 2.1.7).

452	   The Media Encoder step commonly includes pre-encoding
453	   transformations, such as scaling, resampling etc.  The Media Encoder
454	   can have a significant number of configuration options that affects
455	   the properties of the Encoded Stream.  This include properties such
456	   as codec, bit-rate, start points for decoding, resolution, bandwidth
457	   or other fidelity affecting properties.

459	   Scalable Media Encoders need special attention as they produce
460	   multiple outputs that are potentially of different types.  As shown
461	   in Figure 4, a scalable Media Encoder takes one input Source Stream
462	   and encodes it into multiple output streams of two different types;
463	   at least one Encoded Stream that is independently decodable and one
464	   or more Dependent Streams (Section 2.1.8).  Decoding requires at
465	   least one Encoded Stream and zero or more Dependent Streams.  A
466	   Dependent Stream's dependency is one of the grouping relations this
467	   document discusses further in Section 3.7.

469	                              Source Stream
470	                                    |
471	                                    V
472	                       +--------------------------+
473	                       |  Scalable Media Encoder  |
474	                       +--------------------------+
475	                          |         |   ...    |
476	                          V         V          V
477	                       Encoded  Dependent  Dependent
478	                       Stream    Stream     Stream

480	            Figure 4: Scalable Media Encoder Input and Outputs

482	   There are also other variants of encoders, like so-called Multiple
483	   Description Coding (MDC).  Such Media Encoders produce multiple
484	   independent and thus individually decodable Encoded Streams.
485	   However, (logically) combining multiple of these Encoded Streams into
486	   a single Received Source Stream during decoding leads to an
487	   improvement in perceptual reproduced quality when compared to
488	   decoding a single Encoded Stream.

490	   Creating multiple Encoded Streams from the same Source Stream, where
491	   the Encoded Streams are neither in a scalable nor in an MDC
492	   relationship is commonly utilized in Simulcast
493	   [I-D.ietf-mmusic-sdp-simulcast] environments.

495	2.1.7.  Encoded Stream

497	   A stream of time synchronized encoded media that can be independently
498	   decoded.

500	   Due to temporal dependencies, an Encoded Stream may have limitations
501	   in where decoding can be started.  These entry points, for example
502	   Intra frames from a video encoder, may require identification and
503	   their generation may be event based or configured to occur
504	   periodically.

506	2.1.8.  Dependent Stream

508	   A stream of time synchronized encoded media fragments that are
509	   dependent on one or more Encoded Streams (Section 2.1.7) and zero or
510	   more Dependent Streams to be possible to decode.

512	   Each Dependent Stream has a set of dependencies.  These dependencies
513	   must be understood by the parties in a Multimedia Session that intend
514	   to use a Dependent Stream.

516	2.1.9.  Media Packetizer

518	   The transformation of taking one or more Encoded (Section 2.1.7) or
519	   Dependent Streams (Section 2.1.8) and putting their content into one
520	   or more sequences of packets, normally RTP packets, and output Source
521	   RTP Streams (Section 2.1.10).  This step includes both generating RTP
522	   payloads as well as RTP packets.  The Media Packetizer then selects
523	   which Synchronization source(s) (SSRC) [RFC3550] and RTP Sessions to
524	   use.

526	   The Media Packetizer can combine multiple Encoded or Dependent
527	   Streams into one or more RTP Streams:

529	   o  The Media Packetizer can use multiple inputs when producing a
530	      single RTP Stream.  One such example is SRST packetization when
531	      using Scalable Video Coding (SVC) (Section 3.7).

533	   o  The Media Packetizer can also produce multiple RTP Streams, for
534	      example when Encoded and/or Dependent Streams are distributed over
535	      multiple RTP Streams.  One example of this is MRMT packetization
536	      when using SVC (Section 3.7).

538	2.1.10.  RTP Stream

540	   An RTP Stream is a stream of RTP packets containing media data,
541	   source or redundant.  The RTP Stream is identified by an SSRC
542	   belonging to a particular RTP Session.  The RTP Session is identified
543	   as discussed in Section 2.2.2.

545	   A Source RTP Stream is an RTP Stream directly related to an Encoded
546	   Stream (Section 2.1.7), targeted for transport over RTP without any
547	   additional RTP-based Redundancy (Section 2.1.11) applied.

549	   Characteristics:

551	   o  Each RTP Stream is identified by a Synchronization source (SSRC)
552	      [RFC3550] that is carried in every RTP and RTP Control Protocol
553	      (RTCP) packet header.  The SSRC is unique in a specific RTP
554	      Session context.

556	   o  At any given point in time, a RTP Stream can have one and only one
557	      SSRC, but SSRCs for a given RTP Stream can change over time.  SSRC
558	      collision and clock rate change [RFC7160] are examples of valid
559	      reasons to change SSRC for an RTP Stream.  In those cases, the RTP
560	      Stream itself is not changed in any significant way, only the
561	      identifying SSRC number.

563	   o  Each SSRC defines a unique RTP sequence numbering and timing
564	      space.

566	   o  Several RTP Streams, each with their own SSRC, may represent a
567	      single Media Source.

569	   o  Several RTP Streams, each with their own SSRC, can be carried in a
570	      single RTP Session.

572	2.1.11.  RTP-based Redundancy

574	   RTP-based Redundancy is defined here as a transformation that
575	   generates redundant or repair packets sent out as a Redundancy RTP
576	   Stream (Section 2.1.12) to mitigate network transport impairments,
577	   like packet loss and delay.  Note that this excludes the type of
578	   redundancy that most suitable Media Encoders (Section 2.1.6) may add
579	   to the media format of the Encoded Stream (Section 2.1.7) that makes
580	   it cope better with inevitable RTP packet losses.

582	   The RTP-based Redundancy exists in many flavors; they may be
583	   generating independent Repair Streams that are used in addition to
584	   the Source Stream (like RTP Retransmission (Section 3.10) and some
585	   special types of Forward Error Correction, like RTP stream
586	   duplication (Section 3.8)), they may generate a new Source Stream by
587	   combining redundancy information with source information (Using XOR
588	   FEC (Section 3.11) as a redundancy payload (Section 3.9)), or
589	   completely replace the source information with only redundancy
590	   packets.

592	2.1.12.  Redundancy RTP Stream

594	   A Redundancy RTP Stream is an RTP Stream (Section 2.1.10) that
595	   contains no original source data, only redundant data, which may
596	   either be used standalone or be combined with one or more Received
597	   RTP Streams (Section 2.1.23) to produce Repaired RTP Streams
598	   (Section 2.1.26).

600	2.1.13.  RTP-based Security

602	   The optional RTP-based Security transformation applies security
603	   services such as authentication, integrity protection and
604	   confidentiality to an input RTP Stream, like what is specified in The
605	   Secure Real-time Transport Protocol (SRTP) [RFC3711], producing a
606	   Secured RTP Stream (Section 2.1.14).  Either an RTP Stream
607	   (Section 2.1.10) or a Redundancy RTP Stream (Section 2.1.12) can be
608	   used as input to this transformation.

610	   In SRTP and the related Secure RTCP (SRTCP), all of the above
611	   mentioned security services are optional, except for integrity
612	   protection of SRTCP, which is mandatory.  Also confidentiality
613	   (encryption) is effectively optional in SRTP, since it is possible to
614	   use a NULL encryption algorithm.  As described in [RFC7201], the
615	   strength of SRTP data origin authentication depends on the
616	   cryptographic transform and key management used, for example in group
617	   communication where it is sometimes possible to authenticate group
618	   membership but not the actual RTP Stream sender.

620	   RTP-based Security and RTP-based Redundancy can be combined in a few
621	   different ways.  One way is depicted in Figure 1, where an RTP Stream
622	   and its corresponding Redundancy RTP Stream are protected by separate
623	   RTP-based Security transforms.  In other cases, like when a Media
624	   Translator is adding FEC in Section 3.2.1.3 of
625	   [I-D.ietf-avtcore-rtp-topologies-update], a middlebox can apply RTP-
626	   based Redundancy to an already Secured RTP Stream instead of a Source
627	   RTP Stream.  One example of that is depicted in Figure 5 below.

629	               Source RTP Stream    +------------+
630	                      V             |            V
631	           +----------------------+ | +----------------------+
632	           |  RTP-based Security  | | | RTP-based Redundancy |
633	           +----------------------+ | +----------------------+
634	                      |             |            |
635	                      |             |  Redundancy RTP Stream
636	                      +-------------+            |
637	                      |                          V
638	                      |               +----------------------+
639	              Secured RTP Stream      |  RTP-based Security  |
640	                      |               +----------------------+
641	                      |                          |
642	                      |            Secured Redundancy RTP Stream
643	                      V                          V
644	           +----------------------+   +----------------------+
645	           |   Media Transport    |   |   Media Transport    |
646	           +----------------------+   +----------------------+

648	            Figure 5: Adding Redundancy to a Secured RTP Stream

650	   In this case, the Redundancy RTP Stream may already have been secured
651	   for confidentiality (encrypted) by the first RTP-based Security, and
652	   it may therefore not be necessary to apply additional confidentiality
653	   protection in the second RTP-based Security.  To avoid attacks and
654	   negative impact on RTP-based Repair (Section 2.1.25) and the
655	   resulting Repaired RTP Stream (Section 2.1.26), it is however still
656	   necessary to have this second RTP-based Security apply both
657	   authentication and integrity protection to the Redundancy RTP Stream.

659	2.1.14.  Secured RTP Stream

661	   A Secured RTP Stream is a Source or Redundancy RTP Stream that is
662	   protected through RTP-based Security (Section 2.1.13) by one or more
663	   of the confidentiality, integrity, or authentication security
664	   services.

666	2.1.15.  Media Transport

668	   A Media Transport defines the transformation that the RTP Streams
669	   (Section 2.1.10) are subjected to by the end-to-end transport from
670	   one RTP sender to one specific RTP receiver (an RTP Session
671	   (Section 2.2.2) may contain multiple RTP receivers per sender).  Each
672	   Media Transport is defined by a transport association that is
673	   normally identified by a 5-tuple (source address, source port,
674	   destination address, destination port, transport protocol), but a
675	   proposal exists for sending multiple transport associations on a
676	   single 5-tuple [I-D.westerlund-avtcore-transport-multiplexing].

678	   Characteristics:

680	   o  Media Transport transmits RTP Streams of RTP Packets from a source
681	      transport address to a destination transport address.

683	   o  Each Media Transport contains only a single RTP Session.

685	   o  A single RTP Session can span multiple Media Transports.

687	   The Media Transport concept sometimes needs to be decomposed into
688	   more steps to enable discussion of what a sender emits that gets
689	   transformed by the network before it is received by the receiver.
690	   Thus we provide also this Media Transport decomposition (Figure 6).

692	                               RTP Stream
693	                                    |
694	                                    V
695	                       +--------------------------+
696	                       |  Media Transport Sender  |
697	                       +--------------------------+
698	                                    |
699	                             Sent RTP Stream
700	                                    V
701	                       +--------------------------+
702	                       |    Network Transport     |
703	                       +--------------------------+
704	                                    |
705	                         Transported RTP Stream
706	                                    V
707	                       +--------------------------+
708	                       | Media Transport Receiver |
709	                       +--------------------------+
710	                                    |
711	                                    V
712	                           Received RTP Stream

714	                Figure 6: Decomposition of Media Transport

716	2.1.16.  Media Transport Sender

718	   The first transformation within the Media Transport (Section 2.1.15)
719	   is the Media Transport Sender.  The sending Endpoint (Section 2.2.1)
720	   takes an RTP Stream and emits the packets onto the network using the
721	   transport association established for this Media Transport, thereby
722	   creating a Sent RTP Stream (Section 2.1.17).  In the process, it
723	   transforms the RTP Stream in several ways.  First, it generates the
724	   necessary protocol headers for the transport association, for example
725	   IP and UDP headers, thus forming IP/UDP/RTP packets.  In addition,
726	   the Media Transport Sender may queue, intentionally pace or otherwise
727	   affect how the packets are emitted onto the network, thereby
728	   potentially introducing delay and delay variations [RFC5481] that
729	   characterize the Sent RTP Stream.

731	2.1.17.  Sent RTP Stream

733	   The Sent RTP Stream is the RTP Stream as entering the first hop of
734	   the network path to its destination.  The Sent RTP Stream is
735	   identified using network transport addresses, like for IP/UDP the
736	   5-tuple (source IP address, source port, destination IP address,
737	   destination port, and protocol (UDP)).

739	2.1.18.  Network Transport

741	   Network Transport is the transformation that subjects the Sent RTP
742	   Stream (Section 2.1.17) to traveling from the source to the
743	   destination through the network.  This transformation can result in
744	   loss of some packets, delay and delay variation on a per packet
745	   basis, packet duplication, and packet header or data corruption.
746	   This transformation produces a Transported RTP Stream
747	   (Section 2.1.19) at the exit of the network path.

749	2.1.19.  Transported RTP Stream

751	   The Transported RTP Stream is the RTP Stream that is emitted out of
752	   the network path at the destination, subjected to the Network
753	   Transport's transformation (Section 2.1.18).

755	2.1.20.  Media Transport Receiver

757	   The Media Transport Receiver is the receiver Endpoint's
758	   (Section 2.2.1) transformation of the Transported RTP Stream
759	   (Section 2.1.19) by its reception process, which results in the
760	   Received RTP Stream (Section 2.1.23).  This transformation includes
761	   transport checksums being verified.  Sensible system designs
762	   typically either discard packets with mis-matching checksums, or pass
763	   them on while somehow marking them in the resulting Received RTP
764	   Stream so to alert subsequent transformations about the possible
765	   corrupt state.  In this context it is worth noting that there is
766	   typically some probability for corrupt packets to pass through
767	   undetected (with a seemingly correct checksum).  Other
768	   transformations can compensate for delay variations in receiving a
769	   packet on the network interface and providing it to the application
770	   (de-jitter buffer).

772	2.1.21.  Received Secured RTP Stream

774	   This is the Secured RTP Stream (Section 2.1.14) resulting from the
775	   Media Transport (Section 2.1.15) aggregate transformation.

777	2.1.22.  RTP-based Validation

779	   RTP-based Validation is the reverse transformation of RTP-based
780	   Security (Section 2.1.13).  If this transformation fails, the result
781	   is either not usable and must be discarded, or may be usable but
782	   cannot be trusted.  If the transformation succeeds, the result can be
783	   a Received RTP Stream (Section 2.1.23) or a Received Redundancy RTP
784	   Stream (Section 2.1.24), depending on what was input to the
785	   corresponding RTP-based Security transformation, but can also be a
786	   Received Secured RTP Stream (Section 2.1.21) in case several RTP-
787	   based Security transformations were applied.

789	2.1.23.  Received RTP Stream

791	   The Received RTP Stream is the RTP Stream (Section 2.1.10) resulting
792	   from the Media Transport's aggregate transformation (Section 2.1.15),
793	   i.e. subjected to packet loss, packet corruption, packet duplication,
794	   delay, and delay variation from sender to receiver.

796	2.1.24.  Received Redundancy RTP Stream

798	   The Received Redundancy RTP Stream is the Redundancy RTP Stream
799	   (Section 2.1.12) resulting from the Media Transport transformation,
800	   i.e. subjected to packet loss, packet corruption, delay, and delay
801	   variation from sender to receiver.

803	2.1.25.  RTP-based Repair

805	   RTP-based Repair is a Transformation that takes as input zero or more
806	   Received RTP Streams (Section 2.1.23) and one or more Received
807	   Redundancy RTP Streams (Section 2.1.24), and produces one or more
808	   Repaired RTP Streams (Section 2.1.26) that are as close to the
809	   corresponding sent Source RTP Streams (Section 2.1.10) as possible,
810	   using different RTP-based repair methods, for example the ones
811	   referred in RTP-based Redundancy (Section 2.1.11).

813	2.1.26.  Repaired RTP Stream

815	   A Repaired RTP Stream is a Received RTP Stream (Section 2.1.23) for
816	   which Received Redundancy RTP Stream (Section 2.1.24) information has
817	   been used to try to recover the Source RTP Stream (Section 2.1.10) as
818	   it was before Media Transport (Section 2.1.15).

820	2.1.27.  Media Depacketizer

822	   A Media Depacketizer takes one or more RTP Streams (Section 2.1.10),
823	   depacketizes them, and attempts to reconstitute the Encoded Streams
824	   (Section 2.1.7) or Dependent Streams (Section 2.1.8) present in those
825	   RTP Streams.

827	   In practical implementations, the Media Depacketizer and the Media
828	   Decoder may be tightly coupled and share information to improve or
829	   optimize the overall decoding and error concealment process.  It is,
830	   however, not expected that there would be any benefit in defining a
831	   taxonomy for those detailed (and likely very implementation-
832	   dependent) steps.

834	2.1.28.  Received Encoded Stream

836	   The Received Encoded Stream is the received version of an Encoded
837	   Stream (Section 2.1.7).

839	2.1.29.  Media Decoder

841	   A Media Decoder is a transformation that is responsible for decoding
842	   Encoded Streams (Section 2.1.7) and any Dependent Streams
843	   (Section 2.1.8) into a Source Stream (Section 2.1.5).

845	   In practical implementations, the Media Decoder and the Media
846	   Depacketizer may be tightly coupled and share information to improve
847	   or optimize the overall decoding process in various ways.  It is
848	   however not expected that there would be any benefit in defining a
849	   taxonomy for those detailed (and likely very implementation-
850	   dependent) steps.

852	   A Media Decoder has to deal with any errors in the Encoded Streams
853	   that resulted from corruption or failure to repair packet losses.
854	   Therefore, it commonly is robust to error and losses, and includes
855	   concealment methods.

857	2.1.30.  Received Source Stream

859	   The Received Source Stream is the received version of a Source Stream
860	   (Section 2.1.5).

862	2.1.31.  Media Sink

864	   The Media Sink receives a Source Stream (Section 2.1.5) that
865	   contains, usually periodically, sampled media data together with
866	   associated synchronization information.  Depending on application,
867	   this Source Stream then needs to be transformed into a Raw Stream
868	   (Section 2.1.3) that is conveyed to the Media Render
869	   (Section 2.1.33), synchronized with the output from other Media
870	   Sinks.  The Media Sink may also be connected with a Media Source
871	   (Section 2.1.4) and be used as part of a conceptual Media Source.

873	   The Media Sink can further transform the Source Stream into a
874	   representation that is suitable for rendering on the Media Render as
875	   defined by the application or system-wide configuration.  This
876	   include sample scaling, level adjustments etc.

878	2.1.32.  Received Raw Stream

880	   The Received Raw Stream is the received version of a Raw Stream
881	   (Section 2.1.3).

883	2.1.33.  Media Render

885	   A Media Render takes a Raw Stream (Section 2.1.3) and converts it
886	   into Physical Stimulus (Section 2.1.1) that a human user can
887	   perceive.  Examples of such devices are screens, and D/A converters
888	   connected to amplifiers and loudspeakers.

890	   An Endpoint can potentially have multiple Media Renders for each
891	   media type.

893	2.2.  Communication Entities

895	   This section contains concepts for entities involved in the
896	   communication.

898	      +------------------------------------------------------------+
899	      | Communication Session                                      |
900	      |                                                            |
901	      | +----------------+                      +----------------+ |
902	      | | Participant A  |    +------------+    | Participant B  | |
903	      | |                |    | Multimedia |    |                | |
904	      | | +------------+ |<==>| Session    |<==>| +------------+ | |
905	      | | | Endpoint A | |    |            |    | | Endpoint B | | |
906	      | | |            | |    +------------+    | |            | | |
907	      | | | +----------+-+----------------------+-+----------+ | | |
908	      | | | | RTP      | |                      | |          | | | |
909	      | | | | Session  |-+---Media Transport----+>|          | | | |
910	      | | | | Audio    |<+---Media Transport----+-|          | | | |
911	      | | | |          | |          ^           | |          | | | |
912	      | | | +----------+-+----------|-----------+-+----------+ | | |
913	      | | |            | |          v           | |            | | |
914	      | | |            | | +-----------------+  | |            | | |
915	      | | |            | | | Synchronization |  | |            | | |
916	      | | |            | | |     Context     |  | |            | | |
917	      | | |            | | +-----------------+  | |            | | |
918	      | | |            | |          ^           | |            | | |
919	      | | | +----------+-+----------|-----------+-+----------+ | | |
920	      | | | | RTP      | |          v           | |          | | | |
921	      | | | | Session  |<+---Media Transport----+-|          | | | |
922	      | | | | Video    |-+---Media Transport----+>|          | | | |
923	      | | | |          | |                      | |          | | | |
924	      | | | +----------+-+----------------------+-+----------+ | | |
925	      | | +------------+ |                      | +------------+ | |
926	      | +----------------+                      +----------------+ |
927	      +------------------------------------------------------------+

929	    Figure 7: Example Point to Point Communication Session with two RTP
930	                                 Sessions

932	   Figure 7 shows a high-level example representation of a very basic
933	   point-to-point Communication Session between Participants A and B.
934	   It uses two different audio and video RTP Sessions between A's and
935	   B's Endpoints, where each RTP Session is a group communications
936	   channel that can potentially carry a number of RTP Streams.  It is
937	   using separate Media Transports for those RTP Sessions.  The
938	   Multimedia Session shared by the Participants can, for example, be
939	   established using SIP (i.e., there is a SIP Dialog between A and B).
940	   The terms used in Figure 7 are further elaborated in the sub-sections
941	   below.

943	2.2.1.  Endpoint

945	   An Endpoint is a single addressable entity sending or receiving RTP
946	   packets.  It may be decomposed into several functional blocks, but as
947	   long as it behaves as a single RTP stack entity it is classified as a
948	   single "Endpoint".

950	   Characteristics:

952	   o  Endpoints can be identified in several different ways.  While RTCP
953	      Canonical Names (CNAMEs) [RFC3550] provide a globally unique and
954	      stable identification mechanism for the duration of the
955	      Communication Session (see Section 2.2.5), their validity applies
956	      exclusively within a Synchronization Context (Section 3.1).  Thus
957	      one Endpoint can handle multiple CNAMEs, each of which can be
958	      shared among a set of Endpoints belonging to the same Participant
959	      (Section 2.2.3).  Therefore, mechanisms outside the scope of RTP,
960	      such as application defined mechanisms, must be used to provide
961	      Endpoint identification when outside this Synchronization Context.

963	   o  An Endpoint can be associated with at most one Participant
964	      (Section 2.2.3) at any single point in time.

966	   o  In some contexts, an Endpoint would typically correspond to a
967	      single "host", for example a computer using a single network
968	      interface and being used by a single human user.  In other
969	      contexts, a single "host" can serve multiple Participants, in
970	      which case each Participant's Endpoint may share properties, for
971	      example the IP address part of a transport address.

973	2.2.2.  RTP Session

975	   An RTP Session is an association among a group of Participants
976	   communicating with RTP.  It is a group communications channel which
977	   can potentially carry a number of RTP Streams.  Within an RTP
978	   Session, every Participant can find meta-data and control information
979	   (over RTCP) about all the RTP Streams in the RTP Session.  The
980	   bandwidth of the RTCP control channel is shared between all
981	   Participants within an RTP Session.

983	   Characteristics:

985	   o  An RTP Session can carry one ore more RTP Streams.

987	   o  An RTP Session shares a single SSRC space as defined in RFC3550
988	      [RFC3550].  That is, the Endpoints participating in an RTP Session
989	      can see an SSRC identifier transmitted by any of the other
990	      Endpoints.  An Endpoint can receive an SSRC either as SSRC or as a
991	      Contributing source (CSRC) in RTP and RTCP packets, as defined by
992	      the Endpoints' network interconnection topology.

994	   o  An RTP Session uses at least two Media Transports
995	      (Section 2.1.15), one for sending and one for receiving.
996	      Commonly, the receiving Media Transport is the reverse direction
997	      of the Media Transport used for sending.  An RTP Session may use
998	      many Media Transports and these define the session's network
999	      interconnection topology.

1001	   o  A single Media Transport always carries a single RTP Session.

1003	   o  Multiple RTP Sessions can be conceptually related, for example
1004	      originating from or targeted for the same Participant
1005	      (Section 2.2.3) or Endpoint (Section 2.2.1), or by containing RTP
1006	      Streams that are somehow related (Section 3).

1008	2.2.3.  Participant

1010	   A Participant is an entity reachable by a single signaling address,
1011	   and is thus related more to the signaling context than to the media
1012	   context.

1014	   Characteristics:

1016	   o  A single signaling-addressable entity, using an application-
1017	      specific signaling address space, for example a SIP URI.

1019	   o  A Participant can participate in several Multimedia Sessions
1020	      (Section 2.2.4).

1022	   o  A Participant can be comprised of several associated Endpoints
1023	      (Section 2.2.1).

1025	2.2.4.  Multimedia Session

1027	   A Multimedia Session is an association among a group of Participants
1028	   (Section 2.2.3) engaged in the communication via one or more RTP
1029	   Sessions (Section 2.2.2).  It defines logical relationships among
1030	   Media Sources (Section 2.1.4) that appear in multiple RTP Sessions.

1032	   Characteristics:

1034	   o  A Multimedia Session can be composed of several RTP Sessions with
1035	      potentially multiple RTP Streams per RTP Session.

1037	   o  Each Participant in a Multimedia Session can have a multitude of
1038	      Media Captures and Media Rendering devices.

1040	   o  A single Multimedia Session can contain media from one or more
1041	      Synchronization Contexts (Section 3.1).  An example of that is a
1042	      Multimedia Session containing one set of audio and video for
1043	      communication purposes belonging to one Synchronization Context,
1044	      and another set of audio and video for presentation purposes (like
1045	      playing a video file) with a separate Synchronization Context that
1046	      has no strong timing relationship and need not be strictly
1047	      synchronized with the audio and video used for communication.

1049	2.2.5.  Communication Session

1051	   A Communication Session is an association among two or more
1052	   Participants (Section 2.2.3) communicating with each other via one or
1053	   more Multimedia Sessions (Section 2.2.4).

1055	   Characteristics:

1057	   o  Each Participant in a Communication Session is identified via an
1058	      application-specific signaling address.

1060	   o  A Communication Session is composed of Participants that share at
1061	      least one Multimedia Session, involving one or more parallel RTP
1062	      Sessions with potentially multiple RTP Streams per RTP Session.

1064	   For example, in a full mesh communication, the Communication Session
1065	   consists of a set of separate Multimedia Sessions between each pair
1066	   of Participants.  Another example is a centralized conference, where
1067	   the Communication Session consists of a set of Multimedia Sessions
1068	   between each Participant and the conference handler.

1070	3.  Concepts of Inter-Relations

1072	   This section uses the concepts from previous sections, and looks at
1073	   different types of relationships among them.  These relationships
1074	   occur at different abstraction levels and for different purposes, but
1075	   the reason for the needed relationship at a certain step in the media
1076	   handling chain may exist at another step.  For example, the use of
1077	   Simulcast (Section 3.6)) implies a need to determine relations at RTP
1078	   Stream level, but the underlying reason is that multiple Media
1079	   Encoders use the same Media Source, i.e. to be able to identify a
1080	   common Media Source.

1082	3.1.  Synchronization Context

1084	   A Synchronization Context defines a requirement on a strong timing
1085	   relationship between the Media Sources, typically requiring alignment
1086	   of clock sources.  Such a relationship can be identified in multiple
1087	   ways as listed below.  A single Media Source can only belong to a
1088	   single Synchronization Context, since it is assumed that a single
1089	   Media Source can only have a single media clock and requiring
1090	   alignment to several Synchronization Contexts (and thus reference
1091	   clocks) will effectively merge those into a single Synchronization
1092	   Context.

1094	3.1.1.  RTCP CNAME

1096	   RFC3550 [RFC3550] describes Inter-media synchronization between RTP
1097	   Sessions based on RTCP CNAME, RTP and Network Time Protocol (NTP)
1098	   [RFC5905] formatted timestamps of a reference clock.  As indicated in
1099	   [RFC7273], despite using NTP format timestamps, it is not required
1100	   that the clock be synchronized to an NTP source.

1102	3.1.2.  Clock Source Signaling

1104	   [RFC7273] provides a mechanism to signal the clock source in Session
1105	   Description Protocol (SDP) [RFC4566] both for the reference clock as
1106	   well as the media clock, thus allowing a Synchronization Context to
1107	   be defined beyond the one defined by the usage of CNAME source
1108	   descriptions.

1110	3.1.3.  Implicitly via RtcMediaStream

1112	   WebRTC defines "RtcMediaStream" with one or more
1113	   "RtcMediaStreamTracks".  All tracks in a "RtcMediaStream" are
1114	   intended to be synchronized when rendered, implying that they must be
1115	   generated such that synchronization is possible.

1117	3.1.4.  Explicitly via SDP Mechanisms

1119	   The SDP Grouping Framework [RFC5888] defines an m= line (Section 4.2)
1120	   grouping mechanism called "Lip Synchronization" (with LS
1121	   identification-tag) for establishing the synchronization requirement
1122	   across m= lines when they map to individual sources.

1124	   Source-Specific Media Attributes in SDP [RFC5576] extends the above
1125	   mechanism when multiple Media Sources are described by a single m=
1126	   line.

1128	3.2.  Endpoint

1130	   Some applications requires knowledge of what Media Sources originate
1131	   from a particular Endpoint (Section 2.2.1).  This can include such
1132	   decisions as packet routing between parts of the topology, knowing
1133	   the Endpoint origin of the RTP Streams.

1135	   In RTP, this identification has been overloaded with the
1136	   Synchronization Context (Section 3.1) through the usage of the RTCP
1137	   source description CNAME (Section 3.1.1).  This works for some
1138	   usages, but in others it breaks down.  For example, if an Endpoint
1139	   has two sets of Media Sources that have different Synchronization
1140	   Contexts, like the audio and video of the human Participant as well
1141	   as a set of Media Sources of audio and video for a shared movie,
1142	   CNAME would not be an appropriate identification for that Endpoint.
1143	   Therefore, an Endpoint may have multiple CNAMEs.  The CNAMEs or the
1144	   Media Sources themselves can be related to the Endpoint.

1146	3.3.  Participant

1148	   In communication scenarios, it is commonly needed to know which Media
1149	   Sources originate from which Participant (Section 2.2.3).  One reason
1150	   is, for example, to enable the application to display Participant
1151	   Identity information correctly associated with the Media Sources.
1152	   This association is handled through the signaling solution to point
1153	   at a specific Multimedia Session where the Media Sources may be
1154	   explicitly or implicitly tied to a particular Endpoint.

1156	   Participant information becomes more problematic due to Media Sources
1157	   that are generated through mixing or other conceptual processing of
1158	   Raw Streams or Source Streams that originate from different
1159	   Participants.  This type of Media Sources can thus have a dynamically
1160	   varying set of origins and Participants.  RTP contains the concept of
1161	   CSRC that carry information about the previous step origin of the
1162	   included media content on RTP level.

1164	3.4.  RtcMediaStream

1166	   An RtcMediaStream in WebRTC is an explicit grouping of a set of Media
1167	   Sources (RtcMediaStreamTracks) that share a common identifier and a
1168	   single Synchronization Context (Section 3.1).

1170	3.5.  Multi-Channel Audio

1172	   There exist a number of RTP payload formats that can carry multi-
1173	   channel audio, despite the codec being a single-channel (mono)
1174	   encoder.  Multi-channel audio can be viewed as multiple Media Sources
1175	   sharing a common Synchronization Context.  These are independently
1176	   encoded by a Media Encoder and the different Encoded Streams are
1177	   packetized together in a time synchronized way into a single Source
1178	   RTP Stream, using the used codec's RTP Payload format.  Examples of
1179	   codecs that support multi-channel audio are PCMA and PCMU [RFC3551],
1180	   AMR [RFC4867], and G.719 [RFC5404].

1182	3.6.  Simulcast

1184	   A Media Source represented as multiple independent Encoded Streams
1185	   constitutes a Simulcast [I-D.ietf-mmusic-sdp-simulcast] or MDC of
1186	   that Media Source.  Figure 8 shows an example of a Media Source that
1187	   is encoded into three separate Simulcast streams, that are in turn
1188	   sent on the same Media Transport flow.  When using Simulcast, the RTP
1189	   Streams may be sharing RTP Session and Media Transport, or be
1190	   separated on different RTP Sessions and Media Transports, or any
1191	   combination of these two.  One major reason to use separate Media
1192	   Transports is to make use of different Quality of Service for the
1193	   different Source RTP Streams.  Some considerations on separating
1194	   related RTP Streams are discussed in Section 3.12.

1196	                            +----------------+
1197	                            |  Media Source  |
1198	                            +----------------+
1199	                     Source Stream  |
1200	             +----------------------+----------------------+
1201	             |                      |                      |
1202	             V                      V                      V
1203	    +------------------+   +------------------+   +------------------+
1204	    |  Media Encoder   |   |  Media Encoder   |   |  Media Encoder   |
1205	    +------------------+   +------------------+   +------------------+
1206	             | Encoded              | Encoded              | Encoded
1207	             | Stream               | Stream               | Stream
1208	             V                      V                      V
1209	    +------------------+   +------------------+   +------------------+
1210	    | Media Packetizer |   | Media Packetizer |   | Media Packetizer |
1211	    +------------------+   +------------------+   +------------------+
1212	             | Source               | Source               | Source
1213	             | RTP                  | RTP                  | RTP
1214	             | Stream               | Stream               | Stream
1215	             +-----------------+    |    +-----------------+
1216	                               |    |    |
1217	                               V    V    V
1218	                          +-------------------+
1219	                          |  Media Transport  |
1220	                          +-------------------+

1222	                Figure 8: Example of Media Source Simulcast

1224	   The Simulcast relation between the RTP Streams is the common Media
1225	   Source.  In addition, to be able to identify the common Media Source,
1226	   a receiver of the RTP Stream may need to know which configuration or
1227	   encoding goals that lay behind the produced Encoded Stream and its
1228	   properties.  This enables selection of the stream that is most useful
1229	   in the application at that moment.

1231	3.7.  Layered Multi-Stream

1233	   Layered Multi-Stream (LMS) is a mechanism by which different portions
1234	   of a layered or scalable encoding of a Source Stream are sent using
1235	   separate RTP Streams (sometimes in separate RTP Sessions).  LMSs are
1236	   useful for receiver control of layered media.

1238	   A Media Source represented as an Encoded Stream and multiple
1239	   Dependent Streams constitutes a Media Source that has layered
1240	   dependencies.  Figure 9 represents an example of a Media Source that
1241	   is encoded into three dependent layers, where two layers are sent on
1242	   the same Media Transport using different RTP Streams, i.e. SSRCs, and
1243	   the third layer is sent on a separate Media Transport.

1245	                            +----------------+
1246	                            |  Media Source  |
1247	                            +----------------+
1248	                                    |
1249	                                    |
1250	                                    V
1251	       +---------------------------------------------------------+
1252	       |                      Media Encoder                      |
1253	       +---------------------------------------------------------+
1254	               |                    |                     |
1255	        Encoded Stream       Dependent Stream     Dependent Stream
1256	               |                    |                     |
1257	               V                    V                     V
1258	       +----------------+   +----------------+   +----------------+
1259	       |Media Packetizer|   |Media Packetizer|   |Media Packetizer|
1260	       +----------------+   +----------------+   +----------------+
1261	               |                    |                     |
1262	          RTP Stream           RTP Stream            RTP Stream
1263	               |                    |                     |
1264	               +------+      +------+                     |
1265	                      |      |                            |
1266	                      V      V                            V
1267	                +-----------------+              +-----------------+
1268	                | Media Transport |              | Media Transport |
1269	                +-----------------+              +-----------------+

1271	           Figure 9: Example of Media Source Layered Dependency

1273	   It is sometimes useful to make a distinction between using a single
1274	   Media Transport or multiple separate Media Transports when (in both
1275	   cases) using multiple RTP Streams to carry Encoded Streams and
1276	   Dependent Streams for a Media Source.  Therefore, the following new
1277	   terminology is defined here:

1279	   SRST:  Single RTP Stream on a Single Media Transport

1281	   MRST:  Multiple RTP Streams on a Single Media Transport

1283	   MRMT:  Multiple RTP Streams on Multiple Media Transports

1285	   MRST and MRMT relations needs to identify the common Media Encoder
1286	   origin for the Encoded and Dependent Streams.  When using different
1287	   RTP Sessions (MRMT), a single RTP Stream per Media Encoder, and a
1288	   single Media Source in each RTP Session, common SSRC and CNAMEs can
1289	   be used to identify the common Media Source.  When multiple RTP
1290	   Streams are sent from one Media Encoder in the same RTP Session
1291	   (MRST), then CNAME is the only currently specified RTP identifier
1292	   that can be used.  In cases where multiple Media Encoders use
1293	   multiple Media Sources sharing Synchronization Context, and thus
1294	   having a common CNAME, additional heuristics or identification need
1295	   to be applied to create the MRST or MRMT relationships between the
1296	   RTP Streams.

1298	3.8.  RTP Stream Duplication

1300	   RTP Stream Duplication [RFC7198], using the same or different Media
1301	   Transports, and optionally also delaying the duplicate [RFC7197],
1302	   offers a simple way to protect media flows from packet loss in some
1303	   cases (see Figure 10).  This is a specific type of redundancy.  All
1304	   but one Source RTP Stream (Section 2.1.10) are effectively Redundancy
1305	   RTP Streams (Section 2.1.12), but since both Source and Redundant RTP
1306	   Streams are the same, it does not matter which one is which.  This
1307	   can also be seen as a specific type of Simulcast (Section 3.6) that
1308	   transmits the same Encoded Stream (Section 2.1.7) multiple times.

1310	                            +----------------+
1311	                            |  Media Source  |
1312	                            +----------------+
1313	                     Source Stream  |
1314	                                    V
1315	                            +----------------+
1316	                            | Media Encoder  |
1317	                            +----------------+
1318	                    Encoded Stream  |
1319	                        +-----------+-----------+
1320	                        |                       |
1321	                        V                       V
1322	               +------------------+    +------------------+
1323	               | Media Packetizer |    | Media Packetizer |
1324	               +------------------+    +------------------+
1325	                 Source | RTP Stream     Source | RTP Stream
1326	                        |                       V
1327	                        |                +-------------+
1328	                        |                | Delay (opt) |
1329	                        |                +-------------+
1330	                        |                       |
1331	                        +-----------+-----------+
1332	                                    |
1333	                                    V
1334	                          +-------------------+
1335	                          |  Media Transport  |
1336	                          +-------------------+

1338	               Figure 10: Example of RTP Stream Duplication

1340	3.9.  Redundancy Format

1342	   The RTP Payload for Redundant Audio Data [RFC2198] defines a
1343	   transport for redundant audio data together with primary data in the
1344	   same RTP payload.  The redundant data can be a time delayed version
1345	   of the primary or another time delayed Encoded Stream using a
1346	   different Media Encoder to encode the same Media Source as the
1347	   primary, as depicted in Figure 11.

1349	              +--------------------+
1350	              |    Media Source    |
1351	              +--------------------+
1352	                        |
1353	                   Source Stream
1354	                        |
1355	                        +------------------------+
1356	                        |                        |
1357	                        V                        V
1358	              +--------------------+   +--------------------+
1359	              |   Media Encoder    |   |   Media Encoder    |
1360	              +--------------------+   +--------------------+
1361	                        |                        |
1362	                        |                 +------------+
1363	                  Encoded Stream          | Time Delay |
1364	                        |                 +------------+
1365	                        |                        |
1366	                        |     +------------------+
1367	                        V     V
1368	              +--------------------+
1369	              |  Media Packetizer  |
1370	              +--------------------+
1371	                        |
1372	                        V
1373	                   RTP Stream

1375	   Figure 11: Concept for usage of Audio Redundancy with different Media
1376	                                 Encoders

1378	   The Redundancy format is thus providing the necessary meta
1379	   information to correctly relate different parts of the same Encoded
1380	   Stream.  The case depicted above (Figure 11) relates the Received
1381	   Source Stream fragments coming out of different Media Decoders, to be
1382	   able to combine them together into a less erroneous Source Stream.

1384	3.10.  RTP Retransmission

1386	   Figure 12 shows an example where a Media Source's Source RTP Stream
1387	   is protected by a retransmission (RTX) flow [RFC4588].  In this
1388	   example the Source RTP Stream and the Redundancy RTP Stream share the
1389	   same Media Transport.

1391	          +--------------------+
1392	          |    Media Source    |
1393	          +--------------------+
1394	                    |
1395	                    V
1396	          +--------------------+
1397	          |   Media Encoder    |
1398	          +--------------------+
1399	                    |                              Retransmission
1400	              Encoded Stream     +--------+     +---- Request
1401	                    V            |        V     V
1402	          +--------------------+ | +--------------------+
1403	          |  Media Packetizer  | | | RTP Retransmission |
1404	          +--------------------+ | +--------------------+
1405	                    |            |           |
1406	                    +------------+  Redundancy RTP Stream
1407	             Source RTP Stream               |
1408	                    |                        |
1409	                    +---------+    +---------+
1410	                              |    |
1411	                              V    V
1412	                       +-----------------+
1413	                       | Media Transport |
1414	                       +-----------------+

1416	          Figure 12: Example of Media Source Retransmission Flows

1418	   The RTP Retransmission example (Figure 12) illustrates that this
1419	   mechanism works purely on the Source RTP Stream.  The RTP
1420	   Retransmission transform buffers the sent Source RTP Stream and, upon
1421	   request, emits a retransmitted packet with an extra payload header as
1422	   a Redundancy RTP Stream.  The RTP Retransmission mechanism [RFC4588]
1423	   is specified such that there is a one to one relation between the
1424	   Source RTP Stream and the Redundancy RTP Stream.  Therefore, a
1425	   Redundancy RTP Stream needs to be associated with its Source RTP
1426	   Stream.  This is done based on CNAME selectors and heuristics to
1427	   match requested packets for a given Source RTP Stream with the
1428	   original sequence number in the payload of any new Redundancy RTP
1429	   Stream using the RTX payload format.  In cases where the Redundancy
1430	   RTP Stream is sent in a different RTP Session than the Source RTP
1431	   Stream, the RTP Session relation is signaled by using the SDP Media
1432	   Grouping's [RFC5888] Flow Identification (FID identification-tag)
1433	   semantics.

1435	3.11.  Forward Error Correction

1437	   Figure 13 shows an example where two Media Sources' Source RTP
1438	   Streams are protected by Forward Error Correction (FEC).  Source RTP
1439	   Stream A has a RTP-based Redundancy transformation in FEC Encoder 1.
1440	   This produces a Redundancy RTP Stream 1, that is only related to
1441	   Source RTP Stream A.  The FEC Encoder 2, however, takes two Source
1442	   RTP Streams (A and B) and produces a Redundancy RTP Stream 2 that
1443	   protects them jointly, i.e. Redundancy RTP Stream 2 relates to two
1444	   Source RTP Streams (a FEC group).  FEC decoding, when needed due to
1445	   packet loss or packet corruption at the receiver, requires knowledge
1446	   about which Source RTP Streams that the FEC encoding was based on.

1448	   In Figure 13 all RTP Streams are sent on the same Media Transport.
1449	   This is however not the only possible choice.  Numerous combinations
1450	   exist for spreading these RTP Streams over different Media Transports
1451	   to achieve the communication application's goal.

1453	       +--------------------+                +--------------------+
1454	       |   Media Source A   |                |   Media Source B   |
1455	       +--------------------+                +--------------------+
1456	                 |                                     |
1457	                 V                                     V
1458	       +--------------------+                +--------------------+
1459	       |   Media Encoder A  |                |   Media Encoder B  |
1460	       +--------------------+                +--------------------+
1461	                 |                                     |
1462	           Encoded Stream                        Encoded Stream
1463	                 V                                     V
1464	       +--------------------+                +--------------------+
1465	       | Media Packetizer A |                | Media Packetizer B |
1466	       +--------------------+                +--------------------+
1467	                 |                                     |
1468	        Source RTP Stream A                   Source RTP Stream B
1469	                 |                                     |
1470	           +-----+---------+-------------+         +---+---+
1471	           |               V             V         V       |
1472	           |       +---------------+  +---------------+    |
1473	           |       | FEC Encoder 1 |  | FEC Encoder 2 |    |
1474	           |       +---------------+  +---------------+    |
1475	           |  Redundancy   |     Redundancy   |            |
1476	           |  RTP Stream 1 |     RTP Stream 2 |            |
1477	           V               V                  V            V
1478	       +----------------------------------------------------------+
1479	       |                    Media Transport                       |
1480	       +----------------------------------------------------------+

1482	             Figure 13: Example of FEC Redundancy RTP Streams

1484	   As FEC Encoding exists in various forms, the methods for relating FEC
1485	   Redundancy RTP Streams with its source information in Source RTP
1486	   Streams are many.  The XOR based RTP FEC Payload format [RFC5109] is
1487	   defined in such a way that a Redundancy RTP Stream has a one to one
1488	   relation with a Source RTP Stream.  In fact, the RFC requires the
1489	   Redundancy RTP Stream to use the same SSRC as the Source RTP Stream.
1490	   This requires the use of either a separate RTP Session, or the
1491	   Redundancy RTP Payload format [RFC2198].  The underlying relation
1492	   requirement for this FEC format and a particular Redundancy RTP
1493	   Stream is to know the related Source RTP Stream, including its SSRC.

1495	3.12.  RTP Stream Separation

1497	   RTP Streams can be separated exclusively based on their SSRCs, at the
1498	   RTP Session level, or at the Multi-Media Session level.

1500	   When the RTP Streams that have a relationship are all sent in the
1501	   same RTP Session and are uniquely identified based on their SSRC
1502	   only, it is termed an SSRC-Only Based Separation.  Such streams can
1503	   be related via RTCP CNAME to identify that the streams belong to the
1504	   same Endpoint.  SSRC-based approaches [RFC5576], when used, can
1505	   explicitly relate various such RTP Streams.

1507	   On the other hand, when RTP Streams that are related are sent in the
1508	   context of different RTP Sessions to achieve separation, it is known
1509	   as RTP Session-based separation.  This is commonly used when the
1510	   different RTP Streams are intended for different Media Transports.

1512	   Several mechanisms that use RTP Session-based separation rely on it
1513	   to enable an implicit grouping mechanism expressing the relationship.
1514	   The solutions have been based on using the same SSRC value in the
1515	   different RTP Sessions to implicitly indicate their relation.  That
1516	   way, no explicit RTP level mechanism has been needed, only signaling
1517	   level relations have been established using semantics from Grouping
1518	   of Media lines framework [RFC5888].  Examples of this are RTP
1519	   Retransmission [RFC4588], SVC Multi-Session Transmission [RFC6190]
1520	   and XOR Based FEC [RFC5109].  RTCP CNAME explicitly relates RTP
1521	   Streams across different RTP Sessions, as explained in the previous
1522	   section.  Such a relationship can be used to perform inter-media
1523	   synchronization.

1525	   RTP Streams that are related and need to be associated can be part of
1526	   different Multimedia Sessions, rather than just different RTP
1527	   Sessions within the same Multimedia Session context.  This puts
1528	   further demand on the scope of the mechanism(s) and its handling of
1529	   identifiers used for expressing the relationships.

1531	3.13.  Multiple RTP Sessions over one Media Transport

1533	   [I-D.westerlund-avtcore-transport-multiplexing] describes a mechanism
1534	   that allows several RTP Sessions to be carried over a single
1535	   underlying Media Transport.  The main reasons for doing this are
1536	   related to the impact of using one or more Media Transports (using a
1537	   common network path or potentially have different ones).  The fewer
1538	   Media Transports used, the less need for NAT/FW traversal resources
1539	   and smaller number of flow based Quality of Service (QoS).

1541	   However, Multiple RTP Sessions over one Media Transport imply that a
1542	   single Media Transport 5-tuple is not sufficient to express in which
1543	   RTP Session context a particular RTP Stream exists.  Complexities in
1544	   the relationship between Media Transports and RTP Session already
1545	   exist as one RTP Session contains multiple Media Transports, e.g.
1546	   even a Peer-to-Peer RTP Session with RTP/RTCP Multiplexing requires
1547	   two Media Transports, one in each direction.  The relationship
1548	   between Media Transports and RTP Sessions as well as additional
1549	   levels of identifiers need to be considered in both signaling design
1550	   and when defining terminology.

1552	4.  Mapping from Existing Terms

1554	   This section describes a selected set of terms from some relevant
1555	   IETF RFC and Internet Drafts (at the time of writing), using the
1556	   concepts from previous sections.

1558	4.1.  Telepresence Terms

1560	   The terms in this sub-section are used in the context of CLUE
1561	   [I-D.ietf-clue-framework].  Note that some terms listed in this sub-
1562	   section use the same names as terms defined elsewhere in this
1563	   document.  Unless explicitly stated (as "RTP Taxonomy") and in this
1564	   sub-section, they are to be read as references to the CLUE-specific
1565	   term within this sub-section.

1567	4.1.1.  Audio Capture

1569	   Defined in CLUE as a Media Capture (Section 4.1.7) for audio.
1570	   Describes an audio Media Source (Section 2.1.4).

1572	4.1.2.  Capture Device

1574	   Defined in CLUE as a device that converts physical input into an
1575	   electrical signal.  Identifies a physical entity performing an RTP
1576	   Taxonomy Media Capture (Section 2.1.2) transformation.

1578	4.1.3.  Capture Encoding

1580	   Defined in CLUE as a specific encoding (Section 4.1.6) of a Media
1581	   Capture (Section 4.1.7).  Describes an Encoded Stream (Section 2.1.7)
1582	   related to CLUE specific semantic information.

1584	4.1.4.  Capture Scene

1586	   Defined in CLUE as a structure representing a spatial region captured
1587	   by one or more Capture Devices (Section 4.1.2), each capturing media
1588	   representing a portion of the region.  Describes a set of spatially
1589	   related Media Sources (Section 2.1.4).

1591	4.1.5.  Endpoint

1593	   Defined in CLUE as a CLUE-capable device which is the logical point
1594	   of final termination through receiving, decoding and rendering and/or
1595	   initiation through capturing, encoding, and sending of media streams
1596	   (Section 4.1.10).  CLUE further defines it to consist of one or more
1597	   physical devices with source and sink media streams, and exactly one
1598	   [RFC4353] Participant.  Describes exactly one Participant
1599	   (Section 2.2.3) and one or more RTP Taxonomy Endpoints
1600	   (Section 2.2.1).

1602	4.1.6.  Individual Encoding

1604	   Defined in CLUE as a set of parameters representing a way to encode a
1605	   Media Capture (Section 4.1.7) to become a Capture Encoding
1606	   (Section 4.1.3).  Describes the configuration information needed to
1607	   perform a Media Encoder (Section 2.1.6) transformation.

1609	4.1.7.  Media Capture

1611	   Defined in CLUE as a source of media, such as from one or more
1612	   Capture Devices (Section 4.1.2) or constructed from other media
1613	   streams (Section 4.1.10).  Describes either an RTP Taxonomy Media
1614	   Capture (Section 2.1.2) or a Media Source (Section 2.1.4), depending
1615	   on in which context the term is used.

1617	4.1.8.  Media Consumer

1619	   Defined in CLUE as a CLUE-capable device that intends to receive
1620	   Capture Encodings (Section 4.1.3).  Describes the media receiving
1621	   part of an RTP Taxonomy Endpoint (Section 2.2.1).

1623	4.1.9.  Media Provider

1625	   Defined in CLUE as a CLUE-capable device that intends to send Capture
1626	   Encodings (Section 4.1.3).  Describes the media sending part of an
1627	   RTP Taxonomy Endpoint (Section 2.2.1).

1629	4.1.10.  Stream

1631	   Defined in CLUE as a Capture Encoding (Section 4.1.3) sent from a
1632	   Media Provider (Section 4.1.9) to a Media Consumer (Section 4.1.8)
1633	   via RTP.  Describes an RTP Stream (Section 2.1.10).

1635	4.1.11.  Video Capture

1637	   Defined in CLUE as a Media Capture (Section 4.1.7) for video.
1638	   Describes a video Media Source (Section 2.1.4).

1640	4.2.  Media Description

1642	   A single Session Description Protocol (SDP) [RFC4566] media
1643	   description (or media block; an m-line and all subsequent lines until
1644	   the next m-line or the end of the SDP) describes part of the
1645	   necessary configuration and identification information needed for a
1646	   Media Encoder transformation, as well as the necessary configuration
1647	   and identification information for the Media Decoder to be able to
1648	   correctly interpret a received RTP Stream.

1650	   A Media Description typically relates to a single Media Source.  This
1651	   is for example an explicit restriction in WebRTC.  However, nothing
1652	   prevents that the same Media Description (and same RTP Session) is
1653	   re-used for multiple Media Sources
1654	   [I-D.ietf-avtcore-rtp-multi-stream].  It can thus describe properties
1655	   of one or more RTP Streams, and can also describe properties valid
1656	   for an entire RTP Session (via [RFC5576] mechanisms, for example).

1658	4.3.  Media Stream

1660	   RTP [RFC3550] uses media stream, audio stream, video stream, and
1661	   stream of (RTP) packets interchangeably, which are all RTP Streams.

1663	4.4.  Multimedia Conference

1665	   A Multimedia Conference is a Communication Session (Section 2.2.5)
1666	   between two or more Participants (Section 2.2.3), along with the
1667	   software they are using to communicate.

1669	4.5.  Multimedia Session

1671	   SDP [RFC4566] defines a Multimedia Session as a set of multimedia
1672	   senders and receivers and the data streams flowing from senders to
1673	   receivers, which would correspond to a set of Endpoints and the RTP
1674	   Streams that flow between them.  In this document, Multimedia Session
1675	   (Section 2.2.4) also assumes those Endpoints belong to a set of
1676	   Participants that are engaged in communication via a set of related
1677	   RTP Streams.

1679	   RTP [RFC3550] defines a Multimedia Session as a set of concurrent RTP
1680	   Sessions among a common group of Participants.  For example, a video
1681	   conference may contain an audio RTP Session and a video RTP Session.
1682	   This would correspond to a group of Participants (each using one or
1683	   more Endpoints) sharing a set of concurrent RTP Sessions.  In this
1684	   document, Multimedia Session also defines those RTP Sessions to have
1685	   some relation and be part of a communication among the Participants.

1687	4.6.  Multipoint Control Unit (MCU)

1689	   This term is commonly used to describe the central node in any type
1690	   of star topology [I-D.ietf-avtcore-rtp-topologies-update] conference.
1691	   It describes a device that includes one Participant (Section 2.2.3)
1692	   (usually corresponding to a so-called conference focus) and one or
1693	   more related Endpoints (Section 2.2.1) (sometimes one or more per
1694	   conference Participant).

1696	4.7.  Multi-Session Transmission (MST)

1698	   One of two transmission modes defined in H.264 based SVC [RFC6190],
1699	   the other mode being SST (Section 4.13).  In Multi-Session
1700	   Transmission (MST), the SVC Media Encoder sends Encoded Streams and
1701	   Dependent Streams distributed across two or more RTP Streams in one
1702	   or more RTP Sessions.  The term "MST" is ambiguous in RFC 6190,
1703	   especially since the name indicates the use of multiple "sessions",
1704	   while MST type packetization is in fact required whenever two or more
1705	   RTP Streams are used for the Encoded and Dependent Streams,
1706	   regardless if those are sent in one or more RTP Sessions.
1707	   Corresponds either to MRST or MRMT (Section 3.7) stream relations
1708	   defined in this document.  The SVC RTP Payload RFC [RFC6190] is not
1709	   particularly explicit about how the common Media Encoder
1710	   (Section 2.1.6) relation between Encoded Streams (Section 2.1.7) and
1711	   Dependent Streams (Section 2.1.8) is to be implemented.

1713	4.8.  Recording Device

1715	   WebRTC specifications use this term to refer to locally available
1716	   entities performing a Media Capture (Section 2.1.2) transformation.

1718	4.9.  RtcMediaStream

1720	   A WebRTC RtcMediaStream is a set of Media Sources (Section 2.1.4)
1721	   sharing the same Synchronization Context (Section 3.1).

1723	4.10.  RtcMediaStreamTrack

1725	   A WebRTC RtcMediaStreamTrack is a Media Source (Section 2.1.4).

1727	4.11.  RTP Sender

1729	   RTP [RFC3550] uses this term, which can be seen as the RTP protocol
1730	   part of a Media Packetizer (Section 2.1.9).

1732	4.12.  RTP Session

1734	   Within the context of SDP, a singe m= line can map to a single RTP
1735	   Session (Section 2.2.2) or multiple m= lines can map to a single RTP
1736	   Session.  The latter is enabled via multiplexing schemes such as
1737	   BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation], for example, which
1738	   allows mapping of multiple m= lines to a single RTP Session.

1740	4.13.  Single Session Transmission (SST)

1742	   One of two transmission modes defined in H.264 based SVC [RFC6190],
1743	   the other mode being MST (Section 4.7).  In Single Session
1744	   Transmission (SST), the SVC Media Encoder sends Encoded Streams
1745	   (Section 2.1.7) and Dependent Streams (Section 2.1.8) combined into a
1746	   single RTP Stream (Section 2.1.10) in a single RTP Session
1747	   (Section 2.2.2), using the SVC RTP Payload format.  The term "SST" is
1748	   ambiguous in RFC 6190, in that it sometimes refers to the use of a
1749	   single RTP Stream, like in sections relating to packetization, and
1750	   sometimes appears to refer to use of a single RTP Session, like in
1751	   the context of discussing SDP.  Closely corresponds to SRST
1752	   (Section 3.7) defined in this document.

1754	4.14.  SSRC

1756	   RTP [RFC3550] defines this as "the source of a stream of RTP
1757	   packets", which indicates that an SSRC is not only a unique
1758	   identifier for the Encoded Stream (Section 2.1.7) carried in those
1759	   packets, but is also effectively used as a term to denote a Media
1760	   Packetizer (Section 2.1.9).  In [RFC3550], it is stated that "a
1761	   synchronization source may change its data format, e.g., audio
1762	   encoding, over time".  The related Encoded Stream data format in an
1763	   RTP Stream (Section 2.1.10) is identified by the RTP Payload Type.
1764	   Changing data format for an Encoded Stream effectively also changes
1765	   what Media Encoder (Section 2.1.6) that is used for the Encoded
1766	   Stream.  No ambiguity is introduced to SSRC as Encoded Stream
1767	   identifier by allowing RTP Payload Type changes, as long as only a
1768	   single RTP Payload Type is valid for any given RTP Time Stamp.  This
1769	   is aligned with and further described by Section 5.2 of [RFC3550].

1771	5.  Security Considerations

1773	   The purpose of this document is to make clarifications and reduce the
1774	   confusion prevalent in RTP taxonomy because of inconsistent usage by
1775	   multiple technologies and protocols making use of the RTP protocol.
1776	   It does not introduce any new security considerations beyond those
1777	   already well documented in the RTP protocol [RFC3550] and each of the
1778	   many respective specifications of the various protocols making use of
1779	   it.

1781	   Having a well-defined common terminology and understanding of the
1782	   complexities of the RTP architecture will help lead us to better
1783	   standards, avoiding security problems.

1785	6.  Acknowledgement

1787	   This document has many concepts borrowed from several documents such
1788	   as WebRTC [I-D.ietf-rtcweb-overview], CLUE [I-D.ietf-clue-framework],
1789	   and Multiplexing Architecture
1790	   [I-D.westerlund-avtcore-transport-multiplexing].  The authors would
1791	   like to thank all the authors of each of those documents.

1793	   The authors would also like to acknowledge the insights, guidance and
1794	   contributions of Magnus Westerlund, Roni Even, Paul Kyzivat, Colin
1795	   Perkins, Keith Drage, Harald Alvestrand, Alex Eleftheriadis, Mo
1796	   Zanaty, Stephan Wenger, and Bernard Aboba.

1798	7.  Contributors

1800	   Magnus Westerlund has contributed the concept model for the media
1801	   chain using transformations and streams model, including rewriting
1802	   pre-existing concepts into this model and adding missing concepts.
1803	   The first proposal for updating the relationships and the topologies
1804	   based on this concept was also performed by Magnus.

1806	8.  IANA Considerations

1808	   This document makes no request of IANA.

1810	9.  Informative References

1812	   [I-D.ietf-avtcore-rtp-multi-stream]
1813	              Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
1814	              "Sending Multiple Media Streams in a Single RTP Session",
1815	              draft-ietf-avtcore-rtp-multi-stream-08 (work in progress),
1816	              July 2015.

1818	   [I-D.ietf-avtcore-rtp-topologies-update]
1819	              Westerlund, M. and S. Wenger, "RTP Topologies", draft-
1820	              ietf-avtcore-rtp-topologies-update-10 (work in progress),
1821	              July 2015.

1823	   [I-D.ietf-clue-framework]
1824	              Duckworth, M., Pepperell, A., and S. Wenger, "Framework
1825	              for Telepresence Multi-Streams", draft-ietf-clue-
1826	              framework-22 (work in progress), April 2015.

1828	   [I-D.ietf-mmusic-sdp-bundle-negotiation]
1829	              Holmberg, C., Alvestrand, H., and C. Jennings,
1830	              "Negotiating Media Multiplexing Using the Session
1831	              Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle-
1832	              negotiation-23 (work in progress), July 2015.

1834	   [I-D.ietf-mmusic-sdp-simulcast]
1835	              Burman, B., Westerlund, M., Nandakumar, S., and M. Zanaty,
1836	              "Using Simulcast in SDP and RTP Sessions", draft-ietf-
1837	              mmusic-sdp-simulcast-00 (work in progress), January 2015.

1839	   [I-D.ietf-rtcweb-overview]
1840	              Alvestrand, H., "Overview: Real Time Protocols for
1841	              Browser-based Applications", draft-ietf-rtcweb-overview-14
1842	              (work in progress), June 2015.

1844	   [I-D.westerlund-avtcore-transport-multiplexing]
1845	              Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP
1846	              Sessions onto a Single Lower-Layer Transport", draft-
1847	              westerlund-avtcore-transport-multiplexing-07 (work in
1848	              progress), October 2013.

1850	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
1851	              Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
1852	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
1853	              DOI 10.17487/RFC2198, September 1997,
1854	              <http://www.rfc-editor.org/info/rfc2198>.

1856	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1857	              Jacobson, "RTP: A Transport Protocol for Real-Time
1858	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
1859	              July 2003, <http://www.rfc-editor.org/info/rfc3550>.

1861	   [RFC3551]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
1862	              Video Conferences with Minimal Control", STD 65, RFC 3551,
1863	              DOI 10.17487/RFC3551, July 2003,
1864	              <http://www.rfc-editor.org/info/rfc3551>.

1866	   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
1867	              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
1868	              RFC 3711, DOI 10.17487/RFC3711, March 2004,
1869	              <http://www.rfc-editor.org/info/rfc3711>.

1871	   [RFC4353]  Rosenberg, J., "A Framework for Conferencing with the
1872	              Session Initiation Protocol (SIP)", RFC 4353,
1873	              DOI 10.17487/RFC4353, February 2006,
1874	              <http://www.rfc-editor.org/info/rfc4353>.

1876	   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
1877	              Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
1878	              July 2006, <http://www.rfc-editor.org/info/rfc4566>.

1880	   [RFC4588]  Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
1881	              Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
1882	              DOI 10.17487/RFC4588, July 2006,
1883	              <http://www.rfc-editor.org/info/rfc4588>.

1885	   [RFC4867]  Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie,
1886	              "RTP Payload Format and File Storage Format for the
1887	              Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband
1888	              (AMR-WB) Audio Codecs", RFC 4867, DOI 10.17487/RFC4867,
1889	              April 2007, <http://www.rfc-editor.org/info/rfc4867>.

1891	   [RFC5109]  Li, A., Ed., "RTP Payload Format for Generic Forward Error
1892	              Correction", RFC 5109, DOI 10.17487/RFC5109, December
1893	              2007, <http://www.rfc-editor.org/info/rfc5109>.

1895	   [RFC5404]  Westerlund, M. and I. Johansson, "RTP Payload Format for
1896	              G.719", RFC 5404, DOI 10.17487/RFC5404, January 2009,
1897	              <http://www.rfc-editor.org/info/rfc5404>.

1899	   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
1900	              Applicability Statement", RFC 5481, DOI 10.17487/RFC5481,
1901	              March 2009, <http://www.rfc-editor.org/info/rfc5481>.

1903	   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
1904	              Media Attributes in the Session Description Protocol
1905	              (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009,
1906	              <http://www.rfc-editor.org/info/rfc5576>.

1908	   [RFC5888]  Camarillo, G. and H. Schulzrinne, "The Session Description
1909	              Protocol (SDP) Grouping Framework", RFC 5888,
1910	              DOI 10.17487/RFC5888, June 2010,
1911	              <http://www.rfc-editor.org/info/rfc5888>.

1913	   [RFC5905]  Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
1914	              "Network Time Protocol Version 4: Protocol and Algorithms
1915	              Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
1916	              <http://www.rfc-editor.org/info/rfc5905>.

1918	   [RFC6190]  Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
1919	              "RTP Payload Format for Scalable Video Coding", RFC 6190,
1920	              DOI 10.17487/RFC6190, May 2011,
1921	              <http://www.rfc-editor.org/info/rfc6190>.

1923	   [RFC7160]  Petit-Huguenin, M. and G. Zorn, Ed., "Support for Multiple
1924	              Clock Rates in an RTP Session", RFC 7160,
1925	              DOI 10.17487/RFC7160, April 2014,
1926	              <http://www.rfc-editor.org/info/rfc7160>.

1928	   [RFC7197]  Begen, A., Cai, Y., and H. Ou, "Duplication Delay
1929	              Attribute in the Session Description Protocol", RFC 7197,
1930	              DOI 10.17487/RFC7197, April 2014,
1931	              <http://www.rfc-editor.org/info/rfc7197>.

1933	   [RFC7198]  Begen, A. and C. Perkins, "Duplicating RTP Streams",
1934	              RFC 7198, DOI 10.17487/RFC7198, April 2014,
1935	              <http://www.rfc-editor.org/info/rfc7198>.

1937	   [RFC7201]  Westerlund, M. and C. Perkins, "Options for Securing RTP
1938	              Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,
1939	              <http://www.rfc-editor.org/info/rfc7201>.

1941	   [RFC7273]  Williams, A., Gross, K., van Brandenburg, R., and H.
1942	              Stokking, "RTP Clock Source Signalling", RFC 7273,
1943	              DOI 10.17487/RFC7273, June 2014,
1944	              <http://www.rfc-editor.org/info/rfc7273>.

1946	Appendix A.  Changes From Earlier Versions

1948	   NOTE TO RFC EDITOR: Please remove this section prior to publication.

1950	A.1.  Modifications Between WG Version -07 and -08

1952	   Addresses comments from IESG evaluation.

1954	   o  Made text more firm around what improvements this document
1955	      introduces.

1957	   o  Clarified the distinction between analog and digital in sections
1958	      2.1.1 and 2.1.2.

1960	   o  Removed the explicit requirement that a Source RTP Stream must
1961	      send at least some data from an Encoded Stream, replacing it with
1962	      a statement that it is directly related to the Encoded Stream.

1964	   o  Moved the clarification that RTP-based Redundancy excludes Media
1965	      Encoder redundancy data in an Encoded Stream from Section 2.1.10
1966	      (RTP Stream) to 2.1.11 (RTP-based Redundancy), since that
1967	      statement applies to RTP-based Redundancy rather than to RTP
1968	      Stream.

1970	   o  Added clarification that a Media Transport Sender can
1971	      intentionally pace packet transmission.

1973	   o  Aligned text around delay variation to use this term throughout,
1974	      and added a reference to RFC 5481.

1976	   o  Added that RTP Session is a group communications channel that can
1977	      potentially carry a number of RTP Streams, as an additional
1978	      clarification below Figure 7.

1980	   o  Added a clarification in Section 4.1 around Telepresence Terms on
1981	      which references are to CLUE terms and which are to other sections
1982	      of this document, for terms that have the same name in CLUE as in
1983	      this document.

1985	   o  Clarified in Section 4.14 what SSRC data format changes means,
1986	      since the RFC 3550 SSRC definition mentions this possibility.

1988	   o  Editorial improvements.

1990	A.2.  Modifications Between WG Version -06 and -07

1992	   Addresses comments from AD review and GenArt review.

1994	   o  Added RTP-based Security and RTP-based Validation transform
1995	      sections, as well as Secured RTP Stream and Received Secured RTP
1996	      Stream sections.

1998	   o  Improved wording in Abstract and Introduction sections.

2000	   o  Clarified what is considered "media" in section 2.1.2 Media
2001	      Capture.

2003	   o  Changed a number of "Characteristics" lists to more suitable prose
2004	      text.

2006	   o  Re-worded text around use of Encoded and Dependent RTP Streams in
2007	      section 2.1.9 Media Packetizer.

2009	   o  Clarified description of Source RTP Stream in section 2.1.10.

2011	   o  Clarified motivation to use separate Media Transports for
2012	      Simulcast in section 3.6.

2014	   o  Added local descriptions of terms imported from CLUE framework.

2016	   o  Editorial improvements.

2018	A.3.  Modifications Between WG Version -05 and -06

2020	   o  Clarified that a Redundancy RTP Stream can be used standalone to
2021	      generate Repaired RTP Streams.

2023	   o  Clarified that (in accordance with above) RTP-based Repair takes
2024	      zero or more Received RTP Streams and one or more Received
2025	      Redundancy RTP Streams as input.

2027	   o  Changed Figure 6 to more clearly show that Media Transport is
2028	      terminated in the Endpoint, not in the Participant.

2030	   o  Added a sentence to Endpoint section that clarifies there may be
2031	      contexts where a single "host" can serve multiple Participants,
2032	      making those Endpoints share some properties.

2034	   o  Merged previous section 3.5 on SST/MST with previous section 3.8
2035	      on Layered Multi-Stream into a common section discussing the
2036	      scalable/layered stream relation, and moved improved, descriptive
2037	      text on SST and MST to new sub-sections 4.7 and 4.13, describing
2038	      them as existing terms.

2040	   o  Editorial improvements.

2042	A.4.  Modifications Between WG Version -04 and -05

2044	   o  Editorial improvements.

2046	A.5.  Modifications Between WG Version -03 and -04

2048	   o  Changed "Media Redundancy" and "Media Repair" to "RTP-based
2049	      Redundancy" and "RTP-based Repair", since those terms are more
2050	      specific and correct.

2052	   o  Changed "End Point" to "Endpoint" and removed Editor's Note on
2053	      this.

2055	   o  Clarified that a Media Capture may impose constraints on clock
2056	      handling.

2058	   o  Clarified that mixing multiple Raw Streams into a Source Stream is
2059	      not possible, since that requires mixed streams to have a timing
2060	      relation, requiring them to be Source Streams, and added an
2061	      example.

2063	   o  Clarified that RTP-based Redundancy excludes the type of encoding
2064	      redundancy found within the encoded media format in an Encoded
2065	      Stream.

2067	   o  Clarified that a Media Transport contains only a single RTP
2068	      Session, but a single RTP Session can span multiple Media
2069	      Transports.

2071	   o  Clarified that packets with seemingly correct checksum that are
2072	      received by a Media Transport Receiver may still be corrupt.

2074	   o  Clarified that a corrupt packet in a Media Transport Receiver is
2075	      typically either discarded or somehow marked and passed on in the
2076	      Received RTP Stream.

2078	   o  Added Synchronization Context to Figure 6.

2080	   o  Editorial improvements and clarifications.

2082	A.6.  Modifications Between WG Version -02 and -03

2084	   o  Changed section 3.5, removing SST-SS/MS and MST-SS/MS, replacing
2085	      them with SRST, MRST, and MRMT.

2087	   o  Updated section 3.8 to align with terminology changes in section
2088	      3.5.

2090	   o  Added a new section 4.12, describing the term Multimedia
2091	      Conference.

2093	   o  Changed reference from I-D to now published RFC 7273.

2095	   o  Editorial improvements and clarifications.

2097	A.7.  Modifications Between WG Version -01 and -02

2099	   o  Major re-structure

2101	   o  Moved media chain Media Transport detailing up one section level

2103	   o  Collapsed level 2 sub-sections of section 3 and thus moved level 3
2104	      sub-sections up one level, gathering some introductory text into
2105	      the beginning of section 3

2107	   o  Added that not only SSRC collision, but also a clock rate change
2108	      [RFC7160] is a valid reason to change SSRC value for an RTP stream

2110	   o  Added a sub-section on clock source signaling

2112	   o  Added a sub-section on RTP stream duplication

2114	   o  Elaborated a bit in section 2.2.1 on the relation between End
2115	      Points, Participants and CNAMEs

2117	   o  Elaborated a bit in section 2.2.4 on Multimedia Session and
2118	      synchronization contexts

2120	   o  Removed the section on CLUE scenes defining an implicit
2121	      synchronization context, since it was incorrect

2123	   o  Clarified text on SVC SST and MST according to list discussions

2125	   o  Removed the entire topology section to avoid possible
2126	      inconsistencies or duplications with draft-ietf-avtcore-rtp-
2127	      topologies-update, but saved one example overview figure of
2128	      Communication Entities into that section

2130	   o  Added a section 4 on mapping from existing terms with one sub-
2131	      section per term, mainly by moving text from sections 2 and 3

2133	   o  Changed all occurrences of Packet Stream to RTP Stream

2135	   o  Moved all normative references to informative, since this is an
2136	      informative document

2138	   o  Added references to RFC 7160, RFC 7197 and RFC 7198, and removed
2139	      unused references

2141	A.8.  Modifications Between WG Version -00 and -01

2143	   o  WG version -00 text is identical to individual draft -03

2145	   o  Amended description of SVC SST and MST encodings with respect to
2146	      concepts defined in this text

2148	   o  Removed UML as normative reference, since the text no longer uses
2149	      any UML notation

2151	   o  Removed a number of level 4 sections and moved out text to the
2152	      level above

2154	A.9.  Modifications Between Version -02 and -03

2156	   o  Section 4 rewritten (and new communication topologies added) to
2157	      reflect the major updates to Sections 1-3

2159	   o  Section 8 removed (carryover from initial -00 draft)

2161	   o  General clean up of text, grammar and nits

2163	A.10.  Modifications Between Version -01 and -02

2165	   o  Section 2 rewritten to add both streams and transformations in the
2166	      media chain.

2168	   o  Section 3 rewritten to focus on exposing relationships.

2170	A.11.  Modifications Between Version -00 and -01

2172	   o  Too many to list

2174	   o  Added new authors

2176	   o  Updated content organization and presentation

2178	Authors' Addresses

2180	   Jonathan Lennox
2181	   Vidyo, Inc.
2182	   433 Hackensack Avenue
2183	   Seventh Floor
2184	   Hackensack, NJ  07601
2185	   US

2187	   Email: jonathan@vidyo.com

2189	   Kevin Gross
2190	   AVA Networks, LLC
2191	   Boulder, CO
2192	   US

2194	   Email: kevin.gross@avanw.com

2196	   Suhas Nandakumar
2197	   Cisco Systems
2198	   170 West Tasman Drive
2199	   San Jose, CA  95134
2200	   US

2202	   Email: snandaku@cisco.com

2204	   Gonzalo Salgueiro
2205	   Cisco Systems
2206	   7200-12 Kit Creek Road
2207	   Research Triangle Park, NC  27709
2208	   US

2210	   Email: gsalguei@cisco.com

2212	   Bo Burman (editor)
2213	   Ericsson
2214	   Kistavagen 25
2215	   SE-16480 Stockholm
2216	   Sweden

2218	   Email: bo.burman@ericsson.com