< draft-rosenberg-sip-conferencing-models-00.txt   draft-rosenberg-sip-conferencing-models-01.txt >
Internet Engineering Task Force SIP WG Internet Engineering Task Force SIPPING WG
Internet Draft J.Rosenberg,H.Schulzrinne Internet Draft J.Rosenberg,H.Schulzrinne
draft-rosenberg-sip-conferencing-models-00.txt dynamicsoft,Columbia U. draft-rosenberg-sip-conferencing-models-01.txt dynamicsoft,Columbia U.
November 17, 2000 July 20, 2001
Expires: May, 2001 Expires: February 2002
Models for Multi Party Conferencing in SIP Models for Multi Party Conferencing in SIP
STATUS OF THIS MEMO STATUS OF THIS MEMO
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as work in progress. material or to cite them other than as "work in progress".
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at To view the list Internet-Draft Shadow Directories, see
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
The Session Initiation Protocol (SIP) can support multi-party The Session Initiation Protocol (SIP) can support multi-party
conferencing in many different ways. In this draft, we define the conferencing in many different ways. In this draft, we define the
various multi-party conferencing models, and for each, discuss how various multi-party conferencing models, and for each, discuss how
they are used and then analyze their relative benefits and drawbacks. they are used and then analyze their relative benefits and drawbacks.
1 Introduction 1 Introduction
skipping to change at page 2, line 22 skipping to change at page 2, line 22
o How users can join an existing conference without being o How users can join an existing conference without being
invited invited
o How well the model scales. o How well the model scales.
o Which entities need to be aware of the model. o Which entities need to be aware of the model.
o How participants learn about each other. o How participants learn about each other.
We also identify missing pieces and recommend standard activity to We also identify missing pieces and reccomend standard activity to
fill them in. This document itself does not define any new extensions fill them in. This document itself does not define any new extensions
of any kind. of any kind. However, several scenarios discussed in the draft make
use of existing extensions to SIP.
2 End System Mixing 2 End System Mixing
The first model we call "end system mixing". In this model, user A The first model we call "end system mixing". In this model, user A
calls user B, and they have a conversation. At some point later, A calls user B, and they have a conversation. At some point later, A
decides to conference in user C. To do this, A calls C, using a decides to conference in user C. To do this, A calls C, using a
completely separate SIP call. This call uses a different Call-ID, completely separate SIP call. This call uses a different Call-ID,
different tags, etc. There is no call set up directly between B and different tags, etc. There is no call set up directly between B and
C. A receives media streams from both B and C, and mixes them. A C. A receives media streams from both B and C, and mixes them. A
sends a stream containing A's and C's streams to B, and a stream sends a stream containing A's and C's streams to B, and a stream
skipping to change at page 2, line 48 skipping to change at page 2, line 49
Basically, user A handles both signaling and media mixing. B and C Basically, user A handles both signaling and media mixing. B and C
are unaware of the multi-party call, from a SIP perspective at least. are unaware of the multi-party call, from a SIP perspective at least.
From an RTP perspective, A is a mixer, and so the RTCP reports from A From an RTP perspective, A is a mixer, and so the RTCP reports from A
will contain SDES information that indicates the existence of an will contain SDES information that indicates the existence of an
additional party in the media stream. additional party in the media stream.
Note that this model has the serious drawback that the conference Note that this model has the serious drawback that the conference
ends when the mixing UA leaves the call. ends when the mixing UA leaves the call.
2.1 Inviting Users to Join OPEN ISSUE: Another problem with this approach is that
there is no specific way for A to determine when a
Any user in the conference can invite another user to join, so long
as they are capable of performing the required mixing and signaling
+----------+ +----------+
| | | |
-- | | -- | |
--- | B | --- | B |
SIP call --- | | SIP call --- | |
--- .. | | --- .. | |
--- .. +----------+ --- .. +----------+
-- ... -- ...
... ...
+----------+ .. RTP +----------+ .. RTP
skipping to change at page 3, line 32 skipping to change at page 3, line 32
--- . +----------+ --- . +----------+
-- | | -- | |
-- | | -- | |
SIP call -- | | SIP call -- | |
| C | | C |
| | | |
+----------+ +----------+
Figure 1: Three Way Calling using End System Mixing Figure 1: Three Way Calling using End System Mixing
signaling message it receives was meant just for it, or for
the entire conference. For example, if B sends a REFER to
A, pointing to user D, was this REFER meant for A alone, or
for A and C? If it was meant for A and C, presumably A
would act upon the REFER and send it to C as well. C too
would act on the REFER. This would cause two separate
REFER-triggered INVITEs to get routed to D. How would D
know that both INVITEs need to be mixed together as a
conference? What if it cannot support this capability?
Because the three-way calling approach works only for the most basic
case, we do not recommend it as a general solution.
2.1 Inviting Users to Join
Any user in the conference can invite another user to join, so long
as they are capable of performing the required mixing and signaling
functions. To invite a new user to join, a user in the conference functions. To invite a new user to join, a user in the conference
simply calls them using normal SIP procedures. The only difference is simply calls them using normal SIP procedures. The only difference is
that the stream sent to that new user contains the streams received that the stream sent to that new user contains the streams received
from the other parties in the call. from the other parties in the call.
In fact, it is perfectly acceptable for complex connectivity graphs In fact, it is acceptable for complex connectivity graphs to be
to be constructed, as a result of different users inviting other constructed, as a result of different users inviting other users to
users to join. For example, take our case of A calling B, and then join. For example, take our case of A calling B, and then calling C.
calling C. If, later on, C calls D, C will performing the mixing of If, later on, C calls D, C will performing the mixing of the streams
the streams it gets from A (which actually contain media from A and it gets from A (which actually contain media from A and B), along
B), along with its own stream, and send that to D. This results in a with its own stream, and send that to D. This results in a
connectivity graph that looks like: connectivity graph that looks like Figure 2.
A------B A------B
| |
| |
C------D C------D
Figure 2: Connectivity Graph
Note, however, that there is a possibility of loops. From here, if D Note, however, that there is a possibility of loops. From here, if D
calls B, and brings that stream into the conference, a loop is calls B, and brings that stream into the conference, a loop is
created. This loop can be detected using the mechanisms described in created. This loop can be detected using the mechanisms described in
the RTP specification [2]. However, we expect these conditions to be the RTP specification [2]. However, we expect these conditions to be
extremely rare. Presumably, D knows B is in the conference already, extremely rare. Presumably, D knows B is in the conference already,
and so would not likely call B and invite them in. and so would not likely call B and invite them in.
A serious problem with the more complex topologies is that the
departure of a participant might cause a partition of the conference
into several sub-conferences which cannot easily be healed.
2.2 Users Joining 2.2 Users Joining
In this model, there is not any explicit conference "identifier" that In this model, there is not any explicit conference "identifier" that
can be used to join. This conference model, by its nature, is built can be used to join. This conference model, by its nature, is built
around ad-hoc conferences. However, it is still possible for a user around ad-hoc conferences. However, it is still possible for a user
to join in the following way. to join in the following way.
Lets say a new user, E, simply calls B, unaware even, that B is in a Lets say a new user, E, simply calls B, unaware even, that B is in a
conference (E might actually be aware, but the SIP messaging is no conference (E might actually be aware, but the SIP messaging is no
different). B's softphone, recognizing that B is already in a different). B's softphone, recognizing that B is already in a
skipping to change at page 4, line 43 skipping to change at page 5, line 22
later. No SIP signaling at all is needed to do this. B simply starts later. No SIP signaling at all is needed to do this. B simply starts
sending the mixed media to E. sending the mixed media to E.
2.3 Scalability 2.3 Scalability
A drawback of this model is its scalability. Viewing the conference A drawback of this model is its scalability. Viewing the conference
from a graph perspective, if the number of edges touching a vertex from a graph perspective, if the number of edges touching a vertex
(its degree) equals N, the user corresponding to that vertex has to (its degree) equals N, the user corresponding to that vertex has to
perform up to N separate media stream encodings. We say "up to", as perform up to N separate media stream encodings. We say "up to", as
it depends on the number of paricipants who are talking at once. If it depends on the number of paricipants who are talking at once. If
only one pariticpant is talking, the non-talking "mixer" endpoints only one participant is talking, the non-talking "mixer" endpoints
don't need to do any additional encoding. If everyone is talking, it don't need to do any additional encoding. If everyone is talking, it
is N encodes. Since encoding is generally a complex process, a is N encodes. Since encoding is generally a complex process, a
typical workstation these days can handle two or three simultaneous typical workstation these days can handle two or three simultaneous
encodes using a low rate codec like G.723.1. The problem can be encodes using a low rate codec like G.723.1. The problem can be
mitigated somewhat by distributing the mixing responsibilities mitigated somewhat by distributing the mixing responsibilities
(making the graph deep rather than wide). However, this requires a (making the graph deep rather than wide). However, this requires a
conscious effort of the participants regarding who is to make the conscious effort of the participants regarding who is to make the
call to add a new user. This is unlikely to happen in practice. call to add a new user. This is unlikely to happen in practice.
Another limitation to scalability is bandwidth. If the degree of a Another limitation to scalability is bandwidth. If the degree of a
skipping to change at page 6, line 6 skipping to change at page 6, line 31
Large-scale multicast conferences are usually pre-arranged, with Large-scale multicast conferences are usually pre-arranged, with
specific start and stop times (which is why this information exists specific start and stop times (which is why this information exists
in SDP). Protocols such as the Session Announcement Protocol (SAP) in SDP). Protocols such as the Session Announcement Protocol (SAP)
[4] are used to announce these conferences. However, multicast [4] are used to announce these conferences. However, multicast
conferences do not need to be pre-arranged, so long as a mechanism conferences do not need to be pre-arranged, so long as a mechanism
exists to dynamically obtain a multicast address. SAP itself was exists to dynamically obtain a multicast address. SAP itself was
originally used for this purpose; this has been supplanted by the originally used for this purpose; this has been supplanted by the
malloc architecture [5], still under development. malloc architecture [5], still under development.
So, if there are N participants, there will be point to point SIP So, if there are N participants, there will be point-to-point SIP
relationships with pairs of participants. Each participant sends a relationships with pairs of participants. Each participant sends a
single media stream to the group, and receives up to N-1 streams at single media stream to the group, and receives up to N-1 streams at
any time. Note that the number of streams that a user will receive any time. Note that the number of streams that a user will receive
depends on who is actually sending at any given time. If the stream depends on who is actually sending at any given time. If the stream
is audio, and silence suppression is utilized, the number of streams is audio, and silence suppression is utilized, the number of streams
a user will receive at any given time is equal to the number of users a user will receive at any given time is equal to the number of users
talking at any given time. Even for very large conferences, this is talking at any given time. Even for very large conferences, this is
usually just a small number of users. usually just a small number of users.
3.1 Inviting Users to Join 3.1 Inviting Users to Join
skipping to change at page 6, line 34 skipping to change at page 7, line 12
same and all parties use the same port numbers to receive same and all parties use the same port numbers to receive
media data. If the session description provided by the media data. If the session description provided by the
caller is acceptable to the callee, the callee can choose caller is acceptable to the callee, the callee can choose
not to include a session description or MAY echo the not to include a session description or MAY echo the
description in the response. description in the response.
The called party then joins the multicast groups indicated in the The called party then joins the multicast groups indicated in the
SDP, using multicast protocols such as IGMP [6]. Note that it is not SDP, using multicast protocols such as IGMP [6]. Note that it is not
even necessary for users to send each other BYE messages when the even necessary for users to send each other BYE messages when the
conference is over, especially for large-scale, pre-arranged conference is over, especially for large-scale, pre-arranged
conferences that have explicit end times indicated in SDP. SDP aside, conferences that have explicit end times indicated in SDP.
a participant can simply leave the conference at any time by leaving
the multicast groups. No SIP signaling is needed to accomplish this. OPEN ISSUE: Do we need to specify a SIP mechanism for
indicating that no BYE is needed?
SDP aside, a participant can simply leave the conference at any time
by leaving the multicast groups. No SIP signaling is needed to
accomplish this.
3.2 Users Joining 3.2 Users Joining
Users can join a conference of this type without being invited. All Users can join a conference of this type without being invited. All
they need is the multicast addresses, ports, and codecs being used. they need is the multicast addresses, ports, and codecs being used.
These can be obtained through any number of means, including SAP. SDP These can be obtained through any number of means, including SAP. SDP
conference descriptions can even be obtained from web pages, for conference descriptions can even be obtained from web pages, for
example. example.
Once the addresses are obtained, the user simply joins the Once the addresses are obtained, the user simply joins the
skipping to change at page 8, line 11 skipping to change at page 8, line 44
Dial-In conference servers closely mirror dial-in conference bridges Dial-In conference servers closely mirror dial-in conference bridges
in the traditional PSTN. in the traditional PSTN.
A dial-in conference server acts as a normal SIP UA. Users call it, A dial-in conference server acts as a normal SIP UA. Users call it,
and the server maintains point to point SIP relationships with each and the server maintains point to point SIP relationships with each
user that calls in. The server takes the media from the users who user that calls in. The server takes the media from the users who
dial into the same conference, mixes them, and sends out the dial into the same conference, mixes them, and sends out the
appropriate mixed stream to each participant separately. appropriate mixed stream to each participant separately.
The model is depicted in Figure 3. Note that each UA (A,B,C,D) has a
point to point SIP and RTP relationship with the conference server.
Each call has a different Call-ID. Each user sends their own media to
the server. The media delivered to user A by the server is the media
mixed from users B,C and D. The media delivered to user B by the
server is the media mixed from users A, C and D. The media delivered
to user C by the server is the media mixed from users A, B and D. The
media delivered to user D is the media mixed from users A, B and C
+-----+ +-----+
| | | |
| A | | A |
| | | |
+-----+ +-----+
| . | .
| . | .
| . | .
| . | .
| . | .
skipping to change at page 8, line 38 skipping to change at page 9, line 31
| . | .
| . | .
| . | .
| . | .
+-----+ +-----+
| | | |
| C | | C |
| | | |
+-----+ +-----+
Figure 2: Dial-In Conference Servers Figure 3: Dial-In Conference Servers
The model is depicted in Figure 2. Note that each UA (A,B,C,D) has a (this is also known as a mix-minus configuration).
point to point SIP and RTP relationship with the conference server.
Each call has a different Call-ID. Each user sends their own media to
the server. The media delivered to user A by the server is the media
mixed from users B,C and D. The media delivered to user B by the
server is the media mixed from users A, C and D. The media delivered
to user C by the server is the media mixed from users A, B and D. The
media delivered to user D is the media mixed from users A, B and C.
The conference is identified by the request URI of the calls from The conference is identified by the request URI of the calls from
each participant. This provides numerous advantages from a services each participant. This provides numerous advantages from a services
and routing point of view [9]. For example, one conference on the and routing point of view [9]. For example, one conference on the
server might be known as sip:conference34@servers.com. All users who server might be known as sip:conference34@servers.com. All users who
call sip:conference34@servers.com are mixed together. call sip:conference34@servers.com are mixed together.
Dial-In conference servers are usually associated with pre-arranged Dial-In conference servers are usually associated with pre-arranged
conferences. However, the same model applies to ad-hoc conferences. conferences. However, the same model applies to ad-hoc conferences.
An ad-hoc conference server creates the conference state when the An ad-hoc conference server creates the conference state when the
skipping to change at page 10, line 13 skipping to change at page 10, line 42
server: server:
INVITE sip:conference34@servers.com INVITE sip:conference34@servers.com
From: sip:B@example.com From: sip:B@example.com
To: sip:conference34@servers.com To: sip:conference34@servers.com
Referred-By: sip:A@example.com Referred-By: sip:A@example.com
Since the request URI identifies the conference, this will cause B to Since the request URI identifies the conference, this will cause B to
get added to conference 34. get added to conference 34.
An additional mechanism for inviting a user to join is to send REFER
from A to the conference server, with a Refer-To containing the
address of B. This REFER would look like:
REFER sip:conference34@servers.com SIP/2.0
From: sip:A@example.com
To: sip:B@example.com
Refer-To: sip:B@example.com
This approach has the advantage that it doesn't require REFER support
from B, only from the conference server.
OPEN ISSUE: A problem with the mechanisms for adding a user
is that they assume that the UA for user A (the one who
adds another user to the conference) knows that it is
indeed talking to a conference server. If the mechanisms in
this section were applied to a UA which was not a
conference server, the result would be the creation of
additional call legs, but not a conference. This means that
we require some mechanism for identifying that a URL is a
conference URL.
4.2 Users Joining 4.2 Users Joining
Users joining is easily done. The participant that wishes to join It is easy for users to join the conference. The participant that
simply sends an INVITE to the conference server, with the conference wishes to join simply sends an INVITE to the conference server, with
ID in the request URI. The conference ID (which is a SIP URL), can be the conference ID in the request URI. The conference ID (which is a
learned by any number of means, including having it on a web page, SIP URL), can be learned by any number of means, including having it
receiving it in an email, etc. on a web page, receiving it in an email, etc.
For example, if B wishes to join sip:conference34@servers.com, B For example, if B wishes to join sip:conference34@servers.com, B
would send the following request: would send the following request:
INVITE sip:conference34@servers.com INVITE sip:conference34@servers.com
From: sip:B@example.com From: sip:B@example.com
To: sip:conference34@servers.com To: sip:conference34@servers.com
4.3 Scalability 4.3 Scalability
The scalability of this model is limited by the bandwidth and The scalability of this model is limited by the bandwidth and
processing power of the conference server. If there are N processing power of the conference server. If there are N
participants in a conference, M of which are sending media streams, participants in a conference, M of which are sending media streams,
the server will need to manage N signaling relationships, perform N the server will need to manage N signaling relationships, perform M
RTP stream decodes, and N RTP stream encodes (assuming M > 0). The RTP stream decodes, and N RTP stream encodes (assuming M > 0). The
encoding is the primary processing bottleneck, and the sending of the encoding is the primary processing bottleneck, and the sending of the
N media streams is the primary bandwidth bottleneck. However, N media streams is the primary bandwidth bottleneck. However,
conference servers can be built using heavy duty hardware, and have conference servers can be built using heavy duty hardware, and have
high bandwith access. high bandwith access.
Furthermore, since we are using the request URI to name the Furthermore, since we are using the request URI to name the
conferences, we can use standard SIP techniques for distributing conferences, we can use standard SIP techniques for distributing
conferences across servers [9]. conferences across servers [9].
skipping to change at page 11, line 19 skipping to change at page 12, line 30
4.5 Discovering Participant Identities 4.5 Discovering Participant Identities
The identities of other participants in the conference are NOT known The identities of other participants in the conference are NOT known
through SIP. Rather, it is learned through RTP. THe conference server through SIP. Rather, it is learned through RTP. THe conference server
is an RTP mixer. As such, it takes the RTCP SDES of the streams it is an RTP mixer. As such, it takes the RTCP SDES of the streams it
mixes, and aggregrates them into the RTCP stream sent out. This will mixes, and aggregrates them into the RTCP stream sent out. This will
allow participants to gradually (over a few seconds), learn the allow participants to gradually (over a few seconds), learn the
identities of the other participants. identities of the other participants.
As an implementation choice, the conference server can generate the
RTCP SDES of its participants, rather than using those provided by
the participants. The reason for this is authenticity. A conference
server can use SIP authentication mechanisms to identify the
participants in the conference. This may allow it to validate the
RTCP SDES provided by the participants. A conference server could
remove any false information, and regenerate the SDES using the
correct user identity as validated through SIP.
5 Ad-hoc Centralized Conferences 5 Ad-hoc Centralized Conferences
In an ad-hoc centralized conference, two users A and B start with a In an ad-hoc centralized conference, two users A and B start with a
normal SIP call. At some point later, they decide to add a third normal SIP call. At some point later, they decide to add a third
party. Instead of using end system mixing, they would prefer to use a party. Instead of using end system mixing, they would prefer to use a
conference server, as defined in Section 4. conference server, as defined in Section 4.
This model corresponds roughly to the centralized multipoint The call flow for starting this kind of conference is shown in Figure
conference model of H.323. 4. Initially, A calls B (1-3). At some point, B decides to add a
user, C, to the call, and begins the transition to a conference
One of the participants takes responsibility for transitioning to a server. The first step in this process is the discovery of a
conference server. The first step in this process is the discovery of conference server that supports ad-hoc conferences. This can be done
a conference server that supports ad-hoc conferences. This can be through static configuration, or through any of a number of standard
done through static configuration, or through any of a number of service discovery protocols, such as the Service Location Protocol
standard service discovery protocols, such as the Service Location [12].
Protocol [12].
Once the server is discovered, a conference ID is chosen. This ID Once the server is discovered, a conference ID is chosen. This ID
must be globally unique. The conference ID is then prepended to the must be globally unique. The conference ID is then prepended to the
server, and a SIP URL for the ad-hoc conference is formed. For server, and a SIP URL for the ad-hoc conference is formed. For
example, if the server "a.servers.com" is used, and the unique ID is example, if the server "a.servers.com" is used, and the unique ID is
"a7hytaskp09878a", the SIP URL for this conference is "a7hytaskp09878a", the SIP URL for this conference is
sip:a7hytaskp09878a@a.servers.com. sip:a7hytaskp09878a@a.servers.com.
The user who is performing the transition (say, user A) then sends an B then sends an INVITE to this URL (4). This creates the initial
INVITE to this URL. This creates the initial conference state in the conference state in the server. The conference server accepts the
server. A then sends a REFER to the other party in the call (say B), call (5) and B sends an ACK (6). B then sends a REFER to A (7),
referring them to sip:a7hytaskp09878a@a.servers.com. B sends an referring them to sip:a7hytaskp09878a@a.servers.com. A accepts the
INVITE to this address, and is added to the conference. Once the 200 referral (8) and this triggers an INVITE to this address (9). This
OK response to the REFER is sent from B to A, A hangs up to B. A and causes A to be added to the conference. The conference server accepts
B are now in a conference using a conference server. From here, the INVITE (10), and an ACK is generated (11). Once the NOTIFY
operation is identical to the system described in Section 4. request (indicating successful completion of the referred call) is
sent from A to B (12), A responds with a 200 OK. Since B is now
assured that A is connected through the conference server, B hangs up
to A with a BYE (14).
OPEN ISSUE: Its not clear that this is the best flow. An
alternative flow is for B to REFER the conference server to
A, using a call replacement mechanism. This is probably
more correct, since this is not so much a transfer as a
call leg replacement.
Finally, B can add C to the call. This is identical to the procedures
described in Section 4 for adding userst to the conference. First, B
generates a REFER (16) to C. The Refer-To header contains the
conference URL, sip:a7hytaskp09878a@a.servers.com. C responds to the
referral with a 200 OK (17). C then INVITEs itself to the conference
(18-20). C then generates a NOTIFY informing B that the REFER has
completed (21).
It is also possible to transition from a end system mixed conference It is also possible to transition from a end system mixed conference
(even one with a complex connection topology), to a centralized (even one with a complex connection topology), to a centralized
conference server. One user takes responsibility for initiating the conference server. Consider a end-system mixed conference with the
transition. It proceeds as described above. However, the REFER topology of Figure 2. User A wishes to transition to a centralized
request is sent to all SIP peers adjacent to the user. In addition, conference server in order to add another participant. The transition
when a SIP UA receives a REFER, they must not only act on it as is shown in Figures 5 and 6.
described above, but also generate a REFER to any of their adjacent
SIP peers. In essence, the REFER message is propagated along the First, user A discovers a conference server, and creates a new
connection graph, starting at the root (which is the user who A B Conference C
initiates the transition). The transition will work so long as the Server
graph has no cycles (which is needed anyway, as discussed above), and |(1) INVITE | | |
so long as only one user attempts to initiate the transition. If |-------------->| | |
multiple users attempt to initiate the transition at the same time, |(2) 200 OK | | |
the conference will break into two disjoint ad-hoc conferences, with |<--------------| | |
membership depending on the temporal dynamics of the REFER |(3) ACK | | |
propagation. |-------------->| | |
| |(4) INVITE | |
| |-------------->| |
| |(5) 200 OK | |
| |<--------------| |
| |(6) ACK | |
|(7) REFER |-------------->| |
|<--------------| | |
|(8) 200 OK | | |
|-------------->| | |
|(9) INVITE | | |
|------------------------------>| |
|(10) 200 OK | | |
|<------------------------------| |
|(11) ACK | | |
|------------------------------>| |
|(12) NOTIFY | | |
|-------------->| | |
|(13) 200 OK | | |
|<--------------| | |
|(14) BYE | | |
|<--------------| | |
|(15) 200 OK | | |
|-------------->|(16) REFER | |
| |------------------------------>|
| |(17) 200 OK | |
| |<------------------------------|
| | |(18) INVITE |
| | |<--------------|
| | |(19) 200 OK |
| | |-------------->|
| | |(20) ACK |
| | |<--------------|
| |(21) NOTIFY | |
| |<------------------------------|
| |(22) 200 OK | |
| |------------------------------>|
| | | |
Figure 4: Transitioning to ad-hoc
|(1) INVITE | | | |
|---------------------------------------------------------->|
|(2) 200 OK | | | |
|<----------------------------------------------------------|
|(3) ACK | | | |
|---------------------------------------------------------->|
|(4) REFER | | | |
|------------->| | | |
|(5) 200 OK | | | |
|<-------------| | | |
|(6) REFER | | | |
|---------------------------->| | |
|(7) 200 OK | | | |
|<----------------------------| | |
| |(8) INVITE | | |
| |------------------------------------------->|
| |(9) 200 OK | | |
| |<-------------------------------------------|
| |(10) ACK | | |
| |------------------------------------------->|
| | |(11) INVITE | |
| | |---------------------------->|
| | |(12) 200 OK | |
| | |<----------------------------|
| | |(13) ACK | |
| | |---------------------------->|
| | |(14) REFER | |
| | |------------->| |
| | |(15) 200 OK | |
| | |<-------------|(16) INVITE |
| | | |------------->|
| | | |(17) 200 OK |
| | | |<-------------|
| | | |(18) ACK |
| | | |------------->|
| | | | |
| | | | |
A B C D Conf.
Server
Figure 5: Adhoc transition from end-system mixed: part I
conference by sending an INVITE to it (1-3). A then REFERs the two
end systems it is connected to (B and C), to the server (4-5 and 6-7
respectively). This causes B to INVITE itself to the conference
server (8-10), and C to do the same (11-13). Since C had gotten a
REFER from B, it "passes it on" to D by sending a REFER to it (14-
15). This causes D to join the conference server by sending it an
INVITE (16-18).
Once the REFER triggered INVITEs complete, notifications start to get
sent. Since B completed first, it will be the first to send a NOTIFY
to A (19) followed by C (21). At this point, A can terminate its legs
to B and C (23-24 and 25-26 respectively). Since D completed its
REFER triggered INVITE next, it generates a NOTIFY to C (27). This
causes C to terminate its leg with D (29). The call has now
transitioned to a centralized server.
OPEN ISSUE: There is no way for A to know that the entire
conference has transitioned. Also, as above, its not clear
that a REFER from the conference server wouldn't be better.
Once the conference has been formed, further operation is identical
to the dial-in conferencing model of Section 4. The only difference
in the conferences is that the conference identifier is dynamic in
this case, and static in Section 4. This makes users asynchronously
joining nearly impossible.
5.1 Inviting Users to Join 5.1 Inviting Users to Join
Once the ad-hoc conference has been created on the server, inviting Once the ad-hoc conference has been created on the server, inviting
users proceeds as defined in Section 4.1. users proceeds as defined in Section 4.1.
5.2 Users Joining 5.2 Users Joining
Once the ad-hoc conference has been created on the server, joining Once the ad-hoc conference has been created on the server, joining
proceeds as defined in Section 4.2. proceeds as defined in Section 4.2.
skipping to change at page 12, line 44 skipping to change at page 17, line 5
The scalability of this conference model is identical to that of The scalability of this conference model is identical to that of
dial-in conference servers, as described in Section 4.3. dial-in conference servers, as described in Section 4.3.
5.4 Location of Service Logic 5.4 Location of Service Logic
The logic for handling the transition process must be located in at The logic for handling the transition process must be located in at
least one UA in the conference. All UAs that are mixers in a end least one UA in the conference. All UAs that are mixers in a end
system mixed conference must know to propagate the REFER requests system mixed conference must know to propagate the REFER requests
they receive during the transition. they receive during the transition.
|(19) NOTIFY | | | |
|<-------------| | | |
|(20) 200 OK | | | |
|------------->| | | |
|(21) NOTIFY | | | |
|<----------------------------| | |
|(22) 200 OK | | | |
|---------------------------->| | |
|(23) BYE | | | |
|------------->| | | |
|(24) 200 OK | | | |
|<-------------| | | |
|(25) BYE | | | |
|---------------------------->| | |
|(26) 200 OK | | | |
|<----------------------------|(27) NOTIFY | |
| | |<-------------| |
| | |(28) 200 OK | |
| | |------------->| |
| | |(29) BYE | |
| | |------------->| |
| | |(30) 200 OK | |
| | |<-------------| |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
A B C D Conf.
Server
Figure 6: Adhoc transition from end-system mixed: part II
5.5 Discovering Participant Identities 5.5 Discovering Participant Identities
Once the ad-hoc conference is established, conference identities are Once the ad-hoc conference is established, conference identities are
determined through RTCP, as in the dial-in case.
6 Dial-Out Conferences 6 Dial-Out Conferences
Dial-out conferences are a simple variation on dial-in conferences. Dial-out conferences are a simple variation on dial-in conferences.
Instead of the users joining the conference by sending an INVITE to Instead of the users joining the conference by sending an INVITE to
the server, the server chooses the users who are to be members of the the server, the server chooses the users who are to be members of the
conference, and then sends them the INVITE. Typically dial out conference, and then sends them the INVITE. Typically dial out
conferences are pre-arranged, with specific start times and an conferences are pre-arranged, with specific start times and an
initial group membership list. initial group membership list. However, there are other means for the
dial-out server to determine the list of participants, including user
presence [13]. The model in no way limits the means by which the
server determines the set of users.
Once the users accept or reject the call from the dial out server, Once the users accept or reject the call from the dial out server,
the behavior of this system is identical to the dial-in server case the behavior of this system is identical to the dial-in server case
of Section 4. Thus, a dial-out conference server will generally need of Section 4. Thus, a dial-out conference server will generally need
to support dial-in access for the same conference, if it wishes to to support dial-in access for the same conference, if it wishes to
allow joining after the conference begins. allow joining after the conference begins.
Note that, from the participants perspective, they will learn the Note that, from the participants perspective, they will learn the
conference identity (the URL) from the From field in the INVITE conference identity (the URL) from the From field in the INVITE
messages received from the server. messages received from the server.
OPEN ISSUE: Or is the Contact more appropriate?
6.1 Inviting Users to Join 6.1 Inviting Users to Join
Once the conference is established, inviting users to join is Once the conference is established, inviting users to join is
identical to the scenario described in Section 4.1. Note that the URL identical to the scenario described in Section 4.1. Note that the URL
to be used in the REFER is obtained from the From field of the INVITE to be used in the REFER is obtained from the From field of the INVITE
received from the dial-out server. received from the dial-out server.
6.2 Users Joining 6.2 Users Joining
Once the conference is established, joining is identical to the Once the conference is established, joining is identical to the
skipping to change at page 14, line 14 skipping to change at page 19, line 24
7 Centralized Signaling, Distributed Media 7 Centralized Signaling, Distributed Media
In this conferencing model, there is a centralized controller, as in In this conferencing model, there is a centralized controller, as in
the dial-in and dial-out cases. However, the centralized server the dial-in and dial-out cases. However, the centralized server
handles signaling only. The media is still sent directly between handles signaling only. The media is still sent directly between
participants, using either multicast or multi-unicast. Multi-unicast participants, using either multicast or multi-unicast. Multi-unicast
is when a user sends multiple packets (one for each recipient, is when a user sends multiple packets (one for each recipient,
addressed to that recipient). This is referred to as a "Decentralized addressed to that recipient). This is referred to as a "Decentralized
Multipoint Conference" in H.323. Interestingly, this conference model Multipoint Conference" in H.323. Interestingly, this conference model
is possible baseline SIP. is possible with baseline SIP.
It works through third party call control [13]. The conference server It works through third party call control [14]. The conference server
uses re-INVITEs to each participant when a new one joins. The re- uses re-INVITEs to each participant when a new one joins. The re-
INVITEs add a media stream that gets sent to the new participant (and INVITEs add a media stream that gets sent to the new participant (and
similarly in the reverse direction). similarly in the reverse direction).
Let us assume for the moment that a conference already exists with Let us assume for the moment that a conference already exists with
three participants. In this state, each participant is sending media three participants. In this state, each participant is sending media
directly to each other. This is because the SDP that the conference directly to each other. This is because the SDP that the conference
server has given to each participant contains three media lines, each server has given to each participant contains three media lines, each
of type audio, with connection addresses and ports corresponding to of type audio, with connection addresses and ports corresponding to
each of the three users. each of the three users.
The call flow from here is shown in Figure 3. A new participant The call flow from here is shown in Figure 7. In the figure, the word
joins the conference. It does so by sending an INVITE (1)to the after the INV or SIP response code refers to the connection
server, with the conference ID in the request URI. The SDP in the adress(es) in the SDP in the message. +X means the addition of a
INVITE contains a single media stream, with an IP address and port stream with X as the recipient address.
where it would like to receive media (D). The 200 response from the
conference server (2) contains a single media line with an IP address A new participant joins the conference. It does so by sending an
of 0.0.0.0 and a random port, indicating hold. INVITE (1)to the server, with the conference ID in the request URI.
The SDP in the INVITE contains a single media stream, with an IP
address and port where it would like to receive media (D). The 200
response from the conference server (2) contains a single media line
with an IP address of 0.0.0.0 and a random port, indicating hold.
The next step is for the server to obtain two more addresses where The next step is for the server to obtain two more addresses where
the new participant will be receiving media (it already has one from the new participant will be receiving media (it already has one from
the original INVITE). To do this, it sends a re-INVITE to the new the original INVITE). To do this, it sends a re-INVITE to the new
participant (4). This reINVITE contains two additional media streams participant (4). This re-INVITE contains two additional media streams
(for three total), all three of which are on hold. The 200 response (for three total), all three of which are on hold. The 200 response
to the re-INVITE (5) contains two additional IP addresses and ports to the re-INVITE (5) contains two additional IP addresses and ports
where the user is willing to receive media. where the user is willing to receive media.
Now the server needs to inform the other parties that they should Now the server needs to inform the other parties that they should
begin sending media to the new user. It first sends a re-INVITE to begin sending media to the new user. It first sends a re-INVITE to
user C (7). This re-INVITE adds an additional media stream to the two user C (7). This re-INVITE adds an additional media stream to the two
already that C has been sending. This new media stream uses one of already that C has been sending. This new media stream uses one of
the three connection addresses and ports returned by D in message the three connection addresses and ports returned by D in message
(5). Call this address/port D1. The other two are D2 and D3. The 200 (5). Call this address/port D1. The other two are D2 and D3. The 200
skipping to change at page 15, line 20 skipping to change at page 20, line 35
two already in use by C) using address/port D2. The response (11) two already in use by C) using address/port D2. The response (11)
contains a new address/port to send media to B. Call this port B3. In contains a new address/port to send media to B. Call this port B3. In
the re-INVITE to A (13), the server adds an additional media line the re-INVITE to A (13), the server adds an additional media line
using address/port D3. The response (14) contains a new address/port using address/port D3. The response (14) contains a new address/port
to send media to A. Call this port A3. to send media to A. Call this port A3.
Finally, the server sends a re-INVITE (15) to the new party. This Finally, the server sends a re-INVITE (15) to the new party. This
re-INVITE takes all three streams off hold, and updates their re-INVITE takes all three streams off hold, and updates their
connection addresses and ports with C3, B3, and A3, respectively. The connection addresses and ports with C3, B3, and A3, respectively. The
200 OK response (16) returns the same ports and addresses returned in 200 OK response (16) returns the same ports and addresses returned in
message (5) (as noted in [13], these addresses/ports MUST NOT message (5) (as noted in [14], these addresses/ports MUST NOT
change). Now, D can send media to A,B and C. change). Now, D can send media to A,B and C.
The result of these manipulations is, indeed, a full mesh of unicast The result of these manipulations is, indeed, a full mesh of unicast
RTP streams between all participants. Unlike the case of end system RTP streams between all participants. Unlike the case of end system
mixing, the stream sent by any participant to all of the others is mixing, the stream sent by any participant to all of the others is
identical. Each particpant needs to mix, but it mixes the media it identical. Each particpant needs to mix, but it mixes the media it
receives, and plays that out the speakers. This is normal behavior receives, and plays that out the speakers. This is normal behavior
for multiple streams of the same type. Note that the SIP relationship for multiple streams of the same type. Note that the SIP relationship
is still point-to-point. There are four calls at the end of Figure 3, is still point-to-point. There are four calls at the end of Figure 7,
one from each participant to the server, each with a different Call- one from each participant to the server, each with a different Call-
ID. ID.
Note that hybrids are easily possible. Certain users can instead be Note that hybrids are easily possible. Certain users can instead be
mixed (sending audio to the conference server), while others are set mixed (sending audio to the conference server), while others are set
to send audio to each other. to send audio to each other.
7.1 Inviting Users to Join
Inviting users to join works identically to the dial-in conference
bridge scenario 4.
7.2 Users Joining
A user joins in the same way described in section 4.
7.3 Scalability
The scalability of this conferencing model depends on many factors.
From a media perspective, the conference server never even touches a
single media stream. However, for N participants, each participant
needs to be able to receive, decode, and mix N-1 media streams. For
| | | |(1) INV D | | | | |(1) INV D |
| | | |-------------->| | | | |-------------->|
| | | |(2) 200 hold | | | | |(2) 200 hold |
| | | |<--------------| | | | |<--------------|
| | | |(3) ACK | | | | |(3) ACK |
| | | |-------------->| | | | |-------------->|
| | | |(4) INV 3held | | | | |(4) INV 3held |
| | | |<--------------| | | | |<--------------|
| | | |(5) 200 3recv | | | | |(5) 200 3recv |
| | | |-------------->| | | | |-------------->|
skipping to change at page 16, line 49 skipping to change at page 21, line 50
| | | |<--------------| | | | |<--------------|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
A B C D Server A B C D Server
Figure 3: Centralized Signaling, Decentralized Media Figure 7: Centralized Signaling, Decentralized Media
7.1 Inviting Users to Join
Inviting users to join works identically to the dial-in conference
bridge scenario 4.
7.2 Users Joining
A user joins in the same way described in section 4.
7.3 Scalability
The scalability of this conferencing model depends on many factors.
From a media perspective, the conference server never even touches a
single media stream. However, for N participants, each participant
needs to be able to receive, decode, and mix N-1 media streams. For
users accessing the server through dial-in modems, this will severely users accessing the server through dial-in modems, this will severely
limit the sizes of these conferences. However, the processing burden limit the sizes of these conferences. However, the processing burden
is much less than that of the end system mixing model. This is is much less than that of the end system mixing model. This is
because each end user needs to decode N-1 streams, but only encode 1. because each end user needs to decode N-1 streams, but only encode 1.
Decoding is much, much cheaper than encoding, so supporting many Decoding is much, much cheaper than encoding, so supporting many
decodes is not necessarily a problem. This is especially the case decodes is not necessarily a problem. This is especially the case
when silence suppression is in use. In that case, streams are only when silence suppression is in use. In that case, streams are only
sent by talking users. This means any given user only needs to decode sent by talking users. This means any given user only needs to decode
(and receive) as many streams at a time as there are users talking. (and receive) as many streams at a time as there are users talking.
THis can vastly improve scalability of the conference. THis can vastly improve scalability of the conference.
skipping to change at page 17, line 36 skipping to change at page 22, line 52
generally faster. generally faster.
7.4 Location of Service Logic 7.4 Location of Service Logic
Nearly all of the logic for implementing this conferencing service Nearly all of the logic for implementing this conferencing service
lives in the server itself. lives in the server itself.
The only requirement from the end users is that they support The only requirement from the end users is that they support
multiple, parallel media streams of the same type, and that they be multiple, parallel media streams of the same type, and that they be
prepared to mix those streams together. They must also support the prepared to mix those streams together. They must also support the
third party control primitives [13], which don't require anything third party control primitives [14], which don't require anything
beyond baseline SIP, but are not likely supported unless explicit beyond baseline SIP, but are not likely supported unless explicit
actions are taken to do so. actions are taken to do so.
It is this combination - no need for media processing in the server, It is this combination - no need for media processing in the server,
combined with no need for specialized SIP processing in the end combined with no need for specialized SIP processing in the end
systems, that makes this model attractive. systems, that makes this model attractive.
7.5 Discovering Participant Identities 7.5 Discovering Participant Identities
Conference identities are discovered through RTCP. Each user will Conference identities are discovered through RTCP. Each user will
skipping to change at page 18, line 4 skipping to change at page 23, line 18
combined with no need for specialized SIP processing in the end combined with no need for specialized SIP processing in the end
systems, that makes this model attractive. systems, that makes this model attractive.
7.5 Discovering Participant Identities 7.5 Discovering Participant Identities
Conference identities are discovered through RTCP. Each user will Conference identities are discovered through RTCP. Each user will
receive N-1 RTP streams, each of which has its own RTCP channel that receive N-1 RTP streams, each of which has its own RTCP channel that
carries the participant identification. carries the participant identification.
8 Summary of Models 8 Summary of Models
Table 1 shows a summary of the differences between the various Table 1 shows a summary of the differences between the various
models. models.
Table 1: Summary of Models Table 1: Summary of Models
Name signaling media inviting joining discovering scale Name signaling media inviting joining discovering scale
End-Mixing tree tree normal normal RTCP small End-Mixing tree tree normal normal RTCP small
invite invite invite invite
Multicast pairs m-cast normal multicast RTCP large Multicast pairs m-cast normal multicast RTCP large
invite join invite join
Dial-Up star star refer normal RTCP medium Dial-Up star star refer normal RTCP medium
invite invite
Ad-Hoc star star refer normal RTCP medium Ad-Hoc star star refer normal RTCP medium
invite invite
Dial-Out star star refer normal RTCP medium Dial-Out star star refer normal RTCP medium
invite invite
Decentral star fullmesh refer + normal RTCP medium Decentral star fullmesh refer + normal RTCP medium
server invite and server invite and
messaging server msg. messaging server msg.
9 Whats Missing - Full Mesh 9 Security Considerations
The sections above cover a wide range of conferencing models, but not
all of them. One model, in particular, is not supported by SIP. That
model is the fully distributed multiparty model.
In this conferencing model, each user has a point to point SIP
relationship with every other user. Each user also has a point to
point RTP relationship with every other user, as is done in the
decentralized conference of Section 7.
Two earlier drafts were written on the subject, but they specified
protocols that were overly complex and still had race conditions and
unhandled cases. The primary difficulty is that it requires every
participant to learn the identity of every other participant. As
participants come and go, this requires some kind of state flooding
mechanism that causes this information to propagate, and eventually
converge, across participants. While these kinds of distribution
mechanisms have been done for multiparty conferences [14] Fitting
such a distribution mechanism into SIP is not trivial, especially
with the complex requirements that were initially targeted.
Furthermore, the distributed nature of the signaling makes
enforcement of any kind of conference policy pretty much impossible.
Failures can also result in unusual conditions. Specifically, it is
fairly easy for the conference mesh to break in certain places,
resulting in a graph where every user hears most of the other users,
but not all. This can happen, for example, if user A is invited into
a conference, but is rejected by one of the users already into the
conference (because the SIP relationships are point-to-point, a new
user needs to establish a SIP call with all existing participants),
this situation can occur. With large conferences, this becomes a very
real possibility. Earlier work tried to avoid such conditions.
We believe a solution can be found by simplifying the requirements.
For example, we will abandon the requirement to only add a user to
the conference if all other users agree to add them. We will also try
to achieve gradual convergence in shared state, rather than the rapid
convergence proposed in previous work. We will not worry about
message efficiency or message frequency. The primary design objective
should be KISS.
As a baseline model, we believe that each INVITE, 200 OK response,
and ACK simply contain a header called Members. This header is a list
of URLs, and for each URL, there is a parameter that indicates
whether they are in the conference right now, and when they joined,
or whether they were previously in the conference, and when they
left. A UA simply performs a re-INVITE as it receives new
information. A periodic re-INVITE (ala session timer [15] will also
be needed to heal partitions and deal with other conditions that may
arise).
More work is needed to validate the model and to see what other
capabilities are needed.
10 Security Considerations
The use of a server that performs the mixing on behalf of other The use of a server that performs the mixing on behalf of other
users, which is the case for all but one of the conference models users, which is the case for all but one of the conference models
described here, introduces security risks. That entity must be described here, introduces security risks. That entity must be
trusted by the others to properly mix the media - not omitting a trusted by the others to properly mix the media - not omitting a
stream, for example. As such, it is recommended that participants in stream, for example. As such, it is recommended that participants in
a conference authenticate the identity of the server. In the dial-in, a conference authenticate the identity of the server. In the dial-in,
dial-out, and decentralized conferences, this will require dial-out, and decentralized conferences, this will require
authentication of responses by participants. authentication of responses by participants.
Mixing also eliminates the privacy possible with end-to-end media Mixing also eliminates the privacy possible with end-to-end media
transport with mixing in the receivers. Such privacy is still transport with mixing in the receivers. Such privacy is still
possible in the large-scale multicast conferences, but requires possible in the large-scale multicast conferences, but requires
shared keying material for the conference. Doing this for highly shared keying material for the conference. Doing this for highly
dynamic groups is still an open research problem. dynamic groups is still an open research problem.
11 Conclusion 10 Conclusion
In this draft, we have shown how to use baseline SIP (assuming In this draft, we have shown how to use baseline SIP (assuming
endpoints that support the mixing and/or third party call control endpoints that support the mixing and/or third party call control
feature sets) to construct several multiparty conferencing models. feature sets) to construct several multiparty conferencing models.
These include end system mixing, large-scale multicast conferences, These include end system mixing, large-scale multicast conferences,
dial-in conference servers, dial-out conferences, ad-hoc centralized dial-in conference servers, dial-out conferences, ad-hoc centralized
conferences, and centralized signaling, distributed media conferences, and centralized signaling, distributed media
conferences. conferences.
We note that this covers all of the multipoint conferencing models 11 Acknowledgements
described in H.323v1 [16]. Further work is needed to see how (and if)
to support the hierarchical conference bridges defined in H.323v2
[17].
12 Authors Addresses We would like to thank Mary Barnes for her comments and input.
12 Changes since -00
o Added call flow examples.
o Added open issues within text.
o Added additional call flow for adding users to conference, by
sending REFER to conference server with Refer-To of new
participant.
o Discussed conference servers generating RTCP based on
authenticated SIP identities.
13 Authors Addresses
Jonathan Rosenberg Jonathan Rosenberg
dynamicsoft dynamicsoft
200 Executive Drive 72 Eagle Rock Avenue
Suite 120 First Floor
West Orange, NJ 07052 East Hanover, NJ 07936
email: jdrosen@dynamicsoft.com email: jdrosen@dynamicsoft.com
Henning Schulzrinne Henning Schulzrinne
Columbia University Columbia University
M/S 0401 M/S 0401
1214 Amsterdam Ave. 1214 Amsterdam Ave.
New York, NY 10027-7003 New York, NY 10027-7003
email: schulzrinne@cs.columbia.edu email: schulzrinne@cs.columbia.edu
13 Bibliography 14 Bibliography
[1] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: [1] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP:
session initiation protocol," Request for Comments 2543, Internet session initiation protocol," Request for Comments 2543, Internet
Engineering Task Force, Mar. 1999. Engineering Task Force, Mar. 1999.
[2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a
transport protocol for real-time applications," Request for Comments transport protocol for real-time applications," Request for Comments
1889, Internet Engineering Task Force, Jan. 1996. 1889, Internet Engineering Task Force, Jan. 1996.
[3] M. Handley and V. Jacobson, "SDP: session description protocol," [3] M. Handley and V. Jacobson, "SDP: session description protocol,"
skipping to change at page 21, line 25 skipping to change at page 25, line 42
[7] D. Waitzman, C. Partridge, and S. E. Deering, "Distance vector [7] D. Waitzman, C. Partridge, and S. E. Deering, "Distance vector
multicast routing protocol," Request for Comments 1075, Internet multicast routing protocol," Request for Comments 1075, Internet
Engineering Task Force, Nov. 1988. Engineering Task Force, Nov. 1988.
[8] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for [8] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for
enhanced RTP scalability," in Proceedings of the Conference on enhanced RTP scalability," in Proceedings of the Conference on
Computer Communications (IEEE Infocom) , (San Francisco, California), Computer Communications (IEEE Infocom) , (San Francisco, California),
March/April 1998. March/April 1998.
[9] J. Rosenberg, P. Mataga, and H. Schulzrinne, "An application [9] J. Rosenberg, P. Mataga, and H. Schulzrinne, "An application
server component architecture for sip," Internet Draft, Internet server component architecture for SIP," Internet Draft, Internet
Engineering Task Force, Nov. 2000. Work in progress. Engineering Task Force, Mar. 2001. Work in progress.
[10] J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, [10] J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach,
A. Luotonen, and L. Stewart, "HTTP authentication: Basic and digest A. Luotonen, and L. Stewart, "HTTP authentication: Basic and digest
access authentication," Request for Comments 2617, Internet access authentication," Request for Comments 2617, Internet
Engineering Task Force, June 1999. Engineering Task Force, June 1999.
[11] R. Sparks, "SIP call control," Internet Draft, Internet [11] R. Sparks, "SIP call control," Internet Draft, Internet
Engineering Task Force, Sept. 2000. Work in progress. Engineering Task Force, Feb. 2001. Work in progress.
[12] E. Guttman, C. Perkins, J. Veizades, and M. Day, "Service [12] E. Guttman, C. Perkins, J. Veizades, and M. Day, "Service
location protocol, version 2," Request for Comments 2608, Internet location protocol, version 2," Request for Comments 2608, Internet
Engineering Task Force, June 1999. Engineering Task Force, June 1999.
[13] J. Rosenberg, H. Schulzrinne, and J. Peterson, "Third party call [13] J. Rosenberg et al. , "SIP extensions for presence," Internet
control in SIP," Internet Draft, Internet Engineering Task Force, Draft, Internet Engineering Task Force, Apr. 2001. Work in progress.
Mar. 2000. Work in progress.
[14] C. Elliott, "A 'sticky' conference control protocol,"
Internetworking: Research and Experience , Vol. 5, pp. 97--119,
1994.
[15] S. Donovan and J. Rosenberg, "SIP session timer," Internet
Draft, Internet Engineering Task Force, Oct. 2000. Work in progress.
[16] International Telecommunication Union, "Visual telephone systems
and equipment for local area networks which provide a non-guaranteed
quality of service," Recommendation H.323, Telecommunication
Standardization Sector of ITU, Geneva, Switzerland, May 1996.
[17] International Telecommunication Union, "Packet based multimedia [14] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo,
communication systems," Recommendation H.323, Telecommunication "Third party call control in SIP," Internet Draft, Internet
Standardization Sector of ITU, Geneva, Switzerland, Feb. 1998. Engineering Task Force, Mar. 2001. Work in progress.
 End of changes. 48 change blocks. 
190 lines changed or deleted 369 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/