Monday, 26th of March at 13.00-16.10 2012.
Note Takers: Brian Rosen, Charles Eckel
Chairs: Roni Even, Magnus Westerlund
The WG chairs provided a short status update after having covered the Note Well and the Agenda. The WG has had one RFC being published, three documents in IESG processing. draft-ietf-avt-srtp-ekt-03 have recently had some discussion on the list in relation to RTCWEB. draft-ietf-avt-srtp-aes-gcm-02 will progress after EKT. draft-ietf-avtcore-srtp-encrypted-header-ext-01 has finished WGLC. It will need to address comments before publication. draft-ietf-avtcore-monarch-11 is in WGLC.
The WG chairs made a milestone review. There were a number of milestone dates that was missed. Update proposed with new dates. There have also been proposals for new milestones, Overview of RTP security solutions, RTP clock source signaling, and SRTP crypto transform and key management based on ARIA algorithm.
The chairs also informed the WG about the change in Publication Request write-up (PROTO). This includes asking the authors about confirmation about IPR disclosures related to the document.
Magnus Westerlund also reminded the WG about the importance of filling in the blue sheets. One reason is so that we get the right sized room.
Harald Alvestrand presented his personal view around media flows, transport pairs and their relation to RTP session. Since RTP and SDP were published the amount of bandwidth capacity has gone up significantly and RSVP and Multicast hasnít deployed. The network can provide prioritization. But that is only needed when you actually want prioritization. Harald claims that RFC 3550 is making a leap of faith. One being that media types align with desired prioritization. Section 5.2, bullet 4 (Slide 6) blind mixing only works for audio conferencing. If one does video one would not like to do blind mixing.
Cullen Jennings asked about the leap of logic, he canít quite understand the argument. Harald answered that if you believe the statement, then a mixer with sufficient meta data could mix something nice out of 3 audio and 2 video streams. So if you donít do it blindly the statement is false. Doing it blindly the statement is irrelevant. Colin Perkins followed up this questions saying that of course one canít mix blindly. The argument in the RFC is that at the time of writing the appropriate method of providing the meta data needed was using transport streams. Harald responded that he thinks that argument is wrong.
Harald moved on to Section 5.2 bullet 5 (Slide 7). The prioritization between media streams are not function of top level media types it is of desired functionality. Dave Oran asked for a clarification about Haraldís assumption. Do you think ďmultiple mediaĒ in the sentence being multiple media types (audio and video) or multiple media streams? Harald couldnít make head and tails of that and has assumed multiple media types. Dave Oran then commented that he and likely many others donít by the crux of the argumentation because we are not agreeing with that interpretation. Harald replied that interpreting it as multiple media streams would preclude multiple SSRCs in one RTP session which no one has argued against. Roni Even commented that this list in the RFC 3550 is in the context of a discussion of multiple media types, not streams.
Harald moved on to Corrective Action (Slide 8). Thinks there is no need to change RTP and also the RTCP isnít connected to the media type, rather media stream asymmetry. What is needed is SDP corrections. Those are underway in draft-ietf-mmusic-sdp-bundle-negotiation-00. Now time for discussion.
Tholis Eckert? did you establish and arguments why life would be better with RTP with what you want? Harald responded that the reason this was started was that in Google when they needed to establish four transport flows there was a high failure rate. This got much better by deploying RTP and RTCP multiplexing when only two flows where needed. Now we know what the cost is and therefore we want the possibility to go to one flow when it is appropriate.
Colin Perkins stated that have we had this conversation 15 years ago I might have actually agreed with everything you said. Unfortunately we have a fair amount of history with RTP and SDP. RFC 3550 is not the only thing impacted by this. This can work in some cases including the one you are interested in. There exist serious problems in other use cases. You appear to ignore these other cases. Need to document when it works and when it doesn't not only change the SHOULD in RFC 3550.† There are drafts that say where it's a problem. Harald responded that one level I do agree with what Colin said. The reason I went ahead with this is that there are cases which are argued against by this quote. This change of permission canít render any existing application unusable. Colin asked, who is arguing against it, only seen recent arguments that it doesnít work in all cases. Harald, then we can finish bashing in this open door and document this.
Cullen Jennings wondered what you are asking for. According to the slide we are on track to do things with all the issues that need work. Hadriel Kaplan sarcastically commented that if that one use case is sufficient for a solution, while all others need to do things generic. But stepping back, draft-ietf-mmusic-sdp-bundle-negotiation-00 is far from done. Also, this is a proposal of changing RTP which is what you want. Magnus Westerlund has another proposal of how to do multiple RTP sessions over a single transport (Solution B) which would be an alternative and we can do that. And Cullen commented that I donít see the problem. Hadriel responded that the advantage of multiple RTP sessions over one transport †is that it support other use cases. Which also donít want to create multiple transport flows but have multiple RTP sessions. For example, to simplify transcoding of the audio and not the video. Just add a sixth tuple so that one can separate them inside of the five tuple used today. Cullen responded that he wants to send the same RTP packets I send today on a single transport. Hadriel responded that this is actually not the same packet you sent yesterday. For example if using SRTP with a single RTP session there will be a single key for all media streams. Cullen responded that this is not in his slides. Cullen stated that Harald is arguing that all is fine, nothing needs to be done. Cullen doesnít think that is what Harald really wants.
Roni Even (as WG chair) commented that one need to understand when it works and the limitations are. In the end what is important is that one understands when to use a single RTP session and when there is a need for multiple sessions. Hadriel asked if Roni thinks both solutions are needed. Roni responded that even if only one is needed it needs to be documented with its limitations. Colin Perkins commented that his proposal is that we do A and B and write yet another document to explain the trade-offs. Hadriel asked why would we do both if B solves all the cases and A has limitations? Colin agreed with that, but many donít see these limitations as serious. Roni commented that B doesnít use the same RTP as today. Colin concluded that we have been arguing about this for months and we are not agreeing. Thus letís do both and document the trade-offs.
Hadriel asked the proponents for A if they really cared if it is A or B, doesnít you just want the hammer? Is the two or one byte header that big a deal? Harald commented that on an architectural level we should not continue to enforce an unsound architectural choice. Desire that we do get the consensus of the AVTCORE WG so that we donít need to have this discussion anymore. As a deployer I want to put audio and video in the same session without getting grief from AVTCORE. Hadriel asked as if you really care about one or multiple RTP sessions. Harald answered one due to lines of code. Hadriel responded that you anyway need to support multiple RTP sessions and thus already have the code. To add a session multiplexing layer would mean more lines of code. Hadriel responded that he believes that in the end it would be less lines of codes. Today you should have no code and supporting a multiplexing layer is; if supported, then read two bytes and send here. Harald responded that donít you think we have implemented single session solution after such a long argument? Cullen Jennings commented that he doesnít think this is a valid argument. We are arguing over 5 lines of code, when we are dumping many thousands of lines of code into the browsers.
Moving on to the next topic.
Was skipped, to allow for continued discussion around single transport multiplexing, i.e. next topic.
Starting time 49 minutes into first audio recording.
Magnus Westerlund (as individual) went into a presentation intended to drive discussion and achieve the goal of knowing which work should be done. Therefore there is quick check pretty soon and unless clear consensus exists we will go into the details. On slide 4 the options are presented; 1) Define a mechanism that enables multiple RTP sessions over a single lower layer transport. 2) Clarify the RTP specification and operation of a single RTP session with multiple media types in it. 3) Do both 1) and 2). 4) Do nothing.
Cullen Jennings commented that he didn't get the slide (4). Is option 1 include the option of adding a mux header, and the second using something like payload type to differentiate the media types. Magnus responded yes, the main goal he has is to have full RTP sessions with full RTP functionality. Cullen responded that he doesnít know what an RTP session is. Magnus clarified that one really important function for option 1) is to have multiple RTP sessions, each having their full 32-bit SSRC spaces available for usage, including having the same SSRC value being used for different media streams in both session, which is not possible in option 2. Roni commented that this is part of the discussion in the multiplexing architecture document (draft-westerlund-avtcore-multiplex-architecture-01). Colin Perkins commented that the proposal in draft-westerlund-avtcore-transport-multiplexing-02 is to add a demultiplexing shim to realize option 1. The second option is basically to do what Harald proposes. Cullen asked, can you clarify what Harald is proposing? Magnus suggested that we should go through the details of the proposed work on slide 5.
Magnus presented his and Colin Perkins proposal to do option 3), i.e. doing both option 1) and 2). The details of what needs to be worked on are described on the slide. Cullen Jennings wonder what happened to the bundle proposal, i.e. take a bunch of stuff that are on different m= lines in the SDP and allow them to run on a single 5-tuple? Magnus answered that this is Haraldís, i.e. option 2. Cullen disagreed that there is a difference. Hadriel Kaplan supported Magnus view. Cullen followed up asking if there are two video cameras are they two m= lines? Magnus responded that it is possible but not necessarily and that has not been the main discussion point. The main discussion has been around having both audio and video in an RTP session. Cullen wants to get clarification on an easy thing, what about two video cameras. Magnus responded that this is not part of the discussion at all. Cullen responded that this what he got was different in the neutral discussion compared to how bundle currently stands. Magnus responded that he thinks the issue Cullen is after is how you represent in SDP when one end-point have multiple media sources. This issue hasnít been much discussed at all. Cullen commented that he thought this is what was discussed when started down the bundle path. Magnus responded that he at least wasnít discussing this issue. Magnus tried to conclude that the question of how you represent multiple media sources in an RTP session in SDP will still be there independent of the choices and proposals in this slide deck. Colin Perkins commented that it is important to separate the signaling issue from the RTP level issues. People have been putting multiple cameras in the same RTP session for 15 years. We might argue how you signal this but people have been doing this for a long time.
Colin Perkins reiterated the options. 1) We have the concept of RTP sessions and find them useful and like to keep them when transporting them over a single transport flow (5-tuple). There is also use cases for putting things together in the same RTP session that previously were† not, i.e. option 2), which some find useful and can make it work, like WebRTC. We also need a document that says what the issues and considerations are.
Harald stated that when he has been discussing bundle he hasnít been discussing if an m= line should represent one, two or fifty sources of the same type. Magnus commented that it is good that at least Harald and he have been discussing the same thing.†
Richard Azack asked why canít we have multiple unique sets of payload type and use these to determine which RTP session the packet belongs to when sending multiple RTP sessions over the same 5-tuple. Magnus answered that in most implementations you start at port level, the payload type indicates encoding, which would force one to use Port + PT as determination of RTP session. Colin Perkins commented that Richard should read draft-westerlund-avtcore-multiplex-architecture-01 for extensive discussion of why it wouldnít work.
Magnus proposed that we do a consensus check to see if the discussion so far has accomplished anything. Roni Even is the one that judges the consensus, if any, and he shouldnít call it unless it is very clear. Magnus iterated the choices. Cullen Jennings commented he doesnít know what any of these questions mean. That we want to work on something around multiplexing is the only one that is clear and have been for several meetings. Roni Even clarified that all Magnus attempts to determine is how to go forward. Cullen Jennings stated a proposal for the WG. Letís take the requirements people have expressed and analyze the drafts against these. Stop talking about sessions and transport and answer simple questions such as; can I send multiple video cameras over one UDP port? Can I offer multiple codecs for the same camera and switch between them without additional signaling? Can I run all my stuff on one port, two ports, N ports. Cullen couldnít answer these questions for any of the proposals. Those should be clear to go forward. Colin Perkins answered that the answer to the first two questions is yes and have been for the last 15 years. The answer to the third one is what we try to decide. Roni added that both of the solution options can resolve question three, but we are trying to determine if we should do either or both.
Hadriel Kaplan commented that he hasnít seen the advantages disadvantages between the options. From his perspective option 1 has advantages; don't know what they are for 2). Are we going have any slides on the advantages disadvantages? Are we voting without knowing the pro and cons? Magnus stated that he can present what the limitations are one the single RTP session, which is part of the slide deck. Hadriel commented that the only downside he knows of for 1) is the overhead (one or two bytes). That is what you are seeing on the wire commented Roni. From the point of the receiver it is two different streams. Which Hadriel commented is exactly what he has today. They are just demultiplexing this two bytes in, which is why it is simple, following the KISS principle. Roni responded that we today have both the single (multiplex inside) and the multi-session (multiplex on transport) cases, the only limitation that exist is the SHOULD NOT in RFC 3550 about audio and video in the same session. Hadriel responded that this is the big rub isnít it? There are reasons, and what Harald is saying is that these may not be applicable to his use case. And this is only a should. So people for which a single RTP session with multiple media types are applicable can do it. Thus we would not need to develop what Magnus is proposing. However, there are still people thinking that it is neededÖ Hadriel commented that there are clear advantages to Magnusí proposal. Hadriel is flabbergasted that people doesnít understand how easy it is to add two bytes. The attribute issues we discussed in MMUSIC this morning. With Magnus proposal these all goes away as each m= line is still a separate RTP session. Roni responded that this is a question for MMUSIC. There is the question Cullen raised if one describe each SSRC as an individual m= line or have multiple SSRCs in a given m= line. Hadriel responded there aspects that matter in this group. For example by having separate RTP sessions there is separation of the RTCP reports also. Thus the reports are orthogonal and independent between the RTP sessions. In the single RTP session case the reports needs to be about different media types and one issue in fact the payload type relation on what one reports on. You are mixing oil and water and tries to get mayonnaise.
Jonathan Lennox commented first that a lot of the disconnect between Colin and Cullen is two models. The first one is the way RTP works in the head of the ones that designed RTP, meaning that you can have up to 2^32 SSRCs in an RTP session. Then it is the second model in which RTP works, the one being used by SIP which is one SSRC in each direction, end of the story. The one understanding the second gets confused over the first one. The ones understanding the first one gets confused over when people make statements following the second one, as the response from the first one is that of course they can do that. Jonathan prefers the first model, it has many benefits, unfortunately there is lot of SDP is designed according to the second model which is broken. This is MMUSICís problem. A second comment was to Hadriel, that these two bytes are only two bytes and having to rewrite anything that needs to look at the packet on the wire, like Wireshark and Deep Packet Inspection. It is mostly a question of who does the work; the people doing the media plane, or the ones doing the signaling plane. Hadriel clear do not want it to be him. Jonathan clearly donít want it to be him either.
Paul Kyzivat commented that he doesnít know enough to choose between the two options. But is concerned over doing both. If you do it both you will have lots more interoperability. One side do one and signal that and the other side do the other, we will have a mess. Magnus responded that a class of applications actually have to agree on which method they are going to use to avoid that issue. An application or class of applications must determine if they can accept the limitations of a single RTP session or actually need multiple RTP sessions. RTP is used in a wide variety of applications. Unfortunately a lot of them arenít represented in this room. Paul commented that you assume that there are applications that are firewalled from each other? Magnus responded thatyes there are classes, like sensor networks or sensor data transport.
Cullen Jennings commented he would like to try to up level the discussion. Colin appear frustrated by the ďthis is signaling problem take it over thereĒ and Cullen himself are frustrated that when he takes it to MMUSIC he is told to take it AVTCORE. And in AVTCORE he is told to take it to MMUSIC. And we are both the same people. RTP clearly is used in different context, and Cullen is interested in usage inside the RFC 3264 SDP Offer/Answer usage of RTP. †In this discussion we come up on cases that are non-compatible. It might be SDP O/A that is wrong, but independently that is irrelevant. We must consider what is built and deployed and what we need to interoperate with. That is the hard part of SDP and how to make this work together. Cullen wants to see clear descriptions of how we are going to interoperate with existing implementations. Magnus asked if what you are suggesting is to continue working on both tracks to get the real details of both proposals. Roni commented that the choices and what is need to be addressed in MMUSIC will be different depending on if you do only Multiple RTP sessions over a single transport or if you do multiple media types in a single RTP session or both. If we can come to a decision it would make it easier in MMUSIC. Magnus commented that if we follow the line Hadriel is arguing doing only Multiple RTP sessions over single transport then life in MMUSIC is likely easier. If we do both solutions, the choice between the solutions will be out of our hands, it will be in other places where the choice is made between the solutions. When it comes to ďMultiple Media types in a single sessionĒ one way of looking at it is that it has really been SDP that enforced the SHOULD NOT, and as bundle is happening the limitation has already been released and it will happen. Personally I think it is doing both or only Multiple RTP sessions over a single transport. Doing nothing would in fact mean that we need to go kill bundle in MMUSIC so that there is no way of signaling multiple media types for a single RTP session.
Hadriel Kaplan followed up on Jonathan regarding Wireshark. Wireshark will have to be changed independent on which we do. Jonathan responded that this is not true as multiple SSRCs are already supported. Hadriel is that really true, can it really play back both audio and video in that single transport flow? [Confused partially off the mic comments]. Roni commented he thinks what we trying to accomplish is to have everything working that is currently within RFC 3550 and in addition allow multiple RTP sessions being transported on the same flow. Hadriel responded that isnít true you are trying to have multiple media types in a single RTP session. Roni disagreed. Magnus supported Hadriels view and went on to comment. There are two different things but they are brought together by the commonality in enabling multiple media types over a single transport flow. Hadriel followed up saying that there are two choices for solving that problem. We either have a single RTP session with multiple media types and get all the issues in MMUSIC or we take what we have today and get an extra muxing a few bytes in, a 6-tuple instead of 5-tuple everything else stays the same. Of course it is going to be simpler for me, why do you think I am up at the mic. It will also be useful in other contexts, like SIP. Jonathan Lennox commented that this could be used in SIP. Hadriel responded, not realistically assuming SBCs. Hadriel went on to comments Magnus previous comments as being confusing. Of course we should always try to do only one thing? Roni commented that the previous decision in RTCWEB was to start with the single RTP session and then develop the multi session. Hadriel, responded that this is not their decision to make, they have to follow the decision made in AVTCORE and MMUSIC. If some vendors want to do it, then it is a proprietary solution. If I am held to the fire and have to come to this group and get things through then they will have to do the same. Cullen Jennings commented that it wasnít some random group that decided what Roni said, it was this WG. Magnus stated that minutes for Quebec IETF says that we are to do both. It was Magnus that brought up the question of doing only one solution in Taipei, otherwise we would be further along the track to do both. Hadriel clarified that my statement that both will be in the marketplace anyway doesnít matter, we IETF can decide which are standardized solutions and we can decided to do only one.† †
Roni commented that in terms of RTP nothing prevents you from having multiple media types in a single RTP session. It is an SDP issue. What we can do in this WG is to capture what it means to do that. The text in RFC 3550 is that yes you can do it but be careful due to the following reasons. Roniís interpretation of what Magnus says is that AVTCORE should document this. For the issues of doing it in SDP we need to discuss that and see if it works. There might be other signaling protocols where this works.
Harald commented with his process lawyers hat that WGs donít get to override other WGs decisions. IESG is the ones that get to override. If RTCWEB makes a decision that is incompatible with AVTCOREs then we have some ADs with a headache. Haraldís personal architect hat comments that only doing Multiple RTP sessions over a single transport does solves Hadrielís transport multiplexing, not Haraldís architectural issues. The decision of top level media type is still stupid and should be fixed no matter the cost. The implementor hat is that we will get something that works. Colin responded, surely you are not proposing that we fix every stupid IETF decision
Jonathan Lennox commented that we either can make RTP a bit uglier to work around SDP or we can make SDP uglier to work with RTP. Following Alice's Restaurant principal that one big pile of garbage is always better than two smaller piles of garbage, Jonathan thinks we should put the garbage on top of the existing garbage and make SDP uglier.
Jonathanís second comment was that he donít get the cases where multiple media types in one single RTP session doesnít work. Either they havenít been well explained or they donít apply to real use cases. I am happy to do both as long as the use cases are good enough and someone else is happy to do the work. Magnus commented that he can present the two biggest limitations.
Magnus explained slide 10 pressing on that one want to establish a communication session with multiple participants or have a point to point session that goes through a media gateway. Assume that one client C and the central point are supporting and enables using multiple media types in a single RTP session. On the far side of the central point you have one or more clients not capable of using multiple media types in the same RTP session, instead they fall back to using different RTP sessions over independent transports. Thus you have SSRCs that comes from different RTP sessions that suddenly need to function in the context of a single RTP session without collisions. If they are colliding they force the need to do SSRC translation. SSRC translation is costly as it forces re-encryption of the content and forces the translator to be aware of all RTCP extensions being used in the RTP session to guarantee correct translation. Roni commented that it is an Mixer so it shouldnít be a problem. Magnus answered that this is primarily for nodes that donít do full processing of the media, such as transport translators, media gateways or RTP mixers that switches between sources. Jonathan commented that if these are offer/answers you canít do that anyway (see later presentation). Roni also commented that he agrees with Jonathan if you are talking about RTP mixers they by definition present a different SSRC space to each leg. But Magnus are correct for a gateway. Magnus responded that in RTCWEB it appears that the limitation Jonathan talks about isnít going to exist due to JSEP. Cullen Jennings asked why you canít rely on the SSRC collision mechanism? Magnus responded that it might not work if client A uses the same SSRC on both RTP sessions, which is allowed as they are different sessions, then it will not see a collision. It is the merger of the RTP sessions that causes the collision. Colin Perkins also commented that there exists signaling that overrides the collision detection rules in RTP. Cullen responded that as Jonathan said if people are doing offer/answer this is not a problem. Colin responded that it is a problem if one does a=ssrc and the other isnít, then you have a problem.
Richard Ejzak asked if there is negotiation between the client and the central node then it is clear if shall be used or not. Magnus responded that then it becomes a deployment problem. If one want to use a single RTP session you have to forklift upgrade your clients. The reason is that you negotiate the transport individually between the client and the central node. The goal is to agree locally if you can use it or not. Otherwise you will need to know that all clients that may join this particular session will support using a single transport.
Roni Even commented that we are running out of time and should try to give some guidance on how to proceed. Magnus brought up the Options slide and asked if anyone is a supporting the option to do nothing. No one supported that. Cullen commented that his understanding of the previous agreement is to do multiple media types in a single RTP sessions and then continue working on enabling multiple RTP sessions over a single transport. There is too little clarity to make a decision.
Justin Uberti commented that he thought that we had agreed that we have one proposal ďMultiple Media Types in a single sessionĒ that works for a set of interesting use case. We can clear work on another solution that works for all use cases. But I thought we had agreed on doing ďMultiple Media Types in a single sessionĒ. Magnus commented that he thinks that both Justin and Cullen in fact are arguing for the Proposal on slide 5. Justin responded that the agreement was for multiple SSRCs and a single port and potentially multiple media types. Colin Perkins commented that what he remembers we go forward with the proposal for multiple media types in a single RTP session. We also bring forward a proposal for multiple RTP sessions over a single transport. Which is what we are doing and we are saying that we should go forward with both of them.
Roni Even stated that we canít guarantee that we will in the end be able to support all of them. After AVTcore WG makes a decision, and described the RTP aspects we will have to specify how to use them in SDP. And that will not happen here. We can only describe how to do both using RTP, not the signaling.
Jonathan Lennox commented that with the exception of Hadriel we appear to have some consensus for doing Option 3 (both) module the understanding of what three is, including continuing with individual submission for Multiple RTP sessions over a single transport.
Roni Even stated that for the single session that you will need to expand the architecture document to describe when it can be used without bias for Multiple RTP sessions. The Multiple RTP sessions needs a separate draft. Lennox commented that it clearly needs normative work. Single RTP session needs to remove the should and some descriptive work. Roni followed up that means it is an update to RFC 3550 in the end. Magnus stated that his understanding is that the people are saying doing both independently. There was proposed that the main proponents of different views would meet and figure out the document structure and who does the work? Roni asked if anyone was objecting? Documents will be not adopted at this stage, we first need to figure out the right set of documents. We are to progress with the assumption that we will update RFC 3550 in regards to the SHOULD and also continue with draft-westerlund-avtcoretransport-multiplexing-02. That is the direction we are going to take. Brian Rosen asked what about the part that describes single RTP session with multiple media types needs to discuss its limitations. Roni responded certainly. Roni concluded that we appear to have a consensus of how to progress. We also need discuss with MMUSIC how to deal with the SDP aspects.
Starts at 1:45:00 in the first audio recording
Colin Perkins presenting. The motivation is that RTCWEB is talking about congestion control, there are some proposal but it is a difficult problem. This works is about conditions where if encountered everyone in this room would agree is problematic and therefore should stop. Anyone using congestion control would always be within these limits.
Colin will also be talking about this in ICCRGís second session on Tuesday.
The base assumption is RFC 3550 RTCP information. Future work will look into refining this using RTP extensions. Using RTT/Jitter every 5 seconds is too seldom. Packet loss statistics every 5 seconds is not good for congestion control but for overload detection is probably good enough. Two circuit breakers are proposed. The first one is timeout, no increased RTCP extended sequence number for two reporting intervals. The second breaker is if one gets 10 times the TCP throughput equation says an equivalent TCP connection would get.
There are open issues that need to be discussed. Include the assumption for the circuit breakers and if they in themselves are appropriate. The second major thing is around using any available RTP extensions.
Cullen Jennings commented that both look great. They are easy, implementable exactly what we need for circuit breakers. Think we (Cisco) tried this in our Telepresence equipment. Think that a single packet loss could result in that HD rates were not possible any more. Colin responded that it might be correct as the model is standard TCP and does not do high rates very well. But it is something we need to look into.
Flemming Andreasen asked if this assumes RTP packet losses are congestion? Colin responded yes, and clearly this is not always the case. If one canít do the assumption how do we do a circuit breaker at all? Flemming responded that might mean that this isnít applicable in all cases. Needs to be clear on this.
Harald Alvestrand asked if Colin have looked at some example numbers yet? Colin responded no, not yet. Harald had a second question regarding the packet size, appears odd that we would get lower throughput if we use smaller packets. Colin responded that there was the reduced size work for TFRC and we should consider taking that into account. Harald proposed that we could use the MTU for that. Colin responded that this is a slightly bigger issue and there was discussion for DCCP around this. Will look into this. Harald commented that it probably doesnít matter much if we are aiming at hitting within a factor of 10 for this. Magnus commented that the question comes down to if one is byte or packet congested. Colin followed up that the things that use small packets are commonly audio which is lower bit-rate than video and thus less likely to cause congestion.
Tholios Eckert? asked what is the magic with 10 times the throughput? Colin responded that it is number picked out of thin air. Tholios asked if it the fairness argument, that RTP can do 10 times what TCP could over the same path. Colin responded that the proposal is not a congestion control algorithm it is a stop you are breaking things algorithm. The factor 10 is a number that makes it clear that it is significant more than what TCP gets and thus a clear sign that something is wrong. Tholios commented that 10 times is so unfair that you should give up. Colin stated that 10 is not set in stone it is a starting point for the discussion.
James Polk stated that he had sent an email to the RTCWEB Congstion list (firstname.lastname@example.org) and wonders if that was the right list to provide feedback. Colin think that list is the right one. He will reply to the question which had to do if the solution is for best effort only, and what if is not best effort, and multiple DSCP being used?
Magnus Westerlund (as chair) stated that this works appear to be very relevant for this WG. It will likely end up as part of congestion control discussions in a number of RTP extensions etc. This will be the outer limit on what can be done, and hopefully in the future we will see more fair, better etc algorithms. Colin responded that what is important is that we agree on that this type of draft is what we need to do. Then we can argue what the right content is. James Polk commented that if the transport area is spinning up a RTP congestion WG then this work will belong there. Magnus commented that until that happens AVTCORE appear a good home. The goal with this was to produce something quickly that RTCWEB could reference. We hope that the future WG will produce something that we more generally can reference in these cases. James asked Harald if he is motivated to get the WG spun up by Vancouver? Harald did acknowledge that.
Presentation starts 1:59:00 into the first audio recording.
Jonathan Lennox presenting RTP Topology considerations for Offer/Answer initiated sessions. Offer/Answer (O/A) has two modes, one for multicast addresses and another for unicast (the more common). Jonathan has analyzed the topologies in RFC 5117 and all except one can be assembled from Offer/Answer unicast, the exception is Topo-Transport-Translator. The Topo-Transport-Translator is basically a relay of packets from one participant to all the others. This basically behaves like a multicast session would. For example all participants need to have a single view of the session. The rest of the topologies do some level of modification to the media, RTP packets or RTCP reporting. In essence the middlebox ensures that a given participant can understand what it sends from the other participants.
So why doesnít Topo-Transport-Translator work with O/A? Where the topology requires a common session configuration, unicast O/A allows each end-point to establish its own view of the session. Some examples from the core of SDP O/A are 1) Bandwidth. When you send an Offer you express the bandwidth you want to receive. When the answerer sends an answer it expresses what it want to receive. These values are used to determine what the RTCP bandwidth is using the 5% rule. If the two end-points express significantly different values you end up with different RTCP reporting intervals and timeout values. In a session where a common view needs to be established, no-one can tell what the common value is to be as there exist no obvious answer. 2) Media Types. Participants remove media types (Payload Types) from an offer in the answer or updated Offer. An answer can even renumber the payload type numbers used for a given configuration. 3) There exist additional parameters such as ptime and many SDP extensions that will have issues.
The above is a good thing, as you know that you in fact are only talking to one other party. That may be forwarding or rewriting other participantís media, but you can optimize for this case. Thus you can make optimizations which arenít possible if you blindly are talking to a large group of participants. Colin Perkins commented that everything except transport translators and multicast. Jonathan agreed as multicast do have special rules in SDP O/A and can detect that. It doesnít know about transport translators.
When you have asymmetric bandwidth the optimization you can do is to have the peer use 5% of my receiver bandwidth towards me, and I will use 5% of your receiver bandwidth towards you. Timeout is calculated accordingly. We also resolve the issue that if A uses one SSRC and B 5 SSRCs, then B adds another SSRC, then Aís bandwidth is reduced, despite it not being shared with any other SSRC in this direction. They are not rivals for the same bitrate capacity.
RTCP receiver reports. In a RTP session all receivers sends reports on all senders. In a many to many session this goes towards number of reception reports being quadratic to the number of session members. If you are actually talking to the same person they are basically useless. Colin Perkins asked for clarification that they will be stacked up in the same packet. Jonathan agreed, but when you are on the order of 150 participants, they canít be stacked up into the same packet due to MTU. Colin commented that in that case you are round-robin between the SSRC on which you report. Jonathan responded that starts to slow things down. The point is that the middlebox is consuming most of the RTCP reports itself. So the current rule uses up RTCP bandwidth that could otherwise be used for more useful and timely things. If you have multiple sources coming from the same end-point, having them report on each other is completely useless.
Normative recommendations for unicast SDP offer/answer
∑ RTCP bandwidth and timeouts MUST be calculated independently in each direction
∑ endpoints SHOULD NOT send reports from their own sources
∑ endpoints SHOULD pick single reporting source to send reception reports
Colin Perkins agrees that there is an issue with RTCP bandwidth and asymmetric paths. We have known about this a long time but no one has been annoyed sufficiently to do something about it. For the other two points, he believes that the end-points must send reports. The whole point of the transport translator is that it behaves as a multicast group, so don't try to change it; if you do not want this, use another topology (e.g. terminate RTP session on the mixer). Jonathan responded that the point is that I know I am using one of the other models. Explicitly then use one of the other topologies so that you donít see all the other SSRCs and donít have to report on them. Donít change the definition of RTP so it breaks. Jonathan responded that even if you are terminating RTP sessions on the mixer but are sending a lot of sources in each terminated session you still have the same problem. If I have 150 SSRCs in a session and youíre the peer with 150 SSRCs we do have 150^2 reporting blocks. Colin, but then use an RTP mixer so that you donít have as many SSRCs. Jonathan responded, we are not using an mixer, I am exaggerating a bit, but we are seeing this problem. Colin Perkins responded that having built such a system 15 years ago without any issue what is the problem? Bandwidth is cheap. Jonathan responded that he donít like complexities that are O(N^2). But it ends up being wasted/stealing from more important things. Colin responded, then configure your bandwidth fraction sensibly. Jonathan responded that he canít send his feedback in a timely and frequent fashion. Colin responded, you have 150 reporting intervals, send them on any of them. Use AVPF, donít see the problem.
Joerg Ott commented that you might want to engineer this off-line. This looks interesting but donít know if it is the right suggestions. Joerg has looked at more complex overlay structures using RTCP point to point in segments. Jonathan responded that he wants to know more about Joergís use case.
Magnus Westerlund (as individual) thinks the problem should be split up. We should at least clarify how entities with multiple sources should behave. The second two bullets (Slide 15) apply to all the topologies. Jonathan responded that there is a difference between transport translator and multicast and the rest. Magnus followed up that one of his main concerns is that we shouldnít have different behaviors depending on topology. If one clarifies how any end-point with multiple SSRCs are to behave it will apply to all topologies. Magnus also supported looking into the asymmetric bandwidth, but the WG should target generic solutions that arenít specific to offer/answer. Please remember that we have other usages of RTP that arenít established O/A signaling. Jonathan asked if they apply asymmetric bandwidth. Magnus responded yes. Jonathan commented that he think in that case you canít do a transport translator. Magnus thinks dividing the proposals up into independent pieces which can each be discussed on their own merits is the way forward.
Harald Alvestrand asked if the last two bullets on slide 15 are clarifications or changes. Jonathan responded that it is probably a question in debate. Jonathan has a way of constructing it so that it is within the spec if you assume a weird internal topology. Harald responded that he has worked with an engineer trying to interpret RFC 3550 rules and several times thought they canít mean that. So please clarify. Jonathan followed up that the bullets can be interpreted so that the local SSRCs canít see each other and therefore canít report on each other. The issue is that if the other side looks at this, it can think the network is broken as 2/3 of the sources canít see me. If they know you are doing this it is not a problem. Colin commented that is the issue. There are devices that will think this is broken. Jonathan asked, are they devices that can use offer/answer. Colin responded that he donít think O/A is the right thing to think about. Joerg Ott commented that you need to clarify what you think your co-located sources are. There is work on SIP end-point decomposition ongoing, so you donít quite know how well the thing you suppose is sitting next to you really can talk to you or not. Please keep this in mind when you go through this.
Starts 19 min into the second audio recording.
Joerg Ott presenting Multipath RTP proposal. The authors have received quite a lot of comments, mostly about signaling.† Recap and summary of updates since -02.
Two mechanisms to convey interface information; an RTCP based inband and a SDP based out of band. SDP takes precedence over RTP.
Implemented multipath to test a bandwidth distribution algorithm.† Results of single path vs. multiple paths, if enough capacity, both work well. Similarly; as loss increases multipath helps.
- Turn into WG item?
- Add security considerations.
- Double-check with mmusic.
Currently the draft is targeted for experimental RFC.
Roni as Chair: Is the AVTCORE WG interested in a WG item working on an experimental multipath solution? Jonathan Lennox commented this is more forward looking than most care about immediately, but will be useful in near future. Inclined to support this work. Joerg Ott one additional target deployments are multicast overlays for streaming is one place. Magnus, as Chair: not strong support, but will check on list as well. As long as someone drives it is necessary, but we also need some from the community that review and comment on the work.
Starts 28 minutes into second audio recording.
Aidan Williams presents the new draft on RTP Clock Source signaling. The draft is a result of discussion at IETF 82. Several people had interest in this so making it a general building block is a good idea.† IDMS and AVB sync both had SDP signaling for clock source, combined into this document. Need mechanism to provide explicit SDP signaling of NTP timestamp clock source.
Motivations are that a single reference clock used to create NTP timestamps in RTP protocols in a end-point. No indication currently of which clock is being used. Use cases include social tv, video walls, networked audio, and sensor arrays.
Using SDP to let end-points describe what clock source they are using. IP addresses, EUI-64, Ö Definition of equivalent clocks is also given. Can describe various clock types, NTP, 1588 family, GPS, local. There is also a concept of Traceable clock. If you can track it back to a global time source.
Going through the example on Slide 7. Ted Hardie asking If you have NTP attr on one m-line, shouldn't it be on all of them, i.e. session level? NTP is after all a system level time source. Aidan responded that this is an artificial example in slide. Commonly you would have one for the whole session. Ted followed up that he wanted to be certain that a receiver of the SDP would not make a decision on the m= lines based on time source as the same time source would be available on to all m= lines. Aidan responded that for a given system yes. We want to be able to detect systems that arenít using the same clock source or master clock. Ted commented that in this emitter case are you presuming that if I have a video source it may have a different time source that the general system time source? Aidan confirmed that could occur. Jonathan Lennox commented that in the case of disaggregated media you might not actually want different clock source and should actually raise an alarm. Ted asked if there is a method for interrogating the sources of disaggregated media for their time source. Aidan responded that he was not aware of any standard mechanisms.
Toerless asked if you expect it to become reference for any upcoming work. This works appears to allow end-points to determine if they are synchronized or not in a binary fashion, possibly even how desynchronized. Is it up to these end-points what they can do with that. Aidan responded yes, that is what it provides. IDMS is the most immediate use case. Aidan clarified that he himself is primarily in the synchronized audio playback case. Toerless asked if the end-points will have well defined behavior depending on if they are synchronized or not. Aidan responded yes, for a synchronized audio playback a device would probably mute, but this will be application dependent. Toerless asked if this actually help if one only has a local clock? Can I calculate the drift and use that somehow? Aidan responded that what really helps is the knowledge if the sender and receiver have a synchronized time base or not.
Magnus as Chair asked the WG if there is interested in working on a solution within IETF in this WG. There was some interest indicated through raised hands. The reason asking the question in very general terms was that the chairs needs to discuss with Ads and MMUSIC WG chairs if it is appropriate to take the work on.
Starts 43 min into second audio recording.
Ray van Brandenburg provided a short status update. Using synchronized clocks with IDMS improves performance but is not necessary. Clock source signaling has been moved out as an optional but normative reference. Risk that this delays IDMS. Other changes are clarified terminology, dealing with leap seconds, and general editorial clean up. Think it is ready for WG last call.
Colin Perkins asked if the leap second issue is IDMS specific or a general clarification to RFC 3550. Ray responded that it is applicable to any use cases requiring tight synchronization. It could be a clarification on RFC 3550. Currently there is a general remark in the clock source draft and a more specific one in IDMS. So based on your comment we might need an even more general for RFC 3550. Colin responded, that yes if this will affect others than IDMS then making it a very short draft updating RFC 3550 that IDMS can reference. Ray agreed to look into creating such a draft.
Glen Zorn was curious why reference to clock source draft needs to be normative? Ray responded that his interpretation of the normative vs. informative guidance is clear that an optimal but performance improving part should be normative reference. Glen commented, that the clock synchronization is optional. Magnus Westerlund (as chair) stated that even for optional parts, if you do need to understand the reference to implement it, it is still a normative reference. Glen was skeptical clearly normative if you need to understand it to implement it and be compliant. But in this case it is optional and thus you donít need it to be compliant. Magnus responded that it depends on the use case and an implementer may actually have to include this optional part.
Roni Even concluding as chair; First thing is to write leap second as separate draft applicable to RFC 3550. Secondly, regarding the WG last call we can go ahead without clock source but the final publication will be held until the clock source is available.
Starts 51 min into the second audio recording.
Ali Begen presented this individual proposal which last was presented in Taipei. Duplication of RTP stream and RTCP reports is not well specified, this draft addresses that need. Temporal (time shifted) redundancy use case. The SDP using SSRC grouping to indicate the duplication. When doing spatial redundancy the SDP uses media stream (m=) grouping, plus use of cname and srcname (draft-westerlund-avtext-rtcp-sdes-srcname-00) to indicate which SSRCs in the different RTP sessions that are duplication of each other. This is necessary if there is more than one SSRC from a given cname.
One remaining issue, if stream merging happens inside network element, how to deal with it. Propose treating as mixer. Roni Even commented (as individual) that it appears to be the same issue as for the AVTEXT WG item draft-ietf-avtext-splicing-for-rtp-07. Colin Perkins commented that both a mixer and translator would work, there is slightly different trade-offs. The Mixer makes it visible and lets you report and the translator makes it invisible and donít lets you report. Jonathan Lennox commented that it appears what is happing is that one takes two streams and mixes them. Unless you deliberately try to hide this use a mixer. Roni (as individual) asked for clarification that when Ali said merge it resulting stream will include packets from both streams. A mixer will result in only one outgoing SSRC.
Ali asked for adoption of this draft in AVTCORE WG. They are already working on getting the signaling part adopted in MMUSIC. Magnus Westerlund (as chair) asked Ali if he believes that the push back raised previously has been resolved about the utility of duplication? Ali responded that the push back has been around why you wouldnít use FEC instead and those have been well explained in the draft. Roni also commented that another issue was around congested paths. Ali commented that this is not the intended usage to address congested paths. Magnus called for a hum of who are interested in this as a WG item. A fair amount of people supporting a WG item, none against, the chairs will take to list.