RTCWeb Minutes IETF 83 Part 1 Wednesday 28th of March 9.00-11.30 Note takers: Roni Even, Serge Lachapelle Jabber Scribe: Olle E. Johansson Admin Update (WG Chairs) The WG chairs went through the agenda. No one wanted to bash the agenda. Magnus Westerlund reported from AVTCORE and MMUSIC WGs. In AVTCORE there was significant discussion around transport multiplexing. Roni Even (AVTCORE WG chair) has called a rough consensus on the mailing list. That means that the AVTCORE WG will create a document updating RFC 3550 with regards to multiplexing different media types over one RTP session. It will in parallel continue to discuss if it is to create a mechanism enabling multiple RTP sessions over a single transport. There was also strong interest in clarifying RTP behavior when an end-point has multiple sources (SSRCs) within an RTP session. Magnus continued to report on MMUSIC WG. The MMUSIC WG item draft-ietf-mmusic-sdp-bundle-negotiation-00 was discussed and the main topic was how to handle parameter duplication between the different m= lines in a bundle. The individual proposal in draft-alvestrand-rtcweb-msid-01 was also discussed. Any defined parameter should be usable beyond W3C context if it needs a matching semantics. There was also clear that the relationship to the W3C API needs to be documented. An alternative approach was also proposed using grouping and the a=label attribute at SSRC level. Marc Petit-Huguenin said some few words regarding draft-petithuguenin-behave-turn-uris-01 and draft-nandakumar-rtcweb-stun-uri-01. The drafts has been updated and the next version will make additional changes suggested by reviewers like removing the TURNS and STUNS schemes and updated parameters for indicating which protocol is to be used. Ted Hardie asked the room how many had read the room. Not particularly many had read it. Ted encouraged people to review them and provide feedback as they are relevant for RTCWeb. W3C Update (Harald Alvestrand & Stefan Hakansson) Harald and Stefan presented a report from the W3C WebRTC WG. The WGs task is to define the API. There are two implementations for some subset of W3C specifications in Google Chrome and Ericsson's Leif browsers, also Firefox and Opera has shown GetUserMedia functionality. The W3C WG has been split partially, the GetUserMedia part for accessing camera and microphones are now being worked on by a W3C task force. The WebRTC WG only uses the public mailing list and there is an open archive. The MediaStream API is relatively stable. The DATA API has been incorporated into the PeerConnection API. There are two proposals for JSEP integration where there soon will be a choice made. There are a Hints & Capabilities API is waiting to be integrated. Also a Stats API is proposed but not integrated. There are people that want the GetUserMedia functionality prior to PeerConnection. Hopefully also PeerConnection will not be delayed compared to the getUserMedia part. However, there might be parts of the document being changed to facilitate such a process of progressing GetUserMedia. Harald also informed about draft-burnett-rtcweb-constraints-registry-00 which is defining an IANA registry for constraints. Please review this and provide feedback. It is expected that this will be progressed as an individual submission on the standards track. Igor Faynberg asked two questions. Are there API for security like handling SDES keys and secondly how do you see of the coupling of the work between IETF and W3C? Harald responded that their implementation has security turned on and as soon as the DTLS is available in the right libraries that will also be turned on. There exist no methods for tuning security so far, and unless there is strong compelling reasons it is not likely to happen. On the second question there is large overlap of people between the two groups. So it is mostly a question of finding the most suitable place to put text. If it is not obvious how to link the API with the protocols then we have failed. ? (Qualcom) asked about the view on the declarative media capture in the API and the PeerConnection work? Harald responded that the declarative media capture is not in the WebRTC WG. His personal view is that it is pretty useless. The declarative is for taking pictures and has little use with the PeerConnection. RTP Congestion Control draft-perkins-avtcore-rtp-circuit-breakers-00 draft-alvestrand-rtcweb-congestion-01 Magnus Westerlund also informed the WG about the discussion in the ICCRG about RTP congestion control. There was no BOF at this meeting but is expected to happen at the next meeting (Vancouver) with the goal to form a WG for a long term solution. There was also discussion of the RTP circuit breaker draft in AVTCORE WG. Interested should join the AVTCORE WG mailing list (avt@ietf.org) for the circuit breakers and the rtp-congestion@alvestrand.no for discussion of RTP congestion and BOF preparation. Identity Proxy (Eric Rescorla) draft-ietf-rtcweb-security-02 draft-ietf-rtcweb-security-arch-01 draft-rescorla-rtcweb-generic-idp-02 Eric Rescorla (EKR) presented a proposal for RTCWEB Generic Identity Service. The presented goal with the proposal is to enable a communication session to be secured and authenticated with the callees identities. And those identities are one of relatively few identities the callees use on the web. The relation between the entities are presented on slide 3, showing that Alice and Bob can have independent identity providers. The three main entities are Identity provider (IdP), the Authenticating Party and the Relying Party. The relying party should not need to do any login at the IdP, while the authenticating party do need to login to prove their identity. There are a number of Web-based IdP systems in use today, such as Facebook Connect, Google Login, OpenID. EKR walks through examples how Facebook Connect and BrowserID functions. Charles Eckel asked in the context of facebook connect about a common attack. One attack by claiming to link to something, but what you really connect to is some other site. EKR confirmed that this is an issue. And a common mitigation is to always open the login at the top of the page to ensure the address bar is verifiable. EKR continued going through BrowserID. Igor Faynberg wondered about (slide 13 browserID) that as the javascript (JS) is generating keys, thus I need to trust the JS with my private key and not to leak it to the relaying party? EKR responded that the key is firewalled off and the RP can only request an assertion of the Authenticating party's identity. EKR propose that we take existing identity infrastructure to authenticate WebRTC communication sessions (Calls). Want to enable; using existing accounts, minimal changes to IdPs, and generic support in browsers. Goes on to exemplify this using BrowserID with WebRTC (slide 16). Andrew Hutton asked how long time does this take? EKR responded that it depends, but in most case not more than an extra round-trip time. In some case it might not take any extra time, when the caller (Authenticating party) already has logged in and the Relying Party (callee) has the verifier's public key, there is no extra round-trips. Oscar Ohlsson followed up wondering if there is a requirement that the callee (BoB) is required to login in advance to its IdP to assert its identity when answering? EKR responded that is preferable, as otherwise Bob will have to login as it starts ringing on his side. You want to streamline this process so you don't delay media establishment. Another possibility would be to start the call and indicate that an identity assertion is coming. Huilan Lu asked if there is a per session expiration of the certificate generated by the IdP? EKR responded that this is up to the IdP and something that is discussed. Current expiration is some couple hours commonly 12 hours. It is also possible to make the login cookie longer expiration than the certificate as generating a new certificate only takes one round trip time. Martin Thomson commented regarding the login at answering time. As you would expect most applications to anyway require login at application load, it is likely to encourage login in to any identity provider you intended to use at startup time. EKR responded that in most cases you would login using the same identity that you are going to assert when responding to the call. Tom Lowenthal commented regarding the certificate expiration that last version was 12 hours, however we do want it to be as short as possible. One doesn't want to need dealing with revocation handling for the certificates. Thus for each time the IdP managed to make it smarter to handle they can reduce the time the certificate lives. Huilan Lu asked if the signing is using fixed algorithm (slide 18). EKR responded that this will be dependent on the IdP local implementation. The assertion does not know how it is verified just that it was verified. EKR went on to discuss what needs to be defined (Slide 20). Partha Ravindran asked that you have assumed DTLS-SRTP is mandated or it doesn't work with RTP? EKR responded that unless you use DTLS-SRTP there is no usefulness in the Identity mechanism. Cullen Jennings followed up asking if you couldn't sign another keying mechanism like security descriptions. EKR responded that you get a different assertion in that case. You would trust the signaling server representation of the identity. The claim is basically that someone logged in to this site in the past using the identity that is now claimed to call you, a much weaker statement. EKR continued to discuss what information needs to be part of the authentication signature. Slide 22 discusses ICE parameters. Jonathan Lennox wondered if the ICE username and password should be included? EKR responded that is probably not needed. In Jingle, the response anyway need the ICE candidates with the username and password. Igor Faynberg asked what the different is between providing a black hole server compared to a malicious server? EKR responded that as the signaling server controls both which STUN and TURN servers are being used as well as the signaling message it can control independently of the signature or not which ICE candidates that is to be used. If the candidates are signed the signaling servers STUN server can lie in the STUN responses to get ICE candidate that fails except the TURN server. If not signed the signaling server simply modifies or removes ICE candidates as it wants. All to the same effect of controlling where the media traffic goes. So EKR thinks there is little value in protecting ICE candidates and thinks that Dan Wing and Hadriel Kaplan came to the same conclusion in their Media Identity draft. EKR went on to explain the solution proposal which is to instantiate an IdP Proxy using JS in its own I-Frame. Then only a small number of messages are needed to communicate between the proxy and the peer connection. Prototype implementation has been done. Next step is to implement this in Firefox to find issues and rough edges to improve the proposal and deal with the larger open issues (Slide 38). Q&A Andrew Allen: Have you considered the privacy concerns with having multiple identities. The user might not want to disclose certain identities to some callees. EKR responded that there are no problem with generating signatures with different identities. The real problem is how to indicate and select which identity you use. Tim Hanson asked if the reason for doing identity is that we want to separate the transport provider form the id provider. And if we are not interested in this we don't need to do it? EKR responded yes. Tim followed up with a second question, doesn't the signing limit what we can do with the SDP? In our implementation we had to work over the SDP pretty heavily. EKR stated that is why we likely only want to sign the DTLS-SRTP fingerprint and nothing else. Martin Thomson stated that EKR keeps saying THE IdP, it seems possible to have multiple IdPs and assertions. Having multiple appears especially relevant if third party IdPs are allowed. Martin thinks having third party IdPs a terrible idea. Having third party IdPs forces the relying party to make some sort of decision if they trust them or not. EKR responded that there clearly are no technical reasons why one can't have multiple assertions. EKR's experience from multiple signatures in for example e-mail messages is that it makes it difficult for the user to assimilate. Regarding third party IdPs EKR agrees that unless we must we should not specify for them. Jan Seedorf asked what is asserted, only the identity associated with the certificate or fingerprint, any additional property. For example how the user was identified by the IdP? EKR responded that one can envision the IdP providing additional information. However, the only thing you really can trust is the ID the IdP provides, like the username of the facebook account being asserted. If one trust the IdP there can be additional information, and we should consider a method for relating such information. Ted Hardie (as WG chair) commented that he proposed draft text around such usage as there can be significant privacy concerns with this information. Harald Alvestrand stated that he has been in previous efforts where there has been a need for a global infrastructure before being able to deploy. Thus nervous about this and want to determine that this is not the case for WebRTC. Can we ship all the other pieces of WebRTC without identity and the only missing feature is that we can't verify the identity? EKR confirmed that this appear to be true for this. Huilan LU asked how firm the design is with regards to the Identity object being bundled with the peer connection? EKR responded that it is not firm at all. At this moment EKR is quite certain that it can work as he has prototyped it, the next step is to try it out in Firefox to gain experience and explore things. However, there need to be a tight relation between the peer connection and the identity object so that one can verify that it is coming from the peer connection. Richard Ejzak asked about the case where there is no built in support for a particular IdP. Is there some method for testing this? EKR responded that this would be the common case to assume. Jonathan Lennox commented that the SDP can have multiple m-lines with different fingerprints which appear interesting. Should we allow or disallow to assert each m= line with different identity to allow for disaggregated media between different browsers. EKR would prefer to not allow it. Any implementation I would do will not support it initially. I don't know how to make it clear to the user that they are receiving audio from Martin and video from Justin in the same call. Martin Thomson commented that it is a different story, for example I might only have microphone and Justin only a camera. Ted Hardie (as WG chair) concluded this topic and asked for continued discussion on the list. What to do about RTP and/or SDES draft-ietf-avt-srtp-ekt-04 RFC4474, mostly Sections 1-4 RFC5479, Sections 1-5, and the sections pertaining to EKT, DTLS-SRTP, and Security Descriptions RFC4568, mostly Sections 1-3 Dan Wing was leading a discussion around Media security. The main topics where RTP vs. SRTP and then keying for SRTP and to get a good WG discussion of these. Dan was leading the discussion due to his previous involvement in the previous IETF discussions around media keying. First discussion of the plain RTP vs SRTP discussion. Introducing the Pros and Cons and considerations for interoperability, debugging, Protocol Complexity, Risk of maturity differences (Slide 11-16). Igor Faynberg asked about slide 16 (SRTP Pro and Cons) are it missing something about the risk from root certificate used as trust anchors being manipulated? Dan responded that yes, this presentation even include some slides about how to do recording. But this is mostly a question of how the identity is asserted going back to Eric Rescorla's presentation. If an attacker can insert a DTLS-SRTP fingerprint for a certificate they have access to, they can inject themselves in the media path, and that depends if they can modify it or even assert the identity. Bad things can still happen if the trust anchors are compromised. Dan then went on to talking about recording of media. There is clear need for recordings, but it must be possible to combine with encrypting the media to prevent confidential information from leaking. There is a WG in IETF for SIP recording (SIPREC WG). The two models defined by SIPREC was presented. Discussion of RTP vs SRTP (slide 23): Partha Ravindran commented that he agrees with making SRTP mandatory to implement. But must it be mandatory to use. The analysis is good, but appears to be missing things. For example couldn't we use IPsec and run RTP over it? The browser might not be a general purpose browser, it might be specific for enterprise deployment? Ted Hardie (as WG chair) asked if there is there a use case where running SRTP with null cipher and integrity protection and then IPsec would not get what you want? Partha didn't answer the question, only repeated why should it be mandatory to use, why can't others be allowed to select other solutions? Hadriel Kaplan commented that the browser doesn't know if one uses IPsec. From the browser the IPsec virtual interface looks just like any other interface. And when ICE is used, it can pick any interface and thus route the traffic over the public internet. When you mentioned custom browsers, it is outside of this WG, they can implement any proprietary feature. The question is if one can talk to the other side and interoperate with something that follows what this WG has agreed on and that can't do plain RTP. Partha clarified that with custom browser he still meant a WebRTC compliant browser. Partha also commented that there are a number of deployment assumptions in this group and that one can't secure in these. Ted Hardie (as WG chair) reminded everyone that we are talking about deployment in the Internet. Hadriel Kaplan commented that we can't control what the last hop is. We do not even need to talk about coffee shops. Even at this meeting I as person can find a private corner. But using the unsecured wireless unless there is security for the media I have no privacy which we all here knows. We will have to use SRTP. The keying is another discussion. The number one reason Hadriel sees for SRTP is the marketing. When WebRTC is written about in the trade-press it needs to say that it is secured. The keying doesn't matter what is important is that it is encrypted. We are competing with Skype and Flash. Skype could very well have all the keys ever used in Skype, but they get away with, the media is encrypted, thus "secure". Eric Rescorla, there are two facts to consider. First, there appear to be no reason to run plain RTP as backward interoperability anyway requires that the peer runs ICE which is problematic with most of the existing systems. Secondly, it is very hard to discern if one is in an environment where one needs to encrypt or not. Harald Alvestrand stated that as chrome implementer we have considered the matter and have found absolutely no reasons to support plain RTP. We will not implement plain RTP, if your browser doesn't interoperate with us, sorry about that. Jean-Marc Valin asked Dan Wing if the SRTP Null Encryption is susceptible to the same type of down-grade attack as plain RTP. Dan answered yes, but it depends on the key management. Igor Faynberg asked how in practice it works with the negotiation of the Null encryption. Can the far end prevent the usage of the null encryption? Dan answered that it is like TLS to a web server. Even if the client proposes null encryption the web server will not like to do null encryption and will instead choice one of the secure alternatives. This is part of the negotiation. To end up in null cipher mode both ends will have to prefer using null encryption. Richard Eizak commented that there are environments where the packet expansion created by the authentication tag, even when using null cipher for the encryption causes an issue. Dan asked if Richard could be specific about which environments. Richard said 3GPP air interfaces where you do Robust header compression (ROHC) for RTP. Bill Ver Steeg commented that he sees usage in supporting the wider set of use cases that unencrypted RTP allows and should be supported. However, if NULL encryption is supported all my objections goes away. Ted Hardie (as WG chair) called for a hum with the following two questions: - Will the working group support use SRTP only - Will the working group support both RTP and SRTP A Strong consensus on SRTP only was declared by the WG chairs. Why consider SDES Dan Wing continued with the presentation with the big reason why Security Descriptions would be supported, namely backwards compatibility. He walked through the considerations in more detail. One argument against security description is that you can't build identity on top of that mechanism. There was a comment from Bernard Aboba that you may not have any identity anyway, for example if that interoperability is against a phone number. Cullen Jennings as individual contributor commented that there is a WG in IETF based on the premise that phone numbers are a valid identity. Dan Wing continued with discussing the issues with interoperating DTLS-SRTP with Security descriptions on Slide 28. Continuing to discuss Encrypted Key Transport (EKT) (an AVTCORE WG) WG item. Martin Thomson asked if one of the compromises you do results in losing the forward security property? Eric Rescorla answered No. You maintain forward security by having both sides forget the key material actually used. It is not the EKT that is the problem, it is the Security Descriptions in the interoperability case. Hadriel Kaplan commented that the performance impact depends on the type of equipment and its architecture. For some equipment the symmetric key processing is clearly an impact, but for some architecture the key-exchange is the expensive part. This equipment may actually do the symmetric operations virtually for free. In fact in some equipment removing the EKT tag may actually represent more cost than re-encrypting. For equipment that handles a lot of session, especially short ones the public key crypto operations are what is really expensive. Oscar Ohlsson asked if the EKT tag can be in both RTP and RTCP packets? Dan responded that currently yes, they plan to propose that one of those options is removed. Oscar followed up asking if that doesn't mean more EKT tags due to the greater number of RTP packets. Magnus Westerlund interrupted this discussion and requested that the discussion happens on the AVTCORE WG mailing list. Colin Perkins asked if the EKT tag needs to be authenticated? Dan answered that EKT has its own authentication. Dan continued the presentation to exemplify how an SRTP key exchange would happen in the interoperability case. See slide 34-35. Christer Holmberg commented about this example that there is need for being careful in specifying procedures that require a new SDP Offer in the middle of the path. The problem with a re-invite in the middle of path is that the answer may actually be different in other parameters than was in the previous answer or offer in this session. This forces the middle to send a new invite towards the other end, resulting in a potential ping pong behavior. Dan answered that this is true. Performing the re-keying in the media path solves this issue and works a whole lot better. Yes, there is a problem for the media gateway that needs to support legacy equipment. This issue will prevail as long as we have SDP offer/answer and keying in the signaling path. Richard Ejzak commented that this looks like discourage of SDES on the legacy side. In the initial setup case you appear to be forced to do this middle initiated case if you want to avoid the burning gateway (use of EKT?). Dan went through the process and Richard commented that you can't wait with the initial invite until the gateway has established the DTLS-SRTP keys between the browser and the Media Gateway. Thus it always forces a re-invite to put the key in place. Thus Dan's proposal for how to deal with keying of legacy has some real issues. Oscar Ohlsson asked how often do you re-key calls outside of conferencing? Dan answered, every 10 years. Dan Wing continued presenting the Pro and Cons of Security Description and DTLS-SRTP. Adding that Hadriel's comment that you don't have to do public key crypto should be listed as a Pro on Slide 37. Discussion: Hadriel Kaplan commented when interworking DTLS-SRTP using the Media Gateway it will do all of the above (Slide 37) that Dan claims Security Description does. Dan answered, that this is for interworking, thus reasonable that you get the same set of weakness that Security Description has when interworking, not otherwise. Hadriel's follow up is why would one then lie to the WebRTC node about the security properties. This is the emperor's new clothes. There will be no identity either in this case. So why doesn't we support both Security Description and DTLS-SRTP? Ted Hardie requested clarification if Hadriel meant that there is no way for the WebRTC end-point to determine if it is going through such a gateway or not? The way it works today is that yes, a DTLS-SRTP user can verify that the fingerprints do match up. For Security Descriptions there is no meaning to bother. We know that there is no end-to-end security. You do have first hop security (assuming HTTPS). Ted responded that he understands but do not agree. Partha Ravindran disagrees with Hadriel. When working in SIPREC we understood that Security Description would not help. We only need some way for the IVR to indicate that the session is being recorded. I support DTLS-SRTP. Bernard Aboba thinks the presentation is a bit confused. No one suggest that DTLS-SRTP not be used. Security descriptions are for cases when the peer has phone number through the gateway or are a cheap device that actually support ICE. There is no either or here DTLS-SRTP will be used. Security Description is for a specific set of legacy use cases where the end-point is a phone number. Eric Rescorla stated that he is no expert of the deployment of gateways. There is no direct way of verifying a direct connection between a phone number and mechanical device. If the peer do have an SIP URI then you can do cryptographically verification. It will depend on the environment. Dan York stated that it is important to keep the deployment scenario in mind. For a pure WebRTC to WebRTC end-point use case we should do alternative 1 (DTLS-SRTP only). We have an opportunity to things right. For the legacy and gateway cases this clearly becomes more complicated. For the cases where we can, let's do the right thing. Oscar Ohlsson commented that even with DTLS-SRTP this doesn't provide end-to-end security. Real end-to-end security would be from video source to the video display. Now we only get security from one end to the other of a PeerConnection. We don't know what happens between the video source and the PeerConnection. This is made evident by the WebRTC demos which manipulate or distort video prior to even displaying it locally. Cullen Jennings commented that in the past we have many times talked about what happen security wise when media arrives at the device presenting a given identity. This is a difficult problem let us not rat-hole on unsolvable problem. Magnus Westerlund asked Oscar if the issue with end-to-end security in WebRTC context is that the media becomes available and may be manipulated by Javascript running in the browser and does not stay contained in the browser part. Oscar confirmed that clarification. Richard Eizak stated that he doesn't believe anyone is arguing against using DTLS-SRTP in browser to browser case. The question is the gateway case. Richard has in recent discussion stated that he believed DTLS-SRTP to be sufficient. He has now changed his mind. How Dan Wing presented the alternative for interwork is not viable and there are good use cases for Security Descriptions. The web server is in a position to determine if your peer is a network server or not. And in these cases where you are communicating with a network server you don't get any benefits out of DTLS-SRTP. You don't have end-to-end identity verification; you don't have additional security, you get rid of the exchange, you get rid of certificate processing, it is a hell of lot easier. If the browser is in browser to browser communication it should use DTLS-SRTP and there should be clear indication that you know who you are talking to. Hadriel Kaplan another pro for the Security Description is that you get faster media, there is no media clipping at the start. That initial delay from when you click accept is an important factor. Mobile phone users have started delaying their response with a second to avoid the issue. In WebRTC we could talk about seconds. This is a big benefit that shouldn't be ignored. Unless there is a real flaw with security descriptions which there aren't as we need IdP to prove the identity with DTLS-SRTP. A second benefit of Security Description is that we have experience with it and knows it works. Partha Ravindran stated that he don't believe DTLS-SRTP is difficult to implement, at least not in my media gateway. Another point regarding end-to-end security. Any media gateway will break the designed security, at least we can ensure the first hop security. See DTLS-SRTP as a step forward rather a step backward which Security Descriptions are. Martin Thomson doesn't see any significant difference between an identity at a particular device like a phone and a media gateway where I can't attest what happens beyond it. If a user can see the stated identity of the media gateway then they can take that into context. The main concern with having both solutions is how to avoid the downgrades. Dan commented that we will have downgrades as it simpler or faster or whatever argument an implementer has to only offer Security Description. Harald Alvestrand biggest worry about DTLS-SRTP and EKT is the amount of moving parts and the amount of drafts recently coming, it doesn't feel stable. This may not be a problem. An architectural comment about telephone gateways is that he finds them confusing. It appears straightforward to consider a phone number an identity and the gateway has to be an identity provider for that phone number. Regarding the key-exchange and the round trip time, we will encounter cases, like the trombone where the signaling server is far away and the media peer is close. We will see which is the more common. Randell Jesup (channeled from jabber): two issues: 1. Extra bandwidth for EKT (noticeable, especially for audio). 2- SDES means trivial downgrade of security by an attacker - bid0-down attacks. Users need to able to be told their media is secure or it isn't. A subtlety of first-hop security doesn't play to users. I say use gateways when going to legacy SDES equipment. Dan commented that there is a short and a long tag, can't remember how many bits they are. EKR commented that he thinks it is possible to design EKT such that it has no overhead in a vast majority of cases. Dan commented that this was the original design, but the interworking gateway case made us change that. EKR responded that he still think that is possible. Eric Rescorla would like to follow up on what he said in relation to plain RTP. There is a cost associated with doing both, there is the downgrade attack. What is the benefit of doing Security Description that we know is inferior? In EKR's perception the primary reasons has been the legacy interworking. This draft(which? EKT) side stepped one of the primary reasons, namely the run-time performance. This removes the primary motivation to have Security Descriptions. Ted Hardie (as individual and from the floor) don't see how the arguments that has been made for Security Descriptions makes any sense. The arguments are functionally saying that the end-to-end isn't real because it has gone through a back to back user agent. We know that B2B user agents terminates. Supporting Security Descriptions is an argument for changing what we trust. Giving the media keys to the signaling path is losing way too much. Ted declare that he will not be part of calling any consensus on this question as he want to make a strong statement. WebRTC should support DTLS-SRTP only for keying and we can adopt this now! We can then discuss EKT as an upgrade that doesn't change the properties of the system already in place. Cullen Jennings (as WG chair). We need more discussion but we do want to poll this room for where it is at this point. We have header several comments about that EKT is evolving. We will remove that from the choices in the poll and it is for future discussion. We will obviously include DTLS-SRTP. Taking a hum between the options: 1) Do you want Security Descriptions to be a mandatory to implement 2) Do not want to have Security Descriptions as mandatory to implement, i.e. DTLS-SRTP only (with possible extension with EKT) The chairs declared No Consensus between these two options. End of Session RTCWEB IETF83 Second Session Chairs: Cullen Jennings, Magnus Westerlund, Ted Hardie Note Takers: Miguel Garcia, Paul Kyzivat Jabber Scribe: Dan York Admin (Chairs) - 10 min Scheduling the next interim meeting There is a doodle pool to select the scheduling the next interim meeting. The link has been posted to the mailing list. Possible location: Stockholm, Boston area, or Silicon Valley. There was a request to collocate with W3C. Signaling (Justin Uberti) - 60 min draft-ietf-rtcweb-jsep-00 Justin goes through his slides. Minor changes made to JSEP to allow ROAP to be implemented on top. Cullen added as an author. EKR asked if we think ROAP is something the group cares about. Jason said no. Cullen said ROAP was a proof point of feasible mapping from JSEP to SIP. He thinks there will be something replacing ROAP, but as an example. Justin: JSEP should give access to any protocol. Cullen (floor): This WG sooner or later should describe how to map to signaling protocols, including SIP. Justin enumerated issues that have been raised, and some provisional answers. He proposed to use sdp direction attribute to control whether to enable early media. Partha asked if PRANSWER is needed. Justin said he had hoped to get rid of it, but found it is needed. Christer asked about use case 1 – concerned that you get two answers, which isn't sippish. Martin Thompson says there is no such thing as early media – just media. The only difference is that you have consent after 2nd answer, not after first. Justin skips ahead in slides to answer: The meaning of pranswer is that there will be another answer. Allocated resources are not released on PRANSWER. Partha says that the answer is sufficient – there is no extra provisional answer state. Justin says this allows resources to be released. Richard Ejzak asked about RTCP behavior. Is there anything special here? Justin says it should able to be sent immediately without issues. EKR asked about assumed timing of PRANSWERS prior to trickle ICE. Reply that trickling can go on before or after. EKR worried that offerer can't generate own candidates until it gets PRANSWER. Harald says timers make debugging hard. Requests to avoid timers if possible. Cullen said sip doesn't use timers to free resources. Only the app can do that. Martin Thompson asked what these resources are. He isn't convinced there are any. Reply is that its decoder hardware. Martin says â"and we want to let the javascript that we don't trust be in charge of that?" Justin justifies the answer. Partha: SIP already runs timers. The current 3264 does not allow multiple answers. Stefan H: the JS application deals with media streams and tracks, the browser does not know how to map media streams and tracks until an answer is received. Justin elaborates, and acknowledges the potential problem of media clipping. Jonathan Lennox: what would be the consequences of this if you don't ever send an answer? Jim: Is this about releasing resources before we close the peer connection? Justin: the peer connection will have to do it. Magnus (as a chair): Please try to send your comments to the mailing list, indicating what you think it might be wrong. Justin: slide "Multiple 2xx Answers". This is a forking scenario. Christer: these are two separated dialogs. Justin: assume for a second this is a forking scenario. The goal is not to mix media if forking happens. Partha: agree on not mixing. Justin: we need to make sure that the final answer will not use resources that have been already freed. Partha: concerns of who releases resources, the library or the application. Justin: the application does not know the exact details, but has this abstraction. Richard Ejzak: SIP allows multiple successful answers, but typically is not done. Christer: I agree, this is how this should be done. I have never seen deployments where you will get multiple 200 OKs, because the proxies will always take care of it. And if you get it, it will be acked and byed. So, I agree with this. Cullen (floor): What Justin proposes here meets everyone needs. You can do whatever you want, except mixing. This solution works well for everyone. Hadriel: clone. When you clone, can you get the same port numbers, 5-tuples? Justin: No. Justin: "Associating Candidates" slide. Justin: "Release of Candidates" slide. Mark Thompson: Is this a browser implementation, right? we shouldn't care about it. Justin: we need to make sure that the API supports it. Mark: is there a choice whether to signal? Justin: here we have a route change indication Justin: talks of updating the selected ICE candidate in an updated SDP offer. Justin: there is no need to remove the local ports announced by ICE, even if they are not used. Mark: they are resources, should release them. Justin: some resources are more scary than others. Christer: Can the browser inform you that is not going to give you more candidates? It would be good for the JS to inform of failures to contact a TURN server, for example. Justin: the drafts says there is a complete callback. The JS does not know how many candidates to expect. You will be notified that you are ready to go. Justin: slide "ICE Restarts" Christer: when you do a new offer/answer, and something has happened (change of IP address), then you can start ICE again. Justin: for the re-ICE case, it will trigger the remote side to do re-ICE. Justin: slide "JSEP states". Based on 3264 state machine, but here an offer can be updated with another offer Jonathan Lennox: This model does not allow to model Reliable Provisional Responses in SIP. Cullen: the reliable provisional response in SIP is modeled as an answer Jonathan: and it releases resources? Cullen: Yes. Jonathan: Use case: I am getting media from two people and I want to send them an update. Cullen: The only use case I come up with involves RSVP, which we don't support. Let's see if we can find a use case. Robert Sparks: fork, you get 2183, and you want to tell to shut up. Richard: forking to multiple destinations, with reliable provisional responses, and you want to do an update. We need to be able to do cloning. Cullen: I don't see why we need cloning. Let's take it to the mailing list. Justin: Cullen and me need to see if we need to update the state machine. Justin: slide "JSEP attributes" EKR: Is the model to avoid modifying the browser with SDP? Justin: You need to deal with SDP and remote SDP. EKR: there is a cost for the implementor that needs to be SDP conscious. Christer: when the JS app gets the SDP, it can modify wherever they want. Is this list something that JS can inform to the browser? Justin: it should, and in some cases, it is essential. Harald: I hope we can limit the number of cases where SDP manipulation is needed. Martin: The browser should support all these attributes. Quite a large API. Justin: the browser needs to deal with remote arbitrary SDP. Needs to be able to receive weird SDP. Martin: the browser needs to implement all this long list. I don't know where we are going, what is supported and what is not. Justin: good points, as a WG we need to decide what is the support of SDP and what is not. Stephan: how much do we need to specify here? If I change browsers, the behavior should be the same. Justin: agreed. EKR asked about JSEP attributes – don't you still need to be SDP conscious. Asked if goal is that can do lots of things to this way rather than with new APIs. Christer asks if browser will notify the javascript if it has modified attributes. Harald regards hints as steering wheel and brake pedal, while SDP is plugging into engine control. He hopes can do what we need via hints. Martin says that browsers must be able to respond to all these hints – quite complex API. He wants to know why we aren't talking about these hints in terms of the features we want to support rather than in terms of the SDP. Bernard Aboba: advertising the maximum number of things you can do? Justin: yeah, should be possible, as part of this baseline of things that should be supported in the first version. Data Channel (Salvatore Loreto) - 30 min draft-ietf-rtcweb-data-channel-00 This is SCTP over DTLS over UDP. Proposal for SCTP over DTLS was presented in TSV this week. It's a small draft. Ted has a question on the relation with TSV WG. James Polk asks for comments on the TSV WG. Ted encourages people to read the draft and post comments to the TSV WG. Speaker 3: interested on SCTP congestion control. Salvatore: I don't want to spend time at the moment, it should not be part of the first version. Michel Tuexen: SCTP does not support the negotiation of congestion control. Multiple congestion control have been implemented in BSD. Cullen (chair): this looks like an important issue, but let's take it to the list. Salvatore: Multihoming cannot be supported, it is happening in SCTP over DTLS (over UDP). Ted (floor): Can we get in a future to a situation where we have SCTP over DTLS over SCTP? Please make it clear in the draft that you are talking of SCTP/DTLS/UDP. Hadriel Kaplan: asking if there are addresses put into the draft. Salvatore: no IP addresses. EKR: this is more about ICE than any other thing. Cullen (floor): once you setup the initial connection between these two things, the SDP you send and receive, you send it over an SCTP connection. If you are dealing with mobile devices, multihoming will be highly desirable. I would be interested if the authors could look at the way of doing it., so that the mobile sits on top of multiple SCTP interfaces. Salvatore: I think the IP address is transparent for us. Cullen: there is a difference between "change" and "support of multiple IP addresses". Chair interrupts and recommends to follow up on the list. Harald Alvestrand: In SDP, there are no "UDP" labels, only "SCTP/DTLS". It is not obvious that these specifications describe the same thing you are describing. Salvatore: In MUSIC, the draft was there to show how to start an SCTP association. We didn't have a discussion until now. Please send it to the MMUSIC list, we can discuss it there. Randal Jessup from the jabber says it is important to do congestion control. Justin: Multihoming: if we are going to bundle, do we need to support Multipath TCP? Ted (chair): do we need a separate specification for DTLS over ICE? Randal Stewart: you might have multiple connections, you need to have a connection identifier. Michael Tuexen: if we would go to know if you want to have SCTP over multiple DTLS connection. We need to make sure that we have a single congestion control mechanism. Salvatore: slide "Association Setup". Cullen: updates should be done over SDP or directly over the data channel. I think it should be done over SDP for allowing middle boxes to be updated. Michael: yeah, but this is all encrypted, the middlebox is not going to know anything Cullen: just because is encrypted it does not mean the middlebox is not aware. Salvatore: slide "Control Messages" Salvatore: we need this to negotiate the number of data channels Cullen: we should go to a different convention. He makes a proposal. EKR: putting something on SDP is like putting more crap to SDP. Randal Jessup: these channels are very dynamic. Renegotiating this over SDP will be painful and slow. Harald: in a case of a trombone where the signaling gateway is far away, SDP is slower. In case of direct negotiation, it does not harm. And I find SDP a complicated protocol. Michael Tuexen: you need to know the characteristics of the channels you want to use, but then they come and go dynamically. Hadriel: it would need to makes sense to put it in SDP if interoperability was needed, but this is proprietary. The only thing you need in SDP is a line to say "data channel for my proprietary thing over SCTP". Codec Selection (Adam Roach) - 50 min Adam indicates that he does not have a particular interest, other than the success of the working group. He wants to select a codec. Stefan Wenger: the statement on the slide is not correct. The correct statement is "I am not aware of any patent..." Tim Terriberry: Just because there is a patent pool around H.264 does not mean that other patents may be applicable as well. Adam: slide "Potential Ways Forward" Adam: slide "Reflection on option 3" Bernard: there is a fourth option: make sure that implementations are extensible, so that codecs can be added, then this discussion is irrelevant. Cullen: the video tag allows you to add new codecs, is this what you call "extensible"? Bernard: yes Adam does not agree, there should be at least one common codec for two browsers to make the video call. Tim Panton: another option, to specify a cruddy video codec (H.261). Tim Terriberry: some browser, due to security vulnerabilities, do not allow you to install codecs. EKR: The requirement is to download a browser and make video calls. This means we need to have support for one minimal video codec. Justin: supporting the common codec approach. Stefan Wenger: He heard the same discussion in HTML5, where they don't have a mandatory codec. The industry has solved this by the industry, not a standardization group. Adam: the difference is that with HTML5, a server in the network can do transcoding. Here we are talking of point-to-point communication between browser. Stefan: this does not happen in practice. Content becomes available in one format, not the other. The other argument: in contrast to a download service with multiple codecs, here we have a negotiation mechanism, and we are in a better situation than a download model. Someone from Mozilla: if browser A does VP8 and browser B supports H.264, then there is no interoperability. Matt Mathis: This conversation came take a lot of time in the IETF. Even if you put one paragraph on the spec for support, for bootstrapping purposes, it will be old in 10 years, and implementations will have to support it for legacy purposes: EKR: we did that in TLS. The default video codec in HTML5 is Flash. Tim Terriberry. HTML5 has negotiation for multiple mechanism, the video tag can list multiple formats, the browser picks one that supports. Speaker 4: to have a reasonable stable system, we need to have a basic codec, such as H.261. Speaker 5: if we don't agree on the codec, we could agree on the process to select a codec (e.g., coin flip). Ted (chair): hmm on people who read the Note Well. EKR: suggest to call for consensus on abandoning the call for consensus on selecting a codec. The chairs wanted to get a very rough idea of where the room stood. No consensus call was take but Cullen asked for a straw poll between H.261, H.264, VP8 to help the chairs know how to proceed. EKR: in favor of not selecting a single codec (2 people) There was a mix of positions in the straw poll. The chairs are not surprised that this is a hard topic. We need more discussion before we can make a consensus call. Stefan Wenger: disagree with the initial summary of both codecs in the initial slide. Stefan clarifies that he does not have a statement to make under the Note Well. Christian: some people raised their hands on two codecs, perhaps one solution is two mandatory codecs to implement. Adam goes through the backup slides. Stefan and Bernard challenge the statement that if the WG does not choose a codec, it will be failure of the WG. Ted clarified the goes of the Alternative Decision Making. Bernard clarifies that one group will not implement the other's choice just because the standards says it; and vice versa, so there is no point in trying to select something. EKR: the process depends on who you select for opinion. Speaker: if the industry implementation is 50-50, then there is no technical difference. Ted (floor): you ask technical comments to the community for a broader perspective. When external reviewers are involved, they may be able to see technical and non-technical issues that the WG does not see. Dean Willis: as a developer I will not touch H.264. But I know we need to make a decision. Small developers need something cheap and safe and basic. H.261 will fall into this category. Alistair Woodman: The discussion was the same in H.323, and G.711 was selected as audio codec because of IPR free. None believes it would be useful, but this stopped the discussions. (Note - G.711 has been widely used) J. Lennox: questions about external teams. Are the teams restricted to the options that the WG decides or can they suggest a different option? Adam: they can propose something else Spencer: please go through the slides. Chairs: people have a lot of things to say. We cannot take a hmmm today. Chairs: Dean's proposal got quite a lot of support. This conversation has to keep going. Harald: I ask people to know what new information would people change their positions. Harald: the difference between a review team and a random process... Randall (jabber). H.261 is safe, but we fail to ... Stefan: 1. FRAND defense is going to get teeth now. 2. MPEG has two projects for royalty-free codec 3. H.261 would be the safest option. H.263 is more flexible and commonly believed to be safe enough. This might be another option. Bill: supporting the external reviewers option. Justin: explaining what is failure. Guy from Mozilla: in the end what counts is what people will implement, not what gets written in the standard. Ted said to take this discussion to the list, and Stefan said please not. Meeting finishes at 11:31