DRAFT - to be checked against audio archive. IETF 101, TSVWG Meeting Minutes Chairs: Gorry Fairhurst, David Black, and Wes Eddy. Thanks to Richard Scheffenegger and Paul Congdon for assistance in taking notes. 1550-1750 Afternoon Session II 1. Chairs Update: Four RFCs published since Singapore: 8260, 8261, 8311 and 8325. Two drafts are expected to be submitted to IESG - SCTP errata and DSCP IANA process changes. We expect 5 drafts to go to WGLC before next meeting. There are 7 other WG drafts that are in progress; SCTP NAT, a set of L4S drafts, UDP options, Tunnel Congestion Feedback and Datagram PLPMTUD. There will be 2 related drafts discussed in this meeting. There are 3 additional WG that are not on the agenda for this meeting. In the INTAREA WG there are 3 drafts that are related to TSVWG activity - tunnel MTU, fragmentation and SOCKSv6. Milestone updates and review of WG Progress: SCTP NAT draft should be ready for WGLC at/after the Montreal meeting, so October is a good new milestone date for it. Confirmed by Michael Tuexen. Tunnel Congestion Feedback - Bob Briscoe reported that Donnald Eastlake and Andy Malis have become interested in using this for Service Function Chaining (SFC). They were pointed to this for load balancing across service functions, and are interested in helping. 2. Announcements and Heads-Up 2.1 Liaisons (none) 2.2 Other Drafts Related to TSVWG draft-olteanu-intarea-socks-6 (For info - Please discuss on INTAREA list) draft-saldana-tsvwg-simplemux (For info - Please discuss on TSVWG list) draft-herbert-fast (For info - Please discuss on TSVWG list) draft-han-tsvwg-cc (For info - Please discuss on TSVWG list) 3. Transport and Network 3.1 Gorry Fairhurst: IANA Action for DSCP Pools draft-ietf-tsvwg-iana-dscp-registry (In WGLC) Spencer Dawkins (as AD): (regarding discussion about shortening the name of the draft) The current name is meaningful, it is ok. Roland Bless: I commented on -01. Replace last part with "requires IANA action". Gorry Fairhurst (as author): OK, I plan to revise shortly after the meeting. Bob Briscoe: When you say there will not be any negative impact, would you know there is no private use of this pool of DCSPs? Gorry Fairhurst (as author): The purpose of publishing this is to ensure people know about the change. I do not think existing use is harmful, but there is a change, people using local use DSCPs should already be aware that they may be used by others. This will be called out to other WGs in IETF LC. Spencer Dawkins: If those networks use the local DSCP, don't they need to just shoot themselves in the foot when they use this LE PHB? Gorry Fairhurst (as author): They should not if they adhere to DiffServ specs. David Black: DiffServ behaves best with complete configuration at the DiffServ perimeter. Spencer Dawkins: I look forward to the shepherd writeup, that should call attention to any possible impacts. Roland Bless: This should not be a big issue, as there was an indication these DSCPs (xxxx01) were ready marked as possible use for future standards action. Local remapping is always possible. 3.2 Roland Bless: Lower Effort PHB draft-ietf-tsvwg-le-phb David Black: (Asks for any final feedback on the decision to allocate the DSCP). I see no objection to use of codepoint 000001 as the recommendation here for the LE PHB (to be confirmed on list). Bob Briscoe: The text currently says: With respect to the use of LE traffic by different styles of congestion control. The text says you don't want to harm people, you shouldn't harm others traffic. I think this text should describe what happens if there would be harm. Gorry Fairhurst (from the floor): I prefer an approach that says SHOULD use a less than best effort congestion control (e.g., LEDBAT) and explain why this is really desirable. That is, if you don't, there may be unwanted or unexpected interaction with other traffic. That would be better guidance. I (personally) would be reluctant to use MUST, but really I would like to see people do this. David Black (Floor Mic): From an operator perspective SHOULD vs MUST is irrelevant, an operator has to defend the service regardless of which words are in RFCs. I think the text proposal by gorry sounds OK. Bob Briscoe: The text just doesn't make sense in the current wording: whether LE traffic is going to do harm should be written clearly, the important thing is to write clearly words to say in what conditions harm can result. David Black: The operator concern is not for a singular flow, but for the aggregate. Gorry Fairhurst (as a chair): Bob, please send example text to list, to help find better wording Bob will send text about whether use of a less than best effort transport (e.g. LEDBAT) ought to be a MUST or SHOULD requirement for use of the LE PHB, and David will add operator perspective on this. Roland Bless: We will discuss this on the mailing list. Do we need extra text on tunnels? David Black: A reference to RFC2953 will help. We recommend not updating the document about 802.11 mapping, it is pretty clear how this new DSCP maps to WiFi. Roland Bless: The ID with guidance on WebRTC (draft-ietf-tsvwg-rtcweb-qos-18, in RFC-Ed queue) needs to be updated. This ID discussed CS1 and that advice will now become obsolete. Spencer Dawkins (as AD): Updating a draft in the RFC Editor queue is possible to change to the RFC to reflect the new recommended DSCP. I will drop a note to RFC Editor, and ask how best to do this. Gorry Fairhurst: The Chairs should let WEBRTC people know of the updated consensus here. We will let the IETF and W3C RTCWEB/WEBRTC groups know and recommended the DSCP change. (Please cc tsvwg.) 3.3 Anais Finzi: Priority Switching Scheduler draft-finzi-priority-switching-scheduler (After discussion with the authors this is headed to the Independent Submission Editor, with inputs from TSVWG.) The scheduled talk did not occur in Singapore. The goal is to make the AF Class more predictable. This scheme changes the priority of the traffic based on credit thresholds. Roland Bless: Is EF not impacted? I am not quite sure about that conclusion. EF has a definition of an error term. Anais Finzi: In our simulations we did not see any impact. The maximum impact is at the maximum frame size. Roland Bless: I will check the paper, thanks. David Black: Roland volunteered as a reviewer of draft. Ruediger Geib: This proposal would remove the req'd policer on the EF class. Would this conflict with RFC3246? (unknown): So far, the IETF did not specify any schedulers. Is this Standard Track? David Black: No. This is heading for an Independent Submission. Ruediger Geib: I am not sure if I want to have that scheduler on a 100G link. I can review the draft/paper, but I am not an expert on scheduling. Roland Bless: Easy to configure? To set these 3 parameters... There paper shows how to translate the current config into the corresponding set of settings in PSS. This can be simplified by removing some, but not optimal. Roland Bless and Ruediger Geib agreed to review the draft, and David will send an announcement to the list requesting other reviews. 3.4 Tom Jones: Datagram Path Layer Path MTU Discovery David Black: Have you looked at the draft in INTAREA on ICMP PTB signals? Tom Jones: Yes, we have read and sent a few comments, this does not conflict. Michael Tuexen: SCTP can do verification on the VTAG, not just the 5-tuple. Matt Mathis: I will read this draft. The issue of authenticating messages from the network is much broader than just PMTUD. It is good to check if there are new solutions. In the absence of authentication, I agree that messages are just advisory. You have to be able to be able to tolerate bad cases (e.g. byte-swapped lengths). Gorry Fairhurst (as author): Are you saying there can be an advertised link MTU much smaller than the actual path MTIU? Matt Mathis: There is in principal a DOS attack trying to reduce the PMTU. We did see cases where the two MTU bytes were swapped. The intent was to facilitate Jumbo discovery - but this moved the problem down to the design of NICs and switch buffer carving. Magnus Westerlund: It is very important to make verifications robust. I will review this draft. Michael Tuexen: When working with SCTP we found a middlebox (at my home) that just gave some random number, instead of the VTAG! Eric (Akamai): We have been running into a wide range of MTU issues. This is a nice optimization. How does it interact with load balancers? This is important work, we need to consider how PTB gets mapped back to the source. I also saw NATs and at least one TCP optimiser that did strange things. Tom Jones: The method only takes PTB as advisory, we plan in the next revision to add a probe method to verify if the actual PMTU is larger than the advertised link MTU. Please tell us about any strange behaviours - that would be greta input. Gorry Fairhurst (as author): Load balancing (i.e., use of more than one PMTU) was one motivation for making it robust - we want to get this part right. For this, we need some tales of what happens in the wild - please talk to us? David Black: How does this work relate to the transport work in QUIC? Spencer Dawkins (as AD): I hope we can make progress on using this in QUIC, although the starting point may be for QUIC to pick a PMTU number and just fallback if this fails to be supported by the path. We know PMTU using ICMP is broken. Anything that is doing this better seems like a good thing. Tom Jones: We plan to have the algorithm finished for Montreal, and WGLC by December. Lars Eggert (as QUIC Chair): Most current QUIC implementations do something very simple, a basic test/fallback. Having a functional PMTUD is not strictly required. If this scheme is not too complex, some implementers might try this. After we do a v1 of the QUIC spec, we will do v1.1 shortly after. Speaking as an individual this mechanism might go into QUIC v1.1. Gorry Fairhurst (as author): Actually the full algorithm has many states, but once we have this, we can profile a simpler algorithm for applications that do not need the full method. Tom Jones: There are two parts needed: First, we need some small paragraphs in this TSVWG ID to describe requirements and features of each protocol. This should be straightforward for the use of QUIC, there is some text there already. Review by a QUIC subject matter expert would be great. This would make sure nothing is inconsistent is in this draft. - We can also help with the actual text for how to implement this in a QUIC transport - and that could be in a QUIC WG draft. David Black: I prefer QUIC to be upwards compatible with this. Michael Abramson: As someone who tried to run Jumbo frames, my conclusion is we all need this. Network operators should make sure big PMTU works. Should this become a BCP for all protocols? Saying we ought to do this? David Black: BCP is plausible with much more experience. Qualitatively useful, so BCP is the correct end-goal. OS should not ship without some PMTU support. Tom: In BSD, the black-hole detection code is not currently good. Michael A: I also want some telemetry to tell me this has gone active, to help me for troubleshooting. Tom: An entry in the syslog? netstat counter? Michael A: Yes, that sort of thing. Spencer Dawkins: A BCP - implies more running code, more experience is needed. Eric (Akamai): There is currently a risk that we do not get to a state that is good and so the endpoint remains in a very bad state, just like clamping MSS low enough that it is bound to work, but way smaller than optimal. If we do not add this as a default, then jumbo frames are never going to happen. The Linux black-hole detector also needed improvement when I looked a year ago. Michael A: I do want Blackhole detection and logging most. Michael Tuexen: There are differences when using PLPMTUD with TCP. TCP uses probe packets that carry user data, this changes retransmission and the CC state. TCP PLPMTUD is much more complex. In contrast, this datagram version here has the assumption this is done without using user data, so probe loss is not something that impacts CC or retransmission/repair logic. 3.5 Bob Briscoe: ECN transport draft-ietf-tsvwg-rfc6040update-shim (This draft is now ready for WGLC. It will be WGLC'd with the second encaps draft) Tom Herbert: Are there any IETF shims that got it right? All have their own issues. If you do it when you first design it, it is easy. Bob (points to table slide). Philip Eardley: Let's make this a management problem. My view ECN is an inter-operator thing. 3.6 Bob Briscoe: ECN & L4S draft-ietf-tsvwg-ecn-l4s-id draft-ietf-tsvwg-l4s-arch draft-ietf-tsvwg-aqm-dualq (Bob presented slides on L4S. No comments or questions from the WG). (Bob presented a new individual draft on diffuser and L4S.) There were no comments or questions from the WG. This draft will be discussed in Montreal, to see if there was interest in this work. 4. Other presentations / New Work 4.1 Paul Congdon: Congestion Isolation in IEEE 802.1 Paul outlined proposed work in IEEE 802 on providing layer 2 congestion support to switches and the possible interactions with other methods. The IEEE will be making a possible decision in July. Michael A: How does this relate to routers? How often is the xon/xoff transition? Paul : This is basically propagation time, and a threshold. Michael A: Can the hardware actually scale to high speed links and support back to back frames of the order of 1 microsec? Is is feasible to xon/xoff of 10-12 packets? Paul: We think hardware can provide solutions. Pat Thaler: When the xoff is received, there might not be a packet to be sent. Most implementations set xoff threshold to maximum and count on xon. Send xoff on high thresh, and watermark at low threshold send xon. Pat: That is not the way it's generally deployed. Bob Briscoe: What is the trust model? Paul: This targets a single admin domain, probably a data centre. One administrator. Bob: Have you considered virtual queues, instead of thresholds for triggering? A virtual queue slows you down before you have to buffer. I know Broadcom/CISCO chipsets can utilise this. Sowmini Varadhan: We do not like PFC, ECN mostly works. If E2E ECN works, this is not that important. Mirja Kuehlewind: Congestion comes up if multiple flows share the same queue. How does the switch get to the bad flow? Paul: Same way as an ECN mark for the offending flows. Probabilistically the signal should reach the correct source. Richard: Is there granularity of flow detection, can other flows also get stuck into the same congested queue? Paul: Yes, we are aware of this. 1750-1810 Beverage Break 1810-1910 Afternoon Session III 5. Transport Protocols and Mechanisms 5.1 Vincent Roca: FEC drafts draft-ietf-tsvwg-fecframe-ext draft-ietf-tsvwg-rlc-fec-scheme (These drafts are now ready for WGLC, reviewers are needed.) There were no comments on the final drafts. The chairs will look for volunteer reviewers, and we will cross-post review requests to the IRTF NWCRG. 5.2 Joe Touch: UDP Options (proxy by Gorry Fairhurst) draft-ietf-tsvwg-udp-options Tom Jones: Is there going to be more text to explain the options? Gorry: This is the January draft. Chairs: Joe needs to continue the discussion on the list and revise draft following the presentation. 5.3 Tom Jones: UDP Options Implementation (Tom Jones presented a view that differs from Joe's view of the way forward, especially a different way to handle the checksum option.) Tom: We want to complete implementation for June. Gorry Fairhurst (based on email from Joe): Joe Touch is willing to look at different sizes of checksums, should the TSVWG think that is useful and considers efficiently. Tom Jones: A 16 bit checksum is one less line of code, less code, and standard algorithms are good. Chairs: Please follow-up with Joe on the list. 5.4 Michael Tuexen: RFC4960 Errata draft-ietf-tsvwg-rfc4960-errata (This talk presents updates following WGLC.) Gorry Fairhurst: You note a change to the CRC32c definition. This if often referred to by other groups. What changed? Michael Tuexen: This change is to fix code typos by changing definition to make it compile on all platforms, it is not an algorithmic change to the CRC32c itself. There were no further comments from the room. People are encouraged to check the corrections and report any issues to the list. The current version can now continue to AD review and IESG LC. 5.5 Michael Tuexen: SCTP NAT draft-ietf-tsvwg-natsupp There were no comments in the room. The authors have noted the new milestone and expect to work on this ID for the Montreal IETF meeting. 5.6 Gorry Fairhurst/Colin Perkins: Impact of Transport Header Encryption draft-fairhurst-tsvwg-transport-encrypt Matt Mathis: Are you interested in speculation about what might happen? Could we imagine a standardised generic transport-agnostic header, that is end-to-end and indicates embedded bytes across multiple transports? This would be useful for debugging and a partial solution to some issues. Gorry Fairhurst: Who would turn this on? Matt Mathis: It would be good for debugging applications and not so much use for some other places. David Black: OAM that is not in the flow may not be representative. Matt Mathis: There are also issues relating to equipment that the stakeholders do not wish to tell people about - this makes it tricky. It may seem that some tools are duplicated, or other ways can be used to measure the network path, but people do need to have multiple tools to check the answer to the same question. This is needed so they can be sure what is actually happening. In-band OAM might be interesting as one of these approaches. Gorry Fairhurst: This seems like something that is already possible in some networks, but I agree it is really interesting to find out more about, please tell us about this as ID editors. Matt Mathis: OK, I think there could be a generic instrumentation shim. This could be pervasive where it needs to be. Gorry Fairhurst: Generic tools are much better than version-specific tools that need updates. Hannes Tschofennig: How much feedback have you had from operators and equipment manufacturers? I worked with operations and insight was often very hard to come by. On a different topic, tools based on machine learning are even more demanding. How can we get good feedback? Gorry Fairhurst: We received good feedback from many operators off-list. A key problem is knowing who needs what operational information. Very often transport information is used by people to debug equipment/configuration issues - to understand anomalies/health etc, and organisations as a whole do not care about content, only specific people need this. We are looking still for more people to provide feedback Hannes Tschofennig: DDOS mitigation techniques are important, some rely on machine learning. Some approaches suffer more from encryption than others. Gorry Fairhurst: True. Colin Perkins (author): Think this is a interesting discussion. It would be wonderful to have a measurement shim, but that may be not something that we have currently much experience. We would be interesting to look at different approaches, but that would be a different draft. Brian Trammel: The IPPM WG did publish an instrumentation shim, for replacing timestamp and IP ID in IPv4 networks. This use is not applicable in public Internet. This can be better than passive TCP monitoring. Some discussions in QUIC will drive us on different designs in this space. As a framing document, this ID should be adopted. Al Morton: This provides a good view of what encryption has changed from the transport side. This is a useful scope. When you say encryption has costs, adding shims definitely adds very real costs. It would be useful to quantify somehow the complexities that are added here, and what this gives us. I am happy to read the next revision of the draft. David Black: How many have read this? (A fair number.) I ask people to read this and we will review the adoption next meeting. (+1 read this, via jabber) Brian Trammell: I have a suggestion that we should aim at consensus on this one. By scoping to the transport layer, we may get around to this and we can look at both sides of the coin.. Adoption now, or in Montreal - I would be happy with either. David Black: These problems are not going away, we will be discussing in Montreal. Colin Perkins (author): Things are not going to get better. QUIC is going to be deployed, and is using these kind of header, so there is some time contraint. At this time the bits that are exposed in QUIC are still changing, it is not fully stabilized yet, we need to understand these tradeoffs. End of meeting.