PCN working group TUESDAY, March 11, 2008 1520-1720 Afternoon ==================================== notes by Dave McDysan, additions from Kwok-Ho Chan AGENDA: o Administrivia chairs 10 min - Blue sheets - Agenda bash - no changes - Milestones status - See Blake slides Steve Blake: We are currently one to two meetings behind. Overall target is to make significant progress on decision for PCN encoding options. ---------------------------------------------------------------------------- o Discuss open issues in draft-ietf-pcn-architecture-03 Eardley * Phil requested response to Threshold and Excess Rate Marking Behavior from Email list - fortnight ago, summarized in slide 4 - Within the next two days he will make editorial and terminology comments - No one objected to his assertion that there was general consensus that these are the two general marking behaviors - Anna Charny - Some caveats regarding what is done with previously marked packets, which differs between the proposals, is not yet covered . - Phil - Proposed identifying these as "should" statements, which could differ for specific proposals. * Response to Blake's call for other proposals/comments: - Georgios Karagiannis questions - Phill: Clarifications Threshold marking drives admission control Excess marking drives termination, and admission control (in single marking) - Georgios - Descrbied an approach where an interior node could have more memory of past measurements with the aim to increase accuracy of edge node's estimate of excess rate. - Phil asserted that this issue should be addressed at the edge node, and not at the interior marking node. - Anna Charny - Proposals to address missing pieces Access rate marking - should ignore previously marked packets Threshold marking - Shouldn't overwrite a previously access marked packets CL & Single - Drop access marked packets first 3SM, FBT - Don't drop access marked packets, but terminate instead - Michael Menth - Mechanisms don't depend on what marked packets are dropped. - Anna Charny - Performance difference depends on load - Bob Briscoe - Excess rate TB needs a threshold within an MTU size of the bottom of the bucket (see Menth comment to list) - Phil - At least 4 methods, prefer to leave this to implementation - Phil/Anna - Proposed "should" not bias marking for large packets - Phil - Detailed drop behavior text to be developed (over a beer) - Menth - Proposed option for packet size independent marking - Phil requested that Georgios state his optimization in Email on the list Admission control, boundary node behavior - slide 6 - Scott - Question regarding tie-in regarding how decision is made on marks. - Phil described proposal for case when no traffic from ingress recently use of marks on signaling packet (e.g, RSVP message) (alternative would be to admit all). - Menth - Question on how ingress performs admission control. Explaining the use of probes or congestion level estimate as two different methods. That this looks like adding signaling. RSVP packets are effectively a form of probing. An alternative would be ingress maintaining an aggregate for (some) egresses and may make a decision before even sending a probe. Not described yet, will be covered on Thursday as an extension to 3SM. - Phil stated that this approach is not to be done first until it can be better understood. - Anna Charny Maintain congestion level estimate at egress Admission to admit a new flow occurs at egress When no state exists at egress for a particular ingress, then there is some freedom on how this is handled Proposed deleting next to last bullet on slide 6, or at least stating this as "may." - Lars Eggert - agrees with Michael that this looks like a probe message and brings up the situation of does the architecture requires the use of something like RSVP. And will need to include the situation when no signalling is used, ie the first packet received is the RTP packet (data payload packet) Assumption of RSVP packet not documented, and might not work if RSVP packet goes on the slow path. - Steve Blake - per architecture Ingress-Egress aggregate state: signaled, pre-established or tunneled - Scott - Information document can use may instead of MAY and must instead of MUST. - Menth - Under normal conditions. Asserted that algorithm employed by egress node to inform the ingress whether the ingress-egress aggregate is pre-congested (or not), which can be used to block (admit) future flows. Defined in the CL draft. This need not be specified in the architecture. - Anna - Difference is whether egress sends block messages, or egress sends CLE message to ingress which then blocks admission of future flows based upon this information. - Steve Blake - RSVP Path processed by egress, which must respond with RSVP Resv before ingress will admit the flow. - Tim Moran - Last bullet on slide 6. Proposed including emergency service explicitly in this bullet. Briscoe - proposed stating consultation of PDP. - Briscoe - Letting ingress make admit/block removes possibility of difference in decision between egress and ingress (e.g., based upon policy decision) - Anna Charny - Change "egress" to "ingress" on last bullet of slide 6. - Phil proposed talking about this offline to make a decision. - Georgis - Egress decision would be better with probing and ECMP. Termination control - boundary node, slide 8 Recommended mechanism from CL draft - Menth - it is better to have the egress make the decision of admit or block and send the admit or block indication (of the ingress-egress pair) instead of the CLE (congestion level estimate) back to the ingress. This allows different methods to be used.. - Phil - If packets with termination marks are dropped, then termination process takes longer. - Bob - prefers having the ingress making the decision based on CLE and not have the egress make the decision based on the CLE because of easier policy decision making and minimize decision making synchronization. - Georgis - Agreed with Menth proposal. - Anna - Not all work on performance arguments have been done and hence this is not a good basis for decision at this point. Doesn't believe that Menth's proposed method will work with single marking. - Menth - argument is on simplicity and accuracy, and not on performance. It is not simple to have termination information (sustainable rate) in both ingress and egress. Also limits flexibility for potential future uses of the pcn system. ---------------------------------------------------------------------------- o Discuss open issues in draft-chan-pcn-encoding-comparison-03 - Presented by Michael Menth - New version target April 1, working group document target May 1 - no comments ---------------------------------------------------------------------------- -- o Discuss encoding options Toby Moncaster Major issue at tunnel egress regarding ECN marking on slide 9 Makes all case 2 encodings (1 DSCP using ECN), except possibly single marking Could re-define tunnel egress node actions, but this is messy Net is that can only use ECN marking "11" for tunneling within a PCN domain - Unidentified Speaker - Joe Touch is currently giving a talk on tunneling in the Internet Area, it should be investigated by PCN. Encoding option 3 (slide 10) Everyone likes to keep DSCP1 with 11 for TM and change to DSCP2 with 11 for AM, as indicated by Michael's suggestion on the list during lunch time on Tues. - Briscoe - clarification on Pros on Slide 11 - Uniform mode tunnel in Diffserv - Georgis - If routers are using DSCP, then change hash to result in same path selection to avoid ECMP issues. - Viable options are 1) Use 2-3 DSCPs and 3) Use 2 DSCPs and limited ECN usage. - Scott Bradner - Not many DSCPs are left in the standards pool. Will be difficult to get more allocated to the standards pool. - Briscoe - Tradeoff is end-end ECN or number of DSCPs needed. - Toby - Option 3 will require 1*n less DSCPs than Option 1 for n priority classes. - Steven - there is some view of using PCN by ECN. And there is some use of PCN by 3GPP. - Lars - Requesting a larger number of DSCP codepoints is a key consideration and will require strong justification (e.g., markedly better performance). Saying you want 3 codepoints instead of 2 to achieve flexibility is not a strong argument. For example, getting the admitted EF codepoint in tsvwg was difficult and took a long time. - Scott - 16 codepoints are reserved for experimental use. After experimentation, this could be used as justification for a standards action. - Menth - Single marking is simpler, and could potentially use CE from ECN without issue. - Briscoe - PCN DSCP requests may be looked at by IESg in a similar way tO EF admit DSCP. - Unidentified Speaker - Could EF Admit DSCP be used for PCN. Briscoe has concerns unless definition is changed. - Menth - CL without flow termination only requires a two codepoint solution, similar to single marking codepoint requirements. - Briscoe - Summary of Fred Baker EF Admit I-D in tsvwg and how it could be applied. Attempts to differentiate use of EF for admission controlled flows from flows that are not admission controlled. PCN adds guarantee that overflow will not occur. Operators are likely to use admission control in conjunction with PCN than without it. - Steve Blake - Please read Fred Baker EF Admit I-D and comment to tsvwg list. - Lars Eggert - Access rate marking could come from standards DSCP pool since all approaches need this, while only some approaches need termination marking and this could be allocated out of the experimental space. - Tina Tsou (sp?) - supports option 3 the use of 2 DSCPs to allow cleaner support of ECN and PCN together. - Lars Eggert - Pose questions to list for beginning of Thursday meeting. Questions should cover the following subjects: Pursue single marking or other approaches? Option 1 or option 3? Standards track or experimental codepoint(s) THURSDAY, March 13, 2008 0900-1130 Morning Session I Salon I Marriot Downtown, Philadelphia Meeting materials: Audio archive: ==================================== CHAIRs: Scott Bradner (not present) Steven Blake o Administrivia chairs 5 min - Blue sheets - Scribe Bob Briscoe - Shuffled agenda relative to that uploaded Initials: [SB]: Steven Blake (when not acting as chair) [BB]: Bob Briscoe [ACh]: Anna Charny [PE]: Phil Eardley [LE]: Lars Eggert [RJ]: Raj Jain [GK]: Georgios Karagiannis [MM]: Michael Menth [TM]: Toby Moncaster Abbreviations used throughout (not incl those coined by individual presenters): AC: Admission control FT: Flow termination CLE: Congestion level estimate ERM: Excess rate marking ThM: Threshold marking AfM: Affected marking NM: PCN-capable but not marked IEA: Ingress-egress aggregate CL: Controlled load (refers to expired draft-briscoe-tsvwg-cl-architecture-04.txt and draft-briscoe-tsvwg-cl-phb-03.txt) SM: Single marking (draft-charny-pcn-single-marking-03) LC-PCN: Load controlled PCN (draft-westberg-pcn-load-control-03) 3SM: 3-state marking (draft-babiarz-pcn-3sm-01) ECMP: Equal cost multi-path PDP: Policy Decision Point o Performance of admission control methods Michael Menth 20 min draft-menth-pcn-performance-02 ==================================== Slides at Acknowledged Frank Lehrieder Presentation summary: Performance evaluation of:- Congestion level estimate-based AC (CLEBAC) = block if fraction marked/all exceeds a threshold - Observation-based AC (OBBAC) = one marked packet switches IEA to block state, switch back to admit if no marked packets for a timeout Simulation set-up - Single bottleneck link with n (~100) independent sources - Different size IEAs over this link (n or 10) Metric used: fraction of time (flow blocking prob) when false AC decisions: - False positive = admit when shouldn't - False negative = block when shouldn't Q. Anna Charny: Does 10 flows mean 10 in aggregate, or are there also only 10 on the link? A. No, 10 in aggregate but 10% over-capacity means 110 flows on the bottleneck link. Not saying SM (or ERM) is bad, just that it doesn't work well with small IEAs. It's not a property of the link aggregation. It's a property of the ingress-egress aggregation. This is just a property that we would have to live with. Results false false parameter Summary negt'vs post'vs sensitivity __________________________________________ CLEBAC ThM few few little OBAC ThM many few little CLEBAC ERM few many little OBAC ERM few many significant With ERM, blocking ratio is still low with 10% more flows than intended, but with OBAC you can configure the detection algo to have a fairly high blocking ratio at 10% overload by requiring a long time without any marked packets in block state before returning to admit state. This says OBAC can give better AC in the cases we have investigated. Probably haven't done as many simulations as Anna, but there are cases where OBAC gives better results, essentially in small aggregates. [ACh]: In SM presentation we'll try to quantify the excess capacity you need to avoid false positives. Will give quite precise numbers. But results broadly in agreement with more extensive simulations we did. Summary With ThM everything's fine. With ERM many false positives if IEA small (10-30 flows) Questions usefulness of AC based on ERM, but on other side can say it is useful but just need to know its limitations. OBAC also not good in that case, but have seen it can be better than CLEBAC. No figures available on the speed of admission control, but OBAC designed to react faster, which is important when we have a flash crowd, but with CLE-based, we have to wait until the end of a measurement period. Questions? [ACh]: Rather than saying ER marking is useless for small IEAs, a more accurate statement: If you used ER marking you would have to account for the relatively larger error that results which would translate into the appropriate capacity guidance. Certainly better than saying it doesn't work, but it is certainly more inaccurate with small aggregates than we would like to see. [MM]: [Essentially agreeing] Don't say it doesn't work. We need to know: Do we have small IEAs? etc We need to know whether we have difficulties, and if so how do we deal with them. o Edge-assisted marked flow termination (EMFT) Michael Menth 30 min draft-menth-pcn-emft-00 ==================================== Slides at: Summary of presentation: Marked flow termination (MFT): terminates some flows that are marked. Better for ECMP because only flows marked are terminated. Problem: too aggressive if terminate every flow with a marked packet. Solutions: Marking frequency reduction: - Core-assisted MFT (CMFT - same as 3SM): Modify excess rate marking to proprotionately mark - Edge-assisted MFT (EMFT): Edges only terminate some flows that receive marked packets EMFT, two options: - Flow-based EMFT: Per flow credit counter initialised randomly; each marked byte reduces counter, when .le.0, terminate flow - Aggregate-based EMFT: Per IEA credit counter initialised randomly; each marked packet reduces counter, when .le.0, terminate a flow picked from IEA; increase credit counter proportionately to rate of terminated flow [BB]: Credit counter should be decremented by no. of marked bytes, not packets, to be fair to small-packet flows. [MM]: Y [slide says bytes, incorrectly said packets] Expt set-up: single bottleneck, termination rate at 100 flows. 100% overload. 200ms termination delay. No packet loss is a recognised limitation. [GK]: Suppose there is severe congestion somewhere, traffic going through the congestion point will get rerouted to another route, then at least one point in the new path will get congested and packets get dropped. At that point the terminate marked packets will never get to the edge. Then this will not work. If have severe disaster, traffic going through the congestion point will get rerouted to another route, then at least one point in the new path will get congested and packets get dropped. If packets that would have been marked are dropped preferentially, may not get enough marked packets to trigger termination. May not get /any/ packets through for some flows. If this problem is not solved then we fail on getting a sufficient solution for flow termination. [MM]: Will get enough, but yes, it will slow down termination. We solve this problem by ensuring the termination threshold is not too close to the link rate. [GK]: Repeated point about possibly not getting /any/ marked packets in some flows with sever congestion of >100%. [MM]: Not true, as long as termination threshold below link rate, then flows get termination marks. [GK]: /Some/ flows. [Chair]: Pls continue presenting (? inaudible?) Flow termination delay is feedback time of system. The time between termination being triggered and no more packets from that flow arriving at the egress. Parameter we have to take into account: set to 200ms and did sensitivity studies. Knob to control termination speed: Termination aggressiveness, alpha. Controls marking frequency reduction. Same alpha for core or edge solutions. Can control tradeoff betw speed of termination and over-termination. If too high, over-terminates. If too low, termination too slow but doesn't over-terminate. All the approaches have similar mean termination curves with time, but the differences can be seen by plotting the 5th and 95th percentiles. Then flow-based EMFT has wider variance. Termination process still inreasonable, controllable, bounds. For larger (double) overload, still get fast termination in one more cycle, because of exponential nature of termination curve. Packet loss not simulated would reduce termination speed (shortcoming of simulations). Differences between variants. For all, flow termination delay affects termination speed. But not fairness - ie if long & short flow termination delay mixed in same network, they have the same flow termination probability. Comparison: * CMFT: - Termination - No termination priorities - allows anti-cheating for e2e PCN (no details - not in charter) * EMFT - termination precedence possible - flow & aggregate both suitable, but richer policies with aggregates - per flow suitable for e2e (not chartered), per aggregate suitable for PCN domain Summary: Have added analysis of edge-based marking frequency reduction to previous work in 3SM on core-based marking frequency reduction. Marked-flow termination exactly same whether marking freq reduction at edge or core - same knob in both cases. Mechanism self-corrects so invariant to many system parameters (see papers for which parameters are important and have an impact). Severe overload and packet loss missing in this study, but important and needs quantifying. [BB]: Important decision for group: Do we put marking freq reduction on core routers or at the edge. Did this study show there was any statistical difference? [MM]: Managed to design system so termination behaviour is exactly the same. Pointed out differences: Core-based allows for anti-cheating control at the borders. However, edge better control on termination process because - termination behaviour independent of different packet frequencies (packet sizes) so we get better fairness - can apply policy control to flow termination. [GK/MM]: Repeated exchange above about packet losses. [GK]: Only 20% of the 110% over link usage, resulting in 10% of the re-routed flows will reach the termination decision point. Hence the termination decision operation will be very slow. [GK]: Could be a lot slower [MM]: May be 2 s instead of 100ms [GK]: 1 min [MM]: no (chuckling) o LC-PCN Georgios Karagiannis 20 min draft-westberg-pcn-load-control-03 ==================================== Slides: Acknowledged Lars Westberg, Anurag Bhargava, Attila Bader. Load Controlled LC-PCN Presentation summary: LC-PCN provides solutions to the 3 main issues: * AC - Data marking - Probing (solves ECMP during AC) - Combination of both * Flow termination - Base mode - optimization mode * ECMP solutions - AC - using probing - FT - using affected Marking ECMP problem outlined (see slide) ECMP AC solution: - if interior node congested, then all probe packets must be marked - problem: with ER marking not all packets marked - solution: use router alert option so interior nodes know it's a probe packet and mark it always if rate is above admission threshold Raj Jain [RJ], Washington Uni: If probes special they could be small so many more could be handled than data packets. They need not have single bit, but could contain fields to carry more information from the congested nodes. [GK]: next slide explains this. Lars Eggert [LE]: Brought up on list: Because ECMP not standardised, don't know whether router alert will make probe could go through slow path and not see ECMP - we don't know. Might take a different path. Could work but we don't know that. [GK]: That's True Phil Eardley [PE]: Two points on one slide. Other point is correct (all probes must be marked). But an operator might just implement Ac and not FT and hence may not choose SM as the algo for 2 codepoint solution. May possibly do something else for Flow Termination. [LE]: In that case, should use the SHOULD and not MUST words. [GK (realising answer to Raj's question is not on next slide, but continued)]: Probes must have same flow ID as user packets so they will be treated same as user packets. [LE]: But you don't know that (repeating point above) so can't depend on this behaviour. [GK]: This could be different for RFC3175 where e2e RSVP path is used with a different protocol ID and aggregation level 0. I'm not sure how it works but it might be that the RSVP path is not taken out of the fast path. [LE]: Problem: no standard that says router alert packets must take same path as normal packets. [SB]: Isn't it a problem is RSVP PATH messages (and the reserve state they pin in the other direction) aren't on the same path as the data flow? [BB]: I thought, if PCN being used, RSVP turned off on interior nodes, so RSVP would just be treated as data [GK]: True [BB]: (but may have different addresses) [ACh]: At least two reasons that might make this not work: i) Router alert as we've heard. ii) If probes have different protocol ID, ECMP will likely treat them different to data [KG]: Might work if RSVP run on top of UDP, otherwise a problem. ECMP & flow termination ----------------------- Aim: Only flows passing thru congested nodes marked for termination When PCN interior node in flow termination state, all packets not marked for termination are marked as AfM. Need extra encoding state with ECMP: AfM Egress can only select flows marked for termination or AfM Parameters globally equal across PCN domain U,N .ge. 1 (see draft) N is proportionality between measured and marked excess rate. N can be used to solve the previously mentioned problem (loss of marked packets), when node congested, can mark all the rate that is congested, even the rate that will be dropped. When ECMP isn't supported only uses one encoding state. But for optimisation accuracy, some things have to be changed. Eg. marked excess rate = measured excess rate / N. Measured before dropping, marked after dropping. And PCN mark-encoded packets should not be preferentially dropped. [ACh]: LC-PCN identical to SM, but with one big, important exception. SM looks at unmarked packet at edge, LC looks at marked packets, so dropping preference for marked packets has to be reversed. [GK]: Yup Flow termination - Basic mode as described - Optimization mode: uses sliding window to allow for delays between measurements on egress & ingress nodes, Optimization mode on interior nodes (see slides or draft for state diagram & state change events, & algos) [BB]: Are you saying this would need to be standardised, or that this is something people can choose to do? [GK]: That the algorithm works and accuracy can be improved. Detection of AC identical to SM draft AC with probing at egress based deterministically on marking of probes (to solve ECMP) Flow termination identical to SM Egress node state diag & state change events (see slide) Ingress nodes easy (obvious) Probes can be user or signalling. Repeated other previous facts about probes. Summary LC-PCN at ingress: - Different to SM LC-PCN at interior: - Admission control with data marking: - Same as SM draft, but to increase accuracy small modifications needed - Admission control with probing (additional option to solve ECMP) - Flow termination: - Base mode, same as features used in admission control with data marking - Optimization mode (optional feature that is required in order toincrease accuracy of algorithm) - ECMP solution (additional and optional feature that requires a flow termination state and additional encoding state) LC-PCN at Egress: - Admission control with data marking: identical to SM - Admission control with probing (additional option used to solve ECMP problem) - Flow termination: - Detection feature: identical to SM - Selection of the flows to be terminated: different than SM - Feedback to ingress: different than SM - ECMP solution (additional and optional feature) Why is this presentation done here? Insight into solutions provided by LC-PCN Show how close LC-PCN is to SM o Discuss encoding options Moncaster 40 min (Carried forward from Tuesday session) ==================================== Slides not yet available on the meeting materials site, but the attachment to Toby's posting to the PCN list should be useful: (Windows users will have to force a .bin file to be read by their Acrobat Reader) Intent: Wrap up encoding discussion from previous mtg (bumped from Tuesday's agenda) trying to reach consensus on encoding questions. Need to have been following list - moving rapidly this week. Two basic options: 2-state solution for single marking Proposal: standardise 2-state solution and simultaneously produce experimental extension scheme providing for a 3-state solution. If expt successful and lots of operators require it, then we'll come back with the aim of standardising a 3-state solution. Constraints - need to minimise no's of standards space DSCPs - avoid interactions with tunnels - comply with relevant RFCs Probably got agreement from Fred Baker that it would be appropriate for PCN to use the VOICE-ADMIT DSCP he has in tsvwg last call (draft-ietf-tsvwg-admitted-dscp-03) - may or may not be standardised (with that name). New scheme on list from Toby & Bob Briscoe - v young * Basic scheme using one DSCP (Fred's VOICE-ADMIT) - Not-ECT = Not-PCN, ie treat as for Fred's draft, and not a candidate for PCN-marking - ECT = PCN-capable but Not Marked (NM) - CE = PCN-capable and Excess-rate marked (ERM) * Extended scheme to allow experiments with 3 marking states and e2e ECN - Any packet with VOICE-ADMIT DSCP would mean the same as the basic scheme, except: o ECT = PCN-capable but NM and originally ECT before entering this PCN domain - Another DSCP (initially from the LU/EXP range) would be give 3 extra states o Not-ECT = PCN-capable but NM and originally Not-ECT before entering this PCN domain o ECT = PCN-capable but NM and originally CE before entering the PCN domain o CE = PCN-capable and Threshold marked Possible state transitions are given in the slide Two possible consensus questions Q. Can you live with only standardising a 2 codepoint solution? [BB]: I assume you mean initially standardising, not for ever? Q. Can you live with only initially standardising a 2 encoding state solution? [Chair]: encoding states not codepoints Q. Can you live with only initially standardising a 2 encoding state solution? [LE?]: (inaudible) [Chair]: As an initial standardisation effort for PCN, will people be not violently unhappy if we only pursue to standardising a solution that uses 2 encoding state solution [Chair]: Show of hands [Chair]: 8 for - 0 against - rough consensus Q. Do you agree that we should simulataneously aim to produce a 3 encoding state solution as an experimental extension [GK]: I don't understand what exactly the experimental extension includes? [LE]: Motivation: Feedback from some operators says minimising DSCPs is a must. But some of the schemes have a creative interpretation of what a codepoint means. Some do hop-by-hop marking, which some people say is outside what Diffserv would allow you to do, while others say that's not the case. but it appears better to consume fewer codepoints. There is also this range, not only experimental but local use (EXP/LU) that are defined by the Diffserv RFC. So the proposal is that if a standards track codepoint is used to provide a minimum set of PCN functionality, especially if we use the codepoint being defined in tsvwg, which you've called VOICE-ADMIT but the title is actually Capacity Admitted traffic. So the experiment will be to see if operators would rather make the tradeoff of consuming an extra codepoint to get an accurate solution, or whether they be willing to have a less accurate solution so they could use the codepoint for something else. Later on, if we've gotten some deployment experience, if many operators have shown they want the more accurate PCN scheme, we've then got stronger argument to take to the wider community to get an extra codepoint for PCN. [GK]: OK [GK]: Another question that would arise: where detection and handling functionalities could be located (ingress or egress) [TM]: That is irrelevant to this question (interrupted)... [Chair]: Not trying to rule out any particular edge arrangement, not deciding which of the 3-state schemes we might use [GK]: Do you agree that another question would be should we have two 3-state solutions? [Chair]: True [GK]: A third question could be whether we optimise ingress-based functions and another question could be whether we optimise egress-based functions [PE]: Clarifying question: Do you mean you want a question on whether we have 2 experimental DSCPs? [GK]: That's not how I understand it [Chair]: I'm trying to avoid the ambiguous term codepoint and use only encoding states Q2: Do we agree that we should do some work to produce at least one experimental extension that will use at least three encoding states? [Chair]: Show of hands (some confusion broke out) [Chair]: Before we call it again, any questions on that question [BB]: To clarify what Lars said, rather than getting another DSCP, we might instead have enough weight to go to the Security area and get the ECN tunnelling behaviour fixed to give us more encoding states in the ECN field. [Chair]: Let's just stick with encoding states and leave the details out. Q. Do we agree as a w-g, that we should do work on experimental extensions that would utilise at least 3 encoding states? [Chair]: Show of hands [Chair]: 9 for - 0 against [SB]: Think phrased it vaguely enough to not preclude what you, Georgios, are suggesting. Are you happy with that? [GK]: (interrupted...) [LE]: Let me summarise consensus in a different way to see if you still agree with it: [Chair]: OK [LE]: As I understand it, the consensus is there will be a stds track RFC with an encoding state that PCN compliant implementations MUST use. There'll be a 2nd document that is likely going to be an experimental document (sorry, one or more experimental documents but at least one), that uses another encoding state that is optional to implement. That means that if an operator decides to deploy the experiment, they need to purchase equipment that implements that extension. [PE]: You've only talked about encoding states, not the marking behaviours they imply. I agree when you're just talking about encoding states, but if you also link that into saying something about marking behaviours, then I'm not sure I would agree. [Chair]: Anna's next presentation will talk about an emerging consensus amopngst propoers on core marking behaviours, so can we defer that discussion until after her presentation? [LE]: OK [Chair]: Anyone object to Lars's restatement of the consensus? (no-one) o PCN Comparison draft-charny-pcn-comparison-00 Anna Charny 20 min ==================================== Slides: Presentation summary: Comparison draft - slides not really going to describe that - describe emerging consensus. Work not just of comparison draft authors but most of proposal authors Comparison draft created for last IETF. Functional comparison. Simulations comparing proposals. Included pseudo-code monstrosity encompassing all core marking proposals (CL, SM & 3SM). Some now obsolete. Developments since: MM's edge-based marked flow termination LC-PCN clarification and evolution - essentially we have another proposal - also described. Given the drastic reduction in encoding states due to tunnelling, I will try to reflect emerging consensus. Try to describe what will be lost with only SM. Will not compare proposals, which is what the comparison draft is supposed to be about. See inidividual drafts and the comparison draft itself. If we assume we have only 2 encoding states, we have 3 options: only admission, only termination or both. If we allow option of both, then whatever proposal is put forward for standard marking, we must support SM. Not strictly correct, because LC-PCN also only uses 2 states but the point is SM is the core of both proposals. So, given it looks like we will go forward with more sophisticated proposals as an experimental track, as a result, any proposal to go forward must support SM. We must describe the boundary behaviours for whatever solution comes up, but, although we'll try not to constrain anything too much, whatever we describe must not break SM, unless or until other better schemes can be offered. However, we want to try to keep door open as possible, given constraints to other proposals If we only have 2 encoding states, what is lost if we have AC only, or termination only? Obviously the other. Also need to add configuration about which one you are using. Need to support SM as the core. Not clear to me that anything else is lost if we did just termination or just admission. [PE]: In your phrase "If we want both", does 'we' mean an operator, it doesn't mean the w-g? [ACh]: Whatever the w-g does, it needs to know which of the two options to go. [LE]: Also, if we do only either one, losing some ability to control each function: e.g might get into termination state more often if don't do admission [ACh]: Right. Implicit assumption that there is enough interest in having both AC & FT. If that's not the case, a lot of the things I'm going to say will become irrelevant. Assuming SM must be supported, must have ER metering & marking [GK]: I'd like to allow optimisation just presented [ACh]: Slides assume not, but will talk about that later Other things core node should do - if arriving packet already excess-rate marked, then don't meter - any arriving packets already excess-rate marked should be preferentially dropped if drop required - if metered packet is dropped, don't excess rate meter it Other things that must be defined - egress to ingress (& possibly PDP) messages contain: - CLE - sustainable rate - (optional) rate to terminate - (optional) ingress sending rate - Boundary node behaviours to be specified - implicit assumption SM will be the initial behaviour, but more may be defined What has been lost? - Req of pref dropping excess-marked packet is problematic for 3SM, EMFT & PC-PCN - Limits possibility of defining simpler edge behaviours. - And very suboptimal for LC-PCN - This proposal limits possibilities of optimisations at core nodes, effectively ignoring all of those in LC-PCN. That needs to be discussed whether this is alright. Reasons: - Requires core complexity - Insufficient simulation evidence of benefit of these core optimisations [GK]: Would like option of including that later [ACh]: Stds track marking behaviour can't have a SHOULD, must have a MUST. So need evidence that optimisation necessary [GK]: Evidence is there in papers [ACh]: Evidence you're referring to is for a diff algo [GK]: But similar, the way it terminates flows is exactly the same [ACh]: Need stronger evidence for a MUST, not a SHOULD [GK]: Well I do not agree with that (shrug) Other sacrifice is performance: - small no. of flows per IEA - not an infrequent case at all - some perf degradation for multiple bottlenecks on path - see following presentation - No ECMP solution (needs a further encoding state), so only way to deal with ECMP is probing. But we don't know how to do probing well. SM suboptimal for support of probing because it needs more probes than other methods - For termination, SM doesn't have enough hooks to stop some flows on path being terminated that are above their admission but below termination theshold - doesn't allow simpler edge implementations possibly afforded by 3SM & EMFT solutions [LE]: Clarification question: Which of these limitations cannot be addressed by some amount of over-provisioning? [ACh]: None cannot be addressed by some amount of over-provisioning, will be addressed in SM presentation [MM]: Probing to find edge nodes - hard to do admission control without it - cannot be solved [ACh]: Are you saying probing is needed for address discovery? [MM]: N [Ach]: OK [MM]: Other methods are required [ACh]: Probing never been assumed to be that method [GK]: egress behaviour not good for LC-PCN - extensibility is important, eg end-edge or centralised scenarios [ACh]: Two things that prevent PC-PCN solution: i) absence of core optimisation. Dropping of SM not good for LC-PCN [GK]: Also not good for SM [AC]: What? [GK]: Repeated point about dropped packets from previous presentations [ACh]: No. See email I sent this morning - two examples you gave are not correct for SM Chair: Take that offline Chair: You're going to say what you don't get with SM, so we won't have consensus calls until after that [BB]: To answer Lars's question: - ECMP problems of SM wouldn't be solved by overprovisioning - We don't have a way to go to multiple non-trusting domains with SM, which wouldn't be solved by overprovisioning either Previous slide said core must do ER marking. Where does that leave ThM marking? Core MUST do threshold marking if want to expt with 3 states or if we want to allow just admission - the 2 state solution no longer allows doing just AC with ThM. So this proposal eliminates that option. [PE]: Uncomfortable about operator not being able to do AC with PCN and termination with something else. And to be able to do that with just the 2 encoding states. [ACh]: If wanted to do only AC, would either have to i) say MUST do both marking behaviours, or allow a SHOULD, which we've previously said was not OK for LC_PCN or iii) we can say EITHER/OR at which point we have an interoperability (?) problem. So we have to decide. [LE]: We can't have a SHOULD. Need to have 2 MUSTS, otherwise equipment can't interoperate in one domain. While if you have equipment that implements the two MUSTS you can configure it to do either one. [GK]: Repeated 2 things Anna stated this proposal ignores. [ACh]: Agreed. These this proposal ignores these two things so there would obviously need to be consensus to do that. [GK]: If these two are ignored, then an egress-based solution will not be possible. [ACh]: Correct. It will require another standard to do that. it will not be possible if we go with any of the three proposals we just discussed, whether we do a MUST to both ThM & ERM then (interrupted...) [GK]: Yes, but 3SM and flow marked solution do not work [ACh]: That was in the previous slide (interrupted...) [GK]: Does this mean the egress-based solutions are excluded from now? [ACh]: No what it means (interrupted...) [??]: No, that's what it means [ACh]: So, what it means (interrupted...) [GK]: It indirectly means egress-based solutions are excluded [ACh]: No. [GK]: Why not? [ACh]: What it means is any egress-based solution that would work with egress-based termination would work [GK]: But doesn't work well. [ACh]: I think we're in agreement, that some of the schemes that do egress-rate termination do not work well. So in order to reach consensus we need to make some sacrifices (interrupted...) [GK]: Yes, but in order to do sacrifices (interrupted...) [ACh]: I'm trying to simply say what the sacrifices are (interrupted...) [GK]: Yes, but in order to do sacrifices we have to have the right information, and if we don't have the right information we should get it. If we take a wrong decision now then we'll have to change a lot of things later. [ACh]: You're point is taken, there are some solutions that are precluded by this proposal. [Chair]: OK [ACh]: What do we need to do? - Which 2 state solution to pursue (3 options)? - Agree on specific encoding - not the point of this presentation - Once agree, need to write draft - Need to specify the signalling requirements for whatever we agree on - Write boundary behaviour o Single Marking Anna Charny 20 min draft-charny-pcn-single-marking-03 draft-zhang-pcn-performance-evaluation-02 ==================================== Slides: Acknowledged Joy Zhang Presentation summary: Single marking poor compared to CL for small aggregates, wrt admission but particularly termination. Also somewhat worse than CL terminating in the presence of overload at multiple bottlenecks in series. Single marking generally comparable with CL for large IEAs Of course, can never check every case no matter how many simulations, but some comfort in that these simulations were run to find how bad SM was relative to CL, not how good it is. * AC: SM has significant AC error only when IEA v low < 10 flow/IEA due to synch effects with CBR. Effect disappears with enough randomization of CBR [BB]: When BT put a typical distribution of sizes of aggregates for their UK core network on the list, a very large proportion of aggregates were small, because, with 100 edge nodes, there are 10,000 possible aggregates and the distribution is long tailed with most aggregates empty. Is it any different to the 'low' figure you have mentioned here, where your 'low' means <10 flow/IEA. [ACh]: Answer not from simulations, but from intuition gained from simulations. If all of these little aggregates take a relatively small proportion of your bottleneck link, that actually will not be a big deal. However, if these small aggregates take a large proportion of the aggregate link, then this problem well become (significant). But I don't know which of the cases will be more reasonable. [BB]: So what if IEA was 2, which it was in a many, many cases? [ACh]: The rightmost side of the plot is where there are as many flows as aggregates, so that's the case you're concerned about. [BB]: Where the black (CBR) plot goes off the (over-admission) scale? [ACh]: Y. * FT: Again, over-termination with SM, again, due to uneven marking across flows which is noticeable when there are few flows per IEA. Degree of IE aggregation SM needs for < 10% over-termination is ~50 to ~150 Flow/IE, whereas CL is generally always within 10% for any size IEA. Emphasise that these are truly the worst of the simulations we could find. Can fix by smoothing FT signal, but trades off reaction time. Simulation did not assume smoothing. * Multi-bottlenecks in series problem: - FT: Charts represent average of many thousands of simulations. CL stays within 10% (and most below 4% error), while SM has a good proportion still in the 10-20% error range. - AC: SM & CL fairly equivalent in this respect. Fairness slightly worse with SM, but not at all significant. Summary: SM comparable to CL for aggregates of sufficient no. of flows, sufficient meaning: - ~10 or more flows wrt AC - 10-100 flows wrt FT. - At low ingress-egress aggregation, Single-Marking is less accurate (over-admission & over-termination) - In the presence of multiple bottleneck, Single-Markingtermination performs worse than CL-PHB [GK]: Before freezing the behaviour of the core node I'd like to have the possibility to prove the the LC-PCN mechanism works. [ACh]: So what are you asking? To delay the decision? The only decision is what exactly the core node function includes. [GK]: The decision is single marking, that is already taken. [ACh]: That's a decision for the chairs. [GK]: If the LC-PCN optimisation is included, the SM functionality will not be affected by it. [Chair]: I'm in process of formulating a question for the w-g. Something I omitted to say earlier. There may be IPR on some of the proposals before this w-g, and I encourage you to go look at the IETF IPR pages and review those IPR disclosures. Is this a satisfactory statement for the ADs? [LE]: Y [BB]: Coming back on what Georgios said, the longer there's delay less chance of implementers still being around. The whole thing requires doing in a finite amount of time otherwise it will never happen. We're already 6 months late, so we need to be aware that if we delay it may never happen. [GK]: The delay will only be 2 months. And the delay will not be a delay, because we already have made the decision to use single marking. The only decision is if the MUST should be a SHOULD or not. I'm not asking for another option. If I can prove the LC-PCN mechanism works then the egress-based functionality could work, then we can set the SHOULD to a MUST. Then if it doesn't let it stay as SHOULD. [ACh]: This proposal has been made in view of the fact that there seems ot be no more time doing the research in this group. We need to look at the existing proposal today to have a chance of anything happening. The reason this proposal came about was so that something was available now. [PE]: My view would be that we go forward, then in 2 months time if you bring evidence that yor idea brings us loads, 2 months is not going to be too late to say we gain all this by doing your proposal. But I think we've got ample evidence now to know what direction we should take now. The problem of waiting 2 months is that we will still be arguing at that point whether we have enough evidence on your proposal to make a decision. [GK]: I think I've been misunderstood. We have made certain decisions (interrupted...) [ACh]: We haven't made any decisions (interrupted...) [Chair]: Reiterate that we've only made 2 decisions on encoding state aspects so far [LE]: We haven't reached consensus on router behaviour, how exactly it marks packets which is still open, which is what we're discussing that now. [GK]: We have the single marking behaviour [PE]: We haven't had a consensus call on the single marking behaviour [GK]: Yes, I know. But this has not to do with gain, it has to do with extensibility. [ACh]: I would propose to delay the discussion until there is evidence, and when there is evidence we can revisit the decisions. Today there is no evidence and we are running out of time. [GK]: In the formulation of the question can we have this opening? Chair: Does w-g feel they have enough info to come to some conclusion about a way forward here at this time? [Chair]: Show of hands [Chair]: 6 for - 1 against [Chair]: Proposal by Anna for a 2 encoding state solution that we could move to standardise thats based on a type of excess-rate marking [ACh]: To be clear, the proposal is that the core does excess rate marking (interrupted...) [Chair]: In a particular way... [ACh]: No, just using a standard token bucket, but with the specific additions of how already marked packets are handled [Chair]: Right [PE]: Pls clarify exactly what the question. [Chair]: There is a proposal on the table for excess rate marking which has specifics about how previoiusly marked packets and dropped packets are handled. And the question is, who in the w-g believes that is a reasonable path to move forward on, so that we can empower a design team to go write that up in the direction of becoming a w-g doc for a standardised solution? ... I probably should re-phrase that a little bit [PE]: I'm very comfortable with having that in std as a MUST. My question will be then whether there will be an opportunity for anything else to be in the std as a MUST, ie threshold marking. Or whether, in that question you're saying that would be in an experimental RFC or if that is also in the stds RFC, and then you have this configuration thing. [LE]: Let me try to rephrase the options. Earlier we had a call on whether we have a two state standards option with a three state experimental extension. Now we're talking about how these bits get set. - One algorithm option: Single MUST algo to do excess rate marking and that is the only thing that a router needs to implement to be a PCN-compatible router. - Other algorithm option: Add another MUST that say eg. you can configure router to set those same bits Keeping in mind that having a long shopping list of options on marking schemes in a router defeats the purpose of standardising PCN, because it allows the space to fragment and everyone gets their own scheme again, which is something we wanted to avoid. - One scheme we thrashed out this morning was that the router MUST implement ThM as well as ERM, and it can be configured for AC without FT, so for a deployment scenario you use the ThM to set the two encoding states rather than the ERM. That's one option. That seems relatively simple, but it's already twice as complex as having one mandatory to implement scheme. [ACh]: That would mean that each router that claims compatibility with PCN and is used in the single marking case is actually required to implement something else - just to clarify? [LE]: Would need to implement two marking algorithms. Yes. That would be one proposal. Rationale: All expt schemes need ThM of some type, so could have 2 MUSTs. However that's up for discussion. [ACh]: The drawback would be that fewer vendors would be interested in implementing that. [LE]: Sure. It's a tradeoff [GK]: Could the optional points related to egress-based termination also be included in an extension draft? [LE]: Personal (hat off) interpretation. When we chartered PCN, there were multiple solutions that all required different marking schemes. Not good, as a vendor it required you to implement multiple things. 4 schemes instead of one. Or 5 or 6. For interior we wanted one solution that allowed multiple ingress-egress behaviours. With informational documents that might discuss how you implement different ingress-egress schemes using the one interior behaviour. If we talk about many interior schemes, especially many schemes with interior complexity, we would be in effect standardising multiple solutions here. That I would personally like to avoid because then, why do we have the w-g? There would be no need to standardise anything if you have could have had all the options long before with proprietary solutions. [ACh]: Propose a sequence of questions to get us out of this. What if we ask whether: There is consensus that this kind of marking is required for PCN first. Then we can ask whether, at a later point, if there is evidence that hte additional enhancements that Georgios is proposing are beneficial then we'll revisit that. Third question: Must we also require implementation of threshold marking initially. [LE]: THe beginning questions are good. I'd like to caution that the motivation for allowing multiple schemes must be more than that you don't want to see your baby die. [GK]: But it's about where the flow termination is actually handled [ACh]: But we're not discussing that. [GK]: No, but it's related to what Lars just said. [ACh]: This has nothing to do with the core behaviour. [GK]: The difference is not about performance, because I think the SM performance (interrupted...) [LE]: If performance is not the reason, why does it matter where functionality sits? [GK]: It has to do with deployment scenarios - if we want to deploy FT at the egress, because there are some scenarios e.g. centralised node that is using using the information from only one point. And the other scenario where you have end-edge, if we close it now then if we recharter there will be no chance to work on these things. [ACh]: Georgios, there is no evidence today for what you're saying right now. If this evidence comes up, can we revisit then? if you want the centralised node case, that still doesn't matter if you make the decision at the ingress. I don't know about the end-edge case because we haven't discussed that. This might come up in the future. [GK]: That's why it's called extensibility, no? [LE?]: You (Anna) were going to ask a question. [ACh]: So I'm going to ask a question. Does the w-g agree that the core behaviour described today which is essentially token-bucket-based excess rate marking with additional handling of excess-rate marked packets is a MUST for the PCN core node behaviour? [PE]: That depends whether by saying that, that's it. [ACh]: other question that could be asked first: Does the w-g believe that the equipment must be forced to implement two types of markings today? [MM]: I think it is a must for preferential treatment and single marking to work properly. Iit would be a good to have an option, not to require, preferential marking for experimental development. [ACh]: If a SHOULD were allowed in the standard, I would be happy to have a SHOULD for this preferential treatment, but I don't know whether we're allowed to put a SHOULD in. [LE]: I would really like the option space to be very, very small here, because we can't have 20 different marking schemes. [ACh]: Actually the SHOULD in this case (interrupted...) [LE]: Actually you're second question is imprecise, because two doesn't say what the other one is. [ACh]: Threshold marking, as defined by Phil in his presentation yesterday, which means roughly virtual-queue based marking. [LE]: Q1 Do people agree whether ERM is a MUST [ACh]: And the 2nd question is whether ThM is also a MUST. But Phil wants the 2nd question asked first. [PE]: I don't mind as long as (interrupted...) [BB]: I want to put my dilemma to the w-g, because I don't how to answer these questions because neither give what I'm looking for but I can't think of another way of doing it. The problem is all these schemes eventually will need two sorts of marking, given the evidence shows SM has problems, but I'd like there to be two MUSTs. But I don't want that to mean an implementer has to say, I can't sell a machine that does PCN now because it won't comply with the standard, because I can only do SM now with my current hardware. [ACh]: So the tradeoff again is if we ask both now, we have immediate capability to do experimentation with any PCN implentation, but it then requires every router to implement this functionality, even though the use of this extra marking is experimental. [BB]: It doesn't require every router to do it. It requires every router to do it to be PCN-compatible. [ACh]: Right, right, right. Every router that is required to do anything for PCN would be required to do something that is experimental and not standard. To me that's a bit of a problem. I would prefer that all the MUSTs for the experimental scheme would go into the experimental standard. [BB]: But then you've got configuration complexity later. [ACh]: Right, right, right. [LE]: When we say MUST, we mean must implement to be be minimally PCN-compliant. [ACh]: Yes, that's the proposal. [GK]: I didn't see the third question. [Chair]: I'll let you ask the question [Chair]: Questions will be about Excess rate, then the 2 MUSTs, then Georgios's question. [ACh]: Does the w-g agree that a PCN-compatible node must implement excess-rate marking with additional handling of previoiusly marked packets as described in this presentation? [Chair]: Show of hands 8 for - 0 against [ACh]: Does the w-g think that a minimal PCN-compatibility statement will say you must also implement threshold marking? [Chair]: Show of hands 6 for - 1 against [GK]: Actually it's the first question but having some time to bring some evidence to show whether LC-PCN is able to perform well in an egress-based scenario. [Chair]: I don't understand - would that preclude doing ERM in the core? [GK]: I'm not sure if the question included the exact description of how it was done here in the slides. [Chair]: Of how excess rate marking is performed? [GK]: How the token bucket is implemented [LE]: (to GK) assuming you get time, what is the question you will ask then? (Audio recording terminated) [GK]: Is it reasonable to include the additional optimisations of excess-rate marking in the standard? [BB]: Will it interwork with the details we have just agreed to? [GK]: Not sure. [LE?]: If evidence is produced before IETF-72 that there is an improved way, would w-g entertain replacing our standards work on the algo? [Chair]: Show of hands 2 for - 0 against o A Simple Analytical Model for Pre-Congestion Notification Raj Jain 20 min ==================================== Slides: Acknowledged Jin Jiang Summary of presentation: Simulation set-up using single bottleneck. See slides for further details. Flow acceptance probability still significant when 10% over theshold If Admission & termination thresholds close enough, termination probability 5% when admission probability is also 5%: leads to thrashing. Propose thrashing index based on max product of probabiltity of admission and of simultaneous termination. [BB]: indices are oft-quoted, so need to get them right. Shouldn't it be area, not just height of the product of probabilities curve? [RJ]: Could have another index Using x% termination threshold (over admission threshold), makes significant difference to thrashing index Log curve of thrashing index shows highly significant drop in thrashing as thresholds moved apart (see slides). If flows are on more often, then have to set termination threshold higher (see slide) For larger capacity links, termination threshold can be set closer Summary 1. A closed form expression for flow rejection probability and flow termination probability for single marker case 2. The model explains the thrashing behaviour when the system reaches rejection /termination threshold region 3. Thrashing Index = Max{P(Acceptance)xP(Termination)} 4. The termination threshold should be set 10-15% above rejection theshold to avoid thrashing 5. The difference can be less if the number of flows is larger (large capacity links) or if the flow on probability is smaller (inactive flows). [BB]: Suggested 6th summary point: criterion for judging Th marking algos, how tightly they approach a step to reduce thrashing index. W-g meeting closed (20-odd mins late)