[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PWE3] draft-bryant-filsfils-fat-pw



Tom/Shane,

Let me see if I can attempt to clarify the requirements.
First let's assume that we want preserve the order of the PW packets, then I see the following situations:


1) PW is of comparable size to some MPLS trunk link in a link bundle. (let's say at least 10% of the size of the link).
2) PW is larger then some individual trunk link in some bundle.
3) PW can use network ECMP. ( no local link bundles are present along the PW path )


We also have the following sub cases.
a) PW contains IP traffic , and can be identified at ingress.
b) PW does not contain IP traffic/traffic cannot be ID/IP traffic is a single huge flow.


Any combinations of the above are possible. So I think this draft makes an attempt at solving 3a , and 1a.
However I think that the most common problem is 1b. Applications that generate large single amount of pw traffic tend to be enterprise applications.
Like database syncing , backups, encrypted 10GE links etc.


It seems to me , that the multiple receiving labels per PW solution is simple, does not change the pwe3 architecture, and does not overly complicates the hardware design of the PE. Since this can clearly be used on an exception basis it should be adequate to mitigate problem 1a/3a.

I do not believe that problem 2 is solvable while keeping the packets in order. However maybe that is not a requirement.

In the case of problems in the category "1b", we can solve them by implementation without requiring a new protocols.
So what is the most pressing problem to solve ?


Luca


Thomas Nadeau wrote:



OK, so speaking as yet another operator....

there's a clear need to support fat PWEs, but I'm yet to be convinced that this draft is the correct solution to the problem.

The intro to the draft talks about the application being to interconnect IP routers. If that's the case then why not use an IP pseudowire? If you do that then there will just be one label, but (AFAIK) many routers will spot the 0x4 (or 0x6) in the first nibble of the payload and do a hash on the IP header - giving optimum traffic distribution and also preserving the order of each flow.

If the payload is not IP then I think we have a problem at any rate, as we don't necessarily know how to identify a "flow". Sure, you could do a MAC hash for an Ethernet pseudowire, but in many cases you see precisely one pair of MAC addresses on the PWE.

Giles

On Nov 28, 2007 2:47 PM, Shane Amante <shane at castlepoint.net <mailto:shane at castlepoint.net>> wrote:

    Hi Yaakov,

    Yaakov Stein wrote:
    > Stewart and other authors
    >
    > I just finished reading the FAT-PW draft, and have a few
    comments/questions.
    >
    > 1. The draft says "Operators have requested the ability..."
    >     Since I have never heard this request from any of the
    operators with
    > which we work,
    >     can this be changed to "Some operators have requested ..." ?
    >     Since there is one operator on the author list, I guess we
    can guess
    > which operator has requested
    >     this feature !

    Speaking as /another/ operator, I can say there is an absolutely
    strong
    need to solve this problem, (and, has been for quite a long time,
    actually).  Consider the fact that 10 GbE has become (is becoming?) a
    pretty common access circuit to Backbones and that within most SP
    networks the dominant Backbone link size are 10G.  As you're
    likely well
    aware, the IEEE HSSG is working on both 40 GbE and 100 GbE.  Once
    40 GbE
    is available, (and assuming its used for WAN connectivity, perhaps
    similar to 10 GbE LAN PHY), then OC-768c Backbone links will
    suffer the
    same problem.  100 GbE will, eventually, be used as both core and
    access
    links.  In short, this problem is not going away.  We need to
    solve it.


Agreed. Speaking as another operator, I too am concerned that we solve
this problem, but I do not like the approaches described in this draft.


    > 2. The example given is for Ethernet PWs. Is this draft limited
    to this
    > case?
    >     There is discussion of whether it is limited to IP over
    Ethernet,
    >     but this more basic question is not addressed.
    >     For example, could this load balancing to be performed for
    ATM PWs
    >     based on the AAL5 flows?

     From my perspective, Ethernet is far and away the biggest "problem
    child" out there today, due to the size of access to Backbone links,
    (see above).  While it may be admirable to look at making this draft
    "generic" for a variety of PW types, I wouldn't lose any sleep if
    this
    draft remained focused on just Ethernet.



    > 3. PWs are an emulation of the native service.
    >    Why is this emulation being called upon to deliver a feature NOT
    > present in the native service ?
    >    Doesn't this break the model a bit?
    >
    > 4. A native service processing function is required for
    differentiating
    > between different flows
    >    at ingress. If this draft is indeed limited to Ethernet PWs,
    such a
    > processing function
    >    already exists in the native service. 802.3 clause 43 (LAG)
    defines
    > conversations
    >    for exactly this purpose (commonly implemented by hashing IP
    > addresses and port numbers),
    >    and even mentions the use of load balancing in the
    distribution of
    > conversations over links.
    >    I think this function should be at least referenced.
    >
    > 5. My greatest problem is with the prefered mode of section 1.1,
    >     which builds a PW label stack under the MPLS label stack.
    >     The proposal is for 2 PW labels (once again, somewhat
    breaking RFC3985).
    >     Figure 2 is not completely clear about the label structure.
    >     There are two possibilities:
    >      1) both load balancing label and PW label have stack bit
    set. (I
    > hope not !)
    >      2) the load balancing label has S=1, and the PW label has
    S=0.
    >          So formally, the PW label seems to be an MPLS label.
    >     Both possibilities break the standard model.
    >
    >    I would certainly like to see more justification of the problem
    >    before breaking the model in this way.
    >    Perhaps a short requirements document is in order?

    When I read the draft, this is the part I also had the most concern
    with.  In particular, I like the "simplicity" of the LB Label
    approach
    (i.e.: savings on FIB space, no need to signal first and last
    labels for
    each PW, etc.); however, I am concerned about the implications of, or
    potential need to, define a 'generic' MPLS PW label.


In addition to this, I suggest that the requirements first be investigated before we go ahead with this solution. Speaking as someone who needs to make different boxes interoperate in a network, I would prefer a SINGLE solution to this problem.
When we have different protocols, it is generally ok to have different approaches, but having
different approaches in this case seems to make things exponentially harder.


--Tom


    My primary concern is future extensibility.  Specifically, in
    case there
    are /other/ applications, which may or may not have been brought
    to the
    surface, yet, that may have similar needs/desire for a 2nd PW
    label.  If
    that ultimately means we gain consensus to amend the PWE3
    Architecture,
    I'm OK with that, but certainly we would need to have more
    discussion to
    see whether or not it is a good approach and, more importantly,
    what are
    the other implications that go along with it?



    > 6. The draft recommends generating a load balancing label in
    such fashion
    >     that the entropy is high. This assumes that the precise
    form of the
    > label
    >     is used to determine the load balancing path (possibly a
    hash of
    > some sort).
    >     Could this mechanism, even if beyond the scope of the
    document, be
    > explained a bit more ?

    Load-balancing over LAG and ECMP paths, using some number of MPLS
    labels
    as input to a load-balancing hash algorithm, is common across all
    vendors.  However, such algorithms are 'proprietary' to each vendor.
    I'm not sure how much more can be said other than the fact that, one
    would strongly prefer that the output of a LAG or ECMP hashing
    algorithm
    is spread out among the largest number of hash buckets, (as is
    practical), to get the most even distribution of flows across a
    set of N
    links in a LAG or ECMP path.  And, I think the draft already
    makes this
    point, in Section 3:
    ---snip---
       It is recommended that the method chosen to
       generate the load balancing labels introduces a high degree of
       entropy in their values, to maximise the entropy presented to the
       ECMP path selection mechanism in the LSRs in the PSN, and hence
       distribute the flows as evenly as possible over the available PSN
       ECMP paths.
    ---snip---

    Is there something else you had in mind?

    -shane


> 7. With the optional mode of section 1.2 several PW labels are mapped to > a single AC. > I have no problem with this approach. In fact, I feel that it is > somewhat similar to the solutions being proposed for PW protection. > For PW protection two labels mapped to the AC or end-user application, > where one label belongs to the active PW, and the other to the > backup PW (not being used). > For load balancing two or more PWs, all in active state, are mapped > to the same AC. > Would it be possible to integrate the two features into one mechanism > for mapping multiple PW labels in either active or backup state to > one AC or end-user identifier? > > 8. The term VC as opposed to PW is used in various places in the document. > I am not sure what is meant here. Is the intent that a "VC" is one > of the paths of the > load-balanced "PW" ? > > The first paragraph of section 4 seems to imply that the authors are > willing to settle > on either of the modes rather than both. I would support the PW label mode. > If some entropy-rich information needs to be placed in the packet, > perhaps the flags in the CW could be used (if 16 paths is sufficient). > > Y(J)S > > > > ------------------------------------------------------------------------

    >
    > _______________________________________________
    > pwe3 mailing list
    > pwe3 at ietf.org <mailto:pwe3 at ietf.org>
    > https://www1.ietf.org/mailman/listinfo/pwe3



    _______________________________________________
    pwe3 mailing list
    pwe3 at ietf.org <mailto:pwe3 at ietf.org>
    https://www1.ietf.org/mailman/listinfo/pwe3


_______________________________________________ pwe3 mailing list pwe3 at ietf.org <mailto:pwe3 at ietf.org> https://www1.ietf.org/mailman/listinfo/pwe3

------------------------------------------------------------------------

_______________________________________________
pwe3 mailing list
pwe3 at ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3



_______________________________________________ pwe3 mailing list pwe3 at ietf.org https://www1.ietf.org/mailman/listinfo/pwe3