[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PWE3] draft-bryant-filsfils-fat-pw
Shane Amante wrote:
> Luca,
>
> On Dec 11, 2007, at 2:23 PM, Luca Martini wrote:
>> Tom/Shane,
>>
>> Let me see if I can attempt to clarify the requirements.
>> First let's assume that we want preserve the order of the PW packets,
>> then I see the following situations:
>>
>> 1) PW is of comparable size to some MPLS trunk link in a link
>> bundle. (let's say at least 10% of the size of the link).
>> 2) PW is larger then some individual trunk link in some bundle.
>> 3) PW can use network ECMP. ( no local link bundles are present along
>> the PW path )
>>
>> We also have the following sub cases.
>> a) PW contains IP traffic , and can be identified at ingress.
>> b) PW does not contain IP traffic/traffic cannot be ID/IP traffic is
>> a single huge flow.
>>
>> Any combinations of the above are possible. So I think this draft
>> makes an attempt at solving 3a , and 1a.
>
> I agree with you, so far. IMHO, in the context of just 1a vs. 3a, 1a
> is a more pressing problem than 3a.
>
>
>> However I think that the most common problem is 1b. Applications that
>> generate large single amount of pw traffic tend to be enterprise
>> applications.
>> Like database syncing , backups, encrypted 10GE links etc.
>>
>> It seems to me , that the multiple receiving labels per PW solution
>> is simple, does not change the pwe3 architecture, and does not overly
>> complicates the hardware design of the PE. Since this can clearly be
>> used on an exception basis it should be adequate to mitigate problem
>> 1a/3a.
>
> In giving this more thought, and in private discussions with some
> others, the PW Labels Block approach seems more like a 'band-aid',
> than providing a relatively long-term fix for the problem at hand.
> Specifically, consider
Yes I agree that it is not perfect, however it does not change the
forwarding plane , which means that cost optimized hardware has a chance
of supporting it.
> the following use cases where you [will] see fat PW's:
> 1) point-to-point VPWS, specifically in a hub-and-spoke environment,
> e.g.: a Hub site containing a (N x) 10 GbE NNI to another carrier,
> Enterprise DataCenter, etc. Funneling in to that Hub site will be
> GbE, N x GbE and 10 GbE EoMPLS VC's.
> 2) multipoint-to-multipoint VPLS. Although I can't say this is a hard
> requirement at the moment, I would expect that as VPLS adoption
> continues to grow that the solution for "Fat PW's" being discussed
> here will, most likely, get re-used in VPLS at some point down the road.
> 3) perhaps, further out than VPLS, MS-PW ...
>
VPLS, and VPWS use the PWs. So I believe that any solution we can design
for a PW would automatically apply there as well.
> Ultimately, the PW Labels Block approach burns LFIB space. This seems
> to me to be particularly problematic in p2mp and mp2mp topologies,
> since the impact on LFIB space is felt by all PE nodes using the PW
> Labels Block approach. In addition, I'm not necessarily convinced the
> impact on LFIB space will be minor, since the more VC
I do not believe that there is a requirement for Huge single Multicast
flows at this point.
is this what you mean my p2mp ?
> labels that are allocated for a particular PW, the more diverse input
> is provided to load-hashing algorithms within core LSR's and,
> ultimately, the more evenly flows are distributed over component-links.
>
> Finally, the PW Labels Block approach is concerning because operators
> will likely have to play around to find the 'right' size PW Labels
> Block for their network, LAG/ECMP sizes, etc. Thus, as their network
> grows, they'll likely have to go back and re-adjust the PW Label Block
> larger or smaller always trying to optimize LFIB size vs. even
> load-balancing ...
>
This is very tricky. Since there is no standard , we need to guess what
happens here. A small number of labels should be sufficient. We can
always use a programmatic system based on link BW/ and numbers to figure
out how many labels we use.
> Ultimately, BW is growing and shows no signs of abating, (BW growth is
> good for all of us! :-). I don't believe, as you state above, that
> this solution will only see limited use in "exception cases". Other
> networks will, if they
You make a big assumption that we can identify the flows inside the PW
at the AC.
With 10G ethernet encryption hardware approaching commodity pricing ,
I'm not sure that it is a good assumption.
> don't already, need a solution to this same problem. Therefore, I
> would advocate we think through the design fully and make sure it's:
> a) easy to configure/use/operate, esp. over long time scales; b) it's
> easily extensible to other protocols, (e.g.: VPLS); and, c) of course,
> scalable and interoperable with other network elements.
>
>
>
>> I do not believe that problem 2 is solvable while keeping the packets
>> in order. However maybe that is not a requirement.
>>
>> In the case of problems in the category "1b", we can solve them by
>> implementation without requiring a new protocols.
>
> I'm not sure I follow if, or how, you're proposing on solving 1b,
> since you're (b) above says that: "PW does not contain IP
> traffic/traffic cannot be ID/IP traffic is a single huge flow" ...
> then, you go on to list applications for which there is no way for the
> ingress PE to identify a microflow in order to assign either a PW
> Label Block or Load-Balance Label. Can you clarify what you mean above?
>
Not on this list. ;-)
The point is that there are solutions that do not require us to change
any protocols.
Luca
> Thanks,
>
> -shane
>
>
>
>> So what is the most pressing problem to solve ?
>>
>> Luca
>>
>>
>> Thomas Nadeau wrote:
>>>
>>>
>>>
>>>> OK, so speaking as yet another operator....
>>>>
>>>> there's a clear need to support fat PWEs, but I'm yet to be
>>>> convinced that this draft is the correct solution to the problem.
>>>>
>>>> The intro to the draft talks about the application being to
>>>> interconnect IP routers. If that's the case then why not use an IP
>>>> pseudowire? If you do that then there will just be one label, but
>>>> (AFAIK) many routers will spot the 0x4 (or 0x6) in the first nibble
>>>> of the payload and do a hash on the IP header - giving optimum
>>>> traffic distribution and also preserving the order of each flow.
>>>>
>>>> If the payload is not IP then I think we have a problem at any
>>>> rate, as we don't necessarily know how to identify a "flow". Sure,
>>>> you could do a MAC hash for an Ethernet pseudowire, but in many
>>>> cases you see precisely one pair of MAC addresses on the PWE.
>>>>
>>>> Giles
>>>>
>>>> On Nov 28, 2007 2:47 PM, Shane Amante <shane at castlepoint.net
>>>> <mailto:shane at castlepoint.net>> wrote:
>>>>
>>>> Hi Yaakov,
>>>>
>>>> Yaakov Stein wrote:
>>>> > Stewart and other authors
>>>> >
>>>> > I just finished reading the FAT-PW draft, and have a few
>>>> comments/questions.
>>>> >
>>>> > 1. The draft says "Operators have requested the ability..."
>>>> > Since I have never heard this request from any of the
>>>> operators with
>>>> > which we work,
>>>> > can this be changed to "Some operators have requested ..." ?
>>>> > Since there is one operator on the author list, I guess we
>>>> can guess
>>>> > which operator has requested
>>>> > this feature !
>>>>
>>>> Speaking as /another/ operator, I can say there is an absolutely
>>>> strong
>>>> need to solve this problem, (and, has been for quite a long time,
>>>> actually). Consider the fact that 10 GbE has become (is
>>>> becoming?) a
>>>> pretty common access circuit to Backbones and that within most SP
>>>> networks the dominant Backbone link size are 10G. As you're
>>>> likely well
>>>> aware, the IEEE HSSG is working on both 40 GbE and 100 GbE. Once
>>>> 40 GbE
>>>> is available, (and assuming its used for WAN connectivity, perhaps
>>>> similar to 10 GbE LAN PHY), then OC-768c Backbone links will
>>>> suffer the
>>>> same problem. 100 GbE will, eventually, be used as both core and
>>>> access
>>>> links. In short, this problem is not going away. We need to
>>>> solve it.
>>>>
>>>
>>> Agreed. Speaking as another operator, I too am concerned that we solve
>>> this problem, but I do not like the approaches described in this draft.
>>>> > 2. The example given is for Ethernet PWs. Is this draft limited
>>>> to this
>>>> > case?
>>>> > There is discussion of whether it is limited to IP over
>>>> Ethernet,
>>>> > but this more basic question is not addressed.
>>>> > For example, could this load balancing to be performed for
>>>> ATM PWs
>>>> > based on the AAL5 flows?
>>>>
>>>> From my perspective, Ethernet is far and away the biggest "problem
>>>> child" out there today, due to the size of access to Backbone
>>>> links,
>>>> (see above). While it may be admirable to look at making this
>>>> draft
>>>> "generic" for a variety of PW types, I wouldn't lose any sleep if
>>>> this
>>>> draft remained focused on just Ethernet.
>>>>
>>>>
>>>>
>>>> > 3. PWs are an emulation of the native service.
>>>> > Why is this emulation being called upon to deliver a
>>>> feature NOT
>>>> > present in the native service ?
>>>> > Doesn't this break the model a bit?
>>>> >
>>>> > 4. A native service processing function is required for
>>>> differentiating
>>>> > between different flows
>>>> > at ingress. If this draft is indeed limited to Ethernet PWs,
>>>> such a
>>>> > processing function
>>>> > already exists in the native service. 802.3 clause 43 (LAG)
>>>> defines
>>>> > conversations
>>>> > for exactly this purpose (commonly implemented by hashing IP
>>>> > addresses and port numbers),
>>>> > and even mentions the use of load balancing in the
>>>> distribution of
>>>> > conversations over links.
>>>> > I think this function should be at least referenced.
>>>> >
>>>> > 5. My greatest problem is with the prefered mode of section 1.1,
>>>> > which builds a PW label stack under the MPLS label stack.
>>>> > The proposal is for 2 PW labels (once again, somewhat
>>>> breaking RFC3985).
>>>> > Figure 2 is not completely clear about the label structure.
>>>> > There are two possibilities:
>>>> > 1) both load balancing label and PW label have stack bit
>>>> set. (I
>>>> > hope not !)
>>>> > 2) the load balancing label has S=1, and the PW label has
>>>> S=0.
>>>> > So formally, the PW label seems to be an MPLS label.
>>>> > Both possibilities break the standard model.
>>>> >
>>>> > I would certainly like to see more justification of the
>>>> problem
>>>> > before breaking the model in this way.
>>>> > Perhaps a short requirements document is in order?
>>>>
>>>> When I read the draft, this is the part I also had the most concern
>>>> with. In particular, I like the "simplicity" of the LB Label
>>>> approach
>>>> (i.e.: savings on FIB space, no need to signal first and last
>>>> labels for
>>>> each PW, etc.); however, I am concerned about the implications
>>>> of, or
>>>> potential need to, define a 'generic' MPLS PW label.
>>>>
>>>
>>> In addition to this, I suggest that the requirements first be
>>> investigated before we go ahead with this solution. Speaking as
>>> someone who needs to make different boxes interoperate in a network,
>>> I would prefer a SINGLE solution to this problem.
>>> When we have different protocols, it is generally ok to have
>>> different approaches, but having
>>> different approaches in this case seems to make things exponentially
>>> harder.
>>>
>>> --Tom
>>>
>>>
>>>> My primary concern is future extensibility. Specifically, in
>>>> case there
>>>> are /other/ applications, which may or may not have been brought
>>>> to the
>>>> surface, yet, that may have similar needs/desire for a 2nd PW
>>>> label. If
>>>> that ultimately means we gain consensus to amend the PWE3
>>>> Architecture,
>>>> I'm OK with that, but certainly we would need to have more
>>>> discussion to
>>>> see whether or not it is a good approach and, more importantly,
>>>> what are
>>>> the other implications that go along with it?
>>>>
>>>>
>>>>
>>>> > 6. The draft recommends generating a load balancing label in
>>>> such fashion
>>>> > that the entropy is high. This assumes that the precise
>>>> form of the
>>>> > label
>>>> > is used to determine the load balancing path (possibly a
>>>> hash of
>>>> > some sort).
>>>> > Could this mechanism, even if beyond the scope of the
>>>> document, be
>>>> > explained a bit more ?
>>>>
>>>> Load-balancing over LAG and ECMP paths, using some number of MPLS
>>>> labels
>>>> as input to a load-balancing hash algorithm, is common across all
>>>> vendors. However, such algorithms are 'proprietary' to each
>>>> vendor.
>>>> I'm not sure how much more can be said other than the fact that,
>>>> one
>>>> would strongly prefer that the output of a LAG or ECMP hashing
>>>> algorithm
>>>> is spread out among the largest number of hash buckets, (as is
>>>> practical), to get the most even distribution of flows across a
>>>> set of N
>>>> links in a LAG or ECMP path. And, I think the draft already
>>>> makes this
>>>> point, in Section 3:
>>>> ---snip---
>>>> It is recommended that the method chosen to
>>>> generate the load balancing labels introduces a high degree of
>>>> entropy in their values, to maximise the entropy presented to
>>>> the
>>>> ECMP path selection mechanism in the LSRs in the PSN, and hence
>>>> distribute the flows as evenly as possible over the available
>>>> PSN
>>>> ECMP paths.
>>>> ---snip---
>>>>
>>>> Is there something else you had in mind?
>>>>
>>>> -shane
>>>>
>>>>
>>>> > 7. With the optional mode of section 1.2 several PW labels are
>>>> mapped to
>>>> > a single AC.
>>>> > I have no problem with this approach. In fact, I feel that
>>>> it is
>>>> > somewhat similar to the solutions being proposed for PW
>>>> protection.
>>>> > For PW protection two labels mapped to the AC or end-user
>>>> application,
>>>> > where one label belongs to the active PW, and the other to
>>>> the
>>>> > backup PW (not being used).
>>>> > For load balancing two or more PWs, all in active state,
>>>> are mapped
>>>> > to the same AC.
>>>> > Would it be possible to integrate the two features into one
>>>> mechanism
>>>> > for mapping multiple PW labels in either active or backup
>>>> state to
>>>> > one AC or end-user identifier?
>>>> >
>>>> > 8. The term VC as opposed to PW is used in various places in
>>>> the document.
>>>> > I am not sure what is meant here. Is the intent that a "VC"
>>>> is one
>>>> > of the paths of the
>>>> > load-balanced "PW" ?
>>>> >
>>>> > The first paragraph of section 4 seems to imply that the
>>>> authors are
>>>> > willing to settle
>>>> > on either of the modes rather than both. I would support the PW
>>>> label mode.
>>>> > If some entropy-rich information needs to be placed in the
>>>> packet,
>>>> > perhaps the flags in the CW could be used (if 16 paths is
>>>> sufficient).
>>>> >
>>>> > Y(J)S
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> >
>>>> > _______________________________________________
>>>> > pwe3 mailing list
>>>> > pwe3 at ietf.org <mailto:pwe3 at ietf.org>
>>>> > https://www1.ietf.org/mailman/listinfo/pwe3
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> pwe3 mailing list
>>>> pwe3 at ietf.org <mailto:pwe3 at ietf.org>
>>>> https://www1.ietf.org/mailman/listinfo/pwe3
>>>>
>>>>
>>>> _______________________________________________
>>>> pwe3 mailing list
>>>> pwe3 at ietf.org <mailto:pwe3 at ietf.org>
>>>> https://www1.ietf.org/mailman/listinfo/pwe3
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> pwe3 mailing list
>>> pwe3 at ietf.org
>>> https://www1.ietf.org/mailman/listinfo/pwe3
>>>
>>
_______________________________________________
pwe3 mailing list
pwe3 at ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3