[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Comments on draft-morin-l3vpn-mvpn-considerations-01.txt
I have carefully reread draft-morin-l3vpn-mvpn-considerations-01.txt, and I
have a number of comments.
In general, I think the authors have done a good job in trying to be
even-handed, but I think there are some errors in the details, and I also
think that some of the risks and overheads involved in the use of BGP have
not been fully considered. I also don't think that the different flavors of
MPLS multicast LSPs have been carefully considered. I would like to see
some of these issues addressed before adopting it as a WG draft.
On the general issue of comparing PIM vs. BGP, while the draft does not make
a recommendation, it is heavy on PIM's disadvantages and BGP's advantages.
So I think it is worth listing some of the risk areas of BGP. I do not
believe that any of these are "showstoppers" for BGP, but they certainly
should be articulated in any document that purports to be providing a
balanced view.
- As BGP is generally used, BGP churn is a function of real or apparent
topology changes. BGP churn is NOT generally caused by enduser events.
This changes when BGP is used to distribute multicast routes. BGP churn
can now be initiated by PIM Joins, and a PIM Join may be the result of
some enduser clicking a button on his PC (e.g., "receive video"). it is
not clear that there is any limit on the rate at which such enduser events
can occur. We do not appear to have adequate data from which we can draw
conclusions as to the impact of enduser actions on BGP churn.
It is easy to offer opinions about this, but facts are scarce.
- In some multicast applications, "join latency" is considered to be very
important. No one has ever claimed BGP will provide lower join latency
than PIM, the only real question is whether latency will be increased
beyond the limits that some applications find acceptable. Again,
different people have different opinions about how much this will matter
in practice, but hard data is hard to come by.
- Churn vs. latency: To control the BGP churn, we have discussed such
features as Join Dampening. However, that can't be expected to have a
salutary effect on the latency!
There really should be some recognition in the draft that the effects on
latency and churn are as yet poorly understood.
- Any time a new high volume address family is added to BGP. it is important
to consider the impact on the route reflectors, and whether it will be
necessary to add new route reflectors, perhaps ones dedicated to the new
family. One should also consider the impact (especially on the RRs) which
one address family may have on another, e.g., will an unexpectedly large
amount of churn due to PIM cause a slowdown in the unicast routing
convergence. There should be some discussion of this.
- Transactional nature of PIM
At first glance, it may seem that having PIM distribute multicast routes
to BGP is no different than having OSPF distribute unicast routes to BGP,
i.e., that it is really nothing new. However, this isn't quite accurate.
The issue is that PIM isn't really a "routing protocol" in the sense of
choosing routes based on topology. It is a protocol that uses
router-to-router transactions to construct paths and assign flows to
paths. In some cases, especially when Sparse Mode flows are involved, a
number of transactions involving multiple nodes may need to take place
before the multicast "routing" stabilizes. Each one of these transactions
must pass through the RR, placing more load on the RR and increasing
latency.
In some cases, where PIM-SM has data-driven state changes, we have
attempted to avoid putting the data-driven state changes into BGP, by
having BGP send extra messages, including timer-driven messages. While I
tend to think that this is the right tradeoff, it does modify the dynamics
of path construction, and we don't have any hard data to tell us whether
we've made the right tradeoff, or whether the "right tradeoff" depends on
just how the applications are using multicast.
- Strange uses of PIM
Talk to anyone with a lot of experience in PIM deployment in the
enterprise, and you'll hear lots of strange stories about how PIM is used
in some enterprises. Although PIM is optimized for "many receivers, few
sources", it isn't always used that way. I've also heard anecdotes about
applications that make assumptions about the dynamic nature of PIM and may
break if anything changes. As usual, hard data is hard to find. But
there's always a risk that when you change the infrastructure underneath a
crufty old application you will break the application. There are people I
respect, with much more experience in multicast deployment than I have,
who think it's just foolish to do anything that changes the way in which
the infrastructure behaves for sparse mode. While I am not convinced that
they are right, I think that this does have to be recognized as a risk
element.
Now let's look at some of the topics covered in the draft.
1. From section 3.1, "the operational burden of setting up multicast on a PE
or for a VR/VRF SHOULD be as low as possible".
I certainly agree with this! However, I don't think it combines well
with the later recommendation to support outsourced RPs. Obviously the
use of outsourced RPs increases the operational burden for the provider.
I note that there doesn't seem to be any requirement for making the
operational burden of MVPN on the SP's customer be as low as possible,
though such a requirement is prominent in another SP requirements draft,
viz., draft-mnapierala-mvpn-part-reqt-00.txt. Such a requirement would
see to militate against the use of the "outsourced RP", at least for
MVPN customers who already use multicast. (Which I imagine would be the
vast majority of them.)
2. Auto-discovery
"BGP-based auto-discovery is the preferred solution for auto-discovery
... while PIM/shared-tree based auto-discovery should be optionally
considered for migration purposes only".
There isn't really a dichotomy here. The facts are that if PIM-based
shared trees are used, BGP-based auto-discovery doesn't really add
anything, but in any other circumstance, BGP-based auto-discovery is
absolutely necessary.
But if the recommendation here is to provide BGP-based auto-discovery
even in the one case where it isn't strictly needed, I certainly support
that recommendation. This is, in fact, already true of the majority of
existing MVPN deployments. (Though I have been told that there is at
lest one vendor which does not support it in existing deployments.)
There is a suggestion in the draft that if PIM-based shared trees are in
use, BGP auto-discovery can help detect misconfigurations. I'm not sure
about that, as I can imagine transition scenarios in which more than one
shared tree is in use at some time.
3. Number of protocols
"if any additional protocols are introduced compared with the unicast VPN
service, the balance between their advantage and operational burden
SHOULD be examined thoroughly"
Well, who could disagree with that? The pros/cons of any new protocol
should always be examined thoroughly. In fact, the pros/cons of adding
new functionality to existing protocols should always be examined
thoroughly as well. As should the pros/cons of adding new address
families, new procedures, new dynamics, new events and states, etc., to
existing protocols.
It is better to focus on the overall complexity of the system, and on the
overall behavior of the system, than just to count the number of
protocols.
4. S-PMSI Signaling
The BGP-based S-PMSI signaling mechanism does have a risk factor that I
have mentioned above, but that is not considered in the draft
In current practice, BGP changes are caused by topology changes, not by
end user activity. However, it is enduser activity that causes
invocation of the S-PMSI signaling mechanism. We simply do not have the
data at the present time to determine how much this will increase the
amount of churn in BGP. The S-PMSI signaling is also something that is
best done through a low latency mechanism, and we do not at present have
the data to determine whether BGP will significantly increase the latency
when compared with the UDP-based proposal.
In theory, the use of BGP S-PMSI signaling has a number of advantages,
but I don't think I would recommend against the use of the UDP-based
procedure unless and until I had some data about these pragmatic issues.
By the way, when I say "data", I don't mean "strongly held and loudly
voiced opinions", even though the latter are sometimes confused with
data.
"The UDP-based protocol is restricted to use within MVPNs using an
MI-PMSI".
This fact seems to me to be neither here nor there. It is true that the
UDP-based protocol is unlikely to be useful unless PIM is the C-multicast
routing control protocol, in which case you have an MI-PMSI. If you are
using BGP as the C-multicast routing protocol, then you have already
determined that you don't care if BGP churn is caused by enduser events.
The UDP-based S-PMSIs signaling is best considered as an optional part of
the PIM C-multicast routing control plane, rather than as a separate
protocol with its own independent set of pros and cons.
"the use of the UDP-based protocol does not preserve AS routing
independence when used in an inter-AS option B context (i.e. the
decision by a PE in an AS to use an S-PMSI for a given customer flow
will impact routing state in other ASes)
I have no idea what this means, so I hope one of the authors will
explain. (I also don't know what is meant by the claim that the
BGP-based S-PMSI signaling mechanism doesn't have whatever disadvantage
this is.)
5. Off-loading processing onto route reflector
The claim is made that, when using BGP to carry C-multicast routes, "some
of the processing burden associated with client multicast routing [is
offloaded] onto BGP route reflectors"
I would like the authors to specify precisely what this offloaded
processing burden is. Some consideration of its effect on existing L3VPN
route reflector systems would also be worthwhile.
As the text is now, it is very difficult to understand or evaluate the
claim.
6. MI-PMSI issues
"Moreover, mechanisms one and two are restricted to use within MVPNs
using an MI-PMSI, thereby necessitating:
a. The use of a P-multicast tree technique that allows shared trees
(for example PIM-SM in ASM mode or MP2MP LDP).
b. The use of one P-multicast tree per PE per VPN, even for PEs that
do not have sources in their directly attached sites for that
VPN."
First, claim b is not true, as explained in draft-rosen-l3vpn-mvpn-
profiles-00.txt, section 3.3.
Second, the fact that a control protocol requires that it be possible to
instantiate a P-tunnel as a shared tree of some sort does not seem to be
to be a "restriction" or disadvantage. You might as well say that the
use of BGP for C-multicast routing has the disadvantage of being
"restricted" to those networks whose route reflectors support the BGP
C-multicast routing address family.
There are however some other things that do seem to require the presence
of an MI-PMSI: autoRP and BSR. Since many SP customers use autoRP/BSR
to discover RPs, and since it is not clear how to provide support for
these without an MI-PMSI, I am not convinced that one can do without an
MI-PMSI even if BGP is used for distributing C-multicast routes.
7. BGP doesn't eliminate PIM anyway
"using BGP for customer routing distribution within multicast VPNs avoids
the introduction of an additional protocol that would require additional
OAM processes and tools."
It's not as if PIM is eliminated by BGP. As long as PIM is used on the
PE/CE links, the SP is still operating a PIM environment. In fact, there
may be significant scaling problems due to PE-CE PIM. Some people with
years of PIM experience believe that the PE-CE PIM scaling problems are
actually worse than the PE-PE PIM scaling problems, and hence that the
attempt to improve PE-PE scaling by using BGP just attacks the wrong
problem.
In the absence of actual data, the risk is that you end up adding a lot
of NEW stuff and don't even address the scaling bottlenecks.
In any event, I'm not sure I understand how adding a lot of new
functionality to BGP (multicast path creation) is going to be done
without the need for additional OAM processes and tools.
8. Consistency with unicast control mechanisms.
"An illustrative example of the benefit brought by consistency with
unicast design is how the "extranet" feature can be implemented :
when BGP-based mechanisms are used, the well defined and well
understood BGP route target import/export semantics are just reused."
This seems like a red herring. Extranet capability is already supported
in the field today by existing MVPN deployments based on PIM. The
primary impact is really on the auto-discovery phase, not the C-multicast
routing phase.
9. Encapsulation techniques
I notice that there is one particular data plane technique that is not
addressed by the document, viz., the use of ingress replication and
transmission through unicast tunnels. It would be interesting to know
whether SPs have interest in this technology; if not, maybe it can be
removed from the drafts. (I think it adds complexity and frankly, I'm
not sure it's fully and properly specified.)
I think section 3.4 can be taken as a requirement for both a "BGP + GRE"
profile and a "PIM + MPLS" profile, since it seems to advocate for
independence between the the control plane and the data plane. But I'm
not sure if that is what is intended, and would like clarification.
I would like to see a specific recommendation for the support of MP2MP
LSPs, given their importance in C-bidir support (for either control
plane, see draft-ietf-l3vpn-2547bis- mcast-05.txt, section 12.2) and in
general MPLS multicast support for the PIM control plane (see draft-
rosen-l3vpn-mvpn-profiles-00.txt, section 3.3).
10. Upstream-Assigned Labels, MI-PMSI, and Scalability
The WG drafts frequently discuss aggregation, and the possibility of
aggregation is frequently touted as a major scalability advantage.
What isn't always clear is that all forms of S-PMSI aggregation require
a major new MPLS feature, upstream-assigned labels. Supporting
upstream-assigned labels requires significant changes to the data plane
processing of many different platforms, and is not likely to be
available for some period of time.
So realistically, for the foreseeable future, the use of an MI-PMSI is
going to remain the only method of aggregation. Unless a service
provider does not care if their P-routers have to maintain an amount of
multicast state which is proportional to the sum of the states of all
their customers, the service provider should be requiring support for an
MI-PMSI.
As a further note, it should be pointed out that inter-AS segment
S-PMSIs are on a per-PE basis, not a per-AS basis. Therefore without the
aggregation that can be obtained from upstream-assigned labels, it is
not clear that segmented inter-AS S-PMSIs are scalable at all. This is
an issue which should be mentioned.
I think the draft should have a clear recommendation that in the absence
of upstream-assigned MPLS labels, MI-PMSIs should be used.
11. Outsourced RPs
I would like to see the draft offer much stronger support for its
recommendation to implement outsourced RPs, or else to eliminate that
recommendation.
Outsourced RPs (i.e., PE as RP) have a number of disadvantages which
need to be given more emphasis:
- More work for the PE (do many SPs really want to give a lot of new
work to their PEs)?
- Not likely to be of much interest to customers with an existing
multicast infrastructure
- Require increased coordination between SP and enterprise customer,
such as agreement on RP (or RPL) addresses.
- Not completely clear how this works in scenarios where an enterprise
gets VPN service from two different providers, particularly if there
is a site multihomed to two different providers.
I presume that the extra operational work for SP and customer is not
going to be justified by the fact that it simplifies things for some of
the SP's vendors.
12. The draft's conclusion
"Consequently, at the present time and until there is experience with
all of the proposed mechanisms it is not clear which of the above
mechanisms should be recommended as the preferred solution to
implementers. However, it would appear prudent for implementations
to consider supporting both the fourth (BGP-based) and first (full
per-MPVN PIM peering) mechanisms. Further experience on both
implementation is likely to be required before some best practice can
be defined."
I certainly agree with this conclusion. Perhaps it should be given more
prominence.
13. My conclusion
While I do think this draft may be a good start on the project of
comparing the different MVPN options, I don't think it is ready to be
adopted as a WG document, and it needs a lot more work to deal with the
issues I have raised in this note.
I also hope it is clearly understood that this document should be
evaluated by the same criteria as any other internet-draft, i.e., by its
technical quality.