[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on draft-morin-l3vpn-mvpn-considerations-02



I  have  thoroughly  read  draft-morin-l3vpn-mvpn-considerations-02,  and  I
regret that I continue to feel that  it is nowhere near ready to be accepted
as  a  WG document.   There  are  too many  places  where  the reasoning  is
superficial  or incomplete,  or  where  the reasoning  given  just does  not
support the putative conclusions.  I will go into this in some detail below.

If the WG  really wants to have a document which  provides a sound technical
comparison  of various options,  I would  suggest that  a different  sort of
"design team" is needed.

Anyway, specific  comments below.  I will  extract text from  the draft, and
begin my comments with ****.



1.  Introduction

   The current proposal for multicast in BGP/MPLS
   [I-D.ietf-l3vpn-2547bis-mcast] includes multiple alternative
   mechanisms for some of the required building blocks of the solution.
   However, it does not identify the core set of mechanisms which must
   be implemented in order to ensure interoperability.  This may lead to
   a situation where implementations may support different subsets of
   the available optional mechanisms leading to implementations that do
   not interoperate.

**** Since one size does not fit all, not all service providers will want
**** the same sets of options all the time.  To ensure interoperability, one
**** needs to define a set of profiles, each of which contains a specific
**** set of options that work with each other.  That ensures inter-
**** operability with a profile.  Then the question of whether any profile
**** should be mandatory can be addressed.  Just providing a single
**** mandatory set of procedures does nothing to ensure interoperability
**** when non-mandatory options are used.  

3.  Examining alternatives mechanisms for MVPN functions

3.1.  MVPN auto-discovery

   Section 5.2.10 of [RFC4834] states "The operation of a multicast VPN
   solution SHALL be as light as possible and providing automatic
   configuration and discovery SHOULD be a priority when designing a
   multicast VPN solution.  Particularly the operational burden of
   setting up multicast on a PE or for a VR/VRF SHOULD be as low as
   possible".

   The current solution document [I-D.ietf-l3vpn-2547bis-mcast]
   addresses this requirement by proposing two different mechanisms for

   MVPN auto-discovery:

   1.  BGP-based auto-discovery (described in section 4).

   2.  Discovery using PIM running on a MI-PMSI implemented with a
       shared tree using multicast ASM, or MP2MP LDP with the same
       common tree identifier configured in all VRFs of an MVPN.

   It is the recommendation of the authors that BGP-based auto-discovery
   is the preferred solution for auto-discovery and should be supported
   by all implementations while PIM/shared-tree based auto-discovery
   should be optionally considered for migration purpose only.

**** I would agree that the spec should say that BGP-based discovery SHOULD
**** be used.  However, the arguments below seem to be mostly specious to
**** me.

   Part of the rationale for this recommendation is also based on
   section 5.2.10 of [RFC4834] which states "as far as possible, the
   design of a solution SHOULD carefully consider the number of
   protocols within the core network: if any additional protocols are
   introduced compared with the unicast VPN service, the balance between
   their advantage and operational burden SHOULD be examined
   thoroughly".

   BGP is the auto-discovery protocol used in unicast (RFC4364) VPNs and
   therefore the use of BGP-based auto-discovery within multicast VPNs
   avoids the introduction of an additional auto-discovery protocol that
   would require additional OAM processes and tools.  Service providers
   with deployed unicast (RFC4364) VPNs already have extensive
   deployment and operations experience of using BGP as an auto-
   discovery protocol including OAM processes and tools.  Such processes
   and tools will require modifications in order to support multicast
   auto-discovery but those modifications are anticipated to be less
   than those required to develop new processes and tools for a specific
   auto-discovery protocol. 

**** This seems to be directed against a strawman.  In the absence of
**** BGP-based auto-discovery, the only auto-discovery protocol that needs
**** to be supported is PIM on the PEs, and PIM has to be supported on the
**** PEs anyway in order to interact with the CEs.  Thus using PIM for
**** auto-discovery does require any additional protocols.

**** Furthermore, the idea that by using BGP one avoids introducing any new
**** protocols is just a bit strange.  Please note that it has taken upwards
**** of 50 pages (and three years) to specify the message formats and
**** procedures of this "no new protocol".  The fact is that considerable
**** new protocol has been added to BGP: new address families, new sets of
**** interactions, new procedures.

   Additionally, BGP supports MD5
   authentication of its peers for additional security.  In contrast,
   there are no obvious authentication mechanisms to secure PIM
   communications in any known implementation.

**** On the contrary, in a very widespread implementation, it is possible 
**** to protect PIM control packets  with IPsec, either with manual keying
**** or with the GDOI dynamic  group key management protocol from the MSEC
**** WG. 

   Furthermore, PIM based discovery is only applicable to deployments
   using a shared tree on an MI-PMSI, whereas BGP-based auto-discovery
   does not place any restrictions on the type of multicast trees that
   can be used.     BGP-based auto-discovery is independent of the type of
   P-multicast tree used thus satisfying the requirement in section
   5.2.4.1 of [RFC4834] that "a multicast VPN solution SHOULD be
   designed so that control and forwarding planes are not
   interdependent". 

**** I think this paragraph is confused.  MI-PMSI is not a forwarding plane,
**** it is a service.  The different forwarding planes are the different
**** kinds of multipoint LSPs, the PIM/GRE multicast tunnels, the unicast
**** tunnels used by ingress replication, etc.  Any of these could be used
**** to instantiate the MI-PMSI service.  I believe that only the unicast
**** tunnels really require BGP-based auto-discovery.  So I find this
**** argument weak.

...

   Last, the use of the BGP-based autodiscovery is expected to be less
   prone to spoofing attacks (being based on a connection established
   with a three-way handshake), to which the PIM Hello over MI-PMSI
   procedures may be subject to (being datagram-based).

**** I don't believe that TCP's three-way handshake has ever been claimed to
**** be a security feature ;-)  

   ( the authors note that, in order to support the coexistence of both
   protocols (for example during migration scenarios), implementations
   could support both alternatives by providing a per-VRF configuration
   knob that would allow recognizing new PIM neighbors based on the
   reception of PIM hellos on a shared P-multicast tree, even for
   neighbors that did not advertise a BGP auto-discovery route )

**** Are you saying that PEs should refuse to accept PIM hellos in advance
**** of receiving the BGP auto-discovery routes?  I don't think that
**** behavior is currently required by any specification.  What would be the
**** advantage of imposing such a rule?

3.2.  S-PMSI Signaling

**** Editorial comment: this text should probably be in a subsection, "3.2.1
**** BGP vs. UDP" or something like that.

   The current solution document [I-D.ietf-l3vpn-2547bis-mcast] proposes
   two mechanisms for S-PMSI Signaling:

   1.  A new UDP-based TLV protocol specifically for S-PMSI signaling
       (described in section 7.2.1).

   2.  A BGP-based mechanism for S-PMSI signaling (described in section
       7.2.2).


**** Please note that the adjective "new" has been applied to the wrong
**** solution ;-) A typo, no doubt.

   It is the recommendation of the authors that BGP is the preferred
   solution for S-PMSI signaling and should be supported by all
   implementations while the UDP-based S-PMSI signaling protocol should
   be considered optional.

   Part of the rationale for this recommendation is similar to that for
   BGP-based auto-discovery and is based on section 5.2.10 of [RFC4834]
   and the desire to avoid introducing and deploying additional
   protocols unless strictly necessary.

**** See my comment above about referring to the major additions to BGP as
**** "no new protocol".

   Furthermore:

   o  The BGP-based S-PMSI signaling mechanism can be efficiently used
      in an inter-AS option B deployment context while the use of the
      UDP-based protocol does not preserve AS routing independence when
      used in an inter-AS option B context (i.e. the decision by a PE in
      an AS to use an S-PMSI for a given customer flow will impact
      routing state in other ASes).  Co-existence with unicast inter-AS
      VPN options is strongly encouraged by section 5.2.6 of [RFC4834].

**** I don't understand what is being said here.  

   Therefore, it is the opinion of the authors that BGP is the preferred
   solution for performing S-PMSI signaling.

**** Generally, if one lists the advantages of scheme A and the
**** disadvantages of scheme B, one finds that A looks to be better than B.
**** If one takes the trouble to also list some of the advantages of B and
**** disadvantages of A, one may draw a different conclusion.

**** For instance, note that the BGP-based S-PMSI signaling requires all PEs
**** of a particular MVPN to maintain state for each C-flow that has been
**** assigned to an S-PMSI.  The PEs must maintain this state even if they
**** in fact have no receivers for that C-flow.  If one uses a
**** datagram-based mechanism, a PE does not have to retain any state for a
**** flow unless it has receivers for that flow.

**** It is true, as the authors point out, that one can't use the
**** datagram-based procedure unless one has set up a tunnel to carry the
**** datagrams.  If one doesn't want to set up tunnels that carry only
**** control messages, then an "out of band" signaling mechanism such as
**** that offered by BGP (or, perhaps, unicast PIM) is needed.  On the other
**** hand, if one does have the tunnel available to carry the datagrams, why
**** should one be forced to maintain all the extra state imposed by the
**** BGP-based procedure?

**** Further, in the absence of any well understood criteria for assigning
**** and/or removing particular C-flows to/from particular P-tunnels, we
**** don't know what the rate of change is likely to be or what impact this
**** is likely to have on the BGP route reflectors.

**** I do not believe that the authors have given full consideration to all
**** the pros and cons of each feature, and I therefore believe that their
**** conclusions have not been properly supported.

...

**** Editorial comment: I think the following should be a new section "3.2.2
**** Switching to S-PMSI" 

   Section 7.2.2.3 of [I-D.ietf-l3vpn-2547bis-mcast] proposes two
   approaches for how a source PE can decide when to start transmitting
   customer multicast traffic on a S-PMSI:

   1.  The source PE sends multicast packets for the <C-S, C-G> on both
       the I-PMSI P-multicast tree and the S-PMSI P-multicast tree
       simultaneously for a pre-configured period of time, letting the
       receiver PEs select the new tree for reception, before switching
       to only the S-PMSI.

   2.  The source PE waits for a pre-configured period of time after
       advertising the <C-S, C-G> entry bound to the S-PMSI before fully
       switching the traffic onto the S-PMSI-bound P-multicast tree.

....

   For these reasons, it is the authors' recommendation to mandate the
   implementation of the second alternative for switching to S-PMSI.

**** I believe this recommendation has been incorporated into the next
**** revision of the architecture spec.

3.3.  PE-PE Transmission of C-Multicast Routing

   The current solution document [I-D.ietf-l3vpn-2547bis-mcast] proposes
   multiple mechanisms for PE-PE transmission of customer multicast
   routing information:

   1.  Full per-MVPN PIM peering across an MI-PMSI (described in section
       5.2.1).

   2.  Lightweight PIM peering across an MI-PMSI (described in section
       5.2.2)

   3.  The unicasting of PIM C-Join/Prune messages (described in section
       5.2.3)

   4.  The use of BGP for carrying C-Multicast routing (described in
       section 5.3).

3.3.1.  PE-PE signalling scalability

   Scalability being one of the core requirements for multicast VPN, it
   is useful to compare the proposed C-multicast routing mechanisms from
   this perspective : Section 4.2.4 of [RFC4834] recommends that "a
   multicast VPN solution SHOULD support several hundreds of PEs per
   multicast VPN, and MAY usefully scale up to thousands" and section
   4.2.5 states that "a solution SHOULD scale up to thousands of PEs
   having multicast service enabled".

   At such scales of multicast deployment, the first and third
   mechanisms require the PEs to maintain a large number of PIM
   adjacencies with other PEs of the same multicast VPN (which implies
   the regular exchange PIM Hellos with each other) and to refresh
   C-Join/Prune states, thus limiting the scalability of these
   approaches.

**** This analysis is questionable, and appears to be based on unstated
**** assumptions.

**** If one assumes that the Join/Prune states don't change much, then of
**** course the refresh overhead is useless overhead.  On the other hand, if
**** one assumes that the Join/Prune states change frequently, perhaps more
**** frequently than than the refresh rate, the overhead due to the
**** refreshes of things that don't change is just in the noise.
**** Furthermore, BGP has never shown itself to be scalable in the face of
**** such rapid changes.  Without an analysis of rate of change, the
**** conclusions above are unsupported.

**** There is also no analysis to show that PE-PE PIM overhead is going to
**** be the bottleneck, rather than CE-PE PIM, which the authors appear
**** willing to tolerate.  

**** Further, the assumption that we can eliminate much of the state that
**** constitutes the "PIM adjacencies" is questionable.  The fact is that
**** whether one is doing PE-PE PIM or not, there is considerable state that
**** must be devoted to keeping track of who the upstream PE is for each
**** C-multicast flow.  One has to keep track of unicast routing changes
**** that impact this, one has to make sure that packets one receives are
**** from the right upstream PE, etc.  This is a good piece of what is
**** involved in maintaining the PIM adjacencies.

   The third mechanism would reduce the amount of C-Join/Prune
   processing for a given multicast flow for PEs that are not the
   upstream neighbor for this flow, but would require "explicit
   tracking" state to be maintained by the upstream PE, and would
   require refresh-reduction mechanisms to be used to mitigate the fact
   that PIM "Join suppression" cannot be used (what such a refresh-
   reduction mechanism would be has not been described yet).  For these
   reasons, it seems that this approach is not suitable for higher scale
   scenarios.

**** I'm not sure I understand the relationship between PIM Join Suppression
**** and refresh reduction.

**** By the way, the BGP-based signaling has an entire set of messages and
**** procedures whose purpose is to support explicit tracking: Leaf A-D
**** routes.  These messages and procedures exist for two reasons: (a) to
**** support aggregation schemes that try to aggregate trees that are more
**** or less congruent, and (b) to support the use of RSVP-TE P2MP LSPs to
**** instantiate S-PMSIs.  If the authors believe that explicit tracking is
**** inherently unscalable, they should go on to point out that neither the
**** use of RSVP-TE P2MP LSPs nor a strategy of "aggregation by congruence"
**** are scalable schemes.

   The second mechanism would operate in a similar manner to full per-
   MVPN PIM peering except that PIM hellos are not transmitted and PIM
   C-Join/Prune refresh-reduction would be used, thereby improving
   scalability, but this approach has been further developed and it is
   unclear if it is applicable.

**** I can't argue with this, but the reason it's been hard to get further
**** development of the lightweight PIM peering is that few of the
**** principals in PIM WG agree that there is a practical scaling problem
**** with it.


   The first and second mechanisms can leverage the "Join suppression"
   behavior and thus improve the processing burden of an upstream PE,
   sparing the processing of one Join message for each remote PE joined
   to a multicast stream, but this improvement comes at the price of
   requiring all PEs of a multicast VPN to process all PIM Joins sent by
   any PE participating in the same multicast VPN whether they are the
   upstream PE or not.

**** What one gains for this "price" though is that a PE won't send a Join
**** at all if it sees that someone else has already sent it.  The price is
**** thus well worth it if there are lots of nodes with receivers for the
**** given group.  The price is only high when there aren't lots of nodes
**** with receivers.  So the analysis seems to assume that the receivers for
**** each group are sparsely distributed among the PEs of a particular VPN.
**** The authors have not provided any grounds to support that assumption.

   The fourth mechanism (the use of BGP for carrying C-Multicast
   routing) would have a comparable drawback of requiring all PEs to
   process a BGP C-multicast route only interesting a specific upstream
   PE.  For this reason the C-multicast routing approach leverages the
   Route-Target constraint mechanisms, which specifically allows only
   the interested upstream PE to receive a BGP C-multicast route.  When
   RT constraints are used the fourth mechanism reduces the processing
   load put on the provider infrastructure for customer multicast
   routing to the minimum 

**** I think this is a really nifty feature of the BGP method, but let's not
**** get carried away.  One cannot say that this method keeps the processing
**** load at a "minimum", because this method causes all nodes with
**** receivers to send Joins, whereas a true Join suppression scheme would
**** allow those joins to be, err, suppressed.  Further, the route
**** reflectors are processing all these joins, even if all the PEs are not.
**** The authors just have not provided any grounds for calling this
**** "reducing the processing load on the infrastructure to the minimum".

**** One also needs to look at related processing loads which are not
**** mentioned here.  If one needs to do explicit tracking, say, because one
**** is using RSVP-TE P2MP LSPs as S-PMSIs, then not only does each PE with
**** receivers have to send a Join, each one has to send a Leaf A-D route as
**** well.

  (by avoiding any processing by "unrelated"
   PEs, that are nor the joining PE nor the upstream PE), and inherits
   BGP features that are expected to improve scalability (through, for
   instance, providing a means to offload some of the processing burden
   associated with client multicast routing onto BGP route-reflectors),
   and being based on TCP has no refresh-related scalability limit.
   (Please refer to Handling the PIM routing processing load load, for a
   detailed explanation of the differences in ways of handling the
   C-multicast routing load, between the PIM-based approaches and the
   BGP-based approach)

**** The savings gained by using route reflectors is not nearly as large in
**** multicast as in unicast.

**** In unicast, if there is no RR, then for a given VPN, each PE would need
**** to unicast a copy of the VPN routing info to every other PE that is
**** attached to the VPN.  So by using an RR, you end up transmitting much
**** less info.  But in multicast, you don't unicast the same info to each
**** PE, you multicast it, so you're only sending it once.  Using the RR
**** provides no off-loading at all of the processing load to do the
**** transmissions.

**** In unicast, the RR runs the decision process, and only sends the
**** installed routes to the PEs.  This prevents the PEs from getting
**** getting information about the uninstalled routes, thus reducing the
**** amount of info each PE has to get.  It also off-loads the processing
**** needed to do the decision process.  Whether this is really a good thing
**** or not is not so clear.  It is common to adopt various tricks to force
**** certain routes through the RR (e.g., using different RDs for different
**** VRFs of the same VPN), and BGP ADD_PATH idea never seems to die either.
**** But in any case, the effect on multicast is nowhere near as strong.  In
**** the BGP-based C-multicast scheme, every PE has to choose its own
**** upstream PE for each C-multicast flow, track changes in the unicast
**** routing that might affect its choice, participate in the signaling to
**** construct the multicast trees, etc.  The only processing load that the
**** RR really saves is the processing needed to do Join Suppression.  And
**** if you have a need to do explicit tracking, it doesn't even do that.

   However, it is to be noted that offloading customer multicast routing
   processing onto BGP route-reflectors will increase the processing
   load placed on the route-reflector infrastructure, which, in the
   higher scale scenarios, is expected to call for adaptations such as:

   o  a separation of resources for unicast and multicast VPN routing :
      using mvpn-dedicated BGP sessions and/or mvpn-dedicated BGP
      instances on route-reflectors, and/or mvpn-dedicated route-
      reflectors ;

   o  the deployment of additional route-reflectors resources :
      increased processing resources on existing route reflectors or
      additional route-reflectors.

**** So at the end of the section, we see a few understated sentences
**** saying, "by the way, the BGP scheme may not work at scale with the
**** existing route reflector infrastructure or even with existing route
**** reflector implementations".  To me, that seems like a rather serious
**** disadvantage, which should not be dismissed or minimized with wishful
**** thinking about future "adaptations".

**** I do agree that C-multicast routing might call for major changes in the
**** RR implementations.  I think one of the requirements we should set is
**** that mandatory features should not require such changes.

**** The fact that the authors make a statement about the need to change all
**** the route reflectors makes me worry that the BGP solution may not be
**** anywhere near ready for PS status.  Perhaps it should be advanced as
**** experimental until we have a better understanding of its properties.

3.3.2.  P-routers scalability

   Mechanisms (1) and (2) are restricted to use within multicast VPNs
   that use an MI-PMSI, thereby necessitating:

      the use of a P-multicast tree technique that allows shared trees
      (for example PIM-SM in ASM mode or MP2MP LDP)

   or   the use of one P-multicast tree per PE per VPN, even for PEs
      that do not have sources in their directly attached sites for that
      VPN.

   By comparison, the fourth mechanism doesn't impose either of these
   restrictions, and when P2MP trees are used only necessitates the use
   of one tree per VPN per PE attached to a site with a multicast source
   or RP (or with a candidate BSR, if BSR is used), thereby improving
   the amount of state maintained by P-routers compared to the amount
   required to build an MI-PMSI with P2MP trees.

**** I have to say I cannot follow the reasoning here. 

**** Perhaps the point the authors are trying to make is the following.  If
**** one uses multicast distribution trees to transmit control packets,
**** there is a chance that one will need to set up multicast distribution
**** trees that only carry control packets (and that are not needed for
**** carrying data) .  Each such tree requires state in the P routers.  One
**** uses less state in the P routers if one only sets up the multicast
**** trees that are needed for carrying data packets, and doesn't set up any
**** just for carrying control packets.

**** This reasoning depends heavily upon unstated assumptions about how the
**** multicast sources are distributed around the customer sites.  One might
**** also ask whether this is anywhere close to being a bottleneck.
**** Nevertheless, it is shown in draft-rosen-mvpn-profiles how one can use
**** MP2MP LSPs to avoid setting up tunnels that are not needed for data
**** packets, while still using PIM.

**** If MP2MP LSPs are made a mandatory feature, then the issue goes away
**** entirely.  Since MP2MP LSPs are the only reasonable way to support
**** BIDIR-PIM C-flows, they are needed anyway.

**** (Note that the other way of supporting BIDIR-PIM flows require either
**** upstream-assigned MPLS labels or the "PE as RP" scheme, neither of
**** which is regarded as a mandatory feature.)


3.3.3.  Impact of C-multicast routing on Inter-AS deployments

   Furthermore, co-existence with unicast inter-AS VPN options, and an
   equal level of security for multicast and unicast including in an
   inter-AS context, are specifically mentioned in sections 5.2.6, 5.2.8
   and 5.2.12 of [RFC4834].

   The first three mechanisms impose direct PE to PE communications :
   this does not apply well to an inter-AS option B context, because of
   security and robustness issues that are involved by such a level of
   reachability and interaction between PEs in different ASes.

**** This is not a technical argument, just an unsupported claim.  It would
**** be interesting to see this point developed in proper detail though.

   Their use in an inter-AS context is possible, but not without
   limitations or additional engineering design trade-offs depending
   upon the interconnect types.

**** See previous comment.

   By comparison, the fourth option (the use of BGP for carrying
   C-Multicast routing) does not have any of the above limitations
   related to inter-AS deployments, and also provides an additional
   alternative to facilitate such deployments through the possibility of
   using segmented inter-AS trees.

**** The claim here is just wrong.  The use of segmented inter-AS trees does
**** not presuppose the use of BGP C-multicast routing.

3.3.4.  Security and robustness

   BGP supports MD5 authentication of its peers for additional security,
   thereby possibly benefit directly to multicast VPN customer multicast
   routing, whether for intra-AS or inter-AS communications.  By
   contrast, with a PIM-based approach, no mechanism providing a
   comparable level of security to authenticate communications between
   remote PEs has been yet fully described yet
   [I-D.ietf-pim-sm-linklocal][], and in any case would require
   significant additional operations for the provider to be usable in a
   multicast VPN context.

**** See prior comment on security.  Also, how common is it for SPs to use
**** MD5 on the BGP connections between PEs and RRs?

   The robustness of the infrastructure, especially the existing
   infrastructure providing unicast VPN connectivity, is key.  The
   C-multicast routing function, especially under load, will compete
   with the unicast routing infrastructure.  With the PIM-based
   approaches, unicast and multicast VPN routing are expected to only
   compete in the PE, for routing plane processing resources.  In the
   case of the BGP-based approach, they will compete on the PE for
   processing resources, and in the route-reflector if they are used.
   It is identified that in both cases, mechanisms will be required to
   arbitrate resources (e.g. processing priorities).  In the case of
   PIM-based procedures, between the different control plane routing
   instances in the PE.  And in the case of the BGP-based approach, this
   is likely to require using distinct BGP sessions for multicast and
   unicast, possibly toward distinct route-reflectors.

   Multicast routing is dynamic by nature, and multicast VPN routing has
   to follow the VPN customers multicast routing events.  The different
   approaches can be compared on how they are expected to behave in
   scenarios where multicast routing in the VPNs is subject to an
   intense activity.  Such a load would be comparable to the higher
   scale scenarios described in xx (Section 3.3.1) and the fourth (BGP-
   based) approach - when deployed to handle a significant multicast VPN
   routing load - is expected to be the most efficient approach in a
   such case.  

**** I missed the argument which leads to the conclusion that "the fourth
**** approach ... is expected to be the most efficient approach"

**** Of course, if "expected" really just means "hoped", no additional
**** reasoning is needed ;-)

   On the other hand, while the BGP-based approach is likely
   to suffer a slowdown under a load raising beyond processing resources
   (because of possibly congested TCP sockets), the PIM-based approaches
   would react to such a load by dropping messages, with later failure
   recovery through message refreshes, this being at the expense of some
   predictability.

   In fact both situations are problematic, and what seems important is
   the ability for the VPN backbone operator to (a) limit the amount of
   multicast routing activity that can be triggered by a multicast VPN
   customer, and to (b) provide the best possible independence between
   distinct VPNs.  It seems that both of these can be addressed through
   local implementation improvements, and that both the BGP-based and
   PIM-based approach could be engineered to provide (a) and (b).  It
   can be noted though that the BGP approach proposes ways to dampen
   C-multicast route withdrawals and/or advertisements, and thus already
   describes a way to provide (a), while nothing hasn't been yet
   described for the PIM-based approach, though these type of approaches
   rely on a per VPN dataplane to carry the mvpn control plane, and thus
   might naturally benefit from this first level of separation to solve
   (b).

**** It seems to me that the reasoning here just doesn't provide a basis for
**** the conclusions.  


3.3.5.  C-multicast VPN join latency

   Section 5.1.3 of [RFC4834] states that "the group join delay [...] is
   also considered one important QoS parameter.  It is thus RECOMMENDED
   that a multicast VPN solution be designed appropriately in this
   regard.".  In a multicast VPN context, the "group join delay"of
   interest is the time between a CE sending a PIM Join to its PE and
   the first packet of the corresponding multicast stream being received
   by the CE.

   The different approaches proposed seem to have different
   characteristics in how they are expected to impact join latency:

   o  the PIM-based approaches minimize the number of control plane
      processing hops between the PE of a new receiver and the PE of the
      multicast source, and being datagram based introduce minimal
      delay, thereby possibly having a best-case join latency as good as
      possible depending on implementation efficiency

   o  the BGP-based approach uses TCP exchanges, that may introduce an
      additional delay depending on BGP implementation performances, but
      are expected to control the worst-case join latency under load

   o  the BGP-based approaches is designed to allow the introduction of
      route-reflectors which will introduce an additional processing
      delay between the receiver-PE and the source-PE

   o  in higher scale scenarios, the BGP-based approach is expected to
      provide some control of the worst-case join latency whereas the
      PIM-based approaches may behave less efficiently if PIM messages
      are lost

**** There's that "is expected to" again.  What is the reasoning that would
**** support this conclusion?  As far as I can tell, the BGP-based approach
**** provides no more control over the worst case than any other possible
**** approach. 

   o  in higher scale scenarios, the introduction of route-reflectors in
      the BGP architecture are expected to provide processing efficiency
      which is expected to improve latency compared to the PIM-based
      approaches

**** I don't see why.  Please provide the reasoning that supports this
**** conclusion. 

   This qualitative comparison of approaches tend to highlight that the
   BGP based approach is designed for controlling the "worst-case" join
   latency 

**** This claim has not been supported by any reasoning.  

   whereas for the PIM-based approaches seem to structurally be
   able to reach the shorter "best-case" group join latency (especially
   compared to deployment of the BGP-based approach where route-
   reflectors are used). 

**** In other words, PIM provides lower latency in the typical case, where
**** the environment is neither lossy nor congested.

   Doing a quantitative comparison is not
   possible without referring to specific implementations and
   benchmarking procedures, and would possibly expose different
   conclusions, especially for best-case group join latency for which
   performance is expected vary with implementations.  We can also note
   that improving a BGP implementation for reduced latency of route
   processing would not only benefit multicast VPN group join latency,
   but the whole BGP-based routing.

**** Improving BGP is always useful, but it's hard to see what that has to
**** do with any of the issues here.

   Last, it is to be noted that the C-multicast routing procedures will
   only impact the group join latency of a said multicast stream for the
   first receiver that is located across the provider backbone from the
   multicast source.

**** This is not the case.  For one thing, different receiving PEs may
**** select different upstream PEs for the same C-source.  PE1 may have
**** joined C-(S,G) via PE2, but when PE3 tries to join via PE4, join
**** latency is still an issue.

**** We might also take a closer look at the case where PE1 and PE2 have
**** both joined C-(S,G) via PE3.  If an I-PMSI is being used, then
**** presumably PE1 and PE2 are already joined to it, and can immediately
**** begin receiving the traffic.  But suppose an S-PMSI is being used, PE1
**** is already joined to it, but PE2 has not joined it yet.  Since the
**** BGP-based S-PMSI procedures require PE2 to know already that C-(S,G) is
**** bound to a particular S-PMSI, then PE2 just has to join that S-PMSI.
**** So that does reduce latency, but at the expense of state; each PE has
**** to know, for ALL flows in the MVPN, which flows have been bound to
**** which S-PMSIs.  Now suppose that the S-PMSI is instantiated by an
**** RSVP-TE P2MP LSP.  In this case, PE2 cannot just join the S-PMSI.
**** First it must issue a BGP Leaf A-D route specifying that it is
**** interested in C-(S-G).  When PE3 sees this, it initiates the signaling
**** to add PE2 to the LSP.  So although PE3 doesn't have to get a join from
**** PE2, it does have to get a BGP message from PE2.

**** Increased prune latency of course, will keep unwanted traffic coming
**** for a longer time.

3.3.6.  Architectural considerations

   The fourth mechanism (the use of BGP for carrying C-Multicast
   routing) would appear to fit well with the current unicast
   architecture as BGP is the customer routing distribution protocol
   used in unicast VPNs and therefore using BGP for customer routing
   distribution within multicast VPNs avoids the introduction of an
   additional protocol that would require additional OAM processes and
   tools.  

**** I imagine that was one of the arguments in favor of using MOSPF instead
**** of PIM ;-)

**** As long as PIM is still used for CE-PE multicast routing, it is false
**** to say that using PIM PE-PE requires an additional protocol or
**** additional "OAM processes and tools".  I also do not understand why it
**** is simply presupposed that the new BGP address families (and the 50
**** pages of new specification added to BGP) do not impose requirements for
**** any new OAM procedures.

   Service provider's with deployed unicast (RFC4364) VPNs
   already have extensive deployment and operations experience of using
   BGP as a customer routing distribution protocol including OAM
   processes and tools.  Such processes and tools will require
   modification in order to support customer multicast routing but those
   modifications are anticipated to be less than those required to
   develop new processes and tools for a distinct customer routing
   protocol.

**** "are anticipated to be", is that yet another euphemism for "are hoped
**** to be"? Again, this paragraph seems to forget that PIM is needed in the
**** PEs anyway.

   It should be noted that because PIM will be used as the CE-PE
   customer routing distribution protocol, service providers will still
   need OAM processes and tools in order to manage the PIM protocol, so
   this rationale only applies to a subset of the tools and processes
   already in place.

**** This "should have been noted" several paragraphs back where it would be
**** even clearer that that it undercuts much of the authors' reasoning.

   An illustrative example of the benefit brought by consistency with
   unicast design is how the "extranet" feature can be implemented :
   when BGP-based mechanisms are used, the already defined and well
   understood BGP route target import/export semantics are just reused
   and applied to BGP mVPN routes.  By contrast, it is not specified how
   implementing the same feature would be done in the context of other
   alternative mechanisms, and unclear if this is possible without
   significant engineering trade-offs given that their control plane is
   tied to a specific MI-PMSI tunnel.  Note that the support for the
   Extranet feature is stated as a MUST in sections 5.1.6 of [RFC4834].

**** Even the existing MVPN deployments, available for years, provide
**** extranet support.  So there's a proof in practice that the BGP-based
**** mechanisms are not needed for this purpose, notwithstanding the
**** authors' theoretical arguments about abstract qualities of
**** "consistency".

   Section 5.2.10 of [RFC4834] states that "as far as possible, the
   design of a solution SHOULD carefully consider the number of
   protocols within the core network: if any additional protocols are
   introduced compared with the unicast VPN service, the balance between
   their advantage and operational burden SHOULD be examined
   thoroughly".  Considering that the recommendation of the authors
   would be BGP for auto-discovery and S-PMSI signaling, the choice of
   BGP for customer multicast routing would be consistent with the
   protocol choice for unicast VPNs and would adequately address this
   requirement.

**** Since PIM is needed in the PEs anyway, the choice of PIM addresses this
**** requirement equally well.

3.3.7.  Conclusion on C-multicast routing

   The fourth approach (BGP-based) for customer multicast routing
   clearly presents some advantages over the PIM-based alternatives.
   However it has yet to be deployed within an operational MVPN, and
   only limited experience exists with its implementations.  By
   contrast, PIM-based mechanisms lack many of these benefits and have
   identified limitations in how they can handle customer multicast
   routing load in higher-scale scenarios.  Despite these, experience
   showed that the "Full PIM peering" approach is operationally viable.

   Consequently, at the present time and until there is experience with
   all of the proposed mechanisms it is not clear which of the above
   mechanisms should be recommended as the preferred solution to
   implementers.  However, it would appear prudent for implementations
   to consider supporting both the fourth (BGP-based) and first (full
   per-MPVN PIM peering) mechanisms.  Further experience on both
   implementations is likely to be required before some best practice
   can be defined.

**** Let's see, we have one mechanism which is known to be operationally
**** viable, and another that has only recently been invented, with which
**** there is zero operational experience, and which has only one partial
**** implementation.  Which one is eligible to be called a "best practice"?
**** Which one should be called an "experiment"?  

   Moreover to improve the clarity of the proposed specifications,
   considering that neither hello suppression nor refresh-reduction
   procedures are currently specified or documented and that it is not
   clear what the impact to the PIM state machine of these additional
   procedures may be, the authors recommend that the proposals for
   lightweight PIM peering across an MI-PMSI (the second mechanism) and
   for the unicasting of PIM C-Join/Prune messages (the third mechanism)
   be removed from the current solution document
   [I-D.ietf-l3vpn-2547bis-mcast] (at least until they have been further
   specified and both their impact and benefit on a multicast VPN
   deployment is spelled out).

**** Those proposals appear only in the "framework" section, which seems
**** perfectly appropriate.  If they do get developed at some later time, we
**** wouldn't want them ruled out without consideration as being
**** inconsistent with the framework, would we?  (BTW, there is a draft for
**** TCP-based support of PIM now, being presented to the PIM WG.)  

**** Serious comments about how to improve the "clarity of the proposed
**** specifications" would of course be welcome. 

3.4.  Encapsulation techniques for P-multicast trees

 ...

   Current unicast VPN deployments use a variety of LDP, RSVP-TE and
   GRE/IP-Multicast for encapsulating customer packets for transport
   across the provider core of VPN services.  

**** There are current unicast VPN deployments that also use MPLS-in-L2TPv3.
**** Why is this not in the list of things that have to be supported for
**** MVPN then?  

   It is recommended that
   implementations support the three corresponding multicast tree
   encapsulations techniques, namely: mLDP, P2MP RSVP-TE and GRE/
   IP-multicast in order to allow the same encapsulations to be used for
   unicast and multicast traffic as well as facilitating migration from
   [I-D.rosen-vpn-mcast] to an MPLS label based encapsulation.

**** Given that there is no specification requiring support for all these
**** encapsulations in unicast, it hardly makes sense to require support for
**** all of them in multicast.  Anyway, I'd say it is unreasonable to
**** REQUIRE support for every possible tunnel encapsulation (except L2TPv3,
**** of course ;-)) just because for each encaps there is someone who wants
**** it.

   All three of the above encapsulation techniques support the building
   of P2MP multicast trees.  In addition mLDP and GRE/IP-ASM-Multicast
   implementations may also support the building of MP2MP multicast
   trees.  The use of MP2MP trees may provide some scaling benefits to
   the service provider as only a single MP2MP tree need be deployed per
   VPN, thus reducing the amount of multicast state that needs to be
   maintained by P routers.  This gain in state is at the expect of
   bandwidth optimization, since sites that do not have multicast
   receivers for multicast streams sourced behind a said PE group will
   still receive packets of such streams, leading to non-optimal
   bandwidth utilization across the VPN core. 

**** The use of MP2MP LSPs is perfectly compatible with the use of S-PMSIs
**** for optimizing the routing of individual C-flows.  The claim that they
**** lead to non-optimal bandwidth utilization thus seems to be completely
**** unsupported. 

   One thing to consider is
   that the use of MP2MP multicast tree will require configuring the
   same tree identifier or multicast ASM group address in all PEs, and
   will not provide the kind of autoconfiguration possible with P2MP
   trees.

**** If you think that it would be better not to have to configure the ASM
**** group identifier at all the PEs, this can easily be avoided.  You
**** really only have to configure it at one PE, and all the other PEs can
**** auto-discover it.  This requires only a minor change to the mvpn-bgp
**** spec.

**** I'm not sure how important that is though, you're configuring all the
**** PEs with the RTs for the MVPN, what's the marginal difficulty in
**** configuring them with a group identifier as well?

**** The text above seems like an example of stretching to try to find
**** something bad to say about a scheme which is really the technically
**** superior scheme.

   MVPN services can also be supported over a unicast VPN core through
   the use of ingress PE replication whereby the ingress PE replicates
   any multicast traffic over the P2P tunnels used to support unicast
   traffic.  While this option does not require the service provider to
   modify their existing P routers (in terms of protocol support) and
   does not require maintaining multicast-specific state on the P
   routers in order for the service provider to be able deploy a
   multicast VPN service, the use of ingress PE replication obviously
   leads to non-optimal bandwidth utilization and it is therefore
   unlikely to be the long term solution chosen by service providers.
   However ingress PE replication may be useful during some migration
   scenarios or where a service provider considers the level of
   multicast traffic on their network to be too low to justify deploying
   multicast specific support within their VPN core.

**** The spec also has a strange combination of ingress replication with
**** multicast trees, whereby one unicast tunnels the packets to the root of
**** a shared tree, and then decapsulates them and distributes them down the
**** tree.  Senders then get their own transmissions back!  I wonder why the
**** authors have not asked for this partially specified and not very useful
**** technique to be removed from the spec.

   All proposed approaches for control plane and dataplane can be used
   to provide aggregation amongst multicast groups within a VPN and
   amongst different multicast VPNs, and potentially reduce the amount
   of state to be maintained by P routers.  However the latter -- the
   aggregation amongst different multicast VPNs will require support for
   upstream-assigned labels on the PEs.  Support for upstream-assigned
   labels may require changes to the data plane processing of the PEs
   and this should be taken into consideration by service providers
   considering the use of aggregate S-PMSI tunnels for the specific
   platforms that the service provider has deployed.

3.5.  Inter-AS deployments options

...

   The segmented inter-AS solution would appear to offer the largest
   degree of deployment flexibility to operators, however the non-
   segmented inter-AS solution can simplify deployment in a restricted
   number of scenarios and [I-D.rosen-vpn-mcast] only supports the non-
   segmented inter-AS solution and therefore the non-segmented inter-AS
   solution is likely to be required by some operators for backward
   compatibility and during migration from [I-D.rosen-vpn-mcast] to
   [I-D.ietf-l3vpn-2547bis-mcast].

   The applicability of segmented or non-segmented inter-AS tunnels to a
   given deployment or inter-provider interconnect will depend on a
   number of factors specific to each service provider.  However, due to
   the additional deployment flexibility offered by segmented inter-AS
   tunnels, it is the recommendation of the authors that all
   implementations should support the segmented inter-AS model.
   Additionally, the authors recommend that implementations should
   consider supporting the non-segmented inter-AS model in order to
   facilitate co-existence with existing deployments, and as a feature
   to provide a lighter engineering in a restricted set of scenarios,
   although it is recognized that initial implementations may only
   support one or the other.

   Additionally, the authors note that the proposed BGP-based approaches
   for S-PMSI signaling and C-multicast routing information distribution
   provide a good fit with both segmented and non-segmented inter-AS
   tunnels.  In contrast the UDP-TLV based approach for S-PMSI signaling
   appears to be incompatible with segmented inter-AS tunnels, and it is
   unclear if the proposed PIM-based approaches for C-multicast routing
   information distribution would be fully applicable to segmented
   inter-AS tunnels.

**** I don't see any reason why one couldn't instantiate an MI-PMSI service
**** with segmented inter-AS trees, so I don't see why there is any
**** incompatibility between the segmented inter-AS trees and the various
**** non-BGP approaches to signaling.

**** While segmented inter-AS trees do have the advantages cited, there are
**** some scalability concerns.  The S-PMSIs are individuated on a
**** per-root-PE basis, not on a per-root-AS basis.  To make efficient use
**** of these you have to have aggregation via upstream-assigned labels at
**** the border routers.  Of course, upstream-assigned labels are not a
**** required feature.  The border routers need to be aware of each MVPN
**** that passes traffic through them.  While unsegmented trees are not a
**** particularly good solution in multi-provider scenarios, it is far from
**** clear that the segmented inter-AS trees provide a good solution either.
**** This is an area where I think more study and more review would be
**** appropriate.

...

6.  Summary of recommendations

   The following list summarizes the authors' recommendations.  These
   recommendations are not intended to prevent the implementation of
   alternative solutions, rather they are the authors' recommendations
   for the mechanisms that should be made mandatory in
   [I-D.ietf-l3vpn-2547bis-mcast] and therefore be supported by all
   implementations.

   It is the authors' recommendation:

   o  that BGP-based auto-discovery be the mandated solution for auto-
      discovery ;

   o  that BGP be the mandated solution for S-PMSI signaling ;

   o  that the mandated solution for S-PMSI switch-over be the mechanism
      based on the source-connected PE switching traffic from the I-PMSI
      tunnel to the S-PMSI tunnel, without transmitting traffic on both
      at the time ;

   o  that implementations support both the BGP-based and the full per-
      MPVN PIM peering solutions for PE-PE transmission of customer
      multicast routing until further operational experience is gained
      with both solutions ;

**** I really don't think this provides a reasonable basis for requiring
**** support for BGP-based C-multicast routing.  If one wants to require
**** support for something, the alternative that is already deployed
**** and known to work is the one that should be mandated, and the
**** alternative that is still experimental should be made optional.

**** As a wise man once said, "the proof of the pudding is in the eating,
**** not in the debate about the pudding." ;-)

   o  that implementations support the following multicast tree
      encapsulations: mLDP, P2MP RSVP-TE and GRE/IP-Multicast ;

   o  that implementations support segmented inter-AS tunnels and
      consider supporting non-segmented inter-AS tunnels (in order to
      maintain backwards compatibility and for migration) ;

**** I really do like the segmented inter-AS tunnel idea, and in theory it
**** is a very good thing, but of course we don't know how it will pan out
**** in practice, and there are scalability concerns.  I don't think this is
**** ready to be a "must implement" yet.

   o  implementations MUST support deployments when activation of a PIM
      RP function (PIM Register processing and RP-specific PIM
      procedures) or VRF MSDP instance is not required on any PE router.