[Softwires] Ops-dir review of draft-ietf-softwire-mesh-framework-05

This is an ops-dir review of draft-ietf-softwire-mesh-framework-05.

My main concern here is with the O&M implication of dynamically 
created and used softwires, which do not run any routing protocol and 
there is no IETF protocol or management framework which could be used 
to verify the correct operation of these tunnels.  Essentially this 
seems to require that operators deploy about N (where N is the number 
of connected sites) data probing hosts which periodically test 
connectivity with the N-1 other probes.  Or this could be implemented 
with proprietary mechanisms at AFBR routers.

operations and management issues
--------------------------------

Softwires do not run BGP keepalives and do not (apparently) run IGP
protocol.  As a result, it seems difficult for the network operator
to notice when/if connectivity through a softwire no longer works. 
(Previously, BGP or IGP timeouts were an indicator for this.)

This is discussed in Section 10 (some further comments on this below).  They
key point there seems to be that there are ways how an operator could build
monitoring to the softwires, but the IETF protocols do not seem to provide a
protocol solution for this (while they do provide such a solution for
manually configured tunnels for example).

This seems like a significant drawback of this solution and I'd like to see
O&M aspects addressed better in the softwires framework.

In S 10:

   Examples of techniques applicable to softwire OAM include:

      o BGP/TCP timeouts between AFBRs

      o ICMP or LSP echo request and reply addressed to a particular AFBR

... BGP/TCP take a very long time, and their usage only verifies I-IP
signaling path between the endpoints, not data plane.  And what about ICMP
or LSP echo -- this is not clear whether it's run over the softwire or using
I-IP; I'm assuming they are not run on top of softwire.

As a result they are not very useful from OAM perspective.  Given that no
IGP is run on top of softwires, debugging E-IP connectivity issues seems
quite painful.  This is exacerbated by the fact that forwarding decisions to
softwire are done by "policy", not by longest prefix matching.  I.e., your
O&M connectivity test procedures also need to test that this policy is
working OK, and testing needs to be done from locations accepted by the
policy.  What you probably need is to build some kind of N^2 matrix of
data probes (using 1500B packet size or like) through the core to verify
that all softwires are opering correctly.  As a result, I'm having great
doubts about O&M (reliability, debuggability, "is it working or not?"
indications) aspects of this technology.

In S 5 (this is also bordering on architectural issue):

                        This leads to the following softwires deployment
    restriction: if a BGP Capability is defined for the case in which an
    E-IP NLRI has an I-IP NH, all the AFBRs in a given transit core MUST
    advertise that capability.

... is there any way an implementation could verify if this deployment
restriction holds not not?  For example, if one of the routers doesn't
happen to support this capability, how will this be detected by the
network operator?

architecture/int-area issues
----------------------------

The document does not describe MTU and fragmentation/reassembly issues in
the core network at all.  In this kind of service my assumption is that you
need to support 1500B packets at ingress when DF-bit is set or the packet is
IPv6.  Discussion in RFC4459 seems applicable here.  The operational
solution to this problem is the requirement to provision the core network
with larger MTUs so that all 1500B+encapsulation overhead can be supported
throughout the core.  This needs to be discussed in the text.

Also, in S 4.1:

    The AFBRs handle both I-IP and E-IP. However, only I-IP is used on
    AFBR's "core facing interfaces", and E-IP is only used on its client-
    facing interfaces.

... At some point, a client might want to upgrade to dual stack.  Then,
client-interface may use both E-IP and I-IP.  This solution should be
applicable then as well (just that mesh tunnels won't be used for I-IP
from a client port).  I think it is, but this needs to be made clear.

7. Choosing to Forward Through a Softwire

   In many cases, the policy will be very simple.  Some useful policies
   are:

      - if routing says that an E-IP packet has to be sent out a "core-
        facing interface" to an I-IP core, send the packet through a
        softwire [...]

... This text seems a bit confusing.  It seems you are requiring a
modification in forwarding logic of implementations supporting this
mechanism?  I.e., the longest-prefix matching is no longer sufficient with
softwires (i.e.: when deciding whether a packet to a particular E-IP
destination prefix should be tunneled through a softwire or forwarded
natively).

Basically what you have in AFBR's is BGP routing table with E-IP prefixes
with I-IP nexthops (and these I-IP nexthops are populated using IGP through
physical interfaces), and you can't associate I-IP nexthops with softwires
tunnel interface(s) because the next-hops must use BGP over the physical
network.

If I interpret this correctly, this is a substantial difference in the
forwarding paradigm and requirements, and should be more clearly described.
I wonder, how you would go about implementing this, in any case?

    In the case where E-IP is IPv4 and I-IP is IPv6, it is possible to do
    this translation algorithmically.  A can translate the IPv4 S and G
    into the corresponding IPv4-mapped IPv6 addresses [RFC4291], and then
    B can translate them back.  The precise circumstances under which
    these translations are done would be a matter of policy.

... But the corresponding IPv4-mapped IPv6 address for G is not a multicast
address because it does not start with FF00::/8, and I suspect as a result
all implementations will treat such a G address as a unicast address.  I
guess one could fix this by standardizing a  group mapping to use some
multicast prefix under ff00::/8 and encode the v4 address in the bottom
bits.

missing standardization, reference normativeness issues, etc.
-------------------------------------------------------------

    The method of encoding the (S,G) into the FEC identifier needs to be
    standardized.  The encoding must be self-identifying, so that a node
    which is the root of a P2MP LSP can determine whether a FEC
    identifier is the result of having encoded a PIM (S,G).

    The appropriate state machinery must be standardized so that PIM
    events at the AFBRs result in the proper mLDP events.  For example,
    if at some point an AFBR determines (via PIM procedures) that it no
    longer has any downstream receivers for (S,G), the AFBR should invoke
    the proper mLDP procedures to prune itself off the corresponding P2MP
    LSP.

.. this would seem to call for a normative reference to this
standardization?

11.2. MVPN-like Schemes

.. especially this section seems to rely very heavily on [L3VPN-MCAST],
yet that is an informative reference; should probably be normative.

    In the latter two cases, BGP recursive next hop resolution needs to
    be done, and encapsulations may need to be stacked.

... And..? Have these two things been specified somewhere?  This calls for a
reference.

security issues
---------------

    However, attacks of this sort can result in policy violations.  The
    authorized transmitting endpoint(s) of a softwire may be following a
    policy according to which only certain payload packets get sent
    through the softwire.  If unauthorized nodes are able to encapsulate
    the payload packets so that they arrive at the receiving endpoint
    looking as if they arrived from authorized nodes, then the properly
    authorized policies have been side-stepped.

.. I believe this could result in two kinds of attacks which could be
emphasized better (more below)  Above uses "policy violations" which
may be read to refer to a policy described in Section 7.  For example,
the focus of the second quoted sentence is odd -- and makes this seem like
an issue in transmitting endpoints while to me the major issue seems to be
unauthorized endpoints.

   1) attacks (DoS, exploits, etc. unwanted or unauthorized traffic)
      on destination if policy is applied at ingress.  Specifically,
      it is no longer enough to protect "the border" towards your peers at
      your border routers.  You will also need to protect "the border" at all
      softwires decapsulation points.  This is challenging because when you
      protect the border you will know which physical interface (your
      backbone, customer, peer, ...) a packet arrives and can filter on that.
      There you no longer know; you will need to rely on your border routers'
      spoofing or destination address protection mechanisms.

   2) source address spoofing.  It's the transmitting endpoint's job to
      perform source address spoofing prevention and possibly some other
      security policy checks (e.g. typical "protect the core" ACLs).  By
      injecting traffic directly to the receiving endpoint, these are
      avoided.  The target of the attack could be the router decapsulating
      traffic, or someone in the destination prefix.

editorial/nits
--------------

The number of front-page author exceeds the 5 that is normally the limit.
I suggest just listing the editor.

                             In many cases, these characteristics can be
    represented by arbitrarily selected communities or extended
    communities, and the policies at the ingress can be expressed in
    terms of these classes (i.e., communities).

.. the first time this is used, probably useful to say "BGP community
attributes (communities)" and provide an RFC reference.

    not generally have a route to the source of the tree, the AFBR must
    create include an "RPF (Reverse Path Forwarding) Vector" [RPF-VECTOR]
    in the PIM message.

.. remove redundant "create"

    The (S', G') trees should be SSM trees.

... Does this and the text in the next paragraph also refer to the
"IPv4 mapped to IPv6" case?  If so, the paragraph where mapping is discussed
should probably mention S' and G' to make the connection clearer.

    Note that this method cannot be used when the G is a Sparse Mode
    group.

.. this statement is ambiguous, because this doesn't support Dense mode
groups either, right?  The point here is that it must be an SSM channel,
right?

    If a tunnel lies entirely within a single administrative domain, then
    to a certain extent, then there are certain non-cryptographic
    techniques one can use to prevent spoofed packets from reaching a
    tunnel's receiving endpoint.  For example, when the tunnel
    encapsulation is IP-based:

      - The tunnel receiving endpoints can be given a distinct set of
        addresses, and those addresses can be made known to the border
        routers.  The border routers can then filter out packets,
        destined to those addresses, which arrive from outside the
        domain.

      - The tunnel transmitting endpoints can be given a distinct set of
        addresses, and those addresses can be made known to the border
        routers and to the tunnel receiving endpoints. The border routers
        can filter out all packets arriving from outside the domain with
        source addresses that are in this set, and the receiving
        endpoints can discard all packets which appear to be part of a
        softwire, but whose source addresses are not in this set.

... there is also a third option, or maybe it's a variation of the second
case.   When point-to-point tunnels are used, each router only decapsulates
packets from valid peers.  As a consenquence, it's sufficient that in border
routers you prevent source address spoofing in general; you no longer need
to enumerate and specify the transmitting endpoint addresses.
_______________________________________________
Softwires mailing list
Softwires@ietf.org
https://www.ietf.org/mailman/listinfo/softwires