<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std"
     docName="draft-khare-idr-bgp-flowspec-payload-match-03" ipr="trust200902"
     submissionType="IETF">
  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
        full title is longer than 39 characters -->

    <title abbrev="BGP FlowSpec Payload Matching">BGP FlowSpec Payload
    Matching</title>

    <author fullname="Anurag Khare" initials="A." role="editor"
            surname="Khare">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>2251 Corporate Park Drive</street>

          <city>Herndon</city>

          <region>Virginia</region>

          <code>20171</code>

          <country>US</country>
        </postal>

        <email>anuragk@juniper.net</email>
      </address>
    </author>

    <author fullname="John Scudder" initials="J." surname="Scudder">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>jgs@juniper.net</email>
      </address>
    </author>

    <author fullname="Luay Jalil" initials="L." surname="Jalil">
      <organization>Verizon</organization>

      <address>
        <email>luay.jalil@one.verizon.com</email>
      </address>
    </author>

    <author fullname="Michael Gallagher" initials="M." surname="Gallagher">
      <organization>Verizon</organization>

      <address>
        <email>michael.gallagher@verizon.com</email>
      </address>
    </author>

    <author fullname="Kirill Kasavchenko" initials="K." surname="Kasavchenko">
      <organization>NetScout</organization>

      <address>
        <email>Kirill.Kasavchenko@netscout.com</email>
      </address>
    </author>

    <date day="1" month="May" year="2019"/>

    <area>RTG</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <!-- WG name at the upperleft corner of the doc,
        IETF is fine for individual submissions.  
   If this element is not present, the default is "Network Working Group",
        which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>BGP</keyword>

    <keyword>Flowspec</keyword>

    <keyword>DDoS</keyword>

    <keyword>filter</keyword>

    <abstract>
      <t>The rise in frequency, volume, and pernicious effects of DDoS attacks
      has elevated them from fare for the specialist to generalist press.
      Numerous reports detail the taxonomy of DDoS types, the varying
      motivations of their attackers, as well as the resulting business and
      reputation loss of their targets.</t>

      <t>BGP FlowSpec (RFC 5575, "Dissemination of Flow Specification Rules")
      can be used to rapidly disseminate filters that thwart attacks, being
      particularly effective against the volumetric type. Operators can use
      existing FlowSpec components to match on pre-defined packet header
      fields. However recent enhancements to forwarding plane filter
      implementations allow matches at arbitary locations within the packet
      header and, to some extent, the payload. This capability can be used to
      detect highly amplified attacks whose attack signature remains
      relatively constant while values in the packet header vary, as well as
      the burgeoning variety of tunneled traffic.</t>

      <t>We define a new FlowSpec component, "Flexible Match Conditions", with
      similar matching semantics to those of existing components. This
      component will allow the operator to define bounded match conditions
      using bit offsets and a variety of match types.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>BGP FlowSpec <xref target="RFC5575"/> can be used to rapidly
      disseminate filters that thwart attacks, being particularly effective
      against the volumetric type. Operators can use existing FlowSpec
      components to match on pre-defined packet header fields. However recent
      enhancements to forwarding plane filter implementations allow matches at
      arbitary locations within the packet header and, to some extent, the
      payload. This capability can be used to detect highly amplified attacks
      whose attack signature remains relatively constant, while values in
      packet header vary. Varying values in packet headers generally make it
      challenging to mitigate such attacks.</t>

      <t>We define a new FlowSpec component, "Flexible Match Conditions", with
      similar matching semantics to those of existing components. This
      component will allow the operator to define bounded match conditions
      using bit offsets and a variety of match types.</t>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
      </section>
    </section>

    <section title="Motivation">
      <t>BGP FlowSpec couples both the advertisement of NLRI-specific match
      conditions, as well as the forwarding instance to which the filter is
      attached. This makes sense since BGP FlowSpec advertisements are most
      commonly generated, or at least verified, by human operators. The
      operator finds it intuitive to configure match conditions as
      human-readable values, native to each address family.</t>

      <t>It is much friendlier, for instance, to define a filter that matches
      a source address of 192.168.1.1/32, than it is to work with the
      equivalent binary representation of that IPv4 address. Further, it is
      easier to use field names such as 'IPv4 source address' as part of the
      match condition, than it is to demarc that field using byte and bit
      offsets.</t>

      <t>However, there are a number of use cases that benefit from the
      latter, more machine-readable approach.</t>

      <section title="Machine analysis of DDoS attacks">
        <t>Launching a DDoS is easier and more cost-effective than ever. The
        will to attack matters more than wherewithal. Those with the
        inclination can initiate one from the <eref
        target="https://github.com/649/Memcrashed-DDoS-Exploit">comfort of
        their homes</eref>, or even buy <eref
        target="https://www.facebook.com/PutinStresser/photos/a.1687498801469198/2024483917770683/?type=3">DDoS-as-a-Service</eref>,
        complete with 24x7 support and flexible payment plans.</t>

        <t>Despite their effectiveness, such attacks are easily thwarted -
        once identified. The challenge lies in fishing out a generally
        unvarying attack signature from a data stream. Machine analysis may
        prove superior here, given the size of input involved. The resulting
        pattern may not lie within a well-defined field; even if it happens
        to, it may be a more straight-forward workflow to have machine
        analysis result in a machine-readable filter.</t>

        <t>Below we illustrate the need for the suggested approach with two
        use cases.</t>

        <section title="Matching based on payload">
          <t>A vast majority of volumetric DDoS attacks are of
          reflection/amplification nature. They can often be identified by the
          UDP source port of a service that reflects and amplifies the attack
          traffic. However, there exist DDoS attack methodologies such as SSDP
          Diffraction or Bittorent amplification where values in most of layer
          3 and layer 4 header fields, including source and destination UDP
          ports, are varied. That makes it challenging if not impossible to
          classify and mitigate a DDoS attack based on existing Flow
          Specification components. At the same time these attacks very often
          have a constant pattern in payload. Using the pattern in payload as
          a matching criteria would help in mitigating such DDoS attacks.</t>
        </section>

        <section title="Matching based on any protocol header field or across fields">
          <t>BGP FlowSpec <xref target="RFC5575"/> defines 12 Flow
          Specification component types that can be used to match traffic.
          However, a DDoS attack might result in illegitimate traffic of a
          specific pattern in a layer 3 or layer 4 header, and this pattern
          would not have a respective component type. Examples are Time to
          Live field of IP header or Window field of TCP header. In order to
          avoid extending BGP FlowSpec <xref target="RFC5575"/> with all
          theoretically possible component types, this document proposes
          divorcing the search boundary from having to align with header
          fields. This allows flexibly matching patterns regardless of whether
          they have a currently matching component type as well as patterns
          that span fields.</t>
        </section>
      </section>

      <section title="Tunneled traffic">
        <t>Tunnels continue to proliferate due to the benefits they provide.
        They can help reduce state in the underlay network. Tunnels allow
        bypassing routing decisions of the transit network. Traffic that is
        tunneled is often done so to obscure or secure. Common tunnel types
        include IPsec <xref target="RFC4301"/>, Generic Routing Encapsulation
        (GRE) <xref target="RFC2890"/>, Virtual eXtensible Local Area Network
        (VXLAN) <xref target="RFC7348"/>, GPRS Tunneling Protocol (GTP) <xref
        target="GTPv1-U"/>, et al.</t>

        <t>By definition, transit nodes that are not the endpoints of the
        tunnel hold no attendant control or management plane state. These very
        qualities make it challenging to filter tunneled traffic at
        non-endpoints. Often though, the forwarding hardware at these
        transit-only nodes is capable of reading the byte stream that
        comprises the protocol being tunneled. Despite this capability, it is
        usually infeasible to filter based on the content of this passenger
        protocol's header since BGP FlowSpec does not provide the operator a
        way to address arbitrary locations within a packet.</t>
      </section>

      <section title="Non-IP traffic">
        <t>Not all traffic is forwarded as IP packets. Layer 2 services
        abound, including flavors of BGP-signaled Ethernet VPNs such as
        BGP-EVPN, BGP-VPLS, FEC 129 VPWS (LDP-signaled VPWS with BGP
        Auto-Discovery).</t>

        <t>Ongoing efforts such as <xref
        target="I-D.ietf-idr-flowspec-l2vpn"/> offer one approach, which is to
        add layer 2 fields as additional match conditions. This may suffice if
        a filter needs to be applied only to layer 2, or only to layer 3
        header fields.</t>
      </section>
    </section>

    <section anchor="terminology" title="Terminology">
      <t>
        <list hangIndent="4" style="hanging">
          <t hangText="Header">Subset of datagram or packet that contains
          information that is required for delivery from source to
          destination.</t>

          <t hangText="Payload">Remaining subset of datagram or packet that
          contains the information that is being transported.</t>

          <t hangText="Field">A priori defined subset of the header with
          established semantics, acceptable value type and length.</t>

          <t hangText="Type">How the encoded bits that comprise a field are
          interpreted. A well-defined type can be used to enforce notions of
          ordering, upper and lower bounds, and correctness. For instance,
          using a signed integer type to count the number of packets received
          by a given forwarding element could result in negative values. To
          avoid that, the field should be typed as a zero-based counter. In
          order to avoid premature rollover, the counter should be sized
          appropriately. To prevent retrogression, the values should always be
          accumulated as it is impossible to receive fewer packets in toto.
          Defining this example field as an unsigned 64-bit field with
          monotonically incrementing values ensure it meets the appropriate
          objectives.</t>

          <t hangText="Maximum Readable Length">The packet length in bits that
          a forwarding implementation can parse and make available for
          filtering. Abbreviated as MRL.</t>
        </list>
      </t>
    </section>

    <section title="Defining the Search Boundary">
      <t> Based on this set of definitions, the flexible match operator
      requires three inputs to demarcate the search extents and the search
      term itself:</t>

      <t>
        <list style="symbols">
          <t>Where the match should begin: Where in the datagram or packet the
          search for matching values is initiated. This allows skipping over
          parts of the packet that are not of interest.</t>

          <t>Where the match should end: Where in the datagram or packet the
          search ends.</t>

          <t>What should be matched: A variety of search types, including
          exact numeric matches, matching a range of numeric values, and
          string-based matches.</t>
        </list>
      </t>

      <section title="Defining the Start of the Boundary">
        <t>While intuitive to grasp, determining the search boundary requires
        explication. A canonical forwarding engine parses an incoming packet
        header and identifies it as belonging to a single Network Layer
        Reachability Information (NLRI), or address family. The contents of
        the header are parsed with address family specificity, in order to
        extract a forwarding lookup key. In the case of IPv4 unicast
        forwarding, this key is the IPv4 destination address. The key is used
        to look up the corresponding action in an address family specific
        forwarding table.</t>

        <t>This does not preclude implementations from exposing additional
        packet headers to the operator, both encapsulating and encapsulated,
        to provide additional forwarding functionality. For instance, common
        stateless load balancing techniques involve reading fields in
        additional headers in order to increase entropy and preserve flow
        ordering. As another example, in the case of Ethernet encapsulated
        IPv4 packets, a forwarding engine could allow filtering using the
        source or destination MAC address even though the forwarding decision
        is ultimately based only on the IPv4 header.</t>

        <t>As yet another example, consider that a Virtual eXtensible Local
        Area Network (VXLAN) [RFC7348] packet has the following headers:</t>

        <t>
          <list style="symbols">
            <t>Outer Ethernet Header: Source MAC address of the originating
            VXLAN Tunnel End Point (VTEP). - Outer IPv4/IPv6 Header: Source IP
            address of the originating VXLAN Tunnel End Point (VTEP).</t>

            <t>Outer UDP Header: Random source port used to generate entropy
            for load balancing, and destined to the IANA-assigned VXLAN port
            4789.</t>

            <t>VXLAN Header: Used to identify a specific VXLAN overlay
            network.</t>

            <t>Inner Ethernet Header and payload: Original MAC frame
            encapsulated.</t>
          </list>
        </t>

        <t>Forwarding at the tunnel midpoints, i.e., not the where tunnel
        imposition or disposition occur, makes use of the outer IPv4 header.
        In order to differentiate itself, a midpoint may provide the ability
        to parse and take the VXLAN header into account. This functionality
        could be used to implement access control or perform traffic
        telemetry.</t>

        <t>In order to normalize behavior across forwarding implementations,
        the beginning of the search space MUST be aligned with the FlowSpec
        AFI/SAFI to which the flexible match rule belongs. For instance, with
        FlowSpec for IPv4 traffic, the match can only start at the first bit
        of the IPv4 header. Even if the forwarding implementation has the
        capability to read outer and inner headers, the start of the search
        extent is anchored at the IPv4 header.</t>
      </section>

      <section title="Defining the End of the Boundary">
        <t> Similarly, the end of the search boundary MUST be the lesser of
        either the last bit in a packet or the <xref
        target="terminology">Maximum Readable Length</xref> that a forwarding
        implementation can parse from a packet and make available for
        filtering. As the MRL will be implementation-dependent, it needs to be
        known to the flexible filtering rules engine. That can be communicated
        out-of-band via configuration or signaled using future BGP or IGP
        extensions. </t>

        <t>It is not required that all nodes in a flexible filtering domain be
        required to have a common or minimum MRL. This does not obviate the
        need for a rules engine to take MRL into account when creating
        flexible filters. This is especially important as the rules engine may
        not have direct BGP peering with all FlowSpec enforcers and may not
        receive a BGP Notification if it advertises a flexible match that
        exceeds the MRL of a given node.</t>
      </section>
    </section>

    <section title="Specification">
      <t> We define a new FlowSpec component, Type TBD, named "Flexible Match
      Conditions". </t>

      <t>Encoding: &lt;type (1 octet), length (1 octet), value&gt;</t>

      <section title="Value">
        <t> The Value field contains the match boundary, match type, and term
        to match. </t>

        <t>Encoding: &lt;match boundary (2 octets), match type (1 octet),
        match term&gt;</t>

        <section title="Match Boundary">
          <t>The match boundary is encoded as:</t>

          <figure align="left">
            <artwork align="center">
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|u|bit o|      byte offset      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            </artwork>
          </figure>

          <t>
            <list hangIndent="4" style="hanging">
              <t hangText="u -">Currently unused. MUST be zero.</t>

              <!-- the lower bit could be used in the future as an offset sign bit -->

              <t hangText="bit offset -">The number of bits to ignore in the
              packet being matched, from the start of the search boundary.</t>

              <t hangText="byte offset -">The number of bytes to ignore.</t>
            </list>
          </t>
        </section>

        <section title="Match Type">
          <t>Currently the following match types are defined:</t>

          <texttable title="Match Types">
            <ttcol align="center">Value</ttcol>

            <ttcol>Match Type</ttcol>

            <c>0</c>

            <c>Bitmask match</c>

            <c>1</c>

            <c>Numeric range match</c>

            <c>2</c>

            <c>Regular expression (regex) string match</c>
          </texttable>

          <t>Match types 0 and 1 MUST be implemented. All other types are
          optional.</t>

          <section title="Bitmask match">
            <t>This is encoded as {prefix, mask}, of equal length.</t>

            <t>
              <list hangIndent="4" style="hanging">
                <t hangText="prefix -">Provides a bit string to be matched.
                The prefix and mask fields are bitwise AND'ed to create a
                resulting pattern.</t>

                <t hangText="mask -">Paired with the prefix field to create a
                bit string match. An unset bit is treated as a 'do not care'
                bit in the corresponding position in the prefix field. When a
                bit is set in the mask, the value of the bit in the
                corresponding location in the prefix field must match
                exactly.</t>
              </list>
            </t>
          </section>

          <section title="Numeric range match">
            <t>This is encoded as {low value, high value}, treated as an
            inclusive range.</t>

            <t>
              <list hangIndent="4" style="hanging">
                <t hangText="low -">The low value of the desired numeric
                range. This value MUST be numerically lower than the high
                value.</t>

                <t hangText="high -">The high value of the desired numeric
                range. This value MUST be numerically higher than the low
                value.</t>
              </list>
            </t>
          </section>

          <section title="ASCII-only regular expression string match">
            <t>Not every forwarding plane that supports filtering via FlowSpec
            is a hardware-accelerated Network Processor Unit (NPU) or
            Application-Specific Integrated Circuit (ASIC). Software-only
            forwarding planes, while less performant, may be able to filter on
            more complex match types.</t>

            <t>There is a plethora of regular expression engines and their
            supported flavor. The specific flavor this match type refers to is
            the extended regular expression (ERE) as defined by <xref
            target="IEEE.1003-2.1992"/>.</t>
          </section>
        </section>
      </section>
    </section>

    <section title="Error Handling">
      <t>Malicious, misbehaving, or misunderstanding implementations could
      advertise semantically incorrect values. Care must be taken to minimize
      fallout from attempting to parse such data. Any well-behaved
      implementation SHOULD verify that the minimum packet length undergoing a
      match equals (match start header length + byte offset + bit offset +
      value length).</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>This document introduces no additional security considerations beyond
      those already covered in <xref target="RFC5575"/> .</t>

      <!-- jgs - I am skeptical this will suffice. I'll think about if and
     how to elaborate -->
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>IANA <!-- --> is requested to assign <!-- when procedure is
    done, update this to "has assigned", and update the various TBD
    accordingly --> a type from the First Come First Served range of the "Flow
      Spec Component Types" registry:</t>

      <texttable>
        <ttcol align="center">Type Value</ttcol>

        <ttcol align="center">Name</ttcol>

        <ttcol align="center">Reference</ttcol>

        <c>TBD</c>

        <c>Flexible Match Conditions</c>

        <c>this document</c>
      </texttable>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks to Rafal Jan Szarecki, Sudipto Nandi, Ron Bonica, and Jeff
      Haas for their valuable comments and suggestions on this document.</t>
    </section>
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <references title="Normative References">
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.5575.xml"?>
    </references>

    <references title="Informative References">
      <?rfc include="http://xml2rfc.tools.ietf.org/public/rfc/bibxml5/reference.3GPP.29.281.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml-ids/reference.I-D.ietf-idr-flowspec-l2vpn.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2890.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.4301.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.7348.xml"?>

      <?rfc include="http://xml2rfc.tools.ietf.org/public/rfc/bibxml6/reference.IEEE.1003-2.1992.xml"?>

      <reference anchor="GTPv1-U">
        <front>
          <title> General Packet Radio System (GPRS) Tunnelling Protocol User
          Plane (GTPv1-U) </title>

          <author>
            <organization>3GPP</organization>
          </author>

          <date day="26" month="September" year="2011"/>
        </front>

        <seriesInfo name="3GPP TS" value="29.281 10.3.0"/>

        <format target="http://www.3gpp.org/ftp/Specs/html-info/29281.htm"
                type="HTML"/>
      </reference>
    </references>
  </back>
</rfc>
