<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
    please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
    (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
    (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std" docName="draft-khare-idr-bgp-flowspec-payload-match-02"
     ipr="trust200902">
  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
        full title is longer than 39 characters -->

    <title abbrev="BGP FlowSpec Payload Matching">BGP FlowSpec Payload
    Matching</title>

    <author fullname="Anurag Khare" initials="A." role="editor"
            surname="Khare">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>2251 Corporate Park Drive</street>

          <city>Herndon</city>

          <region>Virginia</region>

          <code>20171</code>

          <country>US</country>
        </postal>

        <email>anuragk@juniper.net</email>
      </address>
    </author>

    <author fullname="John Scudder" initials="J." surname="Scudder">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>jgs@juniper.net</email>
      </address>
    </author>

    <author fullname="Luay Jalil" initials="L." surname="Jalil">
      <organization>Verizon</organization>

      <address>
        <email>luay.jalil@one.verizon.com</email>
      </address>
    </author>

    <author fullname="Michael Gallagher" initials="M." surname="Gallagher">
      <organization>Verizon</organization>

      <address>
        <email>michael.gallagher@verizon.com</email>
      </address>
    </author>

    <date/>

    <area>RTG</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <!-- WG name at the upperleft corner of the doc,
        IETF is fine for individual submissions.  
   If this element is not present, the default is "Network Working Group",
        which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>BGP</keyword>

    <keyword>Flowspec</keyword>

    <abstract>
      <t>The rise in frequency, volume, and pernicious effects of DDoS attacks
      has elevated them from fare for the specialist to generalist press.
      Numerous reports detail the taxonomy of DDoS types, the varying
      motivations of their attackers, as well as the resulting business and
      reputation loss of their targets.</t>

      <t>BGP FlowSpec (RFC 5575, "Dissemination of Flow Specification Rules")
      can be used to rapidly disseminate filters that thwart attacks, being
      particularly effective against the volumetric type. Operators can use
      existing FlowSpec components to match on pre-defined packet header
      fields. However recent enhancements to forwarding plane filter
      implementations allow matches at arbitary locations within the packet
      header and, to some extent, the payload. This capability can be used to
      detect highly amplified attacks, whose attack signature remains
      relatively constant.</t>

      <t>We define a new FlowSpec component, "Flexible Match Conditions", with
      similar matching semantics to those of existing components. This
      component will allow the operator to define bounded match conditions
      using offsets and bitmasks.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>BGP FlowSpec <xref target="RFC5575"/> can be used to rapidly
      disseminate filters that thwart attacks, being particularly effective
      against the volumetric type. Operators can use existing FlowSpec
      components to match on pre-defined packet header fields. However recent
      enhancements to forwarding plane filter implementations allow matches at
      arbitary locations within the packet header and, to some extent, the
      payload. This capability can be used to detect highly amplified attacks
      whose attack signature remains relatively constant, or the burgeoning
      variety of tunneled traffic.</t>

      <t>We define a new FlowSpec component, "Flexible Match Conditions", with
      similar matching semantics to those of existing components. This
      component will allow the operator to define bounded match conditions
      using offsets and bitmasks.</t>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref> .</t>
      </section>
    </section>

    <section title="Motivation">
      <t>BGP FlowSpec couples both the advertisement of NLRI-specific match
      conditions, as well as the forwarding instance to which the filter is
      attached. This makes sense since BGP FlowSpec advertisements are most
      commonly generated, or at least verified, by human operators. The
      operator finds it intuitive to configure match conditions as
      human-readable values, native to each address family.</t>

      <t>It is much friendlier, for instance, to define a filter that matches
      a source address of 192.168.1.1/32, than it is to work with the
      equivalent binary representation of that IPv4 address. Further, it is
      easier to use field names such as 'IPv4 source address' as part of the
      match condition, than it is to demarc that field using byte and bit
      offsets.</t>

      <t>However, there are a number of use cases that benefit from the
      latter, more machine-readable approach.</t>

      <section title="Volumetric attacks">
        <t>Launching a DDoS is easier and more cost-effective than ever. The
        will to attack matters more than wherewithal. Those with the
        inclination can initiate one from the <eref
        target="https://github.com/649/Memcrashed-DDoS-Exploit">comfort of
        their homes</eref>, or even buy <eref
        target="https://www.facebook.com/PutinStresser/photos/a.1687498801469198/2024483917770683/?type=3">DDoS-as-a-Service</eref>,
        complete with 24x7 support and flexible payment plans.</t>

        <t>Despite their effectiveness, such attacks are easily thwarted -
        once identified. The challenge lies in fishing out a generally
        unvarying attack signature from a data stream. Machine analysis may
        prove superior here, given the size of input involved. The resulting
        pattern may not lie within a well-defined field; even if it happens
        to, it may be a more straight-forward workflow to have machine
        analysis result in a machine-readable filter.</t>
      </section>

      <section title="Tunneled traffic">
        <t>Tunnels continue to proliferate due to the benefits they provide.
        They can help reduce state in the underlay network. Tunnels allow
        bypassing routing decisions of the transit network. Traffic that is
        tunneled is often done so to obscure or secure. Common tunnel types
        include IPsec <xref target="RFC4301"/>, Generic Routing Encapsulation
        (GRE) <xref target="RFC2890"/>, Virtual eXtensible Local Area Network
        (VXLAN) <xref target="RFC7348"/>, GPRS Tunneling Protocol (GTP) <xref
        target="3GPP.29.281"/>, et al.</t>

        <t>By definition, transit nodes that are not the endpoints of the
        tunnel hold no attendant control or management plane state. These very
        qualities make it challenging to filter tunneled traffic at
        non-endpoints. Often though, the forwarding hardware at these
        transit-only nodes is capable of reading the byte stream that
        comprises the protocol being tunneled. Despite this capability, it is
        usually infeasible to filter based on the content of this passenger
        protocol's header since BGP FlowSpec does not provide the operator a
        way to address arbitrary locations within a packet.</t>
      </section>

      <section title="Non-IP traffic">
        <t>Not all traffic is forwarded as IP packets. Layer 2 services
        abound, including flavors of BGP-signaled Ethernet VPNs such as
        BGP-EVPN, BGP-VPLS, FEC 129 VPWS (LDP-signaled VPWS with BGP
        Auto-Discovery).</t>

        <t>Ongoing efforts such as <xref
        target="I-D.ietf-idr-flowspec-l2vpn"/> offer one approach, which is to
        add layer 2 fields as additional match conditions. This may suffice if
        a filter needs to be applied only to layer 2, or only to layer 3
        header fields.</t>
      </section>
    </section>

    <section title="Details">
      <section title="Flexible Match Conditions">
        <t>We define a new FlowSpec component, Type TBD, named "Flexible Match
        Conditions".</t>

        <t>Encoding: &lt;type (1 octet), op, value&gt;</t>

        <!-- note
    I eliminated the "+" since bitmask matching doesn't lend itself well to
    complicated boolean algebra -->

        <t>It contains a single {operator, value} tuple that is used to match
        packets according to the rules given below.</t>

        <section title="Operator">
          <t>The operator field is encoded as:</t>

          <figure align="left">
            <artwork align="center">
              <![CDATA[
 0                   1
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|v| a |u  |bit o|  byte offset  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ]]>
            </artwork>
          </figure>

          <t>
            <list hangIndent="4" style="hanging">
              <t hangText="v -">Type of value being matched, <xref
              target="string_comparison">string comparison</xref> if this bit
              is set, and <xref target="numeric_range">numeric range</xref> if
              unset.</t>

              <t hangText="a -">Anchor. A 2-bit unsigned integer whose value
              indicates where in the packet the match should start. To avoid
              ambiguity with tunneled packets, the match SHOULD be anchored at
              the outermost header. An example is given <xref
              target="example1">below</xref>.</t>
            </list>
          </t>

          <!-- texttable appears to be ignored w/in list context -->

          <texttable title="Anchor Field Values">
            <ttcol align="center">Value</ttcol>

            <ttcol align="center">Symbolic Name</ttcol>

            <ttcol>Match start</ttcol>

            <c>0</c>

            <c>d</c>

            <c>Layer 2 (d)ata-link layer Ethernet header</c>

            <c>1</c>

            <c>i</c>

            <c>Layer 3 (I)Pv4/IPv6 header</c>

            <c>2</c>

            <c>t</c>

            <c>Layer 4 TCP/UDP (t)ransport header</c>

            <c>3</c>

            <c>p</c>

            <c>Layer 4-specific (p)rotocol-specific payload</c>
          </texttable>

          <t>
            <list hangIndent="4" style="hanging">
              <t hangText="u -">Reserved. MUST be set to 0. MUST be ignored on
              receipt.</t>

              <!-- the lower bit could be
      used in the future as an offset sign bit -->
            </list>
          </t>

          <t>
            <list hangIndent="4" style="hanging">
              <t hangText="bit offset -">A 3-bit unsigned integer indicating
              how many bits to ignore, following the byte offset.</t>

              <t hangText="byte offset -">An 8-bit unsigned integer indicating
              how many bytes to ignore, after the match start as determined by
              the first selected anchor bit.</t>
            </list>
          </t>
        </section>

        <section title="Value">
          <t>The operator field indicates where to start matching; by
          contrast, the value operand indicates what to match and where to
          stop matching. The value operand MUST be of the type indicated by
          the 'v' bit, as signaled in the operator. As a result it can take on
          one of two forms - string vs. numeric range comparison.</t>

          <t>The length of the numeric range is constant. It uses two 64-bit
          fields. A string comparison uses two 128-bit fields. Its length
          field indicates the extent of how much of the prefix and mask fields
          to use in the AND operation. This is deemed sufficient for stateless
          inspection and practical for efficient hardware forwarding plane
          implementations.</t>

          <section anchor="string_comparison" title="String Comparison">
            <figure align="left">
              <artwork align="center">
                <![CDATA[
 0                   1                   2                   3
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    len    |                      reserved                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                                                               |
+                             prefix                            +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                                                               |
+                              mask                             +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ]]>
              </artwork>
            </figure>

            <t>
              <list hangIndent="9" style="hanging">
                <t hangText="len -">Indicates the number of corresponding bits
                in the prefix and mask fields to read. This length field is
                interpreted as (len + 1 &lt;&lt; 1). This allows even unsigned
                values ranging from 2-128.</t>

                <t hangText="prefix -">Provides a bit string to be matched.
                The prefix and mask fields are bitwise AND'ed to create a
                resulting pattern. The number of bits used in the AND
                operation are indicated by the preceding length field.</t>

                <t hangText="mask -">Paired with the prefix field to create a
                bit string match. An unset bit is treated as a 'do not care'
                bit in the corresponding position in the prefix field. When a
                bit is set in the mask, the value of the bit in the
                corresponding location in the prefix field must match
                exactly.</t>
              </list>
            </t>

            <t>Implementations MUST only extract the number of bits from the
            prefix and mask fields as indicated by the preceding length
            field.</t>
          </section>

          <section anchor="numeric_range" title="Numeric Range Comparison">
            <figure align="left">
              <artwork align="center">
                <![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                              low                              +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                              high                             +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ]]>
              </artwork>
            </figure>

            <t>
              <list hangIndent="7" style="hanging">
                <t hangText="low -">The low value of the desired inclusive
                numeric range. This value MUST be numerically lower than the
                high value.</t>

                <t hangText="high -">The high value of the desired inclusive
                numeric range. This value MUST be numerically higher than the
                low value.</t>
              </list>
            </t>
          </section>
        </section>

        <section anchor="example1" title="Example">
          <t>As an example, consider that the canonical <xref
          target="RFC7348">Virtual eXtensible Local Area Network
          (VXLAN)</xref> packet has the following headers:</t>

          <t>
            <list style="symbols">
              <t>Outer Ethernet Header: Source MAC address of the originating
              VXLAN Tunnel End Point (VTEP).</t>

              <t>Outer IPv4/IPv6 Header: Source IP address of the originating
              VXLAN Tunnel End Point (VTEP).</t>

              <t>Outer UDP Header: Random source port used to generate entropy
              for load balancing, and destined to the IANA-assigned VXLAN port
              4789.</t>

              <t>VXLAN Header: Used to identify a specific VXLAN overlay
              network.</t>

              <t>Inner Ethernet Header and payload: Original MAC frame being
              encapsulated.</t>
            </list>
          </t>

          <t>The following table outlines where the match would start based on
          the anchor setting:</t>

          <texttable>
            <ttcol align="center">Anchor value</ttcol>

            <ttcol>Match start</ttcol>

            <c>d</c>

            <c>Outer Ethernet Header</c>

            <c>i</c>

            <c>Outer IPv4/IPv6 Header</c>

            <c>t</c>

            <c>Outer UDP Header</c>

            <c>p</c>

            <c>VXLAN Header</c>
          </texttable>
        </section>
      </section>

      <section title="Error Handling">
        <t>Malicious, misbehaving, or misunderstanding implementations could
        advertise semantically incorrect values. Care must be taken to
        minimize fallout from attempting to parse such data. Any well-behaved
        implementation SHOULD verify that the minimum packet length undergoing
        a match equals (match start header length + byte offset + bit offset +
        value length).</t>
      </section>

      <section anchor="Security" title="Security Considerations">
        <t>This document introduces no additional security considerations
        beyond those already covered in <xref target="RFC5575"/> .</t>

        <!-- jgs - I am skeptical this will suffice. I'll think about if and
     how to elaborate -->
      </section>

      <section anchor="IANA" title="IANA Considerations">
        <t>IANA <!-- --> is requested to assign <!-- when procedure is
    done, update this to "has assigned", and update the various TBD
    accordingly --> a type from the First Come First Served range of the "Flow
        Spec Component Types" registry:</t>

        <texttable>
          <ttcol align="center">Type Value</ttcol>

          <ttcol align="center">Name</ttcol>

          <ttcol align="center">Reference</ttcol>

          <c>TBD</c>

          <c>Flexible Match Conditions</c>

          <c>this document</c>
        </texttable>
      </section>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Thanks to Rafal Jan Szarecki, Sudipto Nandi, Ron Bonica, and Jeff
      Haas for their valuable comments and suggestions on this document.</t>
    </section>
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <references title="Normative References">
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.5575.xml"?>
    </references>

    <references title="Informative References">
      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.7348.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.4301.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2890.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml-3gpp/reference.3GPP.29.281.xml"?>

      <?rfc include="http://xml.resource.org/public/rfc/bibxml-ids/reference.I-D.ietf-idr-flowspec-l2vpn.xml"?>
    </references>
  </back>
</rfc>
