<?xml version="1.0" encoding="US-ASCII"?>
<!-- xml2rfc is available at http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [

  <!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
  <!ENTITY RFC2475 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2475.xml">
  <!ENTITY RFC3031 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3031.xml">
  <!ENTITY RFC3032 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3032.xml">
  <!ENTITY RFC3209 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3209.xml">
  <!ENTITY RFC3270 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3270.xml">
  <!ENTITY RFC3443 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3443.xml">
  <!ENTITY RFC4090 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4090.xml">
  <!ENTITY RFC4124 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4124.xml">
  <!ENTITY RFC4182 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4182.xml">
  <!ENTITY RFC4201 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4201.xml">
  <!ENTITY RFC4206 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4206.xml">
  <!ENTITY RFC4221 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4221.xml">
  <!ENTITY RFC5129 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5129.xml">
  <!ENTITY RFC5462 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5462.xml">
  <!ENTITY RFC5586 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5586.xml">
  <!ENTITY RFC5695 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5695.xml">
  <!ENTITY RFC6391 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6391.xml">
  <!ENTITY RFC6639 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6639.xml">

  <!ENTITY I-D.ietf-mpls-entropy-label SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-ietf-mpls-entropy-label-06">

  ]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc comments="yes"?>
<?rfc inline="yes" ?>

<rfc category="info" ipr="trust200902"
     docName="draft-villamizar-mpls-forwarding-00">

  <front>
    <title abbrev="MPLS Forwarding">
      MPLS Forwarding Compliance and Performance Requirements</title>

    <author role="editor"
	    fullname="Curtis Villamizar" initials="C." surname="Villamizar">
      <organization>Outer Cape Cod Network Consulting</organization>
      <address>
	<email>curtis@occnc.com</email>
      </address>
    </author>

    <author
	    fullname="Kireeti Kompella" initials="K." surname="Kompella">
      <organization>Contrail Systems</organization>
      <address>
	<email>kireeti.kompella@gmail.com</email>
      </address>
    </author>

    <date year="2012" />

    <area>Routing</area>
    <workgroup>MPLS</workgroup>

    <keyword>MPLS</keyword>
    <keyword>composite link</keyword>
    <keyword>link aggregation</keyword>
    <keyword>ECMP</keyword>
    <keyword>link bundling</keyword>
    <keyword>multipath</keyword>
    <keyword>MPLS-TP</keyword>

    <abstract>
      <t>
	This document provides guidelines for implementors regarding
	MPLS forwarding and a basis for evaluations of forwarding
	implementations.  Guidelines cover basic MPLS forwarding,
	forwarding when a deep MPLS label stack is encountered, MPLS
	UHP operations which require one or more label POP plus a
	PUSH, guidelines for hashing an MPLS stack and payload for
	multipath, and conformance and performance requirements for
	recent pseudowire and MPLS standards.
      </t>
    </abstract>
  </front>

  <middle>

    <section title="Introduction">
      <t>
	The document addresses concerns raised on the MPLS WG mailing
	list about shortcomings in implementations of MPLS forwarding.
      </t>
      <t>
	Although this document is informational, the key words "MUST",
	"MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
	"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are used.
	For those who wish to take the advice of this document, these
	keywords SHOULD be interpreted as described in
	<xref target="RFC2119">RFC 2119</xref>.  Similarly, the
	References section is split into Normative and Informative
	subsections.  In this case references which are normative for
	forwarding are listed as normative.  References which describe
	signaling only, though normative with respect to signaling,
	are listed as informative here, as they are informative with
	respect to MPLS forwarding.
      </t>

      <section title="Apparent Misconceptions">
	<t>
	  In early generations of forwarding silicon (which may now be
	  behind us), there apparently were some misconceptions about
	  MPLS.  The following statements may clear up some of these
	  misconceptions.
	  <list style="numbers">
	    <t>
	      There are practical reasons to have more than one or two
	      labels in an MPLS label stack.  Under some circumstances
	      the label stack can become quite deep.
	      See <xref target="sect.basics" />.
	    </t>
	    <t>
	      The label stack must be considered to be arbitrarily
	      deep.  If a the bottom of the label stack cannot be
	      found, but sufficient number of labels exist to forward,
	      an LSR MUST forward the packet.  An LSR MUST NOT assume
	      the packet is malformed unless the end of packet is
	      found before bottom of stack.
	      See <xref target="sect.basics" />.
	    </t>
	    <t>
	      In networks where deep label stacks are encountered,
	      they are not rare.  Full packet rate performance is
	      required regardless of label stack depth, except where
	      multiple POP operations are required.
	      See <xref target="sect.basics" />.
	    </t>
	    <t>
	      Research has shown that long bursts of short packets
	      with 40 byte or 44 byte common IP payload sizes in these
	      bursts.  This is due to TCP ACK compression
	      <xref target="ACK-compression" />.
	      <list style="letters">
		<t>
		  A forwarding engine SHOULD, if practical, be able to
		  sustain an arbitarily long sequence of small packets
		  arriving at full interface rate.
		</t>
		<t>
		  If indefinite full packet rate for small packets is
		  not practical, a forwarding engine MUST be able to
		  buffer a long sequence of small packets inbound to
		  the decision engine and sustain full interface rate
		  for some reasonable average packet rate.
		</t>
	      </list>
	      See <xref target="sect.pkt-rate" />.
	    </t>
	    <t>
	      For practical reasons, support for pseudowire control
	      word SHOULD be considered mandatory by the implementor
	      and system designer.  Deployment of pseudowire control
	      word MAY be considered optional.  See
	      <xref target="sect.pw-cw" />.
	    </t>
	    <t>
	      For practical reasons, support for adding a pseudowire
	      Flow Label <xref target="RFC6391" /> SHOULD be
	      considered mandatory by the implementor and system
	      designer.  Deployment of this features MAY be considered
	      optional.  See <xref target="sect.fat-pw" />.
	    </t>
	    <t>
	      For practical reasons, support for adding a MPLS Entropy
	      Label <xref target="I-D.ietf-mpls-entropy-label" />
	      SHOULD be considered mandatory by the implementor and
	      system designer.  Deployment of this features MAY be
	      considered optional.  See <xref target="sect.entropy" />.
	    </t>
	  </list>
	</t>
      </section>

      <section title="Target Audience">
	<t>
	  This document is intended for multiple audiences:
	  implementor (implementing MPLS forwarding in silicon or in
	  software); systems designer (putting together a MPLS
	  forwarding systems); deployer (running an MPLS network).
	  These guidelines are intended to serve the following
	  purposes:
	</t>
	<t>
	  <list style="numbers">
	    <t>
	      Explain what to do and what not to do when a deep label
	      stack is encountered. (audience: implementor)
	    </t>
	    <t>
	      Highlight pitfalls to look for when implementing an MPLS
	      forwarding chip. (audience: implementor)
	    </t>
	    <t>
	      Provide a checklist of features and performance
	      specificiations to request.  (audience: systems
	      designer, deployer)
	    </t>
	    <t>
	      Provide a set of tests to perform.  (audience: systems
	      designer, deployer).
	    </t>
	  </list>
	</t>
	<t>
	  The implementor, systems designer, and deployer have a
	  transitive supplier customer relationship.  It is in the best
	  interest of the supplier to review their product against their
	  customer's checklist and customer's customer's checklist if
	  applicable.
	</t>
      </section>

    </section>

    <section anchor="sect.issues" title="Forwarding Issues">

      <t>
	A brief review of forwarding issues is provided in the
	subsections that follow.  This section provides some
	background on why some of these requirements exist.  The
	questions to ask of suppliers and testing is covered in the
	following sections, <xref target="sect.ask" /> and <xref
	target="sect.test" />.
      </t>

      <section anchor="sect.basics" title="Forwarding Basics">
	<t>
	  Basic MPLS architecture and MPLS encapsulation, and
	  thereforepacket forwarding is defined in <xref
	  target="RFC3031" /> and <xref target="RFC3032" />.  RFC3031
	  and RFC3032 are somewhat LDP centric.  RSVP-TE supports
	  traffic engineering (TE) and fast reroute, features that LDP
	  lacks.  The base document for RSVP-TE based MPLSis <xref
	  target="RFC3209" />.
	</t>
	<t>
	  A few RFCs update RFC3032.  Those with impact on forwarding
	  include the following.
	  <list style="numbers">
	    <t>
	      TTL processing is clarified in <xref target="RFC3443" />.
	    </t>
	    <t>
	      The use of MPLS Explicit NULL is modified in <xref
	      target="RFC4182" />.
	    </t>
	    <t>
	      Diffserv is supported by <xref target="RFC3270" /> and
	      <xref target="RFC4124" />.  The "EXP" field is renamed to
	      "Traffic Class" in <xref target="RFC5462" />, removing any
	      misconception that it was available for experimentation or
	      could be ignored.
	    </t>
	    <t>
	      ECN is supported by <xref target="RFC5129" />.
	    </t>
	    <t>
	      The MPLS G-ACh and GAL are defined in <xref
	      target="RFC5586" />.
	    </t>
	  </list>
	</t>
	<t>
	  A few RFCs update RFC3209.  Those that are listed as
	  updating RFC3209 generally impact only RSVP-TE signaling.
	  Forwarding is modified by major extension built upon
	  RFC3209.  Some of these extensions are discussed in
	  following subsections.
	</t>

	<section anchor="sect.early-deep"
		 title="Early Uses of Multiple Label Stack Entries">
	  <t>
	    MPLS deployments in the early part of the prior decade
	    (circa 2000) tended to support either LDP or RSVP-TE.  LDP
	    was favored by some for its ability to scale close to the
	    network edges without adding deployment complexity.  RSVP-TE
	    was favored where traffic engineering or fast reroute were
	    considered important.
	  </t>
	  <t>
	    The use of MPLS FRR <xref target="RFC4090" /> added a second
	    label to MPLS traffic, but only when FRR protection was in
	    use.
	  </t>
	  <t>
	    At least one major service provider made use of LDP over
	    RSVP-TE in their core network in the circa 2000-2005 time
	    frame.  LDP supported VPN services to the provider edges.
	    RSVP-TE provided TE and FRR in the core.  This yields two
	    labels on nearly all packets in the core.  They also used
	    FRR which yields three labels on a large subset of traffic
	    while FRR protection is active.  VPNs added yet another
	    label, bringing the label stack depth (with FRR) to four.
	  </t>
	</section>

	<section anchor="sect.link-bundle" title="MPLS Link Bundling">
	  <t>
	    MPLS Link Bundling was the first RFC to address the need for
	    multiple parallel links between nodes <xref target="RFC4201"
	    />.  MPLS Link Bundling is notable in that it tried not to
	    change MPLS forwarding, except in specifying the "All-Ones"
	    component link.  MPLS Link Bundling is seldom if ever
	    deployed.  Instead multipath techniques described in <xref
	    target="sect.multipath" /> are used.
	  </t>
	</section>

	<section anchor="sect.hierarchy" title="MPLS Hierarchy">
	  <t>
	    MPLS hierarchy is defined in <xref target="RFC4206" />.
	    Although RFC4206 is considered part of GMPLS, the Packet
	    Switching Capable (PSC) portion of the MPLS hierarchy are
	    applicable to MPLS and may be supported in an otherwise
	    GMPLS free implementation.  The MPLS PSC hierarchy remains
	    the most likely means of providing further scaling in an
	    RSVP-TE MPLS network, particularly where the network is
	    designed to provide RSVP-TE connectivity to the edges.  This
	    is the case for envisioned MPLS-TP networks.  The use of the
	    MPLS PSC hierarchy can add as many as four labels to a label
	    stack, though it is likely that only one layer of PSC will
	    be used in the near future.
	  </t>
	</section>
	
      </section>

      <section anchor="sect.pkt-rate" title="Packet Rates">
	<t>
	  While average packet size of Internet traffic may be large,
	  long sequences of small packets have both been predicted in
	  theory and observed in practice.  Traffic compression and TCP
	  ACK compression can conspire to create long sequences of
	  packets of 40-44 bytes in payload length.  If carried over
	  Ethernet, the 64 byte minimum payload applies, yielding a
	  packet rate of approximately 150 Mpps (million packets per
	  second) for the duration of the burst.  The peak rate is
	  higher for other encapsulations, as high as 250 Mpps.
	</t>
	<t>
	  The loss of some TCP ACK packets are not the primary concern
	  when such a burst occurs.  When a burst occurs, any other
	  packets, regardless of packet length are dropped once input
	  buffers are exceeded.  Buffers in front of the packet decision
	  engine are often very small.  <!-- can be less than one packet -->
	</t>
	<t>
	  Internet service providers and content providers generally
	  specify full rate forwarding with 40 byte payload packets as a
	  requirement.  This requirement often can be waived if the
	  provider can be convinced that when long sequence of short
	  packets occur no packets will be dropped.
	</t>
	<t>
	  With adequate buffers before the packet decision engine, an
	  LSR can absorb a long sequence of short packets.  Even if the
	  output is slowed to the point where light congestion occurs,
	  the packets, having cleared the decision process, can make use
	  of larger VOQ or output side buffers and be dealt with
	  according to configured QoS treatment, rather than dropped
	  completely at random.
	</t>
	<t>
	  Packet rate requirements apply regardless of which network
	  tier equipment is deployed in.  Whether deployed in the
	  network core or near the network edges, packets must be
	  processed at full line rate or with sufficient buffering
	  prior to the packet decision engine.
	</t>
      </section>

      <section anchor="sect.multipath" title="MPLS Multipath Techniques">
	<t>
	  In any large provider, service providers and content
	  providers, hash based multipath techniques are used in the
	  core.  In many of these providers hash based multipath is
	  used in the edge as well and in some cases the metro.
	</t>
	<t>
	  The most common multipath techniques are ECMP applied at
	  the IP forwarding level, Ethernet LAG with inspection of the
	  IP payload, and multipath on links carrying both IP and
	  MPLS, where the IP header is inspected below the MPLS label
	  stack.  In most core networks, the vast majority of traffic
	  is MPLS encapsulated.
	</t>
	<t>
	  In order to support an adequately even load distribution
	  across multiple links, IP addresses must be used.  Common
	  practice today is to reinspect the IP addresses at each LSR
	  and use the label stack and IP addresses in a hash performed
	  at each LSR.
	</t>
	<t>
	  The use of this technique is so ubiquitous in large core
	  networks that lack of support for multipath makes any
	  product unsuitable for use in large core networks.  This
	  will continue to be the case in the near future, even as
	  deployment of MPLS Entropy Label begins to relax the core
	  LSR multipath performance requirements given the existing
	  deployed base of edge equipment without the ability to add
	  an Entropy Label.
	</t>
	<t>
	  A generation of edge equipment supporting the ability to add
	  an MPLS Entropy Label is needed before the performance
	  requirements for core LSR can be relaxed.  However, it is
	  likely that two generations of deployment in the future will
	  allow core LSR to support full packet rate only when a
	  relatively small number of MPLS labels need to be inspected
	  before hashing.  For now, don't count on it.
	</t>

	<section anchor="sect.pw-cw" title="Pseudowire Control Word">
	  <t>
	    Within the core of a network some form of multipath is
	    almost certain to be used.  Multipath techniques deployed
	    today are likely to be looking beneath the label stack for
	    an opportunity to hash on IP addresses.
	  </t>
	  <t>
	    A pseudowire encapsulated at a network edge must have a
	    means to prevent reordering within the core if the
	    pseudowire will be crossing a network core, or any part of
	    a network topology where multipath is used.
	  </t>
	  <t>
	    Not supporting the ability to encapsulate a pseudowire
	    with a control word may lock a product out from
	    consideration.  A pseudowire capability without control
	    word support might be sufficient for applications which
	    are strictly both intra-metro and low bandwidth.  However
	    a provider with other applications will very likely not
	    tolerate having equipment which can only support a subset
	    of their pseudowire needs.
	  </t>
	</section>

	<section anchor="sect.fat-pw" title="Pseudowire Flow Label">
	  <t>
	    Unlike a pseudowire control word, a pseudowire flow label
	    <xref target="RFC6391" />, is required only for relatively
	    large capacity pseudowires.  There are many cases where a
	    pseudowire flow label makes sense.  Any service such as a
	    VPN which carries IP traffic within a pseudowire can make
	    use of a pseudowire flow label.
	  </t>
	  <t>
	    Any pseudowire which does not carry a flow label is in
	    effect a single microflow (in <xref target="RFC2475" />
	    terms).  Where multipath makes use of a simple hash (see
	    <xref target="sect.multipath" />) the presense of large
	    microflows that consumes 10% of the capacity of a
	    potentially congested link, can upset the traffic balance
	    and in effect reduce the effective capacity of the entire
	    microflow by far more than 10%.  Therefore is a network
	    where a significant number of parallel 10 Gb/s links exists,
	    even a 1 Gb/s pseudowire should carry a flow label if
	    possible.
	  </t>
	</section>

	<section anchor="sect.entropy" title="MPLS Entropy Label">
	  <t>
	    The MPLS Entropy Label simplifies flow group identification
	    <xref target="I-D.ietf-mpls-entropy-label" /> in the network
	    core.  Prior to the MPLS Entropy Label core LSR needed to
	    inspect the entire label stack and often the IP headers to
	    provide an adequate distribution of traffic when using
	    multipath techniques (see <xref target="sect.multipath"
	    />).  With the use of MPLS Entropy Label, a hash can be
	    performed closer to network edges, placed in the label
	    stack, and used within the network core.
	  </t>
	  <t>
	    The MPLS Entropy Label avoid full label stack and payload
	    inspection within the core where performance levels are most
	    difficult to acheive (see <xref target="sect.pkt-rate" />).
	    The label stack inspection can be terminated as soon as the
	    first Entropy Label is encounted, which is generally after a
	    small number of labels are inspected.
	  </t>
	  <t>
	    In order to provide these benefits in the core, LSR closer
	    to the edge must be capable of adding an entropy label.
	    This support may not be required in the access tier, the
	    tier closest to the customer, but is likely to be required
	    in the edge or the border to the network core.  LSR peering
	    with external networks will also need to be able to add an
	    Entropy Label.
	  </t>
	</section>

      </section>

      <section anchor="sect.tp-uhp" title="MPLS-TP and UHP">
	<t>
	  MPLS-TP introduces forwarding demands that will be extremely
	  difficult to meet in a core network.  Most troublesome is
	  the requirement for Ultimate Hop Popping (UHP, the opposite
	  of Penultimate Hop Popping or PHP).  Using UHP opens the
	  possibility of one or more MPLS POP operation plus an MPLS
	  SWAP operation for each packet.  The potential for multiple
	  lookups and multiple counter instances per packet exists.
	</t>
	<t>
	  As networks grow and tunneling of LDP LSPs into RSVP-TE LSPs
	  is used, and/or RSVP-TE hierarchy is used, the requrement to
	  perform one or two or more MPLS POP operations plus a MPLS
	  SWAP operation (and possibly a PUSH or two) increases.  If
	  MPLS-TP LM (link monitoring) OAM is enabled at each layer,
	  then a packet and byte count must be maintained for each POP
	  and SWAP operation.
	</t>
      </section>

    </section>

    <section anchor="sect.ask"
	     title="Questions for Suppliers">
      <t>
	The following questions should be asked of a supplier.  These
	questions are grouped into broad categories.
      </t>
      <t>
	<list style="hanging" hangIndent="4">
	  <t hangText="Basic Compliance">
	    <vspace blankLines="0" />
            <list counter="q" hangIndent="4" style="format Q#%d">
	      <t>
		Can the implementation forward packets with an
		arbitrarily large stack depth?
	      </t>
	    </list>
	  </t>
	  <t hangText="Basic Performance">
	    <vspace blankLines="0" />
            <list counter="q" hangIndent="4" style="format Q#%d">
	      <t>
		Can very small packets be forwarded at full line rate
		on all interfaces indefinitely?
	      </t>
	      <t>
		Customers must decide whether to relax the prior
		requirement and to what extent.  If the answer to the
		prior question is "no", then:
		<list style="letters">
		  <t>
		    What is the smallest packet size where full line
		    rate forwarding can be supported?
		  </t>
		  <t>
		    What is the longest burst of full rate small
		    packets that can be supported?
		  </t>
		</list>
	      </t>
	      <t>
		How many POP operations can be supported along with a
		SWAP operation at full line rate while maintaining
		per LSP packet and byte counts for each POP and SWAP?
		This requirement is particularly relevant for MPLS-TP.
	      </t>
	      <t>
		For a worst case where all packets arrive on one LSP,
		what is the counter overflow time?  Are any means
		provided to avoid polling all counters at short
		intervals?  This applies to both MPLS and MPLS-TP.
	      </t>
	    </list>
	  </t>
	  <t hangText="Multipath Capabilities and Performance">
	    <vspace blankLines="1" />
	    Multipath capabilities do not apply to MPLS-TP but apply
	    to MPLS and apply if MPLS-TP is carried in MPLS.
            <list counter="q" hangIndent="4" style="format Q#%d">
	      <t>
		How many MPLS labels can be included in a hash based
		on the MPLS label stack?
	      </t>
	      <t>
		Is packet rate performance decreased beyond some
		number of labels?
	      </t>
	      <t>
		Can the IP addresses below the MPLS stack be used in
		the hash?
	      </t>
	      <t>
		At what maximum MPLS label stack depth can Bottom of
		Stack and an IP header appear without impacting packet
		rate performance?
	      </t>
	      <t>
		Are reserved labels included in the label stack hash?
		They MUST NOT be included.
	      </t>
	    </list>
	  </t>
	  <t hangText="Pseudowire Capabilities and Performance">
	    <vspace blankLines="0" />
            <list counter="q" hangIndent="4" style="format Q#%d">
	      <t>
		Is the pseudowire control word supported?
	      </t>
	      <t>
		What is the maximum rate of pseudowire encapsulation
		and decasulation?  Apply the same questions as in
		Based Performance for any packet based pseudowire such
		as IP VPN or Ethernet.
	      </t>
	      <t>
		Does inclusion of a pseudowire control word impact
		performance?
	      </t>
	      <t>
		Are flow labels supported?
	      </t>
	      <t>
		If so, what fields are hashed on for the flow label
		for different types of pseudowires?
	      </t>
	      <t>
		Does inclusion of a flow label impact performance?
	      </t>
	    </list>
	  </t>
	  <t hangText="Entropy Label Support and Performance">
	    <vspace blankLines="0" />
            <list counter="q" hangIndent="4" style="format Q#%d">
	      <t>
		Can an entropy label be added when acting as in
		ingress LER and can it be removed when acting as an
		egress LER?
	      </t>
	      <t>
		If so, what fields are hashed on for the entropy label?
	      </t>
	      <t>
		Does adding or removing an entropy label impact packet
		rate performance?
	      </t>
	      <t>
		Can an entropy label be detected in the label stack,
		used in the hash, and properly terminate the search
		for further information to hash on?
	      </t>
	      <t>
		Does using an entropy label have any negative impact
		on performance?  It should have no impact or a
		positive impact.
	      </t>
	    </list>
	  </t>
	</list>
      </t>
    </section>

    <section anchor="sect.test"
	     title="Forwarding Compliance and Performance Testing">
      <t>
	Packet rate performance of equipment supporting a large number
	of 10 Gb/s or 100 Gb/s links is not possible using desktop
	computers or workstations.  The use of high end workstations
	as a source of test traffic was barely viable 20 years ago,
	but is no longer at all viable.  Though custom microcode has
	been used on specialized router forwarding cards to serve the
	purpose of generating test traffic and measuring it, for the
	most part performance testing will require specialized test
	equipment.  There are multiple sources of suitable equipment.
	<!-- test equipment guys will love this paragraph. -->
      </t>
      <t>
	The set of tests listed here do not correspond one-to-one to
	the set of questions in <xref target="sect.ask" />.  The same
	categorization is used and these tests largely serve to
	validate answers provided the the prior questions, and can
	also provide answers where a supplier is unwilling to disclose
	compliance or performance.
	<!-- all too common -->
      </t>
      <t>
	Performance testing is the domain of the IETF Benchmark
	Methodology Working Group (BMWG).  Below are brief
	descriptions of conformance and performance tests.  Some very
	basic tests are specified in <xref target="RFC5695" /> which
	partially cover only the basic performance test T#2.
      </t>
      <t>
	The following tests should be performed by the systems
	designer, or deployer, or performed by the supplier on their
	behalf if it is not practical for the potential customer to
	perform the tests directly.  These tests are grouped into
	broad categories.
      </t>
      <t>
	<list style="hanging" hangIndent="4">
	  <t hangText="Basic Compliance">
	    <vspace blankLines="0" />
            <list counter="t" hangIndent="4" style="format T#%d">
	      <t>
		Test forwarding at a high rate for packets with
		varying number of label entriess.  While packets with
		more than a dozen label entriess are unlikely to be
		used in any practical scenario today, it is useful to
		know if limitations exists.
	      </t>
	    </list>
	  </t>
	  <t hangText="Basic Performance">
	    <vspace blankLines="0" />
            <list counter="t" hangIndent="4" style="format T#%d">
	      <t>
		Test packet forwarding at full line rate with small
		packets.  See <xref target="RFC5695" />.  The most
		likely case to fail is the smallest packet size.
	      </t>
	      <t>
		If the prior tests did not succeed for all packat
		sizes, then perform the following tests.
		<list style="letters">
		  <t>
		    Increase the packet size by 4 bytes until a size
		    is found that can be forwarded at full rate.
		  </t>
		  <t>
		    Inject bursts of consecutive small packets into a
		    stream of larger packets.  Allow some time for
		    recovery between bursts.  Increase the number of
		    packets in the burst until packets are dropped.
		    One way to accomplish this is to use a router with
		    higher priority set on the interfaces on which
		    small packets are sent to it.  The router should
		    buffer the lower priority large packets.  It is
		    best to inject the small packets to this router on
		    a faster interface (if such a thing exists), or
		    more than one interface.
		  </t>
		</list>
	      </t>
	      <t>
		Send test traffic where a SWAP operation is required.
		Also set up multiple LSP carried over other LSP where
		the device under test (DUT) is the egress of these
		LSP.  Create test packets such that the SWAP operation
		is performed after POP operations, increasing the
		number of POP operations until forwarding of small
		packets at full line rate can no longer be supported.
		Also check to see at what point the full set of
		counters can no longer be maintained.  This
		requirement is particularly relevant for MPLS-TP.
	      </t>
	      <t>
		Send all traffic on one LSP and see if the counters
		become inaccurate.  Often counters on silicon are much
		smaller than the 64 bit counters in IETF MIB.  System
		developers should consider what counter polling rate
		is necessary to maintain accurate counters and whether
		those polling rates are practical.

		Relevant MIBs for MPLS are discussed in 
		<xref target="RFC4221" /> and
		<xref target="RFC6639" />.
	      </t>
	    </list>
	  </t>
	  <t hangText="Multipath Capabilities and Performance">
	    <vspace blankLines="1" />
	    Multipath capabilities do not apply to MPLS-TP but apply
	    to MPLS and apply if MPLS-TP is carried in MPLS.
            <list counter="t" hangIndent="4" style="format T#%d">
	      <t>
		Send traffic at a rate well exceeding the capacity of
		a single multipath component link, and where entropy
		exists only below the top of stack.  If only the top
		label is used this test will fail immediately.
	      </t>
	      <t>
		Move the labels with entropy down in the stack until
		either the full forwarding rate can no longer be
		supported or most or all packets try to use the same
		component link.
	      </t>
	      <t>
		Repeat the two tests above with the entropy contained
		in IP addresses below the label stack rather than in
		the label stack.
	      </t>
	      <t>
		Determine whether traffic that contains a pseudowire
		control word is interpreted as IP traffic.
		Information in the payload MUST NOT be used in the
		load balancing if the first nibble of the packet is
		not 4 or 6 (IPv4 or IPv6).
	      </t>
	      <t>
		Determine whether reserved labels are included in the
		label stack hash.  They MUST NOT be included.
	      </t>
	    </list>
	  </t>
	  <t hangText="Pseudowire Capabilities and Performance">
	    <vspace blankLines="0" />
            <list counter="t" hangIndent="4" style="format T#%d">
	      <t>
		Determine whether pseudowire can be set up with a
		pseudowire label and pseudowire control word added at
		ingress and the pseudowire label and pseudowire
		control word removed at egress.
	      </t>
	      <t>
		For pseudowire that contains variable length payload
		packets, repeat the packet size based performance
		tests for pseudowire ingress and egress functions.
	      </t>
	      <t>
		Repeat pseudowire performance tests with and without 
		a pseudowire control word.
	      </t>
	      <t>
		Determine whether pseudowire can be set up with a
		pseudowire label, flow label, and pseudowire control
		word added at ingress and the pseudowire label, flow
		label, and pseudowire control word removed at egress.
	      </t>
	      <t>
		Determine which payload fields are used to create the
		flow label and whether the set of fields and algorithm
		provide sufficient entropy for load balancing.
	      </t>
	      <t>
		Repeat pseudowire performance tests with flow labels
		included.
	      </t>
	    </list>
	  </t>
	  <t hangText="Entropy Label Support and Performance">
	    <vspace blankLines="0" />
            <list counter="t" hangIndent="4" style="format T#%d">
	      <t>
		Determine whether entropy labels are supported.
	      </t>
	      <t>
		Determine which fields are used to create an entropy
		label.  Labels further down in the stack, including
		entropy labels further down and IP payload where
		applicable should be used.  Determine whether the set
		of fields and algorithm provide sufficient entropy for
		load balancing.
	      </t>
	      <t>
		Repeat performance tests at LSP ingress and egress
		when entropy labels are added or removed.
	      </t>
	      <t>
		Determine whether an ELI is detected when acting as a
		midpoint LSR and whether the search for further
		information on which to base the load balancing is
		used.  Information below the entropy labe MUST NOT be
		used.
	      </t>
	      <t>
		Repeat performance tests for midpoint LSR with entropy
		labels found at various label stack depths.
	      </t>
	    </list>
	  </t>
	</list>
      </t>

    </section>

    <!-- Possibly an Acknowledgements or a 'Contributors' section ... -->

    <section anchor="sect.iana" title="IANA Considerations">
      <t>This memo includes no request to IANA.</t>
    </section>

    <section anchor="sect.security" title="Security Considerations">
      <t>
	This document reviews forwarding behaviour specified elsewhere
	and points out compliance and performance requirements.  As
	such it introduces no new security requirements or concerns.
	Knowledge of potential performance shortcomings may serve to
	help avoid pitfalls, but in very unlikely circumstances such
	knowledge could in principle be the basis of denial of
	service.  In practice such extreme data and packet rate would
	be needed to make this type of denial of service extremely
	unlikely and undetectable denial of service impossible.
      </t>
    </section>

  </middle>

  <back>

    <references title="Normative References">

      &RFC2119;
      &RFC3032;
      &RFC3209;
      &RFC3270;
      &RFC3443;
      &RFC4090;
      &RFC4182;
      &RFC4201;
      &RFC5129;
      &RFC5586;
      &RFC6391;

      &I-D.ietf-mpls-entropy-label;

    </references>

    <references title="Informative References">

      &RFC2475;
      &RFC3031;
      &RFC4124;
      &RFC4221;
      &RFC4206;
      &RFC5695;
      &RFC5462;
      &RFC6639;

      <reference anchor="ACK-compression">
        <front>
          <title>Observations and Dynamics of a Congestion Control
          Algorithm: The Effects of Two-Way Traffic</title>
          <author fullname="Zhang, L." />
          <author fullname="Shenker, S" />
          <author fullname="Clark, D. D." />
          <date year="1991" />
        </front>
	<seriesInfo name="Proc. ACM SIGCOMM, ACM Computer
			  Communications Review (CCR)"
		    value="Vol 21, No 4, 1991, pp.133-147." />
      </reference>

    </references>

  </back>
</rfc>
