draft                  Path Precedence Discovery           December 1996


                       Path Precedence Discovery

                        Wed Dec 11 14:12:57 1996


                              Geoff Huston
                                Telstra
                            gih@telstra.net

                            Marshall T. Rose
                      Dover Beach Consulting, Inc.
                         mrose@dbc.mtview.ca.us


                    <draft-huston-pprec-discov-00.txt>


                          Status of this Memo

This document is an Internet Draft.  Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas, and
its Working Groups.  Note that other groups may also distribute working
documents as Internet Drafts.

Internet Drafts are valid for a maximum of six months and may be
updated, replaced, or obsoleted by other documents at any time.  It is
inappropriate to use Internet Drafts as reference material or to cite
them other than as a "work in progress".


                                Abstract

This memo describes a technique for dynamically discovering the maximum
precedence (MPrec) of an arbitrary internet path.  It specifies a small
change to the way that routers generate one type of ICMP messages.  For
a path that passes through a router which does not implement this
change, this technique may not necessarily discover the correct Path
MPrec.  In this case, an application which desires to make use of a
non-zero precedence should degrade gracefully.


Expires June 1997                                               [Page 1]


draft                  Path Precedence Discovery           December 1996


1.  (A Lengthy) Introduction

Today's Internet is composed of many Internet Access Providers (IAPs),
connected in a disorganized graph.  By and large, each IAP provisions
its own network independently of its neighbor IAPs.  This leads to an
Internet mesh of widely-varying characteristics.

When an Internet consumer (henceforth "consumer") negotiates with an IAP
to provision service, the consumer enters into an agreement which
specifies a expected quality-of-service (eQoS) enjoyed by packets which
travel between the consumer's point of attachment with the IAP
(henceforth "consumer attachment") and anywhere throughout the IAP's
network.  eQoS includes, but is not limited, to characteristics such as
throughput, latency, loss, availability and so on.  Accordingly, if two
consumers subscribe to the same IAP, then they may determine eQoS for
the packets they exchange based on the "intersection" of their service
agreement parameters with that IAP.  The eQoS is, of course, determined
in an out-of-band fashion.  For example, two consumers may choose to
agree to subscribe to the same IAP for an identical level of service.

However, if two consumers subscribe to different IAPs, then it becomes
more difficult to determine the eQoS for the packets they exchange.
Simply put, there is no mechanism for communicating the eQoS exhibited
by the two IAP networks when they exchange traffic.  Further, if the two
IAP networks do not share an interconnection point, but rather rely on
the services of one or more transit networks, then there is no a priori
predictability to the eQoS for the packets exchanged by the two
consumers.

A fully-general solution to this problem is far beyond the scope of this
memo.  However, any problem can be solved...if the problem can be made
small enough!

In this context, observe that what is lacking in today's Internet is a
simple mechanism whereby IAPs may provision service which differentiates
traffic based on precedence (that is, packets with a higher precedence
value receive preferential treatment).  Although RFC 791, the "Internet
Protocol" document, provides for a three-bit precedence field, the
_operational_ Internet lacks an mechanism whereby IAPs can treat traffic
preferentially.  The reason, of course, is that there is no mechanism
available which allow a consumer to expect that asking for a higher
precedence results in preferential behavior.  As a consequence, there is
no incentive for IAPs to provide preferential behavior on a cost-
differential basis.


Expires June 1997                                               [Page 2]


draft                  Path Precedence Discovery           December 1996


To solve this much smaller problem, IAPs must take two actions..lp
First, the IAP's routers must be configured to preferentially handle
packets based on their precedence.  There are three aspects to this:

-    precedence-ordered queue service (c.f., Section 5.3.3.1 of RFC
     1812, the "Requirements for IP Version 4 Routers" document), which
     (among other things) causes a router to order the forwarding
     process and output interface queues based on highest precedence;

-    precedence-based congestion control (c.f., Section 5.3.6 of RFC
     1812), which causes a router to drop packets based on lowest
     precedence; and,

-    link layer priority features (c.f., Section 5.3.3.2 of RFC 1812),
     which causes a router to select service levels of the lower layers
     to provide preferential treatment.

Second, each router on the IAP's side of the consumer/IAP attachment
must discard packets higher than the maximum precedence (MPrec) for that
consumer site, and return an ICMP destination unreachable message with a
code indicating the consumer's MPrec.  Naturally, one would expect that
the MPrec for the consumer site would be a new variable added into the
service agreement between the IAP and the consumer.  The default MPrec,
of course, is zero, which is the common practice in today's Internet.

If the consumer opts to negotiate for a non-zero MPrec, then it must
have an expectation that the packets it sends with non-zero precedence
will be honored along the path from its IAP attachment to the
destination.  Further, the consumer probably also wants an assurance
that the return traffic can also enjoy the same level of precedence.
This memo describes the protocol used in order probe an internet path
with respect to the precedence it supports.

However, before specifying the protocol, it is necessary to discuss some
aspects of provisioning a precedence-based facility in an IAP.


Expires June 1997                                               [Page 3]


draft                  Path Precedence Discovery           December 1996


2.  RSVP Considered Harmful?

In a word, no.  Having said that, we must be careful to describe why the
solution described in this memo is appropriate for today's Internet
environment, and why the RSVP-solution space is premature for today's
market.

An Internet network can be considered as a collection of switches and
interconnecting bandwidth.  When provisioning a network to support a
single grade of service, the IAP must deploy adequate switches and
bandwidth to accomodate the offered load with an "acceptably low level"
of transit loss.

Although one could view an "acceptably low level" as no loss, the TCP
flow control algorithm uses packet loss as one threshold signal when
searching for a dynamic level of peak transmission throughput.  As such,
it is perhaps more appropriate to define the acceptable level as one
which avoids periods of degenerative congestion-induced protocol
collapse.

When the network is placed under load stress the IAP has two options:

-    augment the bandwidth and basic switching resources to a level
     commensurate to the increase in load; or,

-    increase the complexity of the switching algorithm being used.

In the latter case, the increased complexity of switching takes the form
of algorithms which perform precedence-ordered queue service.  By
applying these algorithms, network performance is selectively
downgraded, allowing a category of network traffic greater access to
network resources.


2.1.  Investment Economics

Ultimately, the choice between these options is based on investment
economics.

If the IAP purchases additional bandwidth outright, the IAP is making a
capital investment with a relatively long investment life.  Instead, if
the IAP leases additional bandwidth, then given adequate supply, the IAP
is encountering a recurrent cost, scaling at a rate commensurate with
the growth in traffic volume.  Both of these activities can be
undertaken with reasonable financial certainty given basic soundness in


Expires June 1997                                               [Page 4]


draft                  Path Precedence Discovery           December 1996


the provider's business structure.

Alternatively, if the IAP increases the level of switching complexity,
the IAP is encountering a capital cost with a relatively short
investment life cycle.  Further, as the same switching algorithm must be
applied across higher traffic levels within a constant timeframe, this
capital cost increases as a function of transmission speed.

To reduce the impact of the scaling of the capital cost of switching
complexity, network state models, such as RSVP, were developed to
scaling the switching complexity in the face of increasing traffic
levels.


2.2.  The Cost of Switching and Bandwidth

Since increasing the complexity of switching does not increase the
absolute level of traffic carried by the network, the IAP must apply
differential charges to high precedence traffic, in order to generate a
financial return on the investment in more complex switching systems.
In contrast, increasing bandwidth allows greater traffic volumes, which
result into an increased revenue stream, which offsets the cost of the
additional bandwidth.

As such, the costs of additional switching complexity and bandwidth must
be measured against the difference in revenue streams.

In general, switching complexity is less expensive than bandwidth only
when bandwidth is _very_ expensive, e.g., in international traffic
circuits, and in the context of an under-developed communications
infrastructure. Although significant, these environments are not
dominant parts of the Internet infrastructure.  As a result, the end-
to-end environment is heterogenous, wherein:


-    some providers will continue to offer a single grade of service,
     using augmentation of bandwidth and single service level switching
     complexity to service their traffic load; whilst

-    other providers will adopt service class structures as a means of
     management of the congestion impact of imposed traffic.


Expires June 1997                                               [Page 5]


draft                  Path Precedence Discovery           December 1996


2.3.  QoS and Critical Mass

From the consumer's perspective, service quality results from end-to-end
behavior.  Contracting for a particular eQoS is irrelevant unless that
service is supported across the path taken by the traffic.  Indeed,
without a level of transitive bilateral, or multilateral, agreements on
policy for precedence there is no economic motivation for an Internet
Transit Provider (ITP) to honor a precedence request.

This leads to a situation where the investment in switching complexity
as a unilateral decision by one IAP yields no visible enhancement in
end-to-end QoS, even when both end subscribers have subscribed to an
enhanced service.  For precedence to function in a useful fashion across
a multi-provider Internet there is a requirement for a critical number
of IAPs and ITPs to adhere to a common structure for honoring precedence
requests.

We claim, in an heterogenous environment, that it is not uniformly
economically attractive for network providers to multilaterally
subscribe to the implementation of a definition of an end-to-end network
state, e.g., RSVP, to support defined QoS measures.  Instead, we claim
that it is viable for network providers to:

-    elect to honor a common semantic structure which allows consumers
     to make precedence requests; and,

-    to give the consumer the capability to probe for precedence
     capability for the path taken by its traffic.

This approach allows a graduated imposition of QoS across the Internet,
and allows the consumer the option to precedence request in those
situations where no appreciable benefit is derived.


Expires June 1997                                               [Page 6]


draft                  Path Precedence Discovery           December 1996


3.  The Path Precedence (PPrec) Discovery Protocol

In brief:

-    A consumer's host sends a packet with the desired precedence.

-    If any of the routers along the packet's path are configured to
     administratively disallow sending packets with that precedence, the
     router discards the packet and returns an ICMP Destination
     Unreachable message with a (new) code meaning "precedence not
     allowed".

-    Upon receipt of such a "precedence not allowed" message, the host
     takes reacts based on the requirements of its application.  For
     example, an application might view a particular precedence as a
     mandatory requirement, and opt not to communicate at that time.
     Alternatively, it may direct the host to send packets with a lesser
     precedence.

-    The PPrec discovery process ends when the host's estimate of the
     PPrec is low enough that its packets can be delivered without being
     discarded for administrative reasons.

-    Changes in the routing topology may reduce the PPrec, resulting in
     packets being discarded and "precedence not allowed" messages being
     returned.  A host should react accordingly.  Similarly, changes in
     the routing topology may (silently) increase the PPrec.  To probe
     for this, a host may infrequently generate packets with a higher
     precedence, providing such actions are not destructive to the host
     application (i.e., receiving a subsequent "precedence not allowed"
     message will not, by itself, abort a TCP connection).


3.1.  Host specification

When a host receives a "precedence not allowed" message, it MUST reduce
its estimate of the MPrec for the relevant path, based on the value of
the Next-Hop Maximum Precedence field in the message (c.f., Section
3.2).  No further specification is placed upon the hosts behavior, as
different applications may have different requirements, and since
different implementation architectures may favor different strategies.

It is required that after receiving a "precedence not allowed" message,
a host MUST attempt to avoid eliciting more such messages in the near
future, by reducing the precedence of the packets that it is sending


Expires June 1997                                               [Page 7]


draft                  Path Precedence Discovery           December 1996


along the path.

Hosts using PPrec Discovery MUST detect decreases in MPrec as fast as
possible.  Hosts MAY detect increases in PPrec, but because doing so
requires sending packets larger than the current estimated PPrec, and
because the likelihood is that the PPrec will not have increased, this
MUST be done at infrequent intervals.  An attempt to detect an increase
(by sending a packet larger than the current estimate) MUST NOT be done
less than 5 minutes after a "precedence not allowed" message has been
received for the given destination, or less than 1 minute after a
previous, successful attempted increase.  It is RECOMMENDED that these
timers be set at twice their minimum values (10 minutes and 2 minutes,
respectively).

A host MUST not increase its estimate of the PPrec in response to the
contents of a "precedence not allowed" message.  A message purporting to
announce an increase in the PPrec might be a stale packet that has been
floating around in the Internet, a false packet injected as part of a
denial-of-service attack, or the result of having multiple paths to the
destination.


3.2.  Router specification

When a router is administratively configured to discard a packet because
it exceeds the precedence allowed for the packet's source, the router
MUST return an ICMP Destination Unreable message to the source of the
packet, with the code indicating "precedence not allowed" [[value TBD]].
The router MUST include the maximum precedence allowed to the packet's
source in the Next-hop Maximum Precedence (NH-MPrec) field:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Type = 3    |   Code = TBD  |           Checksum            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           unused = 0                          |    NH-MPrec   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |      Internet Header + 64 bits of Original Datagram Data      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The value carried in the 8-bit Next-Hop MPrec field is:
      Bits 0-2:  Maximum precedence allowed over the next-hop for the
                 packet's source.
      Bit  3-7:  Reserved for Future Use.


Expires June 1997                                               [Page 8]


draft                  Path Precedence Discovery           December 1996


3.3.  Determining if the Discovery Algorithm is Available

RFC 1812 makes no mention of administrative controls of precedence-based
routing, other than to say that that there must be a way to disable such
mechanisms.  As such, there is no "standard" administrative  behavior
for today's routers when they encounter packets with a non-zero
precedence field.

It is RECOMMENDED that routers be administratively configured to always
generate a "precedence not allowed" message when receiving a packet with
a precedence value of 7 (all-ones).  This allows a sophisticated host to
send to probe for the existence of the PPrec Discovery algorithm by
sending packets to the destination with this all-ones value.  (Of
course, this doesn't guarantee that algorithm is available at all
routers along the path, but it does provide a good initial estimate.)


4.  Implementation Issues

The issues in handing PPrec Discovery are similar to those associated
with PMTU Discovery.  Accordingly, the reader is directed to Section 6
of RFC 1191.


5.  Security considerations

This PPrec Discovery mechanism makes possible a denial-of-service
attack, in which a third-party sends a false "precedence not allowed"
message indicates a Next-hop MPrec much smaller than reality.  This may
cause an application which requires a higher PPrec to cease its efforts
to communicate.

A third-party party could also cause problems if it could stop a host
from receiving legitimate "precedence not allowed" messages, but in this
case there are simpler denial-of-service attacks available.


6.  Acknowledgements

This proposal is inspired by RFC 1191, the "Path MTU Discovery" for IPv4
document.  All good ideas contained herein have been borrowed freely
from other sources, whilst all bad ideas contained herein are wholly
new.


Expires June 1997                                               [Page 9]


draft                  Path Precedence Discovery           December 1996


7.  Authors' Address

     Geoff Huston
     Telstra
     5/490 Northbourne Ave
     Dickson ACT 2609
     UA

     Tel:    +1 61 208 1908
     Fax:    +1 61 248 6165
     E-Mail: gih@telstra.net

     Marshall T. Rose
     Dover Beach Consulting, Inc.
     11975 El Camino Real
     Suite 200
     San Diego, CA  92130
     US

     Tel:    +1 619 793 2700
     Fax:    +1 619 793 2950
     E-Mail: mrose@dbc.mtview.ca.us


Expires June 1997                                              [Page 10]


draft                  Path Precedence Discovery           December 1996


Table of Contents


1 (A Lengthy) Introduction ........................................    2
2 RSVP Considered Harmful?  .......................................    4
2.1 Investment Economics ..........................................    4
2.2 The Cost of Switching and Bandwidth ...........................    5
2.3 QoS and Critical Mass .........................................    6
3 The Path Precedence (PPrec) Discovery Protocol ..................    7
3.1 Host specification ............................................    7
3.2 Router specification ..........................................    8
3.3 Determining if the Discovery Algorithm is Available ...........    9
4 Implementation Issues ...........................................    9
5 Security considerations .........................................    9
6 Acknowledgements ................................................    9
7 Authors' Address ................................................   10


Expires June 1997                                               [Page 0]