<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
    which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [

<!ENTITY RFC2119 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY RFC4271 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4271.xml">
<!ENTITY RFC5065 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5065.xml">
<!ENTITY RFC4486 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4486.xml">
<!ENTITY RFC4724 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4724.xml">
<!ENTITY RFC8203 SYSTEM
  "http://xml.resource.org/public/rfc/bibxml/reference.RFC.8203.xml">
]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="std" docName="draft-ietf-idr-bgp-gr-notification-15.txt"
     ipr="trust200902" updates="4724">

  <front>

    <title abbrev="Notification support for BGP GR">
    Notification Message support for BGP Graceful Restart</title>

    <author fullname="Keyur Patel" initials="K.P."
            surname="Patel">
      <organization>Arrcus</organization>
      <address>
        <email>keyur@arrcus.com</email>
      </address>
    </author>

    <author fullname="Rex Fernando" initials="R.F."
            surname="Fernando">
      <organization>Cisco Systems</organization>
      <address>
        <postal>
          <street>170 W. Tasman Drive</street>
          <city>San Jose</city>
          <region>CA</region>
          <code>95134</code>
          <country>USA</country>
        </postal>
        <email>rex@cisco.com</email>
      </address>
    </author>

    <author fullname="John Scudder" initials="J.S."
            surname="Scudder">
      <organization>Juniper Networks</organization>
      <address>
        <postal>
          <street>1194 N. Mathilda Ave</street>
          <city>Sunnyvale</city>
          <region>CA</region>
          <code>94089</code>
          <country>USA</country>
        </postal>
        <email>jgs@juniper.net</email>
      </address>
    </author>

    <author fullname="Jeff Haas" initials="J.H."
            surname="Haas">
      <organization>Juniper Networks</organization>
      <address>
        <postal>
          <street>1194 N. Mathilda Ave</street>
          <city>Sunnyvale</city>
          <region>CA</region>
          <code>94089</code>
          <country>USA</country>
        </postal>
        <email>jhaas@juniper.net</email>
      </address>
    </author>

    <date/>

    <area>General</area>

   <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>IDR</keyword>
    <keyword>BGP</keyword>

<abstract>
<t>
The BGP Graceful Restart mechanism defined in RFC 4724 limits the usage
of BGP Graceful Restart to BGP protocol messages other than a BGP
NOTIFICATION message. This document updates RFC 4724 by defining an
extension that permits the Graceful Restart procedures to be
performed when the BGP speaker receives a BGP NOTIFICATION Message or
the Hold Time expires. This document also defines a new BGP NOTIFICATION
Cease Error subcode whose effect is to request a full session restart
instead of a Graceful Restart. 
</t>
</abstract>

  </front>

  <middle>

<section anchor="introduction" title="Introduction">
<t>
For many classes of errors, the BGP protocol must send a NOTIFICATION
message and reset the peering session to handle the error condition.  The
BGP Graceful Restart extension defined in <xref target="RFC4724"></xref> 
requires that normal BGP procedures defined in <xref target="RFC4271"></xref> 
be followed when a NOTIFICATION message is sent or received.
This document defines an extension to BGP Graceful Restart that permits the
Graceful Restart procedures to be performed when the BGP speaker receives a
NOTIFICATION message or the Hold Time expires.  This permits the BGP speaker to avoid flapping
reachability and continue forwarding while the BGP speaker restarts the
session to handle errors detected in the BGP protocol.
</t>
<t>
At a high level, this document can be summed up as follows. When a
BGP session is reset, both speakers operate as "Receiving Speakers"
according to <xref target="RFC4724"/>, meaning they retain each
other's routes. This is also true for HOLDTIME expiration. The
functionality can be defeated using a "Hard Reset" subcode for the
BGP NOTIFICATION Cease Error code. If a Hard Reset is used, a full
session reset is performed.
</t>
      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
      </section>
    </section> 

<section anchor="capability"
         title="Modifications to BGP Graceful Restart Capability">
<t>
The BGP Graceful Restart Capability is augmented to signal the Graceful Restart
support for BGP NOTIFICATION messages. The Restart Flags field is augmented as 
follows (following the diagram from section 3 of <xref target="RFC4724"/>):
</t>

<t>
      <figure align="center">
        <artwork align="left"><![CDATA[
 Restart Flags:

         This field contains bit flags relating to restart.
            ]]></artwork>
      </figure>
</t>
<t>
      <figure align="center">
        <artwork align="left"><![CDATA[
             0 1 2 3 
            +-+-+-+-+
            |R|N|   |
            +-+-+-+-+
            ]]></artwork>
      </figure>
</t>
<t>
The most significant ("Restart State", or "R") bit is defined in
<xref target="RFC4724"/>.
</t>
<t>
The second most significant bit ("N") is defined as the BGP Graceful
Notification bit, which is used to indicate Graceful Restart support
for BGP NOTIFICATION messages. A BGP speaker indicates support for
the procedures of this document, by advertising a Graceful Restart
Capability with its Graceful NOTIFICATION bit set (value 1). This
also implies support for the format for a BGP NOTIFICATION Cease
message defined in <xref target= "RFC4486"/>.
</t>
<t>
If a BGP speaker that previously advertised a given set of Graceful Restart
parameters opens a new session with a different set of parameters, these
new parameters apply once the session has transitioned into ESTABLISHED
state.
</t>
</section>

<section title="BGP Hard Reset Subcode">
<t>
We define a new BGP NOTIFICATION Cease message subcode, called the BGP Hard
Reset Subcode. The value of this subcode is discussed in <xref
target="IANA"/>. We refer to a BGP NOTIFICATION Cease message with the Hard
Reset subcode as a Hard Reset message, or just a Hard Reset. 
</t>
<t>
When the "N" bit has been exchanged by two peers, to distinguish them from
Hard Reset we refer to any NOTIFICATION messages other than Hard Reset as
"Graceful", since such messages invoke Graceful Restart semantics.
</t>

<section anchor="sending" title="Sending a Hard Reset">
<t>
  A Hard Reset message is used to indicate to a peer with which the
  Graceful Notification bit has been exchanged, that the session is to
  be fully terminated.
</t>

<t>
  When sending a Hard Reset, the data portion of the NOTIFICATION
  is encoded as follows:
</t>

<figure align="left">
  <artwork align="left"><![CDATA[
    +--------+--------+--------
    | ErrCode| Subcode| Data
    +--------+--------+--------
  ]]></artwork>
</figure>

<t>
  ErrCode is a BGP Error Code (as documented in the IANA BGP Error 
  Codes registry) that indicates the reason for the Hard Reset.
  Subcode is a BGP Error Subcode (as documented in the IANA BGP
  Error Subcodes registry) as appropriate for the ErrCode. 
  Similarly, Data is as appropriate for the ErrCode and Subcode.
  In short, the Hard Reset encapsulates another NOTIFICATION 
  message in its data portion.
</t>
</section>

<section title="Receiving a Hard Reset">
<t>
Whenever a BGP speaker receives a Hard Reset, the speaker MUST
terminate the BGP session following the standard procedures in <xref
target="RFC4271"/>.
</t>
</section>
</section>

<section title="Operation">
<t>
A BGP speaker that is willing to receive and send BGP NOTIFICATION messages
according to the procedures of this document MUST advertise the BGP
Graceful Notification "N" bit using the Graceful Restart Capability as
defined in <xref target ="RFC4724"></xref>.
</t>

<t>
When such a BGP speaker has received the "N" bit from its peer,
and receives from that peer a BGP NOTIFICATION message other than a
Hard Reset, it MUST follow the rules for the Receiving
Speaker mentioned in <xref target ="receiving_speaker"/>. The BGP
speaker generating the BGP NOTIFICATION message MUST also follow the
rules for the Receiving Speaker.
</t>

<t>
When a BGP speaker resets its session due to a HOLDTIME expiry, it
should generate the relevant BGP NOTIFICATION message as mentioned
in <xref target="RFC4271"/>, but subsequently it MUST follow the
rules for the Receiving Speaker mentioned in <xref target
="receiving_speaker"/>.
</t>

<t>
A BGP speaker SHOULD NOT send a Hard Reset to a peer from which it has
not received the "N" bit. We note, however, that if it did so the effect
would be as desired in any case, since according to <xref
target="RFC4271"/> and <xref target="RFC4724"/> any NOTIFICATION
message, whether recognized or not, results in a session reset. Thus the
only negative effect to be expected from sending the Hard Reset to a
peer that hasn't advertised compliance to this specification would be
that the peer would be unable to properly log the associated information.
</t>

<t>
Once the session is re-established, both BGP speakers SHOULD set
their "Forwarding State" bit to 1. If the "Forwarding State" bit is
not set, then according to the procedures of <xref target
="RFC4724"/> section 4.2, the relevant routes will be flushed, defeating
the goals of this specification.
</t>

<section anchor="receiving_speaker" 
         title="Rules for the Receiving Speaker">
<t>
<xref target="RFC4724"/> section 4.2 defines rules for the Receiving
Speaker. These are modified as follows.
</t>
<t>
The sentence "To deal with possible consecutive restarts, a route
(from the peer) previously marked as stale MUST be deleted" only applies 
when the "N" bit has not been exchanged with the peer:
</t>
<t>
<list style="hanging" hangIndent="5">
<t hangText="OLD:">
   When the Receiving Speaker detects termination of the TCP session for
   a BGP session with a peer that has advertised the Graceful Restart
   Capability, it MUST retain the routes received from the peer for all
   the address families that were previously received in the Graceful
   Restart Capability and MUST mark them as stale routing information.
   To deal with possible consecutive restarts, a route (from the peer)
   previously marked as stale MUST be deleted.  The router MUST NOT
   differentiate between stale and other routing information during
   forwarding.
</t>
<t hangText="NEW:">
   When the Receiving Speaker detects termination of the TCP session for a
   BGP session with a peer that has advertised the Graceful Restart
   Capability, it MUST retain the routes received from the peer for all the
   address families that were previously received in the Graceful Restart
   Capability and MUST mark them as stale routing information. The router
   MUST NOT differentiate between stale and other routing information
   during forwarding. If the "N" bit has not been exchanged with the peer,
   then to deal with possible consecutive restarts, a route (from the peer)
   previously marked as stale MUST be deleted. 
</t>
</list>
</t>
<t>
The stale timer is given a formal name and made mandatory:
</t>
<t>
<list style="hanging" hangIndent="5">
<t hangText="OLD:">
   To put an upper bound on the amount of time a router retains the
   stale routes, an implementation MAY support a (configurable) timer
   that imposes this upper bound.
</t>
<t hangText="NEW:">
   To put an upper bound on the amount of time a router retains the
   stale routes, an implementation MUST support a (configurable) timer,
   called the "stale timer", that imposes this upper bound. A suggested
   default value for the stale timer is 180 seconds. An implementation 
   MAY provide the option to disable the timer (i.e., to provide an 
   infinite retention time) but MUST NOT do so by default.
</t>
</list>
</t>
</section>
</section>

<section title="Use of Hard Reset">
<section title="When to Send Hard Reset">
<t>
Although when to send a Hard Reset is an implementation-specific decision,
we offer some advice. Many Cease notification subcodes represent permanent
or long-term rather than transient session termination, and as such it's
appropriate to use Hard Reset with them. At time of publication, 
Cease subcodes 1-9 were defined. 
</t>
<texttable title="Suggestions for Cease Subcode Behavior">
	<ttcol align='center'>Value</ttcol>
	<ttcol align='center'>Name</ttcol>
	<ttcol align='center'>Suggested Behavior</ttcol>

	<c>1</c>
	<c>Maximum Number of Prefixes Reached</c>
	<c>Hard Reset</c>
	
	<c>2</c>
	<c>Administrative Shutdown</c>
	<c>Hard Reset</c>
	
	<c>3</c>
	<c>Peer De-configured</c>
	<c>Hard Reset</c>
	
	<c>4</c>
	<c>Administrative Reset</c>
	<c>Provide user control</c>
	
	<c>5</c>
	<c>Connection Rejected</c>
	<c>Graceful Cease</c>
	
	<c>6</c>
	<c>Other Configuration Change</c>
	<c>Graceful Cease</c>
	
	<c>7</c>
	<c>Connection Collision Resolution</c>
	<c>Graceful Cease</c>
	
	<c>8</c>
	<c>Out of Resources</c>
	<c>Graceful Cease</c>
	
	<c>9</c>
	<c>Hard Reset</c>
	<c>Hard Reset</c>
</texttable>
<t>
These suggestions are only that, suggestions, not requirements. It's the
nature of BGP implementations that the mapping of internal states to BGP
NOTIFICATION codes and subcodes is not always perfect. The guiding
principle for the implementor should be that if there is no realistic hope
that forwarding can continue or that the session will be re-established
within the deadline, Hard Reset should be used.
</t>
<t>
For all other NOTIFICATION codes other than Cease, use of Hard Reset
does not appear to be indicated.
</t>
</section>
<section title="Interaction With Other Specifications">
<t>
<xref target="RFC8203">"BGP Administrative Shutdown Communication"</xref>
specifies use of the data portion of the Administrative Shutdown or Administrative 
Reset Cease to convey a short message. When <xref target="RFC8203"/> is used in
conjunction with Hard Reset, the subcode of the outermost Cease MUST be Hard 
Reset, with the Administrative Shutdown or Reset Cease encapsulated within. The
encapsulated administrative shutdown message MUST subsequently be processed 
according to <xref target="RFC8203"/>.
</t>
</section>
</section>

<section title="Management Considerations">
<t>
When reporting a Hard Reset to network management, the error code and
subcode reported MUST be Cease, Hard Reset. If the network management layer
in use permits it, the information carried in the Data portion SHOULD be
reported as well.
</t>
</section>

<section anchor="Acknowledgements" title="Acknowledgements">
<t>
The authors would like to thank Jim Uttaro for the suggestion, and
Emmanuel Baccelli, Bruno Decraene, Chris Hall, Paul Mattes, Robert
Raszuk, and Alvaro Retana for their review and comments.
</t>
</section>

    <section anchor="IANA" title="IANA Considerations">
<t>
IANA has temporarily assigned subcode 9, named "Hard Reset", in the "BGP
Cease NOTIFICATION message subcodes" registry. Upon publication of this
document as an RFC, IANA is requested to make this allocation permanent.
</t>
<t>
IANA is requested to establish a registry within the "Border Gateway
Protocol (BGP) Parameters" grouping, to be called "BGP Graceful Restart
Flags". The Registration Procedure should be Standards Action, the
reference this document and <xref target="RFC4724"/>, and the initial
values as follows:
</t>
<texttable>
	<ttcol align='center'>Bit Position</ttcol>
	<ttcol align='center'>Name</ttcol>
	<ttcol align='center'>Short Name</ttcol>
	<ttcol align='center'>Reference</ttcol>

	<c>0</c>
	<c>Restart State</c>
	<c>R</c>
	<c><xref target="RFC4724"/></c>
	
	<c>1</c>
	<c>Notification</c>
	<c>N</c>
	<c>this document</c>
	
	<c>2, 3</c>
	<c>unassigned</c>
	<c></c>
	<c></c>
</texttable>
<t>
IANA is requested to establish a registry within the "Border Gateway
Protocol (BGP) Parameters" grouping, to be called "BGP Graceful Restart
Flags for Address Family". The Registration Procedure should be Standards
Action, the reference this document and <xref target="RFC4724"/>, and the
initial values as follows:
</t>
<texttable>
	<ttcol align='center'>Bit Position</ttcol>
	<ttcol align='center'>Name</ttcol>
	<ttcol align='center'>Short Name</ttcol>
	<ttcol align='center'>Reference</ttcol>

	<c>0</c>
	<c>Forwarding State</c>
	<c>F</c>
	<c><xref target="RFC4724"/></c>
	
	<c>1-7</c>
	<c>unassigned</c>
	<c></c>
	<c></c>
</texttable>
    </section>

    <section anchor="Security" title="Security Considerations">
<t>
This specification doesn't change the basic security model inherent in
<xref target="RFC4724"/>, with the exception that the protection against
repeated resets is relaxed. To mitigate the consequent risk that an attacker
could use repeated session resets to prevent stale routes from ever being
deleted, we make the stale routes timer mandatory (in practice it is
already ubiquitous). To the extent <xref target="RFC4724"/> might be said
to help defend against denials of service by making the control plane more
resilient, this extension may modestly increase that resilience; however,
there are enough confounding and deployment-specific factors that no
general claims can be made. 
</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &RFC2119;
      &RFC4271;
      &RFC4724;
      &RFC4486;
      &RFC8203;
    </references>
  </back>
</rfc>
