idnits 2.17.1 

draft-ietf-grow-ops-reqs-for-bgp-error-handling-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 9, 2014) is 3448 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'I-D.chen-ebgp-error-handling' is defined on line 615,
     but no explicit reference was found in the text

  ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760)

  == Outdated reference: A later version (-13) exists of
     draft-ietf-grow-bgp-gshut-06

  == Outdated reference: A later version (-17) exists of
     draft-ietf-grow-bmp-07


     Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                R. Shakir
3	Internet-Draft                                                        BT
4	Intended status: Informational                          November 9, 2014
5	Expires: May 13, 2015

7	Operational Requirements for Enhanced Error Handling Behaviour in BGP-4
8	           draft-ietf-grow-ops-reqs-for-bgp-error-handling-07

10	Abstract

12	   BGP-4 is utilised as a key intra- and inter-Autonomous System routing
13	   protocol in modern IP networks.  The failure modes as defined by the
14	   original protocol standards are based on a number of assumptions
15	   around the impact of session failure.  Numerous incidents both in the
16	   global Internet routing table and within Service Provider networks
17	   have been caused by strict handling of a single invalid UPDATE
18	   message causing large-scale failures in one or more Autonomous
19	   Systems.

21	   This memo describes the current use of BGP-4 within Service Provider
22	   networks, and outlines a set of requirements for further work to
23	   enhance the mechanisms available to a BGP-4 implementation when
24	   erroneous data is detected.  Whilst this document does not provide
25	   specification of any standard, it is intended as an overview of a set
26	   of enhancements to BGP-4 to improve the protocol's robustness to suit
27	   its current deployment.

29	Status of This Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on May 13, 2015.

46	Copyright Notice

48	   Copyright (c) 2014 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Requirements Language . . . . . . . . . . . . . . . . . . . .   2
64	   2.  Problem Statement . . . . . . . . . . . . . . . . . . . . . .   3
65	     2.1.  Role of BGP-4 in Service Provider Networks  . . . . . . .   3
66	     2.2.  Service Requirements for Amended BGP Error Handling . . .   4
67	   3.  Classes of Errors within UPDATE Messages  . . . . . . . . . .   6
68	     3.1.  Characteristics of Session Scope Errors . . . . . . . . .   6
69	     3.2.  Characteristics of Message Scope Errors . . . . . . . . .   7
70	     3.3.  Characteristics of Attribute Scope Errors . . . . . . . .   7
71	     3.4.  Avoiding Session Scope Errors . . . . . . . . . . . . . .   7
72	     3.5.  Future Attributes introduced to BGP . . . . . . . . . . .   8
73	   4.  Error Handling for Non-Critical Errors  . . . . . . . . . . .   8
74	     4.1.  NLRI-level Error Handling Requirements  . . . . . . . . .   8
75	       4.1.1.  Notifying the Remote Peer of Non-Critical Errors  . .   9
76	     4.2.  Recovering RIB Consistency following NLRI-level Error
77	           Handling  . . . . . . . . . . . . . . . . . . . . . . . .  10
78	   5.  Error Handling for Critical Errors  . . . . . . . . . . . . .  10
79	     5.1.  Long-Lived Critical Errors  . . . . . . . . . . . . . . .  11
80	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  12
81	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  12
82	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  12
83	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
84	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  13
85	     9.2.  Informational References  . . . . . . . . . . . . . . . .  13
86	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  14

88	1.  Requirements Language

90	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
91	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
92	   document are to be interpreted as described in RFC 2119 [RFC2119].

94	2.  Problem Statement

96	   BGP has developed into a key intra- and inter-domain routing
97	   protocol, deployed within both the Internet and private networks.
98	   The changing deployments of the protocol have resulted in increased
99	   demand for robustness of the routing system - with the error handling
100	   behaviour defined in [RFC4271] having been shown to have caused
101	   numerous incidents within live network deployments.  This document
102	   intends to provide an overview of the current deployment cases for
103	   BGP-4, and define a set of requirements (from the perspective of a
104	   network operator) for enhancing error handling within the protocol.

106	2.1.  Role of BGP-4 in Service Provider Networks

108	   BGP was designed as an inter-autonomous system (AS) routing protocol.
109	   Many of the error handling mechanisms within the protocol are defined
110	   in order to be guarantee consistency and correctness of information
111	   between two neighbouring speakers.  The assumption is made that each
112	   AS operates with many adjacencies, each propagating a relatively
113	   small amount of routing information.  Through focusing on information
114	   consistency, the protocol specification prefers failure of an
115	   individual routing adjacency to maintaining reachability to all NLRI
116	   propagated through a particular neighbour, with the expectation that
117	   alternate, less direct, paths can be selected where a failure occurs.
118	   These assumptions resulted in the specification made in [RFC4271]
119	   whereby the receipt of an erroneous UPDATE message is reacted to by
120	   sending a NOTIFICATION message, and tearing down the adjacency with
121	   the remote speaker from whom the error was observed.

123	   BGP's deployments have evolved with the growth of IP-based services.
124	   Historically, a network would deploy an interior gateway protocol
125	   (IGP) to carry infrastructure and customer routes, and utilise an
126	   external gateway protocol (EGP) such as BGP to propagate routes to
127	   other autonomous systems.  However, within modern deployments to
128	   ensure route convergence within an AS is within acceptable time
129	   bounds the amount of information within the IGP has been minimised
130	   (typically to only infrastructure routes). iBGP is then utilised to
131	   carry both internal, customer and external routes within an AS.  As
132	   such, this has resulted in BGP having become an IGP, with traditional
133	   IGPs providing only reachability between nodes within the AS for
134	   packet forwarding, and to establish iBGP sessions.  This change in
135	   role within the overall architecture of an AS has resulted in an
136	   increased robustness requirement for BGP, with the expectation of a
137	   similar level of robustness to that of an IGP being set.  The loss of
138	   an iBGP session can result in significant levels of unreachability
139	   internally to an AS, especially since there are typically limited
140	   (when compared to the Internet) signalling and forwarding paths
141	   available.

143	   The volume and nature of the information carried within BGP has also
144	   changed - it has become the ubiquitous means through which service
145	   information can be propagated between devices.  For instance, being
146	   utilised to carry IP/MPLS service information such as Layer 3 IP VPN
147	   routes [RFC4364] , and Layer 2 Virtual Private LAN Service device
148	   membership [RFC4761].  Since these extensions to the protocol allow
149	   signalling of multiple services (represented by address families
150	   within BGP), and multiple customer topologies (i.e., subsets of
151	   routes within each address family) via the BGP protocol, the impact
152	   of session failure is increased.  The tear down of a single BGP
153	   session can result in a complete outage to all customer services
154	   signalled via the session, even where the triggering event is related
155	   to only one service or topology being carried.

157	   In addition, there has been significant growth in the volume of
158	   routing information carried in BGP.  In numerous networks, the RIB
159	   size of individual BGP speakers can be of the order of millions of
160	   paths.  Particularly large volumes are observed at BGP speakers
161	   performing aggregation and border roles (such as ASBR, or route
162	   reflector hierarchies).  This increased volume of routes results not
163	   only in a significant number of services being impacted during a
164	   protocol failure, but also increases the time to recovery after re-
165	   establishing a BGP session.  The time taken to learn, compute and
166	   distribute new paths increases the impact of failures on services
167	   carried by the network - adding further weight to the requirement to
168	   avoid failures, or limit the extent of their impact.  Particularly,
169	   the impact of individual session failures is increased due to the
170	   existence of a relatively small number of highly-critical BGP
171	   sessions within Internet and multi-service network deployments.
172	   These sessions propagate a high-proportion of the reachability
173	   information - for instance, providing an Internet AS with the global
174	   routing table from upstream providers, or providing IP/MPLS Provider
175	   Edge devices adjacency with route reflector hierarchy providing
176	   signalling for elements of services connected elsewhere within the
177	   routing domain.  In both cases, the failure of these sessions can
178	   result in a significant outage to customer services.

180	2.2.  Service Requirements for Amended BGP Error Handling

182	   Alongside the infrastructure requirements outlined above, service
183	   provider customer requirements continue to evolve.  In particular,
184	   there are increasing requirements for robustness and fault isolation
185	   based on:

187	   o  The increasing reliance on public IP service instead of private
188	      networks - resulting is requirements for greater availability of
189	      Internet services.  The diversity of autonomous systems has
190	      resulted in individual BGP sessions within the Internet carrying
191	      more routing information (e.g., IP transit, or large peering
192	      interconnections), which is originated from more individual
193	      networks - increasing both the impact of an individual session
194	      failure, and the number of different sources of error which can
195	      lead to its failure.  To meet the requirement of high-availability
196	      Internet services, it is therefore an expectation that the error
197	      handling behaviour MUST affect only the those routes, or
198	      autonomous systems, that are are impacted by the erroneous
199	      messages, rather than all routes received by a particular session,
200	      such that the maximum service availability is maintained.

202	   o  The requirement to support multiple services.  In multi-service
203	      environments such as those that support L3VPNs, multiple customer
204	      VPNs are isolated from one another, and from other IP environments
205	      (such as the Internet).  There is an expectation from a service
206	      perspective therefore that the customer service is within its own
207	      fault domain (even when carried via a shared set of signalling),
208	      hence an error on routes or BGP messages related to one VPN SHOULD
209	      NOT negatively impact other VPNs.  Further to this, an error
210	      relating to another service (i.e., another address family, such as
211	      Internet or L2VPN services) SHOULD NOT impact the availability of
212	      the VPN service.  Both of these principles of fault separation are
213	      required in order to support multiple services and segregated
214	      customer infrastructures over a common network infrastructure
215	      whilst meeting the availability required of them.

217	   It should be noted that the requirements for fault isolation and
218	   high-availability do not imply that routing information that is
219	   potentially erroneous (through being carried in an UPDATE message
220	   that cannot be parsed for example) is always maintained despite
221	   questions as to its integrity, particularly as such routing
222	   information may result in leakage between services - but merely that
223	   there is a requirement to reconsider the balance between protocol
224	   correctness, and robustness.

226	   In addition to these service requirements, an increasing requirement
227	   to minimise the time taken to recover from incidents exists.  In some
228	   cases, this may require an operator to compromise on correctness in
229	   order to maintain integrity of a subset of routing information or
230	   services.  To meet this requirement, mechanisms allowing an operator
231	   to ignore all errors or maintain "known good" routing information MAY
232	   be required.  The implementation of such mechanisms is a business
233	   consideration of the service provider in question, and MUST consider
234	   the balance between the risk of incorrectness and the overall impact
235	   to a network platform.  Such mechanisms are particularly of use where
236	   lack of routing information violates an operator's policies (e.g.,
237	   filtering rules distributed via BGP FlowSpec are no longer
238	   installed), or fault isolation requires significant external liaison
239	   (such as contacting a third-party autonomous system to amend or
240	   filter route announcement).

242	3.  Classes of Errors within UPDATE Messages

244	   To meet the requirement to provide more targeted error handling,
245	   errors are therefore classified into the following scopes:

247	   o  Attribute Scope - in this case, an error can be localised to a
248	      particular attribute within the message.  For instance, such
249	      errors may occur when invalid flags are set within an individual
250	      attribute within a message, which is otherwise well-formed.

252	   o  Message Scope - errors resulting in the inability to parse a
253	      single UPDATE message, but not affecting the ability of an
254	      implementation to parse subsequent BGP messages.  For instance,
255	      where the overall length of an UPDATE message is correct, but the
256	      length of a single attribute contained within it is erroneously
257	      specified.

259	   o  Session Scope - where errors occur such that an error in an UPDATE
260	      message results in the inability to the parse subsequent messages.
261	      In this case, attribute length errors may result in the inability
262	      for a BGP implementation to locate the bounds of an UPDATE, and
263	      hence the subsequent message from a peer.

265	   For session-scope errors, the error handling approach implemented
266	   MUST conform with the requirements described in Section 5 of this
267	   document (generically referred to as "Critical" error handling
268	   mechanisms).  Session-scope errors requiring Critical error handling
269	   MUST be the only case whereby the impact of error handling mechanisms
270	   should be allowed to impact entire BGP sessions between two BGP
271	   speakers.

273	   For message- and attribute-level errors, "Non-Critical" error
274	   handling mechanisms SHOULD be used, which MUST meet the specification
275	   described in Section 4.  In the case of attribute-scope errors, a BGP
276	   speaker MUST limit the impact of error-handling mechanisms to the
277	   NLRI carried within the message, and MAY (where applicable) limit to
278	   the scope of error handling to the individual attribute.  Where a
279	   message-scope error occurs, a BGP speaker MUST limit the impact of
280	   error handling to the NLRI contained within the affected UPDATE.

282	3.1.  Characteristics of Session Scope Errors

284	   Based on analysis of existing BGP implementations, and incidents
285	   within the Internet and private network routing tables, it is
286	   expected that errors with a session level scope are restricted to:

288	   o  UPDATE Message Length errors - where the specified UPDATE message
289	      length is inconsistent with the sum of the Total Path Attribute
290	      and Withdrawn Routes length.  These errors relate to message
291	      packing or framing, and result in cases whereby the NLRI attribute
292	      cannot be correctly extracted from the message.

294	   o  Errors parsing the NLRI attribute of an UPDATE message - where the
295	      contents of the IPv4 Unicast Advertised or Withdrawn Routes
296	      attributes, or multi-protocol BGP NLRI attributes (MP_REACH_NLRI
297	      and/or MP_UNREACH_NLRI as defined in [RFC2858]), cannot be
298	      successfully parsed.

300	3.2.  Characteristics of Message Scope Errors

302	   Message scope errors are restricted to those whereby erroneous
303	   encoding results in the ability to parse and determine the NLRI
304	   carried by the message - but the carried attributes are invalid.
305	   These errors (based on existing attributes) are limited to:

307	   o  Errors where the length of all path attributes contained within
308	      the UPDATE does not correspond to the total path attribute length.

310	   o  UPDATE messages missing mandatory attributes, unrecognised non-
311	      optional attributes, or those that contain duplicate or invalid
312	      attributes (be they unsupported, or unexpected).

314	   o  Those messages where the NEXT_HOP, the MP_REACH_NLRI next-hop
315	      values are missing, zero-length, or invalid for the relevant
316	      address family.

318	3.3.  Characteristics of Attribute Scope Errors

320	   Attribute scope errors are defined to be those that relate to an
321	   individual attribute (not related to the NLRI) carried by an UPDATE
322	   message.  Particularly, where:

324	   o  Zero- or invalid-length errors in path attributes, excluding those
325	      containing NLRI.

327	   o  Invalid data or flags are contained in a path attribute that does
328	      not relate to the NLRI.

330	3.4.  Avoiding Session Scope Errors

332	   In order to maximise the number of cases whereby the NLRI attributes
333	   can be reliably extracted from a received message, where a BGP
334	   speaker supports multi-protocol extensions, the MP_REACH_NLRI and
335	   MP_UNREACH_NLRI attributes SHOULD be utilised for all address
336	   families (including IPv4 Unicast) and these attributes should be the
337	   first attribute contained within the UPDATE message.  For these Non-
338	   Critical errors, the NLRI-targeted error handling requirements
339	   described in Section 4 should be followed.

341	3.5.  Future Attributes introduced to BGP

343	   Where attributes are introduced by future extensions to the BGP
344	   protocol error handling behaviour SHOULD be assumed to be be at a
345	   message- or attribute-scope, unless otherwise specified within the
346	   per-extension memo, or the attribute relates directly to carrying
347	   NLRI.  It is recommended that authors of future BGP extensions SHOULD
348	   specify the error handling behaviour required on a per-attribute
349	   error basis.

351	4.  Error Handling for Non-Critical Errors

353	4.1.  NLRI-level Error Handling Requirements

355	   When a Non-Critical error is detected within an UPDATE message a BGP
356	   speaker MUST NOT send a NOTIFICATION message to the remote neighbour.
357	   Instead, the NLRI contained within the message SHOULD be considered
358	   as being withdrawn by the neighbour (referred to as treat-as-
359	   withdraw), until they are updated by a subsequent UPDATE message.
360	   Where defined is acceptable by the relevant memo, for the specific-
361	   case of attribute-scope errors, the erroneous attribute MAY be
362	   discarded by an implementation.  This attribute-discard approach MUST
363	   only be used for attributes that do not impact best-path selection
364	   within an implementation.  An operator SHOULD consider the impact of
365	   implementing policies considering such attributes as part of the
366	   route selection algorithm, such that operator configuration does not
367	   result in unexpected consequences should such an attribute be
368	   discarded.

370	   Network operators SHOULD recognise that where treat-as-withdraw
371	   behaviour is implemented black-holing or looping of traffic may occur
372	   in the period between the NLRI being treated as withdrawn, and
373	   subsequent updates, dependent upon the routing topology.  It SHOULD
374	   be noted that such periods of RIB inconsistency (where one speaker
375	   has advertised a prefix, which has had treat-as-withdraw applied to
376	   it by the receiving speaker) may be relatively long lived, based on
377	   situations such as an erroneous implementation at the receiver, or
378	   the error occurring within an optional-transitive attribute not
379	   examined by the direct neighbour.  In order to allow operators to
380	   select sessions on which this risk of inconsistency is acceptable, an
381	   implementation SHOULD provide means by which Non-Critical error
382	   handling can be disabled on a per-session basis.

384	   Since the Non-Critical error handling required within this section
385	   results in no NOTIFICATION message being transmitted, the fact that
386	   an error has occurred, and there may be inconsistency between the
387	   local and remote BGP speaker MUST be flagged to the network operator
388	   through standard operational interfaces (e.g., SNMP, syslog).  The
389	   information highlighted MUST include the NLRI identified to be
390	   contained within the error message, and SHOULD contain a exact copy
391	   of the received message for further analysis.

393	4.1.1.  Notifying the Remote Peer of Non-Critical Errors

395	   In order that the operator of the BGP speaker from whom an erroneous
396	   UPDATE message has been advertised is aware of the fact that some
397	   NLRI advertised to the remote speaker have been considered invalid, a
398	   BGP speaker SHOULD support mechanisms to report the occurrence of
399	   Non-Critical error handling to the remote speaker.  The receiving
400	   speaker SHOULD transmit the NLRI contained within the erroneous
401	   message to the advertising speaker.  An exact copy of the received
402	   UPDATE message SHOULD also be sent.

404	   The exchange of such information related to events occurring as a
405	   result of BGP messages is not currently supported by any extension to
406	   the protocol.  Clearly, where the two speakers reside within the same
407	   administrative domain, shared logging information can be utilised to
408	   identify the root cause of errors.  However, in many cases these
409	   devices reside within separate administrative domains (e.g., are
410	   ASBRs for Internet or private networks).  In this case, mechanisms
411	   allowing transmission in-band to the BGP session SHOULD be utilised
412	   (e.g., the OPERATIONAL message described in
413	   [I-D.ietf-idr-operational-message]).  Such an in-band channel is
414	   preferred based on the BGP session representing a pre-established
415	   trusted source which is related to a specific BGP-speaking device
416	   within a network.  It is expected that the overall system scalability
417	   of a BGP speaker is improved through utilising the existing channel,
418	   rather than incurring overhead for maintaining many additional
419	   sessions for relatively infrequent messaging events when errors
420	   occur.  However, the extensions providing such a channel MUST
421	   consider their impact to base BGP protocol functions such as the
422	   transmission of UPDATE or KEEPALIVE messages, and SHOULD limit the
423	   volume of messaging to direct reactions to Non-Critical errors
424	   occurring.  These considerations SHOULD be made in order to ensure
425	   that no compromise is made to the security, scalability and
426	   robustness of BGP.  Where additional BGP monitoring information that
427	   is not suitable to be carried in-band is required, out-of-band
428	   mechanisms such as the BMP protocol described in [I-D.ietf-grow-bmp]
429	   could be utilised to provide further information relating to
430	   erroneous messages.

432	4.2.  Recovering RIB Consistency following NLRI-level Error Handling

434	   In order to recover consistency of Adj-RIBs following Non-Critical
435	   error handling, a means by which a validation and recovery of
436	   consistency can be achieved SHOULD be provided to an operator.  This
437	   functionality MAY be provided through extension of the ROUTE-REFRESH
438	   [RFC2918] mechanism - providing means to identify the beginning and
439	   end of a replay of the entire Adj-RIB-Out of the advertising speaker
440	   (as per the suggestion in [I-D.ietf-idr-bgp-enhanced-route-refresh]).

442	   As Non-Critical error handling is localised to the NLRI contained
443	   within the erroneous UPDATE message, a targeted recovery mechanism
444	   MAY be provided allowing a speaker to request re-advertisement of a
445	   particular subset of the Adj-RIB-Out. Where such targeted refresh
446	   functions are available, they SHOULD be preferred to mechanisms
447	   requesting re-advertisement of the whole Adj-RIB-Out based on their
448	   more limited use of CPU and network resources.

450	   A BGP speaker may automatically trigger recovery mechanisms such as
451	   those described in this section following the receipt of an erroneous
452	   UPDATE message identified as Non-Critical to expedite recovery.  It
453	   SHOULD be noted that if automatic recovery mechanisms trigger only
454	   re-advertisement of an identical erroneous message, they may be
455	   ineffective.  Additionally, where the best-path to be advertised by
456	   remote speaker changes, this will be advertised directly, without a
457	   requirement for a request from the receiver.  However, in some cases,
458	   RIB consistency recovery mechanisms may prompt alternate UPDATE
459	   message packing, and hence allow quicker recovery.  Where such
460	   automatic mechanisms are implemented, those focused on smaller sets
461	   of NLRI SHOULD be preferred over those requesting the entire RIB.  In
462	   addition, such mechanisms SHOULD have dampening mechanisms to ensure
463	   that their impact to computational and network resources is limited.

465	5.  Error Handling for Critical Errors

467	   Critical error handling MUST be used where session-scope errors
468	   occur.  In such cases, a NOTIFICATION message MUST be sent to the
469	   remote peer.  In order to limit the impact to network operation,
470	   during such events the mechanisms applied MUST allow for the paths
471	   NLRI received from the remote speaker to continue to be utilised
472	   during the session reset and re-establishment.  It is envisaged that
473	   this requirement may be met through extension of the BGP Graceful
474	   Restart mechanism ([RFC4724]) to be triggered by NOTIFICATION
475	   messages indicating the occurrence of a Critical error.  Such an
476	   extension allows a restart of the TCP and BGP sessions between two
477	   speakers, in a similar manner to the current session restart
478	   behaviour triggered by a NOTIFICATION message.  In order to maximise
479	   the level of re-initialisation which occurs during such a restart
480	   triggered by a Critical error, BGP speakers MAY re-initialise memory
481	   structures related to the RIB where possible.

483	   Where such a restart event occurs, the continued liveliness of the
484	   remote device MAY be verified by BGP KEEPALIVE packets or other OAM
485	   functions such as Bidirectional Forwarding Detection ([RFC5880]).  If
486	   the observed Critical BGP error is indicative of a wider device
487	   failure of the remote speaker, it is expected that a BGP sessions
488	   will not re-establish correctly.  By default, each BGP speaker SHOULD
489	   maintain a limited time window in which session restart is expected
490	   in order to mitigate this possibility.

492	   When a Critical error occurs, the network operator MUST be made aware
493	   of its occurrence through local logging mechanisms (e.g., SNMP traps
494	   or syslog).  The BGP speaker receiving an UPDATE message identified
495	   as a Critical error MUST log its occurrence and a copy of the UPDATE
496	   message.  Where a inter-device messaging mechanism is implemented (as
497	   discussed in Section Section 4.1) a copy of the erroneous UPDATE
498	   message SHOULD be transmitted to the remote speaker upon session-re-
499	   establishment (or via a separate session if implemented).  Both BGP
500	   speakers MUST indicate to an operator the cause of a session restart
501	   was a Critical error in an UPDATE message.

503	   Since repeated critical errors (and session restarts) may have an
504	   impact in overall device scaling if Critical error handling does not
505	   resolve the failure condition, a BGP speaker MAY choose to revert to
506	   the session tear down behaviour described in the base BGP
507	   specification.  This reversion SHOULD only be utilised after a number
508	   of attempts which MUST be controllable by the network operator.
509	   Where a session is shut down, the implementation MAY utilise a back-
510	   off from session restart attempts (as per the IdleHoldTimer described
511	   in the BGP FSM [RFC4271]).  Where reversion to tearing down the BGP
512	   session is performed, a speaker SHOULD limit the impact of
513	   withdrawing prefixes from downstream speakers where possible.  It is
514	   envisaged that this can be achieved by utilising a mechanism such as
515	   the BGP Graceful Shutdown procedure as described in
516	   [I-D.ietf-grow-bgp-gshut].

518	5.1.  Long-Lived Critical Errors

520	   Where Critical error handling mechanisms are required to be utilised,
521	   significant impact to an operator's network or services may still be
522	   experienced.  In order to allow an operator to avoid such scenarios:

524	   o  An implementation MAY provide functionality whereby all future
525	      Critical errors result in UPDATE messages being discarded.  Such
526	      functionality MUST be disabled by default, and SHOULD be
527	      configurable on a per-address-family basis.  An operator MUST
528	      consider such mechanisms as a tool of last-resort to maintain
529	      service for a subset of NLRI, whilst the root cause of a such
530	      errors is investigated and resolved.  This MAY be achieved by
531	      filtering erroneous NLRI at an upstream peer.

533	   o  Provide means by which a the restart timer for Graceful Restart
534	      can be configured to be a long period (order of days, or weeks)
535	      such that a critical failure can be resolved whilst maintaining
536	      operation for a subset of NLRI.  This restart period MUST be
537	      configured separately to standard graceful-restart timers and MUST
538	      be configurable per-address-family.  Long-lived restart mechanisms
539	      MAY be configurable to be utilised by default.  An operator MUST
540	      configure the impact to forwarding correctness of such
541	      configuration, based on the expected rate of change of NLRI within
542	      a particular <AFI,SAFI>.

544	6.  IANA Considerations

546	   This memo includes no request to IANA.

548	7.  Security Considerations

550	   The requirements outlined in this document provide mechanisms which
551	   limit the forwarding impact of the response to an error in a BGP
552	   UPDATE message.  This is of benefit to the security of a BGP speaker.
553	   Without these mechanisms, where erroneous UPDATE messages relating to
554	   a single NLRI entry can be propagated to a BGP speaker, all other
555	   NLRI carried via the same session are affected by the resulting
556	   session tear-down.  This may result in a means by which an AS can be
557	   isolated from particular routing domains (such as the Internet)
558	   should an UPDATE message be propagated via targeted specific paths.
559	   It is envisaged by reducing the impact of the reaction of the
560	   receiving speaker to these messages, the isolation can be constrained
561	   to specific sets of NLRI, or a specific topology.

563	   A number of the mechanisms meeting the requirements specified within
564	   the document (particularly those relating to operational monitoring)
565	   may raise further security concerns.  Such concerns will be addressed
566	   during the specification of these mechanisms.

568	8.  Acknowledgements

570	   Many thanks are extended to Bruno Decraene and David Freedman for
571	   their numerous detailed reviews, and significant contribution towards
572	   the refinement of the requirements in this document.

574	   In addition, the author would like to thank the following network
575	   operators for their insight, and valuable input into defining the
576	   requirements for a variety of deployments of BGP: Shane Amante, Colin
577	   Bookham, Rob Evans, Wes George, Tom Hodgson, Sven Huster, Jonathan
578	   Newton, Neil McRae, Thomas Mangin, Tom Scholl and Ilya Varlashkin.
579	   Many thanks are extended to Jeff Haas, Wim Hendrickx, Tony Li, Alton
580	   Lo, Keyur Patel, John Scudder, Adam Simpson and Robert Raszuk for
581	   their expertise relating to implementations of the BGP protocol.

583	9.  References

585	9.1.  Normative References

587	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
588	              Requirement Levels", BCP 14, RFC 2119, March 1997.

590	   [RFC2858]  Bates, T., Rekhter, Y., Chandra, R., and D. Katz,
591	              "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000.

593	   [RFC2918]  Chen, E., "Route Refresh Capability for BGP-4", RFC 2918,
594	              September 2000.

596	   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
597	              Protocol 4 (BGP-4)", RFC 4271, January 2006.

599	   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
600	              Networks (VPNs)", RFC 4364, February 2006.

602	   [RFC4724]  Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y.
603	              Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724,
604	              January 2007.

606	   [RFC4761]  Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
607	              (VPLS) Using BGP for Auto-Discovery and Signaling", RFC
608	              4761, January 2007.

610	   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
611	              (BFD)", RFC 5880, June 2010.

613	9.2.  Informational References

615	   [I-D.chen-ebgp-error-handling]
616	              Chen, E., Mohapatra, P., and K. Patel, "Revised Error
617	              Handling for BGP Updates from External Neighbors", draft-
618	              chen-ebgp-error-handling-01 (work in progress), September
619	              2011.

621	   [I-D.ietf-grow-bgp-gshut]
622	              Francois, P., Decraene, B., Pelsser, C., Patel, K., and C.
623	              Filsfils, "Graceful BGP session shutdown", draft-ietf-
624	              grow-bgp-gshut-06 (work in progress), August 2014.

626	   [I-D.ietf-grow-bmp]
627	              Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring
628	              Protocol", draft-ietf-grow-bmp-07 (work in progress),
629	              October 2012.

631	   [I-D.ietf-idr-bgp-enhanced-route-refresh]
632	              Patel, K., Chen, E., and B. Venkatachalapathy, "Enhanced
633	              Route Refresh Capability for BGP-4", draft-ietf-idr-bgp-
634	              enhanced-route-refresh-10 (work in progress), June 2014.

636	   [I-D.ietf-idr-operational-message]
637	              Freedman, D., Raszuk, R., and R. Shakir, "BGP OPERATIONAL
638	              Message", draft-ietf-idr-operational-message-00 (work in
639	              progress), March 2012.

641	Author's Address

643	   Rob Shakir
644	   BT plc.
645	   pp. C3L,
646	   BT Centre,
647	   81, Newgate Street,
648	   London.  EC1A 7AJ
649	   UK

651	   Email: rob.shakir@bt.com
652	   URI:   http://www.bt.com/