idnits 2.17.1 

draft-ietf-grow-ops-reqs-for-bgp-error-handling-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 30, 2012) is 4286 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC5881' is defined on line 1070, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760)

  == Outdated reference: A later version (-13) exists of
     draft-ietf-grow-bgp-gshut-03

  == Outdated reference: A later version (-17) exists of
     draft-ietf-grow-bmp-06

  == Outdated reference: A later version (-10) exists of
     draft-ietf-idr-bgp-enhanced-route-refresh-02

  == Outdated reference: A later version (-16) exists of
     draft-ietf-idr-bgp-gr-notification-00

  == Outdated reference: A later version (-06) exists of
     draft-ietf-idr-enhanced-gr-01

  == Outdated reference: A later version (-03) exists of
     draft-zeng-idr-one-time-prefix-orf-02


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                R. Shakir
3	Internet-Draft                                                        BT
4	Intended status: Informational                             July 30, 2012
5	Expires: January 31, 2013

7	Operational Requirements for Enhanced Error Handling Behaviour in BGP-4
8	           draft-ietf-grow-ops-reqs-for-bgp-error-handling-05

10	Abstract

12	   BGP-4 is utilised as a key intra- and inter-Autonomous System routing
13	   protocol in modern IP networks.  The failure modes as defined by the
14	   original protocol standards are based on a number of assumptions
15	   around the impact of session failure.  Numerous incidents both in the
16	   global Internet routing table and within Service Provider networks
17	   have been caused by strict handling of a single invalid UPDATE
18	   message causing large-scale failures in one or more Autonomous
19	   Systems.

21	   This memo describes the current use of BGP-4 within Service Provider
22	   networks, and outlines a set of requirements for further work to
23	   enhance the mechanisms available to a BGP-4 implementation when
24	   erroneous data is detected.  Whilst this document does not provide
25	   specification of any standard, it is intended as an overview of a set
26	   of enhancements to BGP-4 to improve the protocol's robustness to suit
27	   its current deployment.

29	Status of this Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on January 31, 2013.

46	Copyright Notice

48	   Copyright (c) 2012 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
64	     1.1.  Role of BGP-4 in Service Provider Networks . . . . . . . .  3
65	     1.2.  Overview of Operator Requirements for BGP-4 Error
66	           Handling . . . . . . . . . . . . . . . . . . . . . . . . .  5
67	   2.  Errors within BGP-4 UPDATE Messages  . . . . . . . . . . . . .  7
68	     2.1.  Classifying BGP Errors and Expected Error Handling . . . .  8
69	       2.1.1.  Critical BGP Errors  . . . . . . . . . . . . . . . . .  9
70	       2.1.2.  Semantic BGP Errors  . . . . . . . . . . . . . . . . .  9
71	   3.  Avoiding use of NOTIFICATION . . . . . . . . . . . . . . . . . 11
72	   4.  Recovering RIB Consistency . . . . . . . . . . . . . . . . . . 13
73	   5.  Reducing the Impact of Session Reset . . . . . . . . . . . . . 15
74	   6.  Operational Toolset for Monitoring BGP . . . . . . . . . . . . 17
75	   7.  Operational Complexities Introduced by Altering RFC4271  . . . 21
76	     7.1.  Reducing the Network Impact of Session Teardown  . . . . . 23
77	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
78	   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 26
79	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
80	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
81	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 28
82	     11.2. Informational References . . . . . . . . . . . . . . . . . 28
83	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 30

85	1.  Introduction

87	   Where BGP-4 [RFC4271] is deployed in the Internet and Service
88	   Provider networks, numerous incidents have been recorded due to the
89	   manner in which [RFC4271] specifies errors in routing information
90	   should be handled.  Whilst the behaviour defined in the existing
91	   standards retains utility, the deployments of the protocol have
92	   changed within modern networks, resulting in significantly different
93	   demands for protocol robustness.  Whilst a number of Internet Drafts
94	   have been written to begin to enhance the behaviour of BGP-4 in terms
95	   of the handling of erroneous messages, this memo intends to define a
96	   set of requirements for ongoing work.  These requirements are
97	   considered from the perspective of a Network Operator, and hence this
98	   draft does not intend to define the protocol mechanisms by which such
99	   error handling behaviour is to be implemented.

101	1.1.  Role of BGP-4 in Service Provider Networks

103	   BGP was designed as an inter-Autonomous System (AS) routing protocol
104	   and hence many of the error handling mechanisms within the protocol
105	   specification are designed to be conducive to this role.  In general,
106	   this consideration as an inter-AS routing propagation mechanism
107	   results in the view that a BGP session propagates a relatively small
108	   amount of network-layer reachability information (NLRI) between two
109	   ASes.  In this case, it is the expectation of session resilience for
110	   those adjacencies that are key to routing continuity (for example, it
111	   is expected that two networks peering via BGP would connect multiple
112	   times in order to safeguard equipment or protocol failure).  In
113	   addition, there is some expectation of multiple paths to a particular
114	   NLRI being available - it would be expected that a network can fall
115	   back to utilising alternate, less direct, paths where a failure of a
116	   more direct path occurs.

118	   Traditional network architectures would deploy an Interior Gateway
119	   Protocol (IGP) to carry infrastructure and customer routes, with an
120	   Exterior Gateway Protocol (EGP) such as BGP being utilised to
121	   propagate these routes to other Autonomous Systems.  However, with
122	   the growth of IP-based services, this is no longer considered best
123	   practice.  In order to ensure that convergence is within acceptable
124	   time bounds, the amount of routing information carried within the IGP
125	   is significantly reduced - and tends to be only infrastructure
126	   routes. iBGP is then utilised to propagate both customer, and
127	   external routes within an AS.  As such, BGP has become an IGP, with
128	   traditional IGPs acting as a means by which to propagate the routing
129	   information which is required to establish a BGP session, and reach
130	   the egress node within the local routing domain.  This change in role
131	   presents different requirements for the robustness of BGP as a
132	   routing protocol - with the expectation of similar level of
133	   robustness to that of an IGP being set.

135	   Along with this change in role, the nature of the IP routing
136	   information that is carried has changed.  BGP has become a ubiquitous
137	   means by which service information can be propagated between devices.
138	   For instance, BGP is utilised to carry routing information for IP/
139	   MPLS VPN services as described in [RFC4364].  Since there is an
140	   existing deployment of the protocol between PE devices in numerous
141	   networks, it has been adapted to propagate this routing information,
142	   as its use limits the number of routing protocols required on each
143	   device.  This additional information being propagated represents a
144	   large change in requirement for the error handling of the protocol -
145	   where session failure occurs, it is likely a complete service outage
146	   for at least a subset of a network's customers is experienced where
147	   an erroneous packet may have occurred within a different sub-topology
148	   or even service (a different address family for example).  For this
149	   reason, there is a significant demand to avoid service affecting
150	   failures that may be triggered by routing information within a single
151	   sub-topology or service.

153	   The combination of the increased number of deployments of BGP-4 as an
154	   intra-AS routing protocol, its use for the propagation of additional
155	   types of routing and service information, and the growth of IP
156	   services has resulted in a substantial increase in the volume of
157	   information carried within BGP-4.  In numerous networks, RIB sizes of
158	   the order of millions of entries exist within individual BGP
159	   speakers, with particularly high-scale points exhibited at BGP
160	   speakers performing aggregation or functionality designed improve
161	   utilisation of network resources (e.g., route reflector hierarchies).
162	   Clearly an increase in the amount routing information carried in BGP
163	   results in greater impact to services during failures, which is only
164	   amplified by a corresponding increase in recovery times.  Following a
165	   failure, there is a substantial recovery time to learn, compute and
166	   distribute new paths, which results in a greater observed impact to
167	   services affected, and hence adds further weight to the requirement
168	   to avoid failures altogether or, at least, mitigate their impact to
169	   the narrowest scope possible, (e.g., a specific NLRI).  Whilst an
170	   argument could be made that convergence time of BGP-4 could
171	   potentially be reduced through deployment of additional computational
172	   resource, it is notable that solution is not necessarily
173	   straightforward from an implementation or deployment perspective,
174	   (e.g., scaling computation resources within a single address-family
175	   is difficult).  Thus, significant challenges continue to exist for
176	   operators when scaling BGP-4 deployments, and hence mechanisms which
177	   improve the scalability of BGP-4 are very important.

179	   Both within Internet and multi-service routing architectures, a
180	   number of BGP sessions propagate a large proportion of the required
181	   routing information for network operation.  For Internet routing,
182	   these are typically BGP sessions which propagate the global routing
183	   table to an AS - failure of these sessions may have a large impact on
184	   network service, based on a single erroneous update.  In an multi-
185	   service environment, typical deployments utilise a small number of
186	   core-facing BGP sessions, typically towards route reflector devices.
187	   Failure of these sessions may also result in a large impact to
188	   network operation.  Clearly, the avoidance of conditions requiring
189	   these sessions to fail is of great utility to any network operator,
190	   and provides further motivation for the revision of the existing
191	   behaviour.

193	   Whilst the behaviour in [RFC4271] is suited to ensuring that BGP
194	   messages with erroneous routing information in are limited in scope
195	   (by means of session reset), with the above considerations, it is
196	   clear that this mechanism is not suited to all deployments.  It
197	   should, however, be noted that the change in scope affects the
198	   handling only of errors occurring after BGP session establishment.
199	   There is no current operational requirement to amend the means by
200	   which error handling in session establishment, or liveliness
201	   detection, are performed.

203	1.2.  Overview of Operator Requirements for BGP-4 Error Handling

205	   It is the intention of this document to define a set of criteria for
206	   the manner in which a revised error handling mechanism in BGP-4 is
207	   required to conform.  The motivation for the definition of these
208	   requirements can be summarised based on certain behaviour currently
209	   present in the protocol that is not deemed acceptable within current
210	   operational deployments, or where there is a short-fall in the tool
211	   set available to an operator.  These key requirements can be
212	   summarised as follows:

214	   o  It is unacceptable within modern deployments of the BGP-4 protocol
215	      that a single erroneous UPDATE packet affects routes that it does
216	      not carry.  This requirement therefore requires some modification
217	      to the means by which erroneous UPDATE packets are handled, and
218	      reacted to - with a particular focus on avoiding the use of the
219	      NOTIFICATION message.

221	   o  It is recognised that some error conditions may occur within the
222	      BGP-4 protocol may not always be handled gracefully, and may
223	      result in conditions whereby an implementation cannot recover.  In
224	      these (and similar) cases, it is undesirable for an operator that
225	      this reset of the BGP-4 session results in interruption to
226	      forwarding packets (by means of withdrawing routes installed by
227	      BGP-4 into a device's RIB, and subsequently FIB).  To this end,
228	      there is a requirement to define a session reset mechanism which
229	      provides session re-initialisation in a non-destructive manner.

231	   o  Further to the requirements to provide a more robust protocol, the
232	      current visibility into error conditions within the BGP-4 protocol
233	      is extremely limited - where further modifications to this
234	      behaviour are to be made, complexity is likely to be added.  Thus,
235	      to ensure that BGP-4 is manageable, there are requirements for
236	      mechanisms by which the protocol can be examined and monitored.

238	   This document describes each of these requirements in further depth,
239	   along with an overview of means by which they are expected to be
240	   achieved.  In addition, the mechanism by which the enhancements
241	   meeting these requirements are to interact is discussed.

243	2.  Errors within BGP-4 UPDATE Messages

245	   Both through analysis of incidents occurring with the Internet DFZ,
246	   and multi-service environments utilising BGP-4 to signal service or
247	   routing information, a number of different classes of errors within
248	   BGP-4 UPDATE messages have been observed.  In order to consider the
249	   applicability of enhanced error handling mechanisms, it is possible
250	   to divide these errors into a number of sub-classes, particularly
251	   focusing around the location of the error within the UPDATE message.

253	   Where an UPDATE message is considered invalid by a BGP speaker due to
254	   an error within a path attribute that is not the NLRI (where the
255	   definition of NLRI includes reachability information encoded in the
256	   MP_REACH_NLRI and MP_UNREACH_NLRI attributes as specified in
257	   [RFC4760]) it is a requirement of any enhanced error handling
258	   mechanism to handle the error in a manner focused on the NLRI
259	   contained within the message found to be erroneous.  Since in this
260	   case, the message received from the remote peer is syntactically
261	   valid, it is considered that such an UPDATE is indicative of
262	   erroneous data within one or more path attributes.  The impact of the
263	   current behaviour defined within the protocol makes the implication
264	   that the BGP speaker from whom the message is received is now an
265	   invalid path for all NLRI announced via the session - which results
266	   in a disproportionate impact to overall network operation.  In
267	   particular scenarios (such as networks with centralised BGP route
268	   reflection) such action can result in a loss of all reachability to a
269	   network.  In other contexts (such as the Internet DFZ), it cannot be
270	   assumed that the BGP speaker from whom the UPDATE message is received
271	   is directly responsible for the erroneous information contained
272	   within the message.

274	   Two further error cases exist within UPDATE messages, both of which
275	   are related to the mechanisms that are applicable to messages
276	   received where some difficulty exists in parsing the entire BGP
277	   message.  The two cases concern those cases where a valid NLRI
278	   attribute can be extracted, and those where such an attribute is not
279	   able to be parsed.  In these cases, errors in the packing of
280	   attributes within a BGP message may have occurred.  Such errors are
281	   likely indicative of an error specifically caused by the remote BGP
282	   speaker.  It is, however, desirable to an operator that such errors
283	   are handled without affecting all NLRI across a BGP session.  As
284	   such, there is a key requirement to maximise the number of cases in
285	   which it is possible to extract NLRI from a BGP UPDATE message.  To
286	   this end, it is required that where possible the MP_REACH_NLRI and
287	   MP_UNREACH_NLRI attributes are utilised for encoding all NLRI
288	   (including IPv4 Unicast), and that this attribute is included as the
289	   first attribute of a BGP UPDATE message (as originally recommended in
290	   [I-D.chen-ebgp-error-handling]).  Such a change to the order of
291	   inclusion of this attribute maximises the number of cases in which
292	   NLRI can be extracted from an UPDATE.  Where this is possible, it is
293	   again required that the error handling mechanisms utilised should be
294	   directly applied to the NLRI included in the UPDATE.

296	   For all cases whereby NLRI can be obtained from an UPDATE message, it
297	   is expected that the requirements outlined in Section 3 should be
298	   considered by any enhancement to the BGP-4 protocol.

300	   In the case that it is not possible to completely parse the NLRI
301	   attribute from the UPDATE message received from a peer, it is
302	   extremely likely that this is indicative of a serious error with
303	   either the process of attribute packing, or buffer usage on the
304	   remote BGP speaker.  In this case, clearly, it is not possible to
305	   apply any error handling mechanism that is limited to a specific set
306	   of NLRI, since an implementation has no knowledge of the NLRI
307	   included within the UPDATE message.  In addition, such errors are
308	   considered to be relatively fundamental to the operation of a BGP
309	   implementation, and hence may indicate a case whereby significant
310	   system errors have occurred.  The current BGP-4 standard results in a
311	   BGP speaker restarting a session with the remote BGP speaker.
312	   However where such an error does occur, it is required that a
313	   graceful mechanism is utilised to provide a lower impact to network
314	   operation.  The requirements for enhancements of this nature to BGP-4
315	   are outlined in Section 5, with the requirements outlined therein
316	   focused on providing a means by which system integrity can be
317	   restored whilst allowing for continued network operation.

319	2.1.  Classifying BGP Errors and Expected Error Handling

321	   It is clearly of advantage for BGP-4 implementations to utilise a
322	   consistent set of error handling mechanisms for the different types
323	   of errors that are described in Section 2, and provide consistent
324	   nomenclature to refer to them.  It is therefore suggested that errors
325	   that are indicative of larger scale failures of a BGP speaker, and
326	   hence require some error handling at the session level are referred
327	   to as 'critical' errors, whilst those errors that are identified
328	   based on incorrect content of one of more attributes of a message are
329	   referred to as 'semantic' errors.

331	   The errors identified within the following sections consider only
332	   those errors within the specifications at the time of writing, it is
333	   recommended that in the definition of future extensions to the BGP-4
334	   specification, the error handling behaviour (and the category within
335	   which errors within the extension should be considered by an
336	   implementation) is defined.

338	2.1.1.  Critical BGP Errors

340	   As described in this document, it is of advantage to limit the number
341	   of 'critical' errors that occur within the protocol, therefore, based
342	   on analysis of the processing of BGP UPDATE messages, it is required
343	   that 'critical' error handling behaviour is applied to:

345	   o  UPDATE Message Length errors - whereby the specified overall
346	      UPDATE message length is inconsistent with sum of the Total Path
347	      Attribute and Withdrawn Routes length.  In this case, this is
348	      indicative of message packing failure, whereby the NLRI may not be
349	      correctly extracted.

351	   o  Errors Parsing the NLRI attributes of an UPDATE message - where
352	      NLRI is carried in either the IPv4-Unicast Advertised or Withdrawn
353	      routes, or in the MP_REACH_NLRI or MP_UNREACH_NLRI attributes
354	      [RFC2858], it is not possible to target error handling mechanisms
355	      to specific NLRI, and hence session level mechanisms must be
356	      utilised.

358	   It is expected that those requirements outlined in Section 5 are
359	   utilised to provide session-level handling of those errors identified
360	   as 'critical'.

362	2.1.2.  Semantic BGP Errors

364	   Where a BGP message is correctly formed, a number of cases exist
365	   whereby the contents of the UPDATE are not valid - in these cases,
366	   this represents errors that can be identified to affect specific
367	   NLRI.  The following cases are expected to be classified as semantic
368	   errors:

370	   o  Zero or invalid length errors in path attributes excluding those
371	      containing NLRI, or where the length of all path attributes
372	      contained within the UPDATE does not correspond to the total path
373	      attributes length.  In this case, the NLRI can be correctly
374	      extracted, and hence acted upon.

376	   o  Messages where invalid data or flags are contained in a path
377	      attribute that does not relate to the NLRI.

379	   o  UPDATE messages missing mandatory attributes, unrecognised non-
380	      optional attributes or those that contain duplicate or invalid
381	      attributes (be they unsupported or unexpected).

383	   o  Those messages where the NEXT_HOP, or MP_REACH next-hop values are
384	      missing, length zero, or invalid for the relevant AFI/SAFI.

386	   In these cases, it is expected that these errors can be handled
387	   gracefully, following the requirements detailed in Section 3 and
388	   Section 4 of this memo.

390	3.  Avoiding use of NOTIFICATION

392	   The error handling behaviour defined in RFC4271 is problematic due to
393	   the limited options that are available to an implementation.  When an
394	   erroneous BGP message is received, at the current time, the
395	   implementation must either ignore the error, or send a NOTIFICATION
396	   message, after which it is mandatory to terminate the BGP session.
397	   It is apparent that this requirement is at odds with that of protocol
398	   robustness.

400	   There is significant complexity to this requirement.  The mechanism
401	   defined in [I-D.chen-ebgp-error-handling] describes a means by which
402	   no NOTIFICATION message is generated for all cases whereby NLRI can
403	   be extracted from an UPDATE.  The NLRI contained within the erroneous
404	   UPDATE message is considered as though the remote BGP speaker has
405	   provided an UPDATE marking it as withdrawn.  This results in a limit
406	   in the propagation of the invalid routing information, whilst also
407	   ensuring that no traffic is forwarded via a previously-known path
408	   that may no longer be valid.  This mechanism is referred to as
409	   "treat-as-withdraw".

411	   Whilst this behaviour results in avoiding a NOTIFICATION message,
412	   keeping other routing information advertised by the remote BGP
413	   speaker within the RIB, it may result in unreachability for a sub-set
414	   of the NLRI advertised by the remote speaker.  Two cases should be
415	   considered - that where the entry for a route in the Adj-RIB-In of
416	   the neighbour propagating an erroneous packet is utilised, and that
417	   where the route installed in the device's RIB is learnt from another
418	   BGP speaker.  In the former case, should the identified NLRI not be
419	   treated as withdrawn, the original NLRI is utilised within the global
420	   RIB.  However, this information is potentially now invalid (i.e. it
421	   no longer provides a valid forwarding path), whilst an alternate
422	   (valid) path may exist in another Adj-RIB-In.  By continuing to
423	   utilise the NLRI for which the UPDATE was considered invalid, traffic
424	   may be forwarded via an invalid path, resulting in routing loops, or
425	   black-holing.  In the second case, no impact to the forwarding of
426	   traffic, or global RIB, is incurred, yet where treat-as-withdraw is
427	   implemented, possibly stale routing information is purged from the
428	   Adj-RIB-In of the neighbour propagating errors.

430	   Whilst mechanisms such as "treat-as-withdraw" are currently
431	   documented, the proposals are limited in their scope - particularly
432	   in terms of restrictions to implementation only on eBGP sessions.
433	   This limitation is made based on the view that the BGP RIB must be
434	   consistent across an autonomous system.  By implementing treat-as-
435	   withdraw for a iBGP session, one or more routers within the
436	   Autonomous System may not have reachability to a route, and hence
437	   blackholing of traffic, or routing loops, may occur.  It should,
438	   however, be considered if this view is valid, in light of the manner
439	   in which BGP is utilised within operator networks.  Inconsistency in
440	   a RIB based on a single UPDATE being treated as withdrawn may cause a
441	   inconsistency in a single sub-topology (e.g.  Layer 3 VPN service),
442	   or a service not operating completely (in the case of an UPDATE
443	   carrying service membership information).  Where a NOTIFICATION and
444	   teardown is utilised this is destructive to all sub-topologies in all
445	   address family identifiers (AFIs) carried by the session in question.
446	   Even where mechanisms such as multi-session BGP are utilised, a whole
447	   AFI is affected by such a NOTIFICATION message.  In terms of routing
448	   operation, it is therefore far less costly to endure a situation
449	   where a limited sub-set of routing information within an AS is
450	   invalid, than to consider all routing information as invalid based on
451	   a single trigger.

453	   At the time of writing, error handling mechanisms related to
454	   optional, transitive attributes - such as
455	   [I-D.ietf-idr-optional-transitive] are restricted to handling only a
456	   subset of attribute errors - whereas the operational requirement is
457	   to expand this coverage to the widest set of errors possible (i.e.,
458	   all semantic errors within UPDATE messages).  Additionally, where
459	   approaches applicable to a greater number of attributes are proposed
460	   (e.g., [I-D.chen-ebgp-error-handling]), these are limited to
461	   deployment in eBGP applications only, where requirements also exist
462	   in intra-domain cases.  As such, it is envisaged that if extended to
463	   cover these expanded cases, these mechanisms provide a means to avoid
464	   the transmission of a NOTIFICATION message to a remote BGP speaker,
465	   based on a single erroneous message, where at all possible, and hence
466	   meet this requirement.  Critical errors, including those whereby the
467	   NLRI cannot be extracted from the UPDATE message, represent cases
468	   whereby the receiving system cannot handle the error gracefully based
469	   on this mechanism.

471	4.  Recovering RIB Consistency

473	   The recommendations described in Section 3 may result in the RIB for
474	   a topology within an AS being inconsistent across the AS' internal
475	   routers.  Alternatively, where such mechanisms are deployed at an AS
476	   boundary, interconnects between two ASes may be inconsistent with
477	   each other.  There are therefore risks of traffic blackholing, due to
478	   missing routing information, or forwarding loops.  Whilst this is
479	   deemed an acceptable compromise in the short term, clearly, it is
480	   suboptimal.  Therefore, a requirement exists to provide mechanisms by
481	   which a BGP speaker is able to recover the consistency of the Adj-
482	   RIB-In for a particular neighbour.

484	   In the general case, the consistency of the BGP RIB can be recovered
485	   by re-requesting the entire Adj-RIB-Out of a remote BGP speaker is
486	   re-advertised.  A mechanism to achieve this re-advertisement is
487	   defined within the ROUTE-REFRESH specification [RFC2918].  It is
488	   envisaged that by requesting a refresh of all NLRI advertised by a
489	   BGP speaker, any NLRI which has been withdrawn due to being contained
490	   within an invalid UPDATE message is re-learnt.  Where a ROUTE REFRESH
491	   is used to directly perform a consistency check between the Adj-RIB-
492	   Out of a remote device, and the Adj-RIB-In of the local BGP speaker,
493	   a demarcation between the ROUTE-REFRESH, and normal UPDATE messages
494	   is required (in order that an "end" of the refresh can be used to
495	   identify any 'stale' NLRI) -
496	   [I-D.ietf-idr-bgp-enhanced-route-refresh] provides a means by which
497	   the ROUTE-REFRESH mechanism can be extended to meet this requirement.

499	   Whilst re-advertisement of the whole BGP RIB provides a means by
500	   which withdrawn NLRI can be re-advertised, there are some scaling
501	   implications that must be considered.  In the case that a ROUTE-
502	   REFRESH is generated, all NLRI must be re-packed into UPDATE messages
503	   and advertised by one speaker on the BGP session, whilst the other
504	   must receive all UPDATE messages, and validate the RIB's consistency.
505	   In order to avoid the control-plane load, it is therefore a
506	   requirement to utilise targeted mechanisms where possible, rather
507	   than incurring the additional load on both the advertising and
508	   receiving speaker of building and processing UPDATEs for the entire
509	   contents of the RIB.

511	   It is envisaged that during routing inconsistencies caused by
512	   utilising the 'treat-as-withdraw' mechanism, the local BGP speaker is
513	   aware that some routing information was not able to be processed -
514	   due to the fact that an UPDATE message was not parsed correctly.
515	   Since this mechanism (as discussed in Section 3) requires the local
516	   BGP speaker to have determined the set of NLRI for which an erroneous
517	   UPDATE message was received, it is possible to use a targeted
518	   mechanisms to re-request the specific NLRI that was contained within
519	   the erroneous UPDATE message.  By re-requesting, this provides the
520	   remote BGP speaker an opportunity to re-transmit the NLRI - possibly
521	   providing an opportunity to leverage alternative methods to build the
522	   UPDATE message.  Such a request requires extension to the existing
523	   BGP-4 protocol, in terms of specific UPDATE generation filters with a
524	   transient lifetime.  It is envisaged that the work within
525	   [I-D.zeng-idr-one-time-prefix-orf] provides a mechanism allowing
526	   targeted elements of the Adj-RIB-In for a BGP neighbour to be
527	   recovered.

529	   It is of particular note for both means of recovering RIB consistency
530	   described that these are effective only when considering transient
531	   errors within an implementation - for instance, should an RFC
532	   interpretation error within an implementation be present, regardless
533	   of the number of times a specific UPDATE is generated, it is likely
534	   that this error condition will persist (as it may with the existing
535	   behaviour defined by [RFC4271]).  For this reason, there is an
536	   requirement to consider the means by which such consistency recovery
537	   mechanisms are utilised.  It is not advisable that a dynamic filter
538	   and advertisement mechanism is triggered by all error handling events
539	   due to the load this is likely to place on the neighbour receiving
540	   such a request.  Where this BGP speaker is a relatively centralised
541	   device - a route reflector (as described by [RFC4456]) for example -
542	   the act of generation of UPDATE messages with such frequency is
543	   likely to cause disproportionate load.  It is therefore an
544	   operational requirement of such mechanisms that means of request
545	   dampening be required by any such extension.

547	   In cases whereby the consistency of the Adj-RIB-In is to be restored
548	   (e.g., following the 'treat-as-withdraw' behaviour described in
549	   Section 3), and mechanisms such as those described herein are
550	   triggered, such a condition should be noted to an operator by means
551	   of a specific flag, SNMP trap, or other logging mechanism.  In order
552	   to identify the subset of NLRI that are considered to be
553	   inconsistent, this information is of operational benefit and hence
554	   should be logged.

556	5.  Reducing the Impact of Session Reset

558	   Even where protocol enhancements allow errors in the BGP-4 protocol
559	   to cease to trigger NOTIFICATION messages, and hence reset a BGP
560	   session, it is clear that some error conditions may not be exited.
561	   In particular, errors due to existing state, or memory structures,
562	   associated with a specific BGP session will not be handled.  It is
563	   therefore important to consider how these error conditions are
564	   currently handled by the protocol.  It should be noted that the
565	   following discussion and analysis considers only those NOTIFICATION
566	   messages generated in response to errors in UPDATE messages (as
567	   defined by Section 6.3 in [RFC4271]).

569	   The existing NOTIFICATION behaviour triggers a reset of all elements
570	   of the BGP-4 session, as described in Section 6 of [RFC4271].  It is
571	   expected that session teardown requires an implementation to re-
572	   initialise all structures and state required for session maintenance.
573	   Clearly, there is some utility to this requirement, as error
574	   conditions in BGP are, in general, exited from.  However, this
575	   definition is responsible for the forwarding outages within networks
576	   utilising BGP for propagation of routing or service when each error
577	   is experienced.  The requirement described in Section 3 is intended
578	   to reduce the cases whereby a NOTIFICATION is required, however, any
579	   mechanism implemented as a response to this requirement by definition
580	   cannot provide a session reset to the extent of that achieved by the
581	   current behaviour.

583	   In order to address this, there is a requirement for a means by which
584	   a BGP speaker can signal that an unhandled error condition in an
585	   UPDATE message occurred - requiring a session reset - yet also
586	   continue to utilise the paths advertised by the neighbour that are
587	   currently in use within the RIB.  In this case, the Adj-RIB-In
588	   received from the neighbour is not considered invalid, despite a
589	   NOTIFICATION, and session reset, being required.  This set of
590	   requirements is akin to those answered by the BGP Graceful Restart
591	   mechanism described in [RFC4724].  Since the operational requirement
592	   in this case is to provide a means to achieve a complete session
593	   restart without disrupting the forwarding path of those routes in use
594	   within a BGP speaker's RIB, it is expected that utilising a procedure
595	   similar to the Graceful Restart mechanism meets the error handling
596	   requirement.  By responding to an error condition (repeated or
597	   otherwise) with a message indicating that an error that cannot be
598	   handled has occurred, forcing session reset, whilst retaining
599	   forwarding information within the RIB allows forwarding to all routes
600	   within a system's RIB to continue during the period in which the
601	   session restarts.  It is envisaged that the additional complexity
602	   introduced by the introduction of such a mechanism can be limited by
603	   extending existing BGP messages - one such approach is proposed in

605	   [I-D.ietf-idr-bgp-gr-notification].  By placing a time bound on the
606	   restart lifetime, should an error condition not be transient - for
607	   example, should an error have occurred with the BGP process, rather
608	   than a specific of the BGP session - the remote BGP speaker is still
609	   detected as an invalid device for forwarding.

611	   In some cases, the erroneous condition may be due to corruption of
612	   the Adj-RIB-Out on the advertising BGP speaker - rather than caused
613	   by the receiving speaker's state.  In these cases, where existing
614	   structures are replayed whilst performing graceful restart
615	   functionality, the error condition is not necessarily resolved.
616	   Therefore, it is recommended that during a session restart event, as
617	   described within this section, the advertising speaker purge and
618	   rebuild RIB structures, in order to resolve any corruption within
619	   these structures.

621	   It should be noted that a protocol enhancement meeting this
622	   requirement is not able to solve all error conditions - however, a
623	   complete restart of the BGP and TCP session between two BGP speakers
624	   implements an identical recovery mechanism to that which is achieved
625	   by the existing behaviour.  Where an error condition such as memory
626	   or configuration corruption has occurred in a BGP implementation, it
627	   is expected that a mechanism meeting this requirement continues to
628	   detect this, by means of a bound on time for session restart to
629	   occur.  Whilst there may be some consideration that packets continue
630	   to be forwarded through a device which can be in an failure mode of
631	   this nature for a longer period due to this requirement, the
632	   architecture of modern IP routers should be considered.  A divided
633	   forwarding and control plane is common in many devices, as well as
634	   process separation for software-based devices - corruption of a
635	   specific protocol daemon does not necessarily imply forwarding is
636	   affected.  Indeed, where forwarding behaviour of a device is
637	   affected, it is envisaged that a failure detection mechanism (be it
638	   Bidirectional Forwarding Detection, or indeed BGP KEEPALIVE packets)
639	   will detect such a failure in almost all cases, with the symptomatic
640	   behaviour of such a failure being an invalid UPDATE message in very
641	   few other cases.

643	6.  Operational Toolset for Monitoring BGP

645	   A significant complexity that is introduced through the requirements
646	   defined in this document is that of monitoring BGP session status for
647	   an operator.  Although the existing error handling behaviour causes a
648	   disproportionate failure, session failure is extremely visible to
649	   most operational personnel within a Network Operator due to both
650	   existing definitions of SNMP trap mechanisms for BGP, along with the
651	   forwarding impact typically caused by such a failure.  By introducing
652	   mechanisms by which errors of this nature are not as visible, this is
653	   no longer the case.  There is a requirement that where subsets of the
654	   RIB on a device are no longer reachable from a BGP speaker, or indeed
655	   an AS, that some visibility of this situation, alongside a mechanism
656	   to determine the cause is available to an operator.  Whilst, to some
657	   extent, this can be solved by mandating a sub-requirement of each of
658	   the aforementioned requirements that a BGP speaker must log where
659	   such errors occur, and are hence handled, this does not solve all
660	   cases.  In order to clarify this requirement, the example of the
661	   transmission of an erroneous Optional Transitive attribute can be
662	   considered.  Since, by definition, there is no requirement for all
663	   BGP speakers to parse such an attribute, a receiving router may treat
664	   NLRI as withdrawn based on an erroneous attribute not examined by its
665	   neighbour.  In this case, the upstream device or network, propagating
666	   the UPDATE, has no visibility of this error.  Operationally, however,
667	   it is of interest to the upstream router operator that such invalid
668	   information was propagated.

670	   The requirement for logging of error conditions in transmitted BGP
671	   messages, which are visible to only the receiver, cannot be achieved
672	   by any existing BGP message, or capability.  It is envisaged that
673	   each erroneous event should be transmitted to the remote peer -
674	   including the information as to the set of NLRI that were considered
675	   invalid.  Whilst with some mechanisms this is achieved by default
676	   (for example, One-Time Prefix ORF [I-D.zeng-idr-one-time-prefix-orf]
677	   (Outbound Route Filtering) will transmit the set of routes that are
678	   required), the operator requirement is to know which routes may have
679	   been unreachable in all cases.  It is envisaged that an extension to
680	   meet this requirement will allow for such information to be
681	   transmitted between peers, and hence logged.  Such a mechanism may
682	   provide further utility as a either a diagnostic, or logging toolset.

684	   As such, it is possible to divide the messages that are required in
685	   order to provide further visibility into BGP for an operator.  Such a
686	   division can be made both due to the required means of message
687	   transmission, alongside the criticality of each request.

689	   o  Messages required to replace NOTIFICATION - In cases where the
690	      error handling mechanisms defined by [RFC4271] currently result in
691	      a NOTIFICATION message being generated, a number of the
692	      requirements detailed within this document result this message
693	      being suppressed.  Despite this change, the error condition's
694	      occurrence is still of interest to an operator in order to provide
695	      both monitoring and troubleshooting capabilities, since some form
696	      of invalid data has been received on a session.  It therefore
697	      considered that an implementation must generate a message both
698	      locally, and transmitted to the remote peer, based on the such a
699	      condition.  Where such a message is transmitted to the remote
700	      peer, it is considered that the BGP session via which the
701	      erroneous UPDATE message was received should be used as transport
702	      to the remote peer.  The information transmitted in such a message
703	      should be minimised to allow identification of the paths which
704	      were considered erroneous (i.e. restricting the information to
705	      that which is directly relevant to a network operator in the case
706	      of an error condition occurring).  Any delay to convergence on the
707	      session in question is considered to be acceptable, given the
708	      suboptimal nature of the reception of invalid routing information
709	      via a BGP session.  Further concerns regarding such a mechanism
710	      relate to the load generated on the BGP speaker in question,
711	      however, it must be considered that in the case of an erroneous
712	      UPDATE being received, and the 'treat-as-withdraw' mechanism being
713	      utilised, where the erroneous path is removed from the Loc-RIB,
714	      there is likely to be a requirement to generate UPDATE messages
715	      withdrawing the route from all further BGP speakers to which the
716	      prefix is advertised.  The load generated by the generation of
717	      such UPDATEs is likely to be much greater than that of
718	      transmitting error information via a logging message type back to
719	      the speaker from which it was received.  It is envisaged that
720	      light-weight BGP message-based signalling mechanisms such as the
721	      ADVISORY message types detailed in
722	      [I-D.ietf-idr-operational-message] provide a suitable means to
723	      satisfy this requirement.

725	   o  Additional Diagnostic Capabilities for BGP - In a number of cases,
726	      there is an operational requirement to further debug erroneous BGP
727	      UPDATE messages, along with the particulars of the state of a BGP
728	      speaker.  For instance, where an invalid BGP UPDATE message is
729	      transmitted between two BGP speakers, the exact format of the
730	      UPDATE message is of interest to an operator, as this information
731	      provides a clear indication of an message considered to be
732	      erroneous by the BGP speaker to which it was transmitted.  In this
733	      case, it is considered of great utility that the entire UPDATE
734	      message is transmitted back to the advertising speaker, in order
735	      to allow for further debugging to occur.  Whilst such information
736	      is particularly useful to an operator, it clearly provides
737	      information that is not key to protocol operation - for this
738	      reason, it is expected that some of the concerns regarding the
739	      additional complexity, and load that a BGP speaker is subjected to
740	      is not acceptable.  For this reason, it is required that where
741	      mechanisms are developed to support this requirement, messages of
742	      this nature can be supported both within an existing BGP session,
743	      and via a dedicated separate session, be it BGP carrying messages
744	      such as those defined in [I-D.ietf-idr-operational-message] or a
745	      dedicated monitoring protocol akin to BMP described in
746	      [I-D.ietf-grow-bmp].

748	   Whilst the operational requirement for such monitoring tools to allow
749	   for visibility into BGP is clearly agreed upon, the means by which
750	   such messages are transmitted between two BGP speakers is likely to
751	   be dependent upon both the positions of the speakers in question (for
752	   instances, the requirements for such a protocol may differ where a
753	   session is between two ASBRs under separate administration).  The
754	   introduction of additional message types to the BGP protocol clearly
755	   introduces further complexity - and leaves room for further
756	   implementation and standardisation errors that may compromise the
757	   robustness of the BGP protocol.  In addition, the queuing and
758	   scheduling of these BGP messages must be interleaved with the
759	   transmission of the key protocol messages - such as KEEPALIVE and
760	   UPDATE packets.  It is therefore a concern that should a large number
761	   of messages specifically for operational visibility be transmitted,
762	   this will delay the transmission of UPDATE packets, and hence
763	   adversely affect the end-to-end convergence time for NLRI carried
764	   within BGP.  The operational requirement for why messages are
765	   advantageous to be in-band to a protocol should also be considered.
766	   In particular, it should be noted that where such information is to
767	   be transmitted between administrative boundaries a BGP session
768	   represents an existing channel between the two ASes.  This channel is
769	   considered to be secure insofar as the routing information, and
770	   requests sent via the session are considered to come from a trusted
771	   source.  Since error information relates to both a particular
772	   attachment, and is key to ensuring that such a session is operating
773	   as expected, it is considered of great operational benefit that this
774	   information is transmitted over this channel.  In addition, the
775	   overall system scalability is improved by such in-band transmission.
776	   It is expected that erroneous information resulting in the 'treat-as-
777	   withdraw' mechanism being utilised is relatively infrequently
778	   transmitted between two peers (when compared to the frequency of
779	   UPDATE messages transmission).  The impact of including an additional
780	   BGP message type for such operational visibility is relatively small
781	   from a resource utilisation perspective - additional processing
782	   overhead is only experienced when such a message is received.  Where
783	   a separate session is maintained, particular network elements within
784	   a service provider topology may require hundreds, or thousands, of
785	   additional sessions for the transmission of this information.  Such
786	   an resource consumption overhead is likely to be unacceptable to some
787	   network operators.

789	   For the reasons explained above, it is expected that mechanisms
790	   specified to meet the requirements for event visibility consider the
791	   relative impacts of additional monitoring sessions, or message
792	   inclusion in band to BGP in order not to compromise the security,
793	   scalability and robustness of the BGP-4 protocol.

795	7.  Operational Complexities Introduced by Altering RFC4271

797	   The existing NOTIFICATION and subsequent teardown of a BGP session
798	   upon encountering an error has the advantage that a consistent
799	   approach to error handling is required of all implementations of the
800	   BGP-4 protocol.  This is of operational advantage as it provides a
801	   clear expectation of the behaviour of the protocol.  The requirements
802	   defined herein add further complexity to the error-handling within
803	   BGP, and hence are liable to compromise the existing deterministic
804	   protocol behaviour.  It is therefore deemed that there is a further
805	   requirement to define a set of recommended behaviours based on the
806	   reception of a particular class of erroneous UPDATE message,
807	   alongside highlighting some of the implementation complexities that
808	   may need to be handled in the case that particular recommendations
809	   made within this memo are deployed.

811	   Utilising the classes of erroneous UPDATE message described in
812	   Section 2, the recommended behaviour for a BGP-4 implementation can
813	   be divided into two branches.  Primarily, where a semantic error is
814	   identified, an implementation is expected to utilise the reduced-
815	   impact error handling approach, as described in Section 3.  In the
816	   case that such an approach results in known NLRI being withdrawn from
817	   the BGP speaker's RIB, and an implementation provides functionality
818	   such that these errors are recovered from through an automatically
819	   triggered means, such as those described within Section 4, some
820	   consideration of the scalability of these recovery mechanisms is
821	   required.  Clearly, there is an computational and bandwidth overhead
822	   associated with the re-advertisement of NLRI between two BGP speakers
823	   - both due to the generation of UPDATE messages, their transmission
824	   between the two speakers, and the parsing and processing into the RIB
825	   required.  This overhead is directly proportional to the number of
826	   UPDATE messages that are required.  Where a semantic error is
827	   experienced, by definition the NLRI contained within the UPDATE can
828	   be extracted.  It is therefore possible to minimise the proportion of
829	   the RIB that is re-advertised by targeting any recovery mechanism on
830	   the NLRI contained within the erroneous UPDATE.  Such a targeted
831	   mechanism can be achieved through a means such as One-Time ORF, or
832	   other means of targeting UPDATE messages not discussed within this
833	   memo.  It is recommended that where available, any automatic (or
834	   manual) triggered recovery mechanism behaviour utilises such targeted
835	   means in preference to any whole RIB refresh mechanism (such as
836	   ROUTE-REFRESH).

838	   In the case that an erroneous UPDATE has been processed through a
839	   means such as treat-as-withdraw (described within Section 3), a
840	   recovering mechanism may be considered superfluous, if the assumption
841	   is made that the RIB inconsistency will only be recovered from based
842	   on a path re-convergence (or change in BGP attribute) for the
843	   advertising BGP speaker.  However, where this assumption is not
844	   considered to provide adequate recovery behaviour, and a mechanism to
845	   restore RIB consistency automatically is implemented, some
846	   consideration must be made for where repeated erroneous messages
847	   occur.  In this case, in order to limit the impact to the BGP
848	   speaker's network operation, at a pre-defined point it is recommended
849	   that such automatic recovery mechanisms towards the BGP speaker from
850	   which erroneous UPDATEs are repeatedly received are suppressed, and
851	   the fact that such suppression has occurred is highlighted to an
852	   operator.  The point at which such behaviour is suppressed is to be
853	   defined on a per-implementation basis, taking into account feedback
854	   from the Network Operator community based on the deployment of the
855	   recommendations described in this document.  It is expected that such
856	   trigger points are dependent upon the mechanisms implemented for a
857	   particular BGP-4 implementations, and the impact upon the speaker of
858	   these means of RIB recovery.

860	   Where critical errors are experienced, such that a session reset is
861	   required, the mechanism discussed in Section 5 should be used.
862	   Again, since such a mechanism results in a restart of a BGP session,
863	   it expected that all NLRI carried over the session is re-advertised
864	   as it is re-established, incurring processing overhead on both the
865	   advertising and receiving BGP speaker.  In order to minimise the
866	   consumption of control-plane computational resource on both speakers,
867	   it is recommended that mechanisms allowing a reduced set of BGP
868	   UPDATE messages to be re-transmitted between two speakers are
869	   employed wherever possible - for instance through employing
870	   mechanisms such as those described in [I-D.ietf-idr-enhanced-gr].

872	   In the case that repeated critical errors occur, the overhead of
873	   performing any mechanism implemented based on the requirements in
874	   Section 5 is incurred following each erroneous UPDATE message.  Since
875	   these mechanisms are, by definition, performed automatically in
876	   response to the erroneous message being received similar
877	   considerations as to the impact to the BGP speaker must be taken into
878	   account.  As such, it is expected that after a certain trigger level,
879	   the ongoing receipt of critical errors within BGP UPDATE messages is
880	   deemed to be indicative of a long-lasting failure, and a session no
881	   longer considered viable.  Where such an case is experienced, it is
882	   expected that the BGP session reverts to the standard session failure
883	   behaviour, as described in [RFC4271] and documents updating this base
884	   standard.  Where such a reversion is implemented this condition
885	   should be flagged to an network operator.  The number of restart
886	   attempts before the session reverts to being shut down should be
887	   determined based on the overhead of the recovery mechanisms
888	   implemented (for instance, where [I-D.ietf-idr-enhanced-gr] is
889	   implemented, the impact of session restart may be significantly
890	   lower), and operational experience of the deployment of the
891	   recommendations described in this document.

893	   Since repeated erroneous UPDATE messages which experience critical
894	   errors may be indicative of long-lasting failure modes, it is
895	   recommended that a back-off from restarting BGP sessions experiencing
896	   such behaviour is implemented.  As such, this is not applicable to
897	   restart behaviour through means such as those described in Section 5
898	   since such restarts are time-bound based on the period for which the
899	   Adj-RIB-In from a BGP speaker is maintained as valid (e.g., when
900	   considering BGP Graceful Restart, such restarts are time-bound by the
901	   Restart Time described in [RFC4724]).  However, following a session
902	   reverting to being pulled down based on repeated error conditions, it
903	   is recommended that following restart attempts are subject to an
904	   exponentially increasing interval between subsequent attempts.  It is
905	   therefore recommended that in such cases an implementation implements
906	   the increasing values of IdleHoldTimer as described in the BGP-4 FSM
907	   documented in [RFC4271].

909	7.1.  Reducing the Network Impact of Session Teardown

911	   As discussed within the preceding section, where repeated critical
912	   UPDATE message errors are received, it is recommended that the impact
913	   to the both advertising and receiving BGP-4 speakers be limited by
914	   reverting to tearing the BGP-4 session experiencing such errors down.
915	   The BGP-4 specification presented in [RFC4271] achieves such a
916	   session shutdown by sending a NOTIFICATION message, however, this has
917	   the net result that all downstream BGP speakers (i.e. those to whom
918	   the routes carried over the now ceased BGP session was readvertised)
919	   must withdraw this route from their RIB, and perform a best-path
920	   selection if required.  In some cases, there may be no alternate path
921	   available, and hence a period of time for which no valid BGP route
922	   exists.  Particularly, this is very likely to occur where an upstream
923	   BGP speaker performs a best-path selection and advertises only a
924	   single path to its neighbours - there is a requirement for the
925	   upstream speaker to perform a best-path selection, and re-advertise a
926	   new set of NLRI before the downstream system is able to converge to a
927	   new path.  It should be noted that where UPDATE messages withdrawing
928	   NLRI are not subject to the BGP session's configured
929	   MinRouteAdvertisementInterval (MRAI) [RFC4271], but re-advertisements
930	   are, this may result in a BGP speaker being without a path for a
931	   period up to the MRAI.

933	   Clearly, it is advantageous to avoid this period of time for which
934	   there may be no reachability for a set of routes, especially since
935	   the BGP speaker terminating a particular session is doing so due to a
936	   particular error handling policy.  The graceful shutdown mechanism
937	   detailed in [I-D.ietf-grow-bgp-gshut] provides a mechanism by which a
938	   BGP speaker is able to signal that a set of routes are to be
939	   withdrawn, and hence allow downstream systems to pre-emptively
940	   perform a best-path selection, and hence advertise new reachability
941	   information in a make-before-break manner.

943	   It is therefore envisaged, that where a session is to be shutdown,
944	   based on a trigger relating to erroneous UPDATE messages being
945	   received (be they repeated or not) that the graceful shutdown
946	   procedure in utilised, so as to reduce the forwarding impact of
947	   routes received on the session being withdrawn.

949	8.  IANA Considerations

951	   This memo includes no request to IANA.

953	9.  Security Considerations

955	   The requirements outlined in this document provide mechanisms by
956	   which erroneous BGP messages may be responded to with limited impact
957	   to forwarding operation.  This is of benefit to the security of a BGP
958	   speaker in general.  Where UPDATE messages may have been propagated
959	   by a single malicious Autonomous System or router within a network
960	   (or the Internet default free zone - DFZ), which are then propagated
961	   to all devices within the same routing domain, all other NLRI
962	   available over the same session become unreachable.  This mechanism
963	   may provide means by which an Autonomous System can be isolated from
964	   required routing domains (such as the Internet), should the relevant
965	   UPDATE messages be propagated via specific paths.  By reducing the
966	   impact of such failures, it is envisaged that this possibility may be
967	   constrained to a specific set of NLRI, or a specific topology.

969	   Some mechanisms meeting the requirements specified in this document,
970	   particularly those within Section 6 may provide further security
971	   concerns, however, it is envisaged that these are addressed in per-
972	   enhancement memos.

974	10.  Acknowledgements

976	   The author would like to thank the following network operators for
977	   their insight, and valuable input in defining the requirements for a
978	   variety of operational deployments of the BGP-4 protocol; Shane
979	   Amante, Bruno Decraene, Rob Evans, David Freedman, Wes George, Tom
980	   Hodgson, Sven Huster, Jonathan Newton, Neil McRae, Thomas Mangin, Tom
981	   Scholl and Ilya Varlashkin.

983	   In addition, many thanks are extended to Jeff Haas, Wim Hendrickx,
984	   Tony Li, Alton Lo, Keyur Patel, John Scudder, Adam Simpson and Robert
985	   Raszuk for their expertise relating to implementations of the BGP-4
986	   protocol.

988	11.  References

990	11.1.  Normative References

992	   [RFC2858]  Bates, T., Rekhter, Y., Chandra, R., and D. Katz,
993	              "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000.

995	   [RFC2918]  Chen, E., "Route Refresh Capability for BGP-4", RFC 2918,
996	              September 2000.

998	   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
999	              Protocol 4 (BGP-4)", RFC 4271, January 2006.

1001	   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
1002	              Networks (VPNs)", RFC 4364, February 2006.

1004	   [RFC4456]  Bates, T., Chen, E., and R. Chandra, "BGP Route
1005	              Reflection: An Alternative to Full Mesh Internal BGP
1006	              (IBGP)", RFC 4456, April 2006.

1008	   [RFC4724]  Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y.
1009	              Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724,
1010	              January 2007.

1012	   [RFC4760]  Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
1013	              "Multiprotocol Extensions for BGP-4", RFC 4760,
1014	              January 2007.

1016	11.2.  Informational References

1018	   [I-D.chen-ebgp-error-handling]
1019	              Chen, E., Mohapatra, P., and K. Patel, "Revised Error
1020	              Handling for BGP Updates from External Neighbors",
1021	              draft-chen-ebgp-error-handling-01 (work in progress),
1022	              September 2011.

1024	   [I-D.ietf-grow-bgp-gshut]
1025	              Francois, P., Decraene, B., Pelsser, C., Patel, K., and C.
1026	              Filsfils, "Graceful BGP session shutdown",
1027	              draft-ietf-grow-bgp-gshut-03 (work in progress),
1028	              December 2011.

1030	   [I-D.ietf-grow-bmp]
1031	              Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring
1032	              Protocol", draft-ietf-grow-bmp-06 (work in progress),
1033	              December 2011.

1035	   [I-D.ietf-idr-bgp-enhanced-route-refresh]
1036	              Patel, K., Chen, E., and B. Venkatachalapathy, "Enhanced
1037	              Route Refresh Capability for BGP-4",
1038	              draft-ietf-idr-bgp-enhanced-route-refresh-02 (work in
1039	              progress), June 2012.

1041	   [I-D.ietf-idr-bgp-gr-notification]
1042	              Patel, K., Fernando, R., and J. Scudder, "Notification
1043	              Message support for BGP Graceful Restart",
1044	              draft-ietf-idr-bgp-gr-notification-00 (work in progress),
1045	              December 2011.

1047	   [I-D.ietf-idr-enhanced-gr]
1048	              Patel, K., Chen, E., Fernando, R., and J. Scudder,
1049	              "Accelerated Routing Convergence for BGP Graceful
1050	              Restart", draft-ietf-idr-enhanced-gr-01 (work in
1051	              progress), June 2012.

1053	   [I-D.ietf-idr-operational-message]
1054	              Freedman, D., Raszuk, R., and R. Shakir, "BGP OPERATIONAL
1055	              Message", draft-ietf-idr-operational-message-00 (work in
1056	              progress), March 2012.

1058	   [I-D.ietf-idr-optional-transitive]
1059	              Scudder, J., Chen, E., Mohapatra, P., and K. Patel,
1060	              "Revised Error Handling for BGP UPDATE Messages",
1061	              draft-ietf-idr-optional-transitive-04 (work in progress),
1062	              October 2011.

1064	   [I-D.zeng-idr-one-time-prefix-orf]
1065	              Zeng, Q., Dong, J., Heitz, J., Patel, K., Shakir, R., and
1066	              Z. Huang, "One-time Address-Prefix Based Outbound Route
1067	              Filter for BGP-4", draft-zeng-idr-one-time-prefix-orf-02
1068	              (work in progress), July 2012.

1070	   [RFC5881]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
1071	              (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
1072	              June 2010.

1074	Author's Address

1076	   Rob Shakir
1077	   BT
1078	   pp C3L
1079	   BT Centre
1080	   81, Newgate Street
1081	   London  EC1A 7AJ
1082	   UK

1084	   Email: rob.shakir@bt.com
1085	   URI:   http://www.bt.com/