idnits 2.17.1 draft-ietf-grow-ops-reqs-for-bgp-error-handling-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 20, 2011) is 4571 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5881' is defined on line 990, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760) == Outdated reference: A later version (-17) exists of draft-ietf-grow-bmp-05 == Outdated reference: A later version (-04) exists of draft-ietf-idr-optional-transitive-03 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force R. Shakir 3 Internet-Draft C&W 4 Intended status: Informational October 20, 2011 5 Expires: April 22, 2012 7 Operational Requirements for Enhanced Error Handling Behaviour in BGP-4 8 draft-ietf-grow-ops-reqs-for-bgp-error-handling-02 10 Abstract 12 BGP-4 is utilised as a key intra- and inter-Autonomous System routing 13 protocol in modern IP networks. The failure modes as defined by the 14 original protocol standards are based on a number of assumptions 15 around the impact of session failure. Numerous incidents both in the 16 global Internet routing table and within Service Provider networks 17 have been caused by strict handling of a single invalid UPDATE 18 message causing large-scale failures in one or more Autonomous 19 Systems. 21 This memo describes the current use of BGP-4 within Service Provider 22 networks, and outlines a set of requirements for further work to 23 enhance the mechanisms available to a BGP-4 implementation when 24 erroneous data is detected. Whilst this document does not provide 25 specification of any standard, it is intended as an overview of a set 26 of enhancements to BGP-4 to improve the protocol's robustness to suit 27 its current deployment. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on April 22, 2012. 46 Copyright Notice 48 Copyright (c) 2011 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. Role of BGP-4 in Service Provider Networks . . . . . . . . 3 65 1.2. Overview of Operator Requirements for BGP-4 Error 66 Handling . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2. Errors within BGP-4 UPDATE Messages . . . . . . . . . . . . . 6 68 2.1. Classifying BGP Errors and Expected Error Handling . . . . 7 69 2.1.1. Critical BGP Errors . . . . . . . . . . . . . . . . . 7 70 2.1.2. Semantic BGP Errors . . . . . . . . . . . . . . . . . 8 71 3. Avoiding use of NOTIFICATION . . . . . . . . . . . . . . . . . 9 72 4. Recovering RIB Consistency . . . . . . . . . . . . . . . . . . 11 73 5. Reducing the Impact of Session Reset . . . . . . . . . . . . . 13 74 6. Operational Toolset for Monitoring BGP . . . . . . . . . . . . 15 75 7. Operational Complexities Introduced by Altering RFC4271 . . . 19 76 7.1. Reducing the Network Impact of Session Teardown . . . . . 21 77 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 78 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23 79 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 80 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 81 11.1. Normative References . . . . . . . . . . . . . . . . . . . 25 82 11.2. Informational References . . . . . . . . . . . . . . . . . 25 83 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 27 85 1. Introduction 87 Where BGP-4 [RFC4271] is deployed in the Internet and Service 88 Provider networks, numerous incidents have been recorded due to the 89 manner in which [RFC4271] specifies errors in routing information 90 should be handled. Whilst the behaviour defined in the existing 91 standards retains utility, the deployments of the protocol have 92 changed within modern networks, resulting in significantly different 93 demands for protocol robustness. Whilst a number of Internet Drafts 94 have been written to begin to enhance the behaviour of BGP-4 in terms 95 of the handling of erroneous messages, this memo intends to define a 96 set of requirements for ongoing work. These requirements are 97 considered from the perspective of a Network Operator, and hence this 98 draft does not intend to define the protocol mechanisms by which such 99 error handling behaviour is to be implemented. 101 1.1. Role of BGP-4 in Service Provider Networks 103 BGP was designed as an inter-Autonomous System (AS) routing protocol 104 and hence many of the error handling mechanisms within the protocol 105 specification are designed to be conducive to this role. In general, 106 this consideration as an inter-AS routing propagation mechanism 107 results in the view that a BGP session propagates a relatively small 108 amount of network-layer reachability information (NLRI) between two 109 ASes. In this case, it is the expectation of session resilience for 110 those adjacencies that are key to routing continuity (for example, it 111 is expected that two networks peering via BGP would connect multiple 112 times in order to safeguard equipment or protocol failure). In 113 addition, there is some expectation of multiple paths to a particular 114 NLRI being available - it would be expected that a network can fall 115 back to utilising alternate, less direct, paths where a failure of a 116 more direct path occurs. 118 Traditional network architectures would deploy an Interior Gateway 119 Protocol (IGP) to carry infrastructure and customer prefixes, with an 120 Exterior Gateway Protocol (EGP) such as BGP being utilised to 121 propagate these prefixes to other Autonomous Systems. However, with 122 the growth of IP-based services, this is no longer considered best 123 practice. In order to ensure that convergence is within acceptable 124 time bounds, the amount of routing information carried within the IGP 125 is significantly reduced - and tends to be only infrastructure 126 prefixes. iBGP is then utilised to propagate both customer, and 127 external prefixes within an AS. As such, BGP has become an IGP, with 128 traditional IGPs acting as a means by which to propagate the routing 129 information which is required to establish a BGP session, and reach 130 the egress node within the local routing domain. This change in role 131 presents different requirements for the robustness of BGP as a 132 routing protocol - with the expectation of similar level of 133 robustness to that of an IGP being set. 135 Along with this change in role, the nature of the IP routing 136 information that is carried has changed. BGP has become a ubiquitous 137 means by which service information can be propagated between devices. 138 For instance, BGP is utilised to carry routing information for IP/ 139 MPLS VPN services as described in [RFC4364]. Since there is an 140 existing deployment of the protocol between PE devices in numerous 141 networks, it has been adapted to propagate this routing information, 142 as its use limits number of routing protocols required on each 143 device. This additional information being propagated represents a 144 large change in requirement for the error handling of the protocol - 145 where session failure occurs, it is likely a complete service outage 146 for at least a subset of a network's customers is experienced where 147 an erroneous packet may have occurred within a different sub-topology 148 or even service (a different address family for example). For this 149 reason, there is a significant demand to avoid service affecting 150 failures that may be triggered by routing information within a single 151 sub-topology or service. 153 Both within Internet and multi-service routing architectures, a 154 number of BGP sessions propagate a large proportion of the required 155 routing information for network operation. For Internet routing, 156 these are typically BGP sessions which propagate the global routing 157 table to an AS - failure of these sessions may have a large impact on 158 network service, based on a single erroneous update. In an multi- 159 service environment, typical deployments utilise a small number of 160 core-facing BGP sessions, typically towards route reflector devices. 161 Failure of these sessions may also result in a large impact to 162 network operation. Clearly, the avoidance of conditions requiring 163 these sessions to fail is of great utility to any network operator, 164 and provides further motivation for the revision of the existing 165 behaviour. 167 Whilst the behaviour in [RFC4271] is suited to ensuring that BGP 168 messages with erroneous routing information in are limited in scope 169 (by means of session reset), with the above considerations, it is 170 clear that this mechanism is not suited to all deployments. It 171 should, however, be noted that the change in scope affects the 172 handling only of errors occurring after BGP session establishment. 173 There is no current operational requirement to amend the means by 174 which error handling in session establishment, or liveliness 175 detection, are performed. 177 1.2. Overview of Operator Requirements for BGP-4 Error Handling 179 It is the intention of this document to define a set of criteria for 180 the manner in which a revised error handling mechanism in BGP-4 is 181 required to conform. The motivation for the definition of these 182 requirements can be summarised based on certain behaviour currently 183 present in the protocol that is not deemed acceptable within current 184 operational deployments, or where there is a short-fall in the tool 185 set available to an operator. These key requirements can be 186 summarised as follows: 188 o It is unacceptable within modern deployments of the BGP-4 protocol 189 that a single erroneous UPDATE packet affects prefixes that it 190 does not carry. This requirement therefore requires some 191 modification to the means by which erroneous UPDATE packets are 192 handled, and reacted to - with a particular focus on avoiding the 193 use of the NOTIFICATION message. 195 o It is recognised that some error conditions may occur within the 196 BGP-4 protocol may not always be handled gracefully, and may 197 result in conditions whereby an implementation cannot recover. In 198 these (and similar) cases, it is unacceptable for an operator that 199 this reset of the BGP-4 session results in interruption to 200 forwarding packets (by means of withdrawing prefixes installed by 201 BGP-4 into a device's RIB, and subsequently FIB). To this end, 202 there is a requirement to define a session reset mechanism which 203 provides session re-initialisation in a non-destructive manner. 205 o Further to the requirements to provide a more robust protocol, the 206 current visibility into error conditions within the BGP-4 protocol 207 is extremely limited - where further modifications to this 208 behaviour are to be made, complexity is likely to be added. Thus, 209 to ensure that BGP-4 is manageable, there are requirements for 210 mechanisms by which the protocol can be examined and monitored. 212 This document describes each of these requirements in further depth, 213 along with an overview of means by which they are expected to be 214 achieved. In addition, the mechanism by which the enhancements 215 meeting these requirements are to interact is discussed. 217 2. Errors within BGP-4 UPDATE Messages 219 Both through analysis of incidents occurring with the Internet DFZ, 220 and multi-service environments utilising BGP-4 to signal service or 221 routing information, a number of different classes of errors within 222 BGP-4 UPDATE messages have been observed. In order to consider the 223 applicability of enhanced error handling mechanisms, it is possible 224 to divide these errors into a number of sub-classes, particularly 225 focusing around the location of the error within the UPDATE message. 227 Where an UPDATE message is considered invalid by a BGP speaker due to 228 an error within a path attribute that is not the NLRI (where the 229 definition of NLRI includes reachability information encoded in the 230 MP_REACH_NLRI and MP_UNREACH_NLRI attributes as specified in 231 [RFC4760]) it is a requirement of any enhanced error handling 232 mechanism to handle the error in a manner focused on the NLRI 233 contained within the message. Since in this case, the message 234 received from the remote peer is syntactically valid, it is 235 considered that such an UPDATE is indicative of erroneous data within 236 a path attribute. The impact of the current behaviour defined within 237 the protocol makes the implication that the BGP speaker from whom the 238 message is received is now an invalid path for all NLRI announced via 239 the session - which results in a disproportionate impact to overall 240 network operation. In particular scenarios (such as networks with 241 centralised BGP route reflection) such action can result in a loss of 242 all reachability to a network. In other contexts (such as the 243 Internet DFZ), it cannot be assumed that the BGP speaker from whom 244 the UPDATE message is received is directly responsible for the 245 erroneous information contained within the message. 247 Two further error cases exist within UPDATE messages, both of which 248 are related to the mechanisms that are applicable to messages 249 received where some difficulty exists in parsing the entire BGP 250 message. The two cases concern those cases where a valid NLRI 251 attribute can be extracted, and those where such an attribute is not 252 able to be parsed. In these cases, errors in the packing of 253 attributes within a BGP message may have occurred. Such errors are 254 likely indicative of an error specifically caused by the remote BGP 255 speaker. It is, however, desirable to an operator that such errors 256 are handled without affecting all NLRI across a BGP session. As 257 such, there is a key requirement to maximise the number of cases in 258 which it is possible to extract NLRI from a BGP UPDATE message. To 259 this end, it is required that where possible the MP_REACH and 260 MP_UNREACH attributes are utilised for encoding all NLRI (including 261 IPv4 Unicast), and that this attribute is included as the first 262 attribute of a BGP UPDATE message (as originally recommended in 263 [I-D.chen-ebgp-error-handling]). Such a change to the order of 264 inclusion of this attribute maximises the number of cases in which 265 NLRI can be extracted from an UPDATE. Where this is possible, it is 266 again required that the error handling mechanisms utilised should be 267 directly applied to the NLRI included in the UPDATE. 269 For all cases whereby NLRI can be obtained from an UPDATE message, it 270 is expected that the requirements outlined in Section 3 should be 271 considered by any enhancement to the BGP-4 protocol. 273 In the case that it is not possible to completely parse the NLRI 274 attribute from the UPDATE message received from a peer, it is 275 extremely likely that this is indicative of a serious error with 276 either the process of attribute packing, or buffer usage on the 277 remote BGP speaker. In this case, clearly, it is not possible to 278 apply any error handling mechanism that is limited to a specific set 279 of NLRI, since an implementation has no knowledge of the NLRI 280 included within the UPDATE message. In addition, such errors are 281 considered to be relatively fundamental to the operation of a BGP 282 implementation, and hence may indicate a case whereby significant 283 system errors have occurred. The current BGP-4 standard results in a 284 BGP speaker restarting a session with the remote BGP speaker. 285 However where such an error does occur, it is required that a 286 graceful mechanism is utilised to provide a lower impact to network 287 operation. The requirements for enhancements of this nature to BGP-4 288 are outlined in Section 5, with the requirements outlined therein 289 focused on providing a means by which system integrity can be 290 restored whilst allowing for continued network operation. 292 2.1. Classifying BGP Errors and Expected Error Handling 294 It is clearly of advantage for BGP-4 implementations to utilise a 295 consistent set of error handling mechanisms for the different types 296 of errors that are described in Section 2, and provide consistent 297 nomenclature to refer to them. It is therefore suggested that errors 298 that are indicative of larger scale failures of a BGP speaker, and 299 hence require some error handling at the session level are referred 300 to as 'critical' errors, whilst those errors that are identified 301 based on incorrect content of one of more attributes of a message are 302 referred to as 'semantic' errors. 304 2.1.1. Critical BGP Errors 306 As described in this document, it is of advantage to limit the number 307 of 'critical' errors that occur within the protocol, therefore, based 308 on analysis of the processing of BGP UPDATE messages, it is required 309 that 'critical' error handling behaviour is applied to: 311 o UPDATE Message Length errors - whereby the specified overall 312 UPDATE message length is inconsistent with sum of the Total Path 313 Attribute and Withdrawn Routes length. In this case, this is 314 indicative of message packing failure, whereby the NLRI may not be 315 correctly extracted. 317 o Errors Parsing the NLRI attributes of an UPDATE message - where 318 NLRI is carried in either the IPv4-Unicast Advertised or Withdrawn 319 routes, or in the MP_REACH_NLRI or MP_UNREACH_NLRI attributes 320 [RFC2858], it is not possible to target error handling mechanisms 321 to specific NLRI, and hence session level mechanisms must be 322 utilised. 324 It is expected that those requirements outlined in Section 5 are 325 utilised to provide session-level handling of those errors identified 326 as 'critical'. 328 2.1.2. Semantic BGP Errors 330 Where a BGP message is correctly formed, a number of cases exist 331 whereby the contents of the UPDATE are not valid - in these cases, 332 this represents errors that can be identified to affect specific 333 NLRI. The following cases are expected to be classified a semantic 334 errors: 336 o Zero or invalid length errors in path attributes excluding those 337 containing NLRI, or where the length of all path attributes 338 contained within the UPDATE does not correspond to the total path 339 attributes length. In this case, the NLRI can be correctly 340 extracted, and hence acted upon. 342 o Messages where invalid data or flags are contained in a path 343 attribute that does not relate to the NLRI. 345 o UPDATE messages missing mandatory attributes, unrecognised non- 346 optional attributes or those that contain duplicate or invalid 347 attributes (be they unsupported or unexpected). 349 o Those messages where the NEXT_HOP, or MP_REACH next-hop values are 350 missing, length zero, or invalid for the relevant AFI/SAFI. 352 In these cases, it is expected that these errors can be handled 353 gracefully, following the requirements detailed in Section 3 and 354 Section 4 of this memo. 356 3. Avoiding use of NOTIFICATION 358 The error handling behaviour defined in RFC4271 is problematic due to 359 the limited options that are available to an implementation. When an 360 erroneous BGP message is received, at the current time, the 361 implementation must either ignore the error, or send a NOTIFICATION 362 message, after which it is mandatory to terminate the BGP session. 363 It is apparent that this requirement is at odds with that of protocol 364 robustness. 366 There is significant complexity to this requirement. The mechanism 367 defined in [I-D.chen-ebgp-error-handling] describes a means by which 368 no NOTIFICATION message is generated for all cases whereby NLRI can 369 be extracted from an UPDATE. The NLRI contained within the erroneous 370 UPDATE message is considered as though the remote BGP speaker has 371 provided an UPDATE marking it as withdrawn. This results in a limit 372 in the propagation of the invalid routing information, whilst also 373 ensuring that no traffic is forwarded via a previously-known path 374 that may no longer be valid. This mechanism is referred to as 375 "treat-as-withdraw". 377 Whilst this behaviour results in avoiding a NOTIFICATION message, 378 keeping other routing information advertised by the remote BGP 379 speaker within the RIB, it may result in unreachability for a sub-set 380 of the NLRI advertised by the remote speaker. Two cases should be 381 considered - that where the entry for a prefix in the Adj-RIB-In of 382 the neighbour propagating an erroneous packet is utilised, and that 383 where the prefix installed in the device's RIB is learnt from another 384 BGP speaker. In the former case, should the identified NLRI not be 385 treated as withdrawn, the original NLRI is utilised within the global 386 RIB. However, this information is potentially now invalid (i.e. it 387 no longer provides a valid forwarding path), whilst an alternate 388 (valid) path may exist in another Adj-RIB-In. By continuing to 389 utilise the NLRI for which the UPDATE was considered invalid, traffic 390 may be forwarded via an invalid path, resulting in routing loops, or 391 black-holing. In the second case, no impact to the forwarding of 392 traffic, or global RIB, is incurred, yet where treat-as-withdraw is 393 implemented, possibly stale routing information is purged from the 394 Adj-RIB-In of the neighbour propagating errors. 396 Whilst mechanisms such as "treat-as-withdraw" are currently 397 documented, the proposals are limited in their scope - particularly 398 in terms of restrictions to implementation only on eBGP sessions. 399 This limitation is made based on the view that the BGP RIB must be 400 consistent across an autonomous system. By implementing treat-as- 401 withdraw for a iBGP session, one or more routers within the 402 Autonomous System may not have reachability to a prefix, and hence 403 blackholing of traffic, or routing loops, may occur. It should, 404 however, be considered if this view is valid, in light of the manner 405 in which BGP is utilised within operator networks. Inconsistency in 406 a RIB based on a single UPDATE being treated as withdrawn may cause a 407 inconsistency in a single sub-topology (e.g. Layer 3 VPN service), 408 or a service not operating completely (in the case of an UPDATE 409 carrying service membership information). Where a NOTIFICATION and 410 teardown is utilised this is destructive to all sub-topologies in all 411 address family identifiers (AFIs) carried by the session in question. 412 Even where mechanisms such as multi-session BGP are utilised, a whole 413 AFI is affected by such a NOTIFICATION message. In terms of routing 414 operation, it is therefore far less costly to endure a situation 415 where a limited sub-set of routing information within an AS is 416 invalid, than to consider all routing information as invalid based on 417 a single trigger. 419 It is considered that, if extended to cover iBGP, the mechanisms 420 described in [I-D.chen-ebgp-error-handling] and 421 [I-D.ietf-idr-optional-transitive] provide a means to avoid the 422 transmission of a NOTIFICATION to a remote BGP speaker based on a 423 single erroneous message, where at all possible, and hence meet this 424 requirement. The failure cases whereby NLRI cannot be extracted from 425 the UPDATE message represent a case whereby the receiving system 426 cannot handle the error gracefully based on this mechanism. 428 4. Recovering RIB Consistency 430 The recommendations described in Section 3 may result in the RIB for 431 a topology within an AS being inconsistent across the AS' internal 432 routers. Alternatively, where such mechanisms are deployed at an AS 433 boundary, interconnects between two ASes may be inconsistent with 434 each other. There are therefore risks of traffic blackholing, due to 435 missing routing information, or forwarding loops. Whilst this is 436 deemed an acceptable compromise in the short term, clearly, it is 437 suboptimal. Therefore, a requirement exists to provide mechanisms by 438 which a BGP speaker is able to recover the consistency of the Adj- 439 RIB-In for a particular neighbour. 441 In the general case, the consistency of the BGP RIB can be recovered 442 by re-requesting the entire Adj-RIB-Out of a remote BGP speaker is 443 re-advertised. A mechanism to achieve this re-advertisement is 444 defined within the ROUTE-REFRESH specification [RFC2918]. It is 445 envisaged that by requesting a refresh of all NLRI advertised by a 446 BGP speaker, any NLRI which has been withdrawn due to being contained 447 within an invalid UPDATE message is re-learnt. Where a ROUTE REFRESH 448 is used to directly perform a consistency check between the Adj-RIB- 449 Out of a remote device, and the Adj-RIB-In of the local BGP speaker, 450 a demarcation between the ROUTE-REFRESH, and normal UPDATE messages 451 is required (in order that an "end" of the refresh can be used to 452 identify any 'stale' NLRI) - [I-D.keyur-bgp-enhanced-route-refresh] 453 provides a means by which the ROUTE-REFRESH mechanism can be extended 454 to meet this requirement. 456 Whilst re-advertisement of the whole BGP RIB provides a means by 457 which withdrawn NLRI can be re-advertised, there are some scaling 458 implications that must be considered. In the case that a ROUTE- 459 REFRESH is generated, all NLRI must be re-packed into UPDATE messages 460 and advertised by one speaker on the BGP session, whilst the other 461 must receive all UPDATE messages, and validate the RIB's consistency. 462 Clearly, it is advantageous to avoid this work where possible. 464 It is envisaged that during routing inconsistencies caused by 465 utilising the 'treat-as-withdraw' mechanism, the local BGP speaker is 466 aware that some routing information was not able to be processed - 467 due to the fact that an UPDATE message was not parsed correctly. 468 Since this mechanism (as discussed in Section 3) requires the local 469 BGP speaker to have determined the set of NLRI for which an erroneous 470 UPDATE message was received, it is possible to use a targeted 471 mechanisms to re-request the specific NLRI that was contained within 472 the erroneous UPDATE message. By re-requesting, this provides the 473 remote BGP speaker an opportunity to re-transmit the NLRI - possibly 474 providing an opportunity to leverage alternative methods to build the 475 UPDATE message. Such a request requires extension to the existing 476 BGP-4 protocol, in terms of specific UPDATE generation filters with a 477 transient lifetime. It is envisaged that the work within 478 [I-D.zeng-one-time-prefix-orf] provides a mechanism allowing targeted 479 elements of the Adj-RIB-In for a BGP neighbour to be recovered. 481 It is of particular note for both means of recovering RIB consistency 482 described that these are effective only when considering transitive 483 errors within an implementation - for instance, should an RFC 484 interpretation error within an implementation be present, regardless 485 of the number of times a specific UPDATE is generated, it is likely 486 that this error condition will persist (as it may with the existing 487 behaviour defined by [RFC4271]). For this reason, there is an 488 requirement to consider the means by which such consistency recovery 489 mechanisms are utilised. It is not advisable that a transitive 490 filter and advertisement mechanism is triggered by all error handling 491 events due to the load this is likely to place on the neighbour 492 receiving such a request. Where this BGP speaker is a relatively 493 centralised device - a route reflector (as described by [RFC4456]) 494 for example - the act of generation of UPDATE messages with such 495 frequency is likely to cause disproportionate load. It is therefore 496 an operational requirement of such mechanisms that means of request 497 dampening be required by any such extension. 499 5. Reducing the Impact of Session Reset 501 Even where protocol enhancements allow errors in the BGP-4 protocol 502 to cease to trigger NOTIFICATION messages, and hence reset a BGP 503 session, it is clear that some error conditions may not be exited. 504 In particular, errors due to existing state, or memory structures, 505 associated with a specific BGP session will not be handled. It is 506 therefore important to consider how these error conditions are 507 currently handled by the protocol. It should be noted that the 508 following discussion and analysis considers only those NOTIFICATION 509 messages generated in response to errors in UPDATE messages (as 510 defined by Section 6.3 in [RFC4271]). 512 The existing NOTIFICATION behaviour triggers a reset of all elements 513 of the BGP-4 session, as described in Section 6 of [RFC4271]. It is 514 expected that session teardown requires an implementation to re- 515 initialise all structures and state required for session maintenance. 516 Clearly, there is some utility to this requirement, as error 517 conditions in BGP are, in general, exited from. However, this 518 definition is responsible for the forwarding outages within networks 519 utilising BGP for route propagation when each error is experienced. 520 The requirement described in Section 3 is intended to reduce the 521 cases whereby a NOTIFICATION is required, however, any mechanism 522 implemented as a response to this requirement by definition cannot 523 provide a session reset to the extent of that achieved by the current 524 behaviour. 526 In order to address this, there is a requirement for a means by which 527 a BGP speaker can signal that an unhandled error condition in an 528 UPDATE message occurred - requiring a session reset - yet also 529 continue to utilise the paths advertised by the neighbour that are 530 currently in use within the RIB. In this case, the Adj-RIB-In 531 received from the neighbour is not considered invalid, despite a 532 NOTIFICATION, and session reset, being required. This set of 533 requirements is akin to those answered by the BGP Graceful Restart 534 mechanism described in [RFC4724]. Since the operational requirement 535 in this case is to provide a means to achieve a complete session 536 restart without disrupting the forwarding path of those prefixes in 537 use within a BGP speaker's RIB, it is expected that utilising a 538 procedure similar to the Graceful Restart mechanism meets the error 539 handling requirement. By responding to an error condition (repeated 540 or otherwise) with a message indicating that an error that cannot be 541 handled has occurred, forcing session reset, whilst retaining 542 forwarding information within the RIB allows forwarding to all 543 prefixes within a system's RIB to continue, whilst the session 544 restarts. It is envisaged that the additional complexity introduced 545 by the introduction of such a mechanism can be limited by extending 546 existing BGP messages - one such approach is proposed in 548 [I-D.keyupate-idr-bgp-gr-notification]. By placing a time bound on 549 the restart lifetime, should an error condition not be transient - 550 for example, should an error have occurred with the BGP process, 551 rather than a specific of the BGP session - the remote BGP speaker is 552 still detected as an invalid device for forwarding. 554 It should, however, be noted that a protocol enhancement meeting this 555 requirement is not able to solve all error conditions - however, a 556 complete restart of the BGP and TCP session between two BGP speakers 557 implements an identical recovery mechanism to that which is achieved 558 by the existing behaviour. Where an error condition such as memory 559 or configuration corruption has occurred in a BGP implementation, it 560 is expected that a mechanism meeting this requirement continues to 561 detect this, by means of a bound on time for session restart to 562 occur. Whilst there may be some consideration that packets continue 563 to be forwarded through a device which can be in an failure mode of 564 this nature for a longer period, due to this requirement, the 565 architecture of modern IP routers should be considered. A divided 566 forwarding and control plane is common in many devices, as well as 567 process separation for software-based devices - corruption of a 568 specific protocol daemon does not necessarily imply forwarding is 569 affected. Indeed, where forwarding behaviour of a device is 570 affected, it is envisaged that a failure detection mechanism (be it 571 Bidirectional Forwarding Detection, or indeed BGP KEEPALIVE packets) 572 will detect such a failure in almost all cases, with the symptomatic 573 behaviour of such a failure being an invalid UPDATE message in very 574 few other cases. 576 6. Operational Toolset for Monitoring BGP 578 A significant complexity that is introduced through the requirements 579 defined in this document is that of monitoring BGP session status for 580 an operator. Although the existing error handling behaviour causes a 581 disproportionate failure, session failure is extremely visible to 582 most operational personnel within a Network Operator due to both 583 existing definitions of SNMP trap mechanisms for BGP, along with the 584 forwarding impact typically caused by such a failure. By introducing 585 mechanisms by which errors of this nature are not as visible, this is 586 no longer the case. There is a requirement that where subsets of the 587 RIB on a device are no longer reachable from a BGP speaker, or indeed 588 an AS, that some visibility of this situation, alongside a mechanism 589 to determine the cause is available to an operator. Whilst, to some 590 extent, this can be solved by mandating a sub-requirement of each of 591 the aforementioned requirements that a BGP speaker must log where 592 such errors occur, and are hence handled, this does not solve all 593 cases. In order to clarify this requirement, the example of the 594 transmission of an erroneous Optional Transitive attribute can be 595 considered. Since, by definition, there is no requirement for all 596 BGP speakers to parse such an attribute, a receiving router may treat 597 NLRI as withdrawn based on an erroneous attribute not examined by its 598 neighbour. In this case, the upstream device or network, propagating 599 the UPDATE, has no visibility of this error. Operationally, however, 600 it is of interest to the upstream router operator that such invalid 601 information was propagated. 603 The requirement for logging of error conditions in transmitted BGP 604 messages, which are visible to only the receiver, cannot be achieved 605 by any existing BGP message, or capability. It is envisaged that 606 each erroneous event should be transmitted to the remote peer - 607 including the information as to the set of NLRI that were considered 608 invalid. Whilst with some mechanisms this is achieved by default 609 (for example, One-Time Prefix ORF [I-D.zeng-one-time-prefix-orf] 610 (Outbound Route Filtering) will transmit the set of prefixes that are 611 required), the operator requirement is to know which prefixes may 612 have been unreachable in all cases. It is envisaged that an 613 extension to meet this requirement will allow for such information to 614 be transmitted between peers, and hence logged. Such a mechanism may 615 provide further utility as a either a diagnostic, or logging toolset. 617 As such, it is possible to divide the messages that are required in 618 order to provide further visibility into BGP for an operator. Such a 619 division can be made both due to the required means of message 620 transmission, alongside the criticality of each request. 622 o Messages required to replace NOTIFICATION - In cases where the 623 error handling mechanisms defined by [RFC4271] currently result in 624 a NOTIFICATION message being generated, a number of the 625 requirements detailed within this document result this message 626 being suppressed. Despite this change, the error condition's 627 occurrence is still of interest to an operator, since some form of 628 invalid data has been received on a session in order to provide 629 both monitoring and troubleshooting capabilities. It therefore 630 considered that an implementation must generate a message both 631 locally, and transmitted to the remote peer, based on the such a 632 condition. Where such a message is transmitted to the remote 633 peer, it is considered that the BGP session via which the 634 erroneous UPDATE message was received as transport to the remote 635 peer. The information transmitted in such a message should be 636 minimised to allow identification of the paths which were 637 considered erroneous (i.e. restricting the information to that 638 which is directly relevant to a network operator in the case of an 639 error condition occurring). Any delay to convergence on the 640 session in question is considered to be acceptable, given the 641 suboptimal nature of the reception of invalid routing information 642 via a BGP session. Further concerns regarding such a mechanism 643 relate to the load generated on the BGP speaker in question, 644 however, it must be considered that in the case of an erroneous 645 UPDATE being received, and the 'treat-as-withdraw' mechanism being 646 utilised, where the erroneous path is removed from the Loc-RIB, 647 there is likely to be a requirement to generate UPDATE messages 648 withdrawing the prefix from all further BGP speakers to which the 649 prefix is advertised. The load generated by the generation of 650 such UPDATEs is likely to be much greater than that of 651 transmitting error information via a logging message type back to 652 the speaker from which it was received. It is envisaged that 653 light-weight BGP message-based signalling mechanisms such as the 654 ADVISORY message types detailed in 655 [I-D.frs-bgp-operational-message] provide a suitable means to 656 satisfy this requirement. 658 o Additional Diagnostic Capabilities for BGP - In a number of cases, 659 there is an operational requirement to further debug erroneous BGP 660 UPDATE messages, along with the particulars of the state of a BGP 661 speaker. For instance, where an invalid BGP UPDATE message is 662 transmitted between two BGP speakers, the exact format of the 663 UPDATE message is of interest to an operator, as this information 664 provides a clear indication of an message considered to be 665 erroneous by the BGP speaker to which it was transmitted. In this 666 case, it is considered of great utility that the entire UPDATE 667 message is transmitted back to the advertising speaker, in order 668 to allow for further debugging to occur. Whilst such information 669 is particularly useful to an operator, it clearly provides 670 information that is not key to protocol operation - for this 671 reason, it is expected that some of the concerns regarding the 672 additional complexity, and load that a BGP speaker is subjected to 673 is not acceptable. For this reason, it is required that where 674 mechanisms are developed to support this requirement, messages of 675 this nature can be supported both within an existing BGP session, 676 and via a dedicated separate session, be it BGP carrying messages 677 such as those defined in [I-D.frs-bgp-operational-message] or a 678 dedicated monitoring protocol akin to BMP described in 679 [I-D.ietf-grow-bmp]. 681 Whilst the operational requirement for such monitoring tools to allow 682 for visibility into BGP is clearly agreed upon, the means by which 683 such messages are transmitted between two BGP speakers is likely to 684 be dependent upon both the positions of the speakers in question (for 685 instances, the requirements for such a protocol may differ where a 686 session is between two ASBRs under separate administration). The 687 introduction of additional message types to the BGP protocol clearly 688 introduces further complexity - and leaves room for further 689 implementation and standardisation errors that may compromise the 690 robustness of the BGP protocol. In addition, the queuing and 691 scheduling of these BGP messages must be interleaved with the 692 transmission of the key protocol messages - such as KEEPALIVE and 693 UPDATE packets. It is therefore a concern that should a large number 694 of messages specifically for operational visibility be transmitted, 695 this will delay the transmission of UPDATE packets, and hence 696 adversely affect the end-to-end convergence time for NLRI carried 697 within BGP. The operational requirement for why messages are 698 advantageous to be in-band to a protocol should also be considered. 699 In particular, it should be noted that where such information is to 700 be transmitted between administrative boundaries a BGP session 701 represents an existing channel exists between the two ASes. This 702 channel is considered to be secure insofar as the routing 703 information, and requests sent via the session are considered to come 704 from a trusted source. Since error information relates to both a 705 particular attachment, and is key to ensuring that such a session is 706 operating as expected, it is considered of great operational benefit 707 that this information is transmitted over this channel. In addition, 708 the overall system scalability is improved by such in-band 709 transmission. It is expected that erroneous information resulting in 710 the 'treat-as-withdraw' mechanism being utilised is relatively 711 infrequently transmitted between two peers (when compared to the 712 frequency of UPDATE messages transmission). The impact of including 713 an additional BGP message type for such operational visibility is 714 relatively small from a resource utilisation perspective - additional 715 processing overhead is only experienced when such a message is 716 received. Where a separate session is maintained, particular network 717 elements within a service provider topology may require hundreds, or 718 thousands, of additional sessions for the transmission of this 719 information. Such an resource consumption overhead is likely to be 720 unacceptable to some network operators. 722 For the reasons explained above, it is expected that mechanisms 723 specified to meet the requirements for event visibility consider the 724 relative impacts of additional monitoring sessions, or message 725 inclusion in band to BGP in order not to compromise the security, 726 scalability and robustness of the BGP-4 protocol. 728 7. Operational Complexities Introduced by Altering RFC4271 730 The existing NOTIFICATION and subsequent teardown of a BGP session 731 upon encountering an error has the advantage that a consistent 732 approach to error handling is required of all implementations of the 733 BGP-4 protocol. This is of operational advantage, as it provides a 734 clear expectation of the behaviour of the protocol. The requirements 735 defined herein add further complexity to the error-handling within 736 BGP, and hence are liable to compromise the existing deterministic 737 protocol behaviour. It is therefore deemed that there is a further 738 requirement to provide a clear method by which an erroneous UPDATE 739 should be reacted to, in order that all protocol implementations 740 provide a consistent means by which recovery is achieved. A further 741 complexity is introduced due to the disparate nature of the work 742 items altering the BGP error handling behaviour - since all items are 743 likely to be implemented as a BGP capability [RFC5492], situations 744 are likely to occur between devices (especially those with different 745 BGP implementations), where some of the mechanisms referenced are 746 unsupported. This adds further barriers to a standard definition of 747 the BGP-4 error handling behaviour. 749 In general, the approach considered ideal upon encountering an 750 erroneous UPDATE message can be divided into two cases - those where 751 the NLRI can be determined from the message, and those where it 752 cannot be. The latter case is the simpler of the two. In this case, 753 there is a requirement for the implementation to reset the BGP 754 session, utilising the reduced-impact approach, described in 755 Section 5. In the case where the remote BGP speaker is in a 756 transient error condition related to specific peer data structures, 757 or state, a single instance of this behaviour is likely to exit the 758 error condition. In the case of implementation errors, it is 759 possible that the BGP session in question may enter a continuous loop 760 of being reset, with a partial RIB being held by one or more of the 761 BGP speakers due to an non-deterministic order of UPDATE propagation. 762 It is therefore a requirement that within this reduced-impact 763 procedure any subsequent UPDATE messages that would result in further 764 session resets are ignored. Whilst this results in a condition where 765 an undetermined amount of the RIB is inconsistent, partial 766 reachability is maintained. In this case, the operational toolsets 767 discussed in Section 6 is likely to provide mechanisms by which this 768 condition can be brought to the attention of the relevant operators. 769 This requirement to accept a partial RIB, which results in potential 770 invalid traffic forwarding is a direct result of the deployments of 771 BGP-4, as described in Section 1.1. 773 The case where NLRI can be determined from an erroneous UPDATE 774 provides further complexities. In this case, a BGP speaker is aware 775 of the sub-set of the RIB which have been identified as being 776 contained within invalid UPDATE messages. This allows a local BGP 777 speaker to re-request single prefixes, utilising a mechanism such as 778 "one-time prefix ORF". However, a similar result is achieved by re- 779 requesting the entire RIB - albeit with greater resource 780 requirements. It is therefore expected that the process of recovery 781 utilises a staged set of mechanisms to attempt to restore consistency 782 of the RIB: 784 1. Where available, a mechanism capable of requesting only the NLRI 785 determined to have been contained within a invalid UPDATE should 786 be utilised. However, since it is possible that such an error 787 condition can be transient in nature, it is likely that more than 788 one request is to be transmitted (assuming the first does not 789 return a valid UPDATE message). In order to allow a 790 deterministic process, there is a requirement for a limit on the 791 number of specific requests transmitted to be defined. 793 2. Where a specific refresh mechanism is not available, a peer 794 should re-request the entire RIB. Again, there is a requirement 795 to limit the number of complete RIB requests that should be sent 796 via an implementation, in order to provide a bound both on the 797 expected level of load a device may experience, and on the time 798 for which the RIB may be inconsistent. 800 3. Finally, a session reset should be performed, as per the reduced- 801 impact NOTIFICATION requirement defined in Section 5. At this 802 point, a similar challenge to that discussed above exists, should 803 the error condition persist. In this case, as defined above, 804 there is a requirement to ignore those UPDATE messages that 805 continue to be erroneous. 807 It is envisaged that where limits are required, these will be defined 808 on a per memo-basis, or within a further revision of the requirements 809 described herein. 811 Whilst the approach described above provides a standard means by 812 which error recovery may be handled on a per UPDATE basis, further 813 complexities are raised where multiple errors occur. Clearly, 814 following this procedure causes control-plane load on both the BGP 815 speakers - for this reason, consideration of how repeated use of the 816 mechanisms discussed in this document is required. It is notable 817 that errors may not occur with UPDATE messages relating to only a 818 single NLRI, independent errors in multiple NLRIs may be experienced. 819 For this reason, it is required that an implementation rate limits 820 the number of error handling events sourced towards a particular 821 neighbour. It is expected that such rate limiting, or event 822 suppression is achieved on a per-session basis, where state 823 information is already held, rather than on a per-prefix basis as it 824 is envisaged that such behaviour presents significant scaling 825 problems, and introduces further state requirements for an 826 implementation of the protocol. It is recommended that where a flag 827 indicative of erroneous behaviour is implemented, the state of such a 828 value is maintained independently of session establishment. 830 7.1. Reducing the Network Impact of Session Teardown 832 In some cases, where repeated erroneous UPDATE messages are received 833 on a BGP-4 session, it is desirable that a BGP speaker disconnects 834 completely from the remote peer without performing a restart, in 835 order to avoid the control-plane overhead of repeated session 836 establishment, and subsequent reset events. This behaviour may be 837 required after a per-session flag indicating erroneous behaviour is 838 set, as discussed in Section 7. The BGP-4 specification presented in 839 [RFC4271] achieves such a session shutdown by sending a NOTIFICATION 840 message, however, this has the net result that all downstream BGP 841 speakers (i.e. those to whom the NLRI carried over the now ceased BGP 842 session was readvertised) must withdraw this NLRI from their RIB, and 843 perform a best-path selection if required. In some cases, there may 844 be no alternate path being available, and hence a period of time for 845 which no valid BGP route exists. Particularly, this is very likely 846 to occur where an upstream BGP speaker performs a best-path selection 847 and advertises only a single path to its neighbours - there is a 848 requirement for the upstream speaker to perform a best-path 849 selection, and re-advertise a new set of NLRI before the downstream 850 system is able to converge to a new path. It should be noted that 851 where UPDATE messages withdrawing NLRI are not subject to the BGP 852 session's configured MinRouteAdvertisementInterval (MRAI) [RFC4271], 853 but re-advertisements are, this may result in a BGP speaker being 854 without a path for a period up to the MRAI. 856 Clearly, it is advantageous to avoid this period of time for which 857 there may be no reachability for a set of NLRI, especially since the 858 BGP speaker terminating a particular session is doing so due to a 859 particular error handling policy. The graceful shutdown mechanism 860 detailed in [I-D.francois-bgp-gshut] provides a mechanism by which a 861 BGP speaker is able to signal that a set of NLRI is to be withdrawn, 862 and hence allow downstream systems to pre-emptively perform a best- 863 path selection, and hence advertise new reachability information in a 864 make-before-break manner. 866 It is therefore envisaged, that where a session is to be shutdown, 867 based on a trigger relating to erroneous UPDATE messages being 868 received (be they repeated or not) that the graceful shutdown 869 procedure in utilised, so as to reduce the forwarding impact of NLRI 870 received on the session being withdrawn. 872 8. IANA Considerations 874 This memo includes no request to IANA. 876 9. Security Considerations 878 The requirements outlined in this document provide mechanisms by 879 which erroneous BGP messages may be responded to with limited impact 880 to forwarding operation. This is of benefit to the security of a BGP 881 speaker in general. Where UPDATE messages may have been propagated 882 by a single malicious Autonomous System or router within a network 883 (or the Internet default free zone - DFZ), which are then propagated 884 to all devices within the same routing domain, all other NLRI 885 available over the same session become unreachable. This mechanism 886 may provide means by which an Autonomous System can be isolated from 887 required routing domains (such as the Internet), should the relevant 888 UPDATE messages be propagated via specific paths. By reducing the 889 impact of such failures, it is envisaged that this possibility may be 890 constrained to a specific set of NLRI, or a specific topology. 892 Some mechanisms meeting the requirements specified in this document, 893 particularly those within Section 6 may provide further security 894 concerns, however, it is envisaged that these are addressed in per- 895 enhancement memos. 897 10. Acknowledgements 899 The author would like to thank the following network operators for 900 their insight, and valuable input in defining the requirements for a 901 variety of operational deployments of the BGP-4 protocol; Shane 902 Amante, Bruno Decraene, Rob Evans, David Freedman, Tom Hodgson, Sven 903 Huster, Jonathan Newton, Neil McRae, Thomas Mangin, Tom Scholl and 904 Ilya Varlashkin. 906 In addition, many thanks are extended to Jeff Haas, Wim Hendrickx, 907 Alton Lo, Keyur Patel, John Scudder, Adam Simpson and Robert Raszuk 908 for their expertise relating to implementations of the BGP-4 909 protocol. 911 11. References 913 11.1. Normative References 915 [RFC2858] Bates, T., Rekhter, Y., Chandra, R., and D. Katz, 916 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000. 918 [RFC2918] Chen, E., "Route Refresh Capability for BGP-4", RFC 2918, 919 September 2000. 921 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 922 Protocol 4 (BGP-4)", RFC 4271, January 2006. 924 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 925 Networks (VPNs)", RFC 4364, February 2006. 927 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 928 Reflection: An Alternative to Full Mesh Internal BGP 929 (IBGP)", RFC 4456, April 2006. 931 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 932 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 933 January 2007. 935 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 936 "Multiprotocol Extensions for BGP-4", RFC 4760, 937 January 2007. 939 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 940 with BGP-4", RFC 5492, February 2009. 942 11.2. Informational References 944 [I-D.chen-ebgp-error-handling] 945 Chen, E., Mohapatra, P., and K. Patel, "Revised Error 946 Handling for BGP Updates from External Neighbors", 947 draft-chen-ebgp-error-handling-01 (work in progress), 948 September 2011. 950 [I-D.francois-bgp-gshut] 951 Francois, P., Decraene, B., pelsser, c., and C. Filsfils, 952 "Graceful BGP session shutdown", 953 draft-francois-bgp-gshut-01 (work in progress), 954 March 2009. 956 [I-D.frs-bgp-operational-message] 957 Raszuk, R., Shakir, R., and D. Freedman, "BGP OPERATIONAL 958 Message", draft-frs-bgp-operational-message-00 (work in 959 progress), July 2011. 961 [I-D.ietf-grow-bmp] 962 Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring 963 Protocol", draft-ietf-grow-bmp-05 (work in progress), 964 December 2010. 966 [I-D.ietf-idr-optional-transitive] 967 Scudder, J. and E. Chen, "Error Handling for Optional 968 Transitive BGP Attributes", 969 draft-ietf-idr-optional-transitive-03 (work in progress), 970 September 2010. 972 [I-D.keyupate-idr-bgp-gr-notification] 973 Patel, K., Fernando, R., Scudder, J., and J. Haas, 974 "Notification Message support for BGP Graceful Restart", 975 draft-keyupate-idr-bgp-gr-notification-00 (work in 976 progress), July 2011. 978 [I-D.keyur-bgp-enhanced-route-refresh] 979 Patel, K., Chen, E., and B. Venkatachalapathy, "Enhanced 980 Route Refresh Capability for BGP-4", 981 draft-keyur-bgp-enhanced-route-refresh-02 (work in 982 progress), March 2011. 984 [I-D.zeng-one-time-prefix-orf] 985 Zeng, Q. and J. Dong, "One-time Address-Prefix Based 986 Outbound Route Filter for BGP-4", 987 draft-zeng-one-time-prefix-orf-01 (work in progress), 988 October 2010. 990 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 991 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 992 June 2010. 994 Author's Address 996 Rob Shakir 997 Cable&Wireless Worldwide 998 London 999 UK 1001 Email: rjs@cw.net 1002 URI: http://www.cw.com/