idnits 2.17.1 draft-ietf-isis-restart-04.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2], [3]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2003) is 7589 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 13 looks like a reference -- Missing reference section? '2' on line 778 looks like a reference -- Missing reference section? '3' on line 778 looks like a reference -- Missing reference section? '4' on line 111 looks like a reference -- Missing reference section? '5' on line 654 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 1 warning (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group M. Shand 2 Internet Draft Les Ginsberg 3 Expiration Date: January 2004 Cisco Systems 4 July 2003 6 Restart signaling for IS-IS 7 draft-ietf-isis-restart-04.txt 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC 2026 [1]. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. Internet-Drafts are draft documents valid for a maximum of 18 six months and may be updated, replaced, or obsoleted by other 19 documents at any time. It is inappropriate to use Internet-Drafts as 20 reference material or to cite them other than as "work in progress." 22 The list of current Internet-Drafts can be accessed at 23 http://www.ietf.org/ietf/1id-abstracts.txt 25 The list of Internet-Draft Shadow Directories can be accessed at 26 http://www.ietf.org/shadow.html. 28 1. Abstract 30 The IS-IS routing protocol (RFC 1195 [2], ISO/IEC 10589 [3]) is a 31 link state intra-domain routing protocol. Normally, when an IS-IS 32 router is restarted, temporary disruption of routing occurs due to 33 events in both the restarting router and the neighbors of the 34 restarting router. 36 The router which has been restarted computes its own routes before 37 achieving database synchronization with its neighbors. The results 38 of this computation are likely to be non-convergent with the routes 39 computed by other routers in the area/domain. 41 Neighbors of the restarting router detect the restart event and 42 cycle their adjacencies with the restarting router through the down 43 state. The cycling of the adjacency state causes the neighbors to 44 regenerate their LSPs describing the adjacency concerned. This in 45 turn causes temporary disruption of routes passing through the 46 restarting router. 48 In certain scenarios the temporary disruption of the routes is 49 highly undesirable. This draft describes mechanisms to avoid or 50 minimize the disruption due to both of these causes. 52 2. Conventions used in this document 54 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 55 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 56 this document are to be interpreted as described in RFC-2119 [3]. 58 If the control and forwarding functions in a router can be 59 maintained independently, it is possible for the forwarding function 60 state to be maintained across a control function restart. This 61 functionality is assumed when the terms "restart/restarting" are 62 used in this document. 64 The terms "start/starting" are used to refer to a router in which 65 the control function has either been started for the first time or 66 has been restarted but the forwarding functions have not been 67 maintained in a prior state. 69 The terms "(re)start/(re)starting" are used when the text is 70 applicable to both a "starting" and a "restarting" router. 72 3. Overview 74 When an adjacency is reinitialized as a result of a neighbor 75 restarting, a router does three things: 77 1. It causes its own LSP(s) to be regenerated, thus triggering 78 SPF runs throughout the area (or in the case of Level 2, 79 throughout the domain). 81 2. It sets SRMflags on its own LSP database on the adjacency 82 concerned. 84 3. In the case of a Point-to-Point link it transmits a (set of) 85 CSNP(s) over the adjacency. 87 In the case of a restarting router process, the first of these is 88 highly undesirable, but the second is essential in order to ensure 89 synchronization of the LSP database. 91 The third action above minimizes the number of LSPs which must be 92 exchanged and, if made reliable, provides a means of determining 93 when the LSP databases of the neighboring routers have been 94 synchronized. This is desirable whether the router is being 95 restarted or not (so that the overload bit can be cleared in the 96 router's own LSP, for example). 98 This draft describes a mechanism for a restarting router to signal 99 that it is restarting to its neighbors, and allow them to 100 reestablish their adjacencies without cycling through the down 101 state, while still correctly initiating database synchronization. 103 This draft additionally describes a mechanism for a restarting 104 router to determine when it has achieved LSP database 105 synchronization with its neighbors. 107 This draft additionally describes a mechanism to optimize LSP 108 database synchronization and minimize transient routing disruption 109 when a router starts. 111 It is assumed that the three-way handshake [4] is being used on 112 Point-to-Point circuits. 114 4. Approach 116 4.1 Timers 118 Three additional timers, T1, T2 and T3 are required to support the 119 functionality defined in this document. 121 An instance of the timer T1 is maintained per interface, and 122 indicates the time after which an unacknowledged (re)start attempt 123 will be repeated. A typical value might be 3 seconds. 125 An instance of the timer T2 is maintained for each LSP database 126 present in the system i.e. for a Level1/2 system, there will be an 127 instance of the timer T2 for Level 1 and an instance for Level 2. 128 This is the maximum time that the system will wait for LSPDB 129 synchronization. A typical value might be 60 seconds. 131 A single instance of the timer T3 is maintained for the entire 132 system. It indicates the time after which the router will declare 133 that it has failed to achieve database synchronization (by setting 134 the overload bit in its own LSP). This is initialized to 65535 135 seconds, but is set to the minimum of the remaining times of 136 received IIHs containing a restart TLV with RA set and an indication 137 that the neighbor has an adjacency in the UP state to the restarting 138 router. 140 NOTE: The timer T3 is only used by a restarting router. 142 4.2 Restart TLV 144 A new TLV is defined to be included in IIH PDUs. The presence of 145 this TLV indicates that the sender supports the functionality 146 defined in this document and it carries flags that are used to 147 convey information during a (re)start. All IIHs transmitted by a 148 router that supports this capability MUST include this TLV. 150 Type 211 151 Length 1 - (3 + ID Length) 152 Value 153 Flags (1 octet) 155 0 1 2 3 4 5 6 7 156 +--+--+--+--+--+--+--+--+ 157 | Reserved |SA|RA|RR| 158 +--+--+--+--+--+--+--+--+ 160 RR - Restart Request 161 RA - Restart Acknowledgment 162 SA - Suppress adjacency advertisement 164 (Note: Remaining fields are required when RA bit is set) 166 Remaining Time (2 octets) 168 Remaining holding time (in seconds) 170 Restarting Neighbor System ID (ID Length octets) 172 The system ID of the neighbor to which the RA refers. 173 Note: Implementations based on earlier versions of this 174 document may not include this field in the TLV when RA is 175 set. In this case a router which is expecting an RA on a 176 LAN circuit SHOULD assume that the acknowledgement is 177 directed at the local system.) 179 4.2.1 Use of RR and RA bits 181 The RR bit is used by a (re)starting router to signal to its 182 neighbors that a (re)start is in progress, that an existing 183 adjacency SHOULD be maintained even under circumstances when the 184 normal operation of the adjacency state machine would require the 185 adjacency to be reinitialized, to request a set of CSNPs, and to 186 request setting of SRMflags. 188 The RA bit is sent by the neighbor of a (re)starting router to 189 acknowledge the receipt of a restart TLV with the RR bit set. 191 When the neighbor of a (re)starting router receives an IIH with the 192 restart TLV having the RR bit set, if there exists on this interface 193 an adjacency in state "Up" with the same System ID, and in the case 194 of a LAN circuit, with the same source LAN address, then, 195 irrespective of the other contents of the "Intermediate System 196 Neighbors" option (LAN circuits), or the "Point-to-Point Three-Way 197 Adjacency" option (Point-to-Point circuits): 199 a) The state of the adjacency is not changed. If this is the first 200 IIH with the RR bit set that this system has received associated 201 with this adjacency then the adjacency is marked as being in 202 "Restart mode" and the adjacency holding time is refreshed - 203 otherwise the holding time is not refreshed. The "remaining time" 204 transmitted according to (b) below MUST reflect the actual time 205 after which the adjacency will now expire. Receipt of a normal 206 IIH with RR bit reset will clear the "Restart mode" state. This 207 procedure allows the restarting router to cause the neighbor to 208 maintain the adjacency long enough for restart to successfully 209 complete while also preventing repetitive restarts from 210 maintaining an adjacency indefinitely. Whether an adjacency is 211 marked as being in "Restart mode" or not has no effect on 212 adjacency state transitions. 214 b) immediately (i.e. without waiting for any currently running 215 timer interval to expire, but with a small random delay of a few 216 10s of milliseconds on LANs to avoid "storms"), transmit over the 217 corresponding interface an IIH including the restart TLV with the 218 RR bit clear and the RA bit set, in the case of Point-to-Point 219 adjacencies having updated the "Point-to-Point Three-Way 220 Adjacency" option to reflect any new values received from the 221 (re)starting router. (This allows a restarting router to quickly 222 acquire the correct information to place in its hellos.) The 223 "Remaining Time" MUST be set to the current time (in seconds) 224 before the holding timer on this adjacency is due to expire. If 225 the corresponding interface is a LAN interface, then the 226 Restarting Neighbor System ID SHOULD be set to the System ID of 227 the router from whom the IIH with RR bit set was received. This 228 is required to correctly associate the acknowledgement and 229 holding time in the case where multiple systems on a LAN restart 230 at approximately the same time. This IIH SHOULD be transmitted 231 before any LSPs or SNPs transmitted as a result of the receipt of 232 the original IIH. 234 c) if the corresponding interface is a Point-to-Point interface, or 235 if the receiving router has the highest LnRouterPriority (with 236 highest source MAC address breaking ties) among those routers to 237 which the receiving router has an adjacency in state "Up" on this 238 interface whose IIHs contain the restart TLV, excluding 239 adjacencies to all routers which are considered in "Restart mode" 240 (note the actual DIS is NOT changed by this process), initiate 241 the transmission over the corresponding interface of a complete 242 set of CSNPs, and set SRMflags on the corresponding interface for 243 all LSPs in the local LSP database. 245 Otherwise (i.e. if there was no adjacency in the "UP" state to the 246 system ID in question), process the IIH as normal by reinitializing 247 the adjacency, and setting the RA bit in the returned IIH. 249 4.2.2 Use of SA bit 251 The SA bit is used by a starting router to request that its neighbor 252 suppress advertisement of the adjacency to the starting router in 253 the neighbor's LSPs. 255 A router which is starting has no maintained forwarding function 256 state. This may or may not be the first time the router has started. 257 If this is not the first time the router has started, copies of LSPs 258 generated by this router in its previous incarnation may exist in 259 the LSP databases of other routers in the network. These copies are 260 likely to appear "newer" than LSPs initially generated by the 261 starting router due to the reinitialization of LSP fragment sequence 262 numbers by the starting router. This may cause temporary blackholes 263 to occur until the normal operation of the update process causes the 264 starting router to regenerate and flood copies of its own LSPs with 265 higher sequence numbers. The temporary blackholes can be avoided if 266 the starting router's neighbors suppress advertising an adjacency to 267 the starting router until the starting router has been able to 268 propagate newer versions of LSPs generated by previous incarnations. 270 When the neighbor of a starting router receives an IIH with the 271 restart TLV having the SA bit set, if there exists on this interface 272 an adjacency in state "Up" with the same System ID, and in the case 273 of a LAN circuit, with the same source LAN address, then 274 advertisement of the adjacency to the starting router in LSPs MUST 275 be suppressed. Until an IIH with the SA bit clear has been received, 276 the adjacency advertisement MUST continue to be suppressed. If the 277 adjacency transitions to the UP state, the new adjacency MUST NOT be 278 advertised until an IIH with the SA bit clear has been received. 280 Note that a router which suppresses advertisement of the adjacency 281 to the starting router MUST NOT use this adjacency when performing 282 its SPF calculation. In particular, if an implementation follows the 283 example guidelines presented in [3] Annex C.2.5 Step 0:b) "pre-load 284 TENT with the local adjacency database", the suppressed adjacency 285 MUST NOT be loaded into the TENT. 287 4.3 Adjacency (re)acquisition 289 Adjacency (re)acquisition is the first step in (re)initialization. 290 Restarting and starting routers will make use of the RR bit in the 291 restart TLV, though each will use it at different stages of the 292 (re)start procedure. 294 4.3.1 Adjacency reacquisition during restart 296 The restarting router explicitly notifies its neighbor that the 297 adjacency is being reacquired, and hence that it SHOULD NOT 298 reinitialize the adjacency. This is achieved by setting the RR bit 299 in the restart TLV. When the neighbor of a restarting router 300 receives an IIH with the restart TLV having the RR bit set, if there 301 exists on this interface an adjacency in state "Up" with the same 302 System ID, and in the case of a LAN circuit, with the same source 303 LAN address, then the procedures described in 4.2.1 are followed. 305 A router that does not support the restart capability will ignore 306 the restart TLV and reinitialize the adjacency as normal, returning 307 an IIH without the restart TLV. 309 On restarting, a router initializes the timer T3, starts the timer 310 T2 for each LSPDB and for each interface (and in the case of a LAN 311 circuit, for each level) starts the timer T1 and transmits an IIH 312 containing the restart TLV with the RR bit set. 314 On a Point-to-Point circuit the "Adjacency Three-Way State" SHOULD 315 be set to "Init", because the receipt of the acknowledging IIH (with 316 RA set) MUST cause the adjacency to enter "Up" state immediately. 318 On a LAN circuit the LAN-ID assigned to the circuit SHOULD be the 319 same as that used prior to the restart. In particular, for any 320 circuits for which the restarting router was previously DIS, the use 321 of a different LAN-ID would necessitate the generation of a new set 322 of pseudonode LSPs, and corresponding changes in all the LSPs 323 referencing them from other routers on the LAN. By preserving the 324 LAN-ID across the restart, this churn can be prevented. To enable a 325 restarting router to learn the LAN-ID used prior to restart, the 326 LAN-ID specified in an IIH w RR set MUST be ignored. 328 Transmission of "normal" IIHs is inhibited until the conditions 329 described below are met (in order to avoid causing an unnecessary 330 adjacency initialization). On expiry of the timer T1, it is 331 restarted and the IIH is retransmitted as above. 333 When a restarting router receives an IIH a local adjacency is 334 established as usual, and if the IIH contains a restart TLV with the 335 RA bit set (and on LAN circuits with a Restart Neighbor System ID 336 which matches that of the local system), the receipt of the 337 acknowledgement over that interface is noted. When the RA bit is set 338 and the state of the remote adjacency is UP then the timer T3 is set 339 to the minimum of its current value and the value of the "Remaining 340 Time" field in the received IIH. 342 On a Point-to-Point link, receipt of an IIH not containing the 343 restart TLV is also treated as an acknowledgement, since it 344 indicates that the neighbor is not restart capable. However, since 345 no CSNP is guaranteed to be received over this interface, the timer 346 T1 is cancelled immediately without waiting for a complete set of 347 CSNP(s). Synchronization may therefore be deemed complete even 348 though there are some LSPs which are held (only) by this neighbor 349 (see section 4.4). In this case we also want to be certain that the 350 neighbor will reinitialize the adjacency in order to guarantee that 351 SRMflags have been set on its database, thus ensuring eventual LSPDB 352 synchronization. This is guaranteed to happen except in the case 353 where the Adjacency Three-Way State in the received IIH is UP and 354 the Neighbor Extended Local Circuit ID matches the extended local 355 circuit ID assigned by the restarting router. In this case the 356 restarting router MUST force the adjacency to reinitialize by 357 setting the local Adjacency Three-Way State to DOWN and sending a 358 normal IIH. 360 In the case of a LAN interface, receipt of an IIH not containing the 361 restart TLV is unremarkable since synchronization can still occur so 362 long as at least one of the non-restarting neighboring routers on 363 the LAN supports restart. Therefore T1 continues to run in this 364 case. If none of the neighbors on the LAN are restart capable, T1 365 will eventually expire after the locally defined number of retries. 367 In the case of a Point-to-Point circuit, the "LocalCircuitID" and 368 "Extended Local Circuit ID" information contained in the IIH can be 369 used immediately to generate an IIH containing the correct 3-way 370 handshake information. The presence of "Neighbor System ID" or 371 "Neighbor Extended Local Circuit ID" information which does not 372 match the values currently in use by the local system is ignored 373 (since the IIH may have been transmitted before the neighbor had 374 received the new values from the restarting router), but the 375 adjacency remains in the initializing state until the correct 376 information is received. 378 In the case of a LAN circuit the source neighbor information (e.g. 379 SNPAAddress) is recorded and used for adjacency establishment and 380 maintenance as normal. 382 When BOTH a complete set of CSNP(s) (for each active level, in the 383 case of a pt-pt circuit) and an acknowledgement have been received 384 over the interface, the timer T1 is cancelled. 386 Once the timer T3 has expired or been cancelled, subsequent IIHs are 387 transmitted according to the normal algorithms, but including the 388 restart TLV with both RR and RA clear. 390 If a LAN contains a mixture of systems, only some of which support 391 the new algorithm, database synchronization is still guaranteed, but 392 the "old" systems will have reinitialized their adjacencies. 394 If an interface is active, but does not have any neighboring router 395 reachable over that interface the timer T1 would never be cancelled, 396 and according to clause 4.4.1.1 the SPF would never be run. 397 Therefore timer T1 is cancelled after some pre-determined number of 398 expirations (which MAY be 1). (By this time any existing adjacency 399 on a remote system would probably have expired anyway.) 401 4.3.2 Adjacency acquisition during start 403 The starting router wants to ensure that in the event a neighboring 404 router has an adjacency to the starting router in the UP state (from 405 a previous incarnation of the starting router) that this adjacency 406 is reinitialized. The starting router also wants neighboring routers 407 to suppress advertisement of an adjacency to the starting router 408 until LSP database synchronization is achieved. This is achieved by 409 sending IIHs with the RR bit clear and the SA bit set in the restart 410 TLV. The RR bit remains clear and the SA bit remains set in 411 subsequent transmissions of IIHs until the adjacency has reached the 412 UP state and the initial T1 timer interval (see below) has expired. 414 Receipt of an IIH with RR bit clear will result in the neighboring 415 router utilizing normal operation of the adjacency state machine. 416 This will ensure that any old adjacency on the neighboring router 417 will be reinitialized. 419 On receipt of an IIH with SA bit set the behavior described in 4.2.2 420 is followed. 422 On starting, a router starts timer T2 for each LSPDB. 424 For each interface (and in the case of a LAN circuit, for each 425 level), when an adjacency reaches the UP state, the starting router 426 starts a timer T1 and transmits an IIH containing the restart TLV 427 with the RR bit clear and SA bit set. On expiry of the timer T1, it 428 is restarted and the IIH is retransmitted with both RR and SA bits 429 set(only the RR bit has changed state from earlier IIHs). 431 On receipt of an IIH with RR bit set (regardless of whether SA is 432 set or not) the behavior described in 4.2.1 is followed. 434 When an IIH is received by the starting router and the IIH contains 435 a restart TLV with the RA bit set (and on LAN circuits with a 436 Restart Neighbor System ID which matches that of the local system), 437 the receipt of the acknowledgement over that interface is noted. 439 On a Point-to-Point link, receipt of an IIH not containing the 440 restart TLV is also treated as an acknowledgement, since it 441 indicates that the neighbor is not restart capable. Since the 442 neighbor will have reinitialized the adjacency this guarantees that 443 SRMflags have been set on its database, thus ensuring eventual LSPDB 444 synchronization. However, since no CSNP is guaranteed to be received 445 over this interface, the timer T1 is cancelled immediately without 446 waiting for a complete set of CSNP(s). Synchronization may therefore 447 be deemed complete even though there are some LSPs which are held 448 (only) by this neighbor (see section 4.4). 450 In the case of a LAN interface, receipt of an IIH not containing the 451 restart TLV is unremarkable since synchronization can still occur so 452 long as at least one of the non-restarting neighboring routers on 453 the LAN supports restart. Therefore T1 continues to run in this 454 case. If none of the neighbors on the LAN are restart capable, T1 455 will eventually expire after the locally defined number of retries. 456 The usual operation of the update process will ensure that 457 synchronization is eventually achieved. 459 When BOTH a complete set of CSNP(s) (for each active level, in the 460 case of a pt-pt circuit) and an acknowledgement have been received 461 over the interface, the timer T1 is cancelled. Subsequent IIHs sent 462 by the starting router have the RR and RA bits clear and the SA bit 463 set in the restart TLV. 465 Timer T1 is cancelled after some pre-determined number of 466 expirations (which MAY be 1). 468 When the T2 timer(s) are cancelled or expire transmission of 469 "normal" IIHs (with RR, RA, and SA bits clear) will begin. 471 4.3.3 Multiple levels 473 A router which is operating as both a Level 1 and a Level 2 router 474 on a particular interface MUST perform the above operations for each 475 level. 477 On a LAN interface, it MUST send and receive both Level 1 and 478 Level 2 IIHs and perform the CSNP synchronizations independently for 479 each level. 481 On a pt-pt interface, only a single IIH (indicating support for both 482 levels) is required, but it MUST perform the CSNP synchronizations 483 independently for each level. 485 4.4 Database synchronization 487 When a router is started or restarted it can expect to receive a 488 (set of) CSNP(s) over each interface. The arrival of the CSNP(s) is 489 now guaranteed, since an IIH with RR bit set will be retransmitted 490 until the CSNP(s) are correctly received. 492 The CSNPs describe the set of LSPs that are currently held by each 493 neighbor. Synchronization will be complete when all these LSPs have 494 been received. 496 When (re)starting, a router starts an instance of timer T2 for each 497 LSPDB as described in 4.3.1 or 4.3.2. In addition to normal 498 processing of the CSNPs, the set of LSPIDs contained in the first 499 complete set of CSNP(s) received over each interface is recorded, 500 together with their remaining lifetime. In the case of a LAN 501 interface, a complete set of CSNPs MUST consist of CSNPs received 502 from neighbor(s) which are not restarting. If there are multiple 503 interfaces on the (re)starting router, the recorded set of LSPIDs is 504 the union of those received over each interface. LSPs with a 505 remaining lifetime of zero are NOT so recorded. 507 As LSPs are received (by the normal operation of the update process) 508 over any interface, the corresponding LSPID entry is removed (it is 509 also removed if the LSP had arrived before the CSNP containing the 510 reference). When an LSPID has been held in the list for its 511 indicated remaining lifetime, it is removed from the list. When the 512 list of LSPIDs is empty and the timer T1 has been cancelled for all 513 the interfaces that have an adjacency at this level, the timer T2 is 514 cancelled. 516 At this point the local database is guaranteed to contain all the 517 LSP(s) (either the same sequence number, or a more recent sequence 518 number) which were present in the neighbors' databases at the time 519 of (re)starting. LSPs that arrived in a neighbor's database after 520 the time of (re)starting may or may not be present, but the normal 521 operation of the update process will guarantee that they will 522 eventually be received. At this point the local database is deemed 523 to be "synchronized". 525 Since LSPs mentioned in the CSNP(s) with a zero remaining lifetime 526 are not recorded, and those with a short remaining lifetime are 527 deleted from the list when the lifetime expires, cancellation of the 528 timer T2 will not be prevented by waiting for an LSP that will never 529 arrive. 531 4.4.1 LSP generation and flooding and SPF computation 533 The operation of a router starting, as opposed to restarting is 534 somewhat different. These two cases are dealt with separately below. 536 4.4.1.1. Restarting 538 In order to avoid causing unnecessary routing churn in other 539 routers, it is highly desirable that the own LSPs generated by the 540 restarting system are the same as those previously present in the 541 network (assuming no other changes have taken place). It is 542 important therefore not to regenerate and flood the LSPs until all 543 the adjacencies have been re-established and any information 544 required for propagation into the local LSPs is fully available. 545 Ideally, the information is loaded into the LSPs in a deterministic 546 way, such that the same information occurs in the same place in the 547 same LSP (and hence the LSPs are identical to their previous 548 versions). If this can be achieved, the new versions will not even 549 cause SPF to be run in other systems. However, provided the same 550 information is included in the set of LSPs (albeit in a different 551 order, and possibly different LSPs), the result of running the SPF 552 will be the same and will not cause churn to the forwarding tables. 554 In the case of a restarting router, none of the router's LSPs are 555 transmitted, nor are the router's own forwarding tables updated 556 while the timer T3 is running. 558 Redistribution of inter-level information MUST be regenerated before 559 this router's LSP is flooded to other nodes. Therefore the Level-n 560 non-pseudonode LSP(s) MUST NOT be flooded until the other level's T2 561 timer has expired and its SPF has been run. This ensures that any 562 inter-level information which is to be propagated can be included in 563 the Level-n LSP(s). 565 During this period, if one of the router's own (including 566 pseudonodes) LSPs is received, which the local router does not 567 currently have in its own database, it is NOT purged. Under normal 568 operation, such an LSP would be purged, since the LSP clearly should 569 not be present in the global LSP database. However, in the present 570 circumstances, this would be highly undesirable, because it could 571 cause premature removal of an own LSP - and hence churn in remote 572 routers. Even if the local system has one or more own LSPs (which it 573 has generated, but not yet transmitted) it is still not valid to 574 compare the received LSP against this set, since it may be that as a 575 result of propagation between Level 1 and Level 2 (or vice versa) a 576 further own LSP will need to be generated when the LSP databases 577 have synchronized. 579 During this period a restarting router SHOULD send CSNPs as it 580 normally would. Information about the router's own LSPs MAY be 581 included, but if it is included it MUST be based on LSPs which have 582 been received, not on versions which have been generated (but not 583 yet transmitted). This restriction is necessary to prevent premature 584 removal of an LSP from the global LSP database. 586 When the timer T2 expires or is cancelled indicating that 587 synchronization for that level is complete, the SPF for that level 588 is run in order to derive any information which is required to be 589 propagated to another level, but the forwarding tables are not yet 590 updated. 592 Once the other level's SPF has run and any inter-level propagation 593 has been resolved, the 'own' LSPs can be generated and flooded. Any 594 'own' LSPs which were previously ignored, but which are not part of 595 the current set of 'own' LSPs (including pseudonodes) MUST then be 596 purged. Note that it is possible that a Designated Router change may 597 have taken place, and consequently the router SHOULD purge those 598 pseudonode LSPs which it previously owned, but which are now no 599 longer part of its set of pseudonode LSPs. 601 When all the T2 timers have expired or been cancelled, the timer T3 602 is cancelled and the local forwarding tables are updated. 604 If the timer T3 expires before all the T2 timers have expired or 605 been cancelled, this indicates that the synchronization process is 606 taking longer than minimum holding time of the neighbors. The 607 router's own LSP(s) for levels which have not yet completed their 608 first SPF computation are then flooded with the overload bit set to 609 indicate that the router's LSPDB is not yet synchronized (and 610 therefore other routers MUST NOT compute routes through this 611 router). Normal operation of the update process resumes and the 612 local forwarding tables are updated. In order to prevent the 613 neighbor's adjacencies from expiring, IIHs with the normal interface 614 value for the holding time are transmitted over all interfaces with 615 neither RR nor RA set in the restart TLV. This will cause the 616 neighbors to refresh their adjacencies. The own LSP(s) will continue 617 to have the overload bit set until timer T2 has expired or been 618 cancelled. 620 4.4.1.2. Starting 622 In the case of a starting router, as soon as each adjacency is 623 established, and before any CSNP exchanges, the router's own zeroth 624 LSP is transmitted with the overload bit set. This prevents other 625 routers from computing routes through the router until it has 626 reliably acquired the complete set of LSPs. The overload bit remains 627 set in subsequent transmissions of the zeroth LSP (such as will 628 occur if a previous copy of the routers LSP is still present in the 629 network) while any timer T2 is running. 631 When all the T2 timers have been cancelled, the own LSP(s) MAY be 632 regenerated with the overload bit clear (assuming the router isn't 633 in fact overloaded, and there is no other reason, such as incomplete 634 BGP convergence, to keep the overload bit set), and flooded as 635 normal. 637 Other 'own' LSPs (including pseudonodes) are generated and flooded 638 as normal, irrespective of the timer T2. The SPF is also run as 639 normal and the RIB and FIB updated as routes become available. 641 To avoid the possible formation of temporary blackholes the starting 642 router sets the SA bit in the restart TLV (as described in 4.3.2) in 643 all IIHs that it sends. 645 When all T2 timers have been cancelled the starting router MUST 646 transmit IIHs with the SA bit clear. 648 5. State Tables 650 This section presents state tables which summarize the behaviors 651 described in this document. Other behaviors, in particular adjacency 652 state transitions and LSP database update operation, are NOT 653 included in the state tables except where this document modifies the 654 behaviors described in [3] and [5]. 656 Three state tables are presented from the point of view of a running 657 router, a restarting router, and a starting router. 659 5.1 Running Router 661 Event | Running | ADJ suppressed 662 ============================================================== 663 RX RR | Maintain ADJ State | 664 | Send RA | 665 | Set SRM,send CSNP | 666 | (Note 1) | 667 | Update Hold Time, | 668 | set Restart Mode | 669 | (Note 2) | 670 -------------+----------------------+------------------------- 671 RX RR clr | Clr Restart mode | 672 -------------+----------------------+------------------------- 673 RX SA set | Suppress IS neighbor | 674 | TLV in LSP(s) | 675 | Goto ADJ Suppressed | 676 -------------+----------------------+------------------------- 677 RX SA clr | |Unsuppress IS neighbor 678 | | TLV in LSP(s) 679 | |Goto Running 681 ============================================================== 683 Note 1: If ADJ is UP 684 Note 2: If Restart Mode clear 686 5.2 Restarting Router 688 Event | Restarting | ADJ Seen RA | ADJ Seen CSNP | SPF Wait 689 =================================================================== 690 Router | Send IIH/RR| | | 691 restarts | ADJ Init | | | 692 | Start T1, | | | 693 | T2,T3 | | | 694 ------------+------------+-------------+---------------+------------ 695 RX RA | Adjust T3 | | Cancel T1 | 696 | Goto ADJ | | | 697 | Seen RA | | | 698 ----------- +------------+-------------+---------------+------------ 699 RX CSNP | Goto ADJ | Cancel T1 | | 700 Set | Seen CSNP| | | 701 ------------+------------+-------------+---------------+------------ 702 RX IIH w/o | Cancel T1 | | | 703 Restart TLV| | | | 704 ------------+------------+-------------+---------------+------------ 705 T1 Expires | Send IIH/RR| Send IIH/RR | Send IIH/RR | 706 | Restart T1 | Restart T1 | Restart T1 | 707 ------------+------------+-------------+---------------+------------ 708 T1 Expires | Send IIH/ | Send IIH/ | Send IIH/ | 709 nth time | normal | normal | normal | 710 ------------+------------+-------------+---------------+------------ 711 T2 expires | Trigger SPF| | | 712 | Goto SPF | | | 713 | Wait | | | 714 ------------+------------+-------------+---------------+------------ 715 T3 expires | Set OL | | | 716 | Flood local| | | 717 | LSPs | | | 718 | Update fwd | | | 719 | plane | | | 720 ------------+------------+-------------+---------------+------------ 721 LSP DB Sync| Cancel T2, | | | 722 | and T3 | | | 723 | Trigger SPF| | | 724 | Goto SPF | | | 725 | wait | | | 726 ------------+------------+-------------+---------------+------------ 727 All SPF | | | | Clear OL 728 done | | | | Update Fwd 729 | | | | plane 730 | | | | Flood local 731 | | | | LSPs 732 | | | | Goto Running 733 ===================================================================== 735 5.3 Starting Router 737 Event | Starting | ADJ Seen RA | ADJ Seen CSNP 738 ========================================================= 739 Router | Send IIH/SA | | 740 starts | Start T1,T2 | | 741 -------------+-------------+-------------+--------------- 742 RX RA | Goto ADJ | | Cancel T1 743 | Seen RA | | 744 -------------+-------------+-------------+--------------- 745 RX CSNP | Goto ADJ | Cancel T1 | 746 Set | Seen CSNP | | 747 -------------+-------------+-------------+--------------- 748 RX IIH w | Cancel T1 | | 749 no Restart | | | 750 TLV | | | 751 -------------+-------------+-------------+--------------- 752 ADJ UP | Start T1 | | 753 | Send local | | 754 | LSPs w OL | | 755 -------------+-------------+-------------+--------------- 756 T1 Expires | Send IIH/RR | Send IIH/RR | Send IIH/RR 757 | and SA | and SA | and SA 758 | Restart T1 | Restart T1 | Restart T1 759 -------------+-------------+-------------+--------------- 760 T1 Expires | Send IIH/SA | Send IIH/SA | Send IIH/SA 761 nth time | | | 762 -------------+-------------+-------------+--------------- 763 T2 expires | Clear OL | | 764 | Send IIH | | 765 | normal | | 766 | Goto Running| | 767 -------------+-------------+-------------+--------------- 768 LSP DB Sync | Cancel T2 | | 769 | Clear OL | | 770 | Send IIH | | 771 | normal | | 772 ========================================================= 774 6. Security Considerations 776 This memo does not create any new security issues for the IS-IS 777 protocol. Security considerations for the base IS-IS protocol are 778 covered in [2] and [3]. 780 7. References 782 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 783 9, RFC 2026, October 1996. 785 2 Callon, R., "OSI IS-IS for IP and Dual Environment," RFC 1195, 786 December 1990. 788 3 ISO, "Intermediate system to Intermediate system routeing 789 information exchange protocol for use in conjunction with the 790 Protocol for providing the Connectionless-mode Network Service 791 (ISO 8473)," ISO/IEC 10589:2002, Second Edition. 793 4 Bradner, S., "Key words for use in RFCs to Indicate Requirement 794 Levels", BCP 14, RFC 2119, March 1997 796 5 Katz, D., "Three-Way Handshake for IS-IS Point-to-Point 797 Adjacencies", RFC 3373, September 2002 799 8. Acknowledgments 801 The authors would like to acknowledge contributions made by Jeff 802 Parker, Radia Perlman, Mark Schaefer, Naiming Shen, Nischal Sheth, 803 Russ White, and Rena Yang. 805 9. Authors' Addresses 807 Mike Shand 808 Cisco Systems 809 250 Longwater Avenue, 810 Reading, 811 Berkshire, 812 RG2 6GB 813 UK 814 Phone: +44 208 824 8690 815 Email: mshand@cisco.com 817 Les Ginsberg 818 Cisco Systems 819 510 McCarthy Blvd. 820 Milpitas, Ca. 95035 USA 821 Email: ginsberg@cisco.com