idnits 2.17.1 draft-ietf-isis-restart-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2004) is 7400 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '4' on line 88 looks like a reference -- Missing reference section? '5' on line 692 looks like a reference -- Missing reference section? '3' on line 840 looks like a reference -- Missing reference section? '2' on line 840 looks like a reference -- Missing reference section? '6' on line 841 looks like a reference Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group M. Shand 2 Internet Draft L. Ginsberg 3 Expiration Date: July 2004 Cisco Systems 4 January 2004 6 Restart signaling for IS-IS 7 draft-ietf-isis-restart-05.txt 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC 2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. Internet-Drafts are draft documents valid for a maximum of 18 six months and may be updated, replaced, or obsoleted by other 19 documents at any time. It is inappropriate to use Internet-Drafts as 20 reference material or to cite them other than as "work in progress." 22 The list of current Internet-Drafts can be accessed at 23 http://www.ietf.org/ietf/1id-abstracts.txt. 25 The list of Internet-Draft Shadow Directories can be accessed at 26 http://www.ietf.org/shadow.html. 28 Copyright Notice Copyright (C) The Internet Society (2003). All 29 Rights Reserved. 31 Abstract 33 The IS-IS routing protocol (RFC 1195, ISO/IEC 10589) is a link state 34 intra-domain routing protocol. Normally, when an IS-IS router is 35 restarted, temporary disruption of routing occurs due to events in 36 both the restarting router and the neighbors of the restarting 37 router. 39 The router which has been restarted computes its own routes before 40 achieving database synchronization with its neighbors. The results 41 of this computation are likely to be non-convergent with the routes 42 computed by other routers in the area/domain. 44 Neighbors of the restarting router detect the restart event and 45 cycle their adjacencies with the restarting router through the down 46 state. The cycling of the adjacency state causes the neighbors to 47 regenerate their LSPs describing the adjacency concerned. This in 48 turn causes temporary disruption of routes passing through the 49 restarting router. 51 In certain scenarios the temporary disruption of the routes is 52 highly undesirable. This draft describes mechanisms to avoid or 53 minimize the disruption due to both of these causes. 55 Table of Contents 57 1. Conventions used in this document..............................3 58 2. Overview.......................................................3 59 3. Approach.......................................................4 60 3.1 Timers.......................................................4 61 3.2 Restart TLV..................................................4 62 3.2.1 Use of RR and RA bits.....................................5 63 3.2.2 Use of SA bit.............................................7 64 3.3 Adjacency (re)acquisition....................................8 65 3.3.1 Adjacency reacquisition during restart....................8 66 3.3.2 Adjacency acquisition during start.......................10 67 3.3.3 Multiple levels..........................................11 68 3.4 Database synchronization....................................12 69 3.4.1 LSP generation and flooding and SPF computation..........12 70 3.4.1.1. Restarting..........................................13 71 3.4.1.2. Starting............................................14 72 4. State Tables..................................................16 73 4.1 Running Router..............................................16 74 4.2 Restarting Router...........................................17 75 4.3 Starting Router.............................................18 76 5. Security Considerations.......................................19 77 6. IANA Considerations...........................................19 78 7. Normative References..........................................20 79 8. Acknowledgments...............................................20 80 9. Authors' Addresses............................................20 81 10. Full Copyright Statement.....................................21 83 1. Conventions used in this document 85 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 86 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 87 this document are to be interpreted as described in RFC-2119 [4]. 89 If the control and forwarding functions in a router can be 90 maintained independently, it is possible for the forwarding function 91 state to be maintained across a control function restart. This 92 functionality is assumed when the terms "restart/restarting" are 93 used in this document. 95 The terms "start/starting" are used to refer to a router in which 96 the control function has either been started for the first time or 97 has been restarted but the forwarding functions have not been 98 maintained in a prior state. 100 The terms "(re)start/(re)starting" are used when the text is 101 applicable to both a "starting" and a "restarting" router. 103 2. Overview 105 When an adjacency is reinitialized as a result of a neighbor 106 restarting, a router does three things: 108 1. It causes its own LSP(s) to be regenerated, thus triggering 109 SPF runs throughout the area (or in the case of Level 2, 110 throughout the domain). 112 2. It sets SRMflags on its own LSP database on the adjacency 113 concerned. 115 3. In the case of a Point-to-Point link it transmits a (set of) 116 CSNP(s) over the adjacency. 118 In the case of a restarting router process, the first of these is 119 highly undesirable, but the second is essential in order to ensure 120 synchronization of the LSP database. 122 The third action above minimizes the number of LSPs which must be 123 exchanged and, if made reliable, provides a means of determining 124 when the LSP databases of the neighboring routers have been 125 synchronized. This is desirable whether the router is being 126 restarted or not (so that the overload bit can be cleared in the 127 router's own LSP, for example). 129 This draft describes a mechanism for a restarting router to signal 130 that it is restarting to its neighbors, and allow them to 131 reestablish their adjacencies without cycling through the down 132 state, while still correctly initiating database synchronization. 134 This draft additionally describes a mechanism for a restarting 135 router to determine when it has achieved LSP database 136 synchronization with its neighbors and a mechanism to optimize LSP 137 database synchronization and minimize transient routing disruption 138 when a router starts. 140 It is assumed that the three-way handshake [5] is being used on 141 Point-to-Point circuits. 143 3. Approach 145 3.1 Timers 147 Three additional timers, T1, T2 and T3 are required to support the 148 functionality defined in this document. 150 An instance of the timer T1 is maintained per interface, and 151 indicates the time after which an unacknowledged (re)start attempt 152 will be repeated. A typical value might be 3 seconds. 154 An instance of the timer T2 is maintained for each LSP database 155 present in the system i.e. for a Level1/2 system, there will be an 156 instance of the timer T2 for Level 1 and an instance for Level 2. 157 This is the maximum time that the system will wait for LSPDB 158 synchronization. A typical value might be 60 seconds. 160 A single instance of the timer T3 is maintained for the entire 161 system. It indicates the time after which the router will declare 162 that it has failed to achieve database synchronization (by setting 163 the overload bit in its own LSP). This is initialized to 65535 164 seconds, but is set to the minimum of the remaining times of 165 received IIHs containing a restart TLV with RA set and an indication 166 that the neighbor has an adjacency in the UP state to the restarting 167 router. 169 NOTE: The timer T3 is only used by a restarting router. 171 3.2 Restart TLV 173 A new TLV is defined to be included in IIH PDUs. The presence of 174 this TLV indicates that the sender supports the functionality 175 defined in this document and it carries flags that are used to 176 convey information during a (re)start. All IIHs transmitted by a 177 router that supports this capability MUST include this TLV. 179 Type 211 180 Length # of octets in the value field (1 to (3 + ID Length)) 181 Value 183 No. of octets 184 +-----------------------+ 185 | Flags | 1 186 +-----------------------+ 187 | Remaining Time | 2 188 +-----------------------+ 189 | Restarting Neighbor ID| ID Length 190 +-----------------------+ 192 Flags (1 octet) 194 0 1 2 3 4 5 6 7 195 +--+--+--+--+--+--+--+--+ 196 | Reserved |SA|RA|RR| 197 +--+--+--+--+--+--+--+--+ 199 RR - Restart Request 200 RA - Restart Acknowledgment 201 SA - Suppress adjacency advertisement 203 (Note: Remaining fields are required when RA bit is set) 205 Remaining Time (2 octets) 207 Remaining holding time (in seconds) 209 Restarting Neighbor System ID (ID Length octets) 211 The system ID of the neighbor to which an RA refers. Note: 212 Implementations based on earlier versions of this document 213 may not include this field in the TLV when RA is set. In 214 this case a router which is expecting an RA on a LAN 215 circuit SHOULD assume that the acknowledgement is directed 216 at the local system. 218 3.2.1 Use of RR and RA bits 220 The RR bit is used by a (re)starting router to signal to its 221 neighbors that a (re)start is in progress, that an existing 222 adjacency SHOULD be maintained even under circumstances when the 223 normal operation of the adjacency state machine would require the 224 adjacency to be reinitialized, to request a set of CSNPs, and to 225 request setting of SRMflags. 227 The RA bit is sent by the neighbor of a (re)starting router to 228 acknowledge the receipt of a restart TLV with the RR bit set. 230 When the neighbor of a (re)starting router receives an IIH with the 231 restart TLV having the RR bit set, if there exists on this interface 232 an adjacency in state "Up" with the same System ID, and in the case 233 of a LAN circuit, with the same source LAN address, then, 234 irrespective of the other contents of the "Intermediate System 235 Neighbors" option (LAN circuits), or the "Point-to-Point Three-Way 236 Adjacency" option (Point-to-Point circuits): 238 a) The state of the adjacency is not changed. If this is the first 239 IIH with the RR bit set that this system has received associated 240 with this adjacency then the adjacency is marked as being in 241 "Restart mode" and the adjacency holding time is refreshed - 242 otherwise the holding time is not refreshed. The "remaining time" 243 transmitted according to (b) below MUST reflect the actual time 244 after which the adjacency will now expire. Receipt of a normal 245 IIH with RR bit reset will clear the "Restart mode" state. This 246 procedure allows the restarting router to cause the neighbor to 247 maintain the adjacency long enough for restart to successfully 248 complete while also preventing repetitive restarts from 249 maintaining an adjacency indefinitely. Whether an adjacency is 250 marked as being in "Restart mode" or not has no effect on 251 adjacency state transitions. 253 b) immediately (i.e. without waiting for any currently running timer 254 interval to expire, but with a small random delay of a few 10s of 255 milliseconds on LANs to avoid "storms"), transmit over the 256 corresponding interface an IIH including the restart TLV with the 257 RR bit clear and the RA bit set, in the case of Point-to-Point 258 adjacencies having updated the "Point-to-Point Three-Way 259 Adjacency" option to reflect any new values received from the 260 (re)starting router. (This allows a restarting router to quickly 261 acquire the correct information to place in its hellos.) The 262 "Remaining Time" MUST be set to the current time (in seconds) 263 before the holding timer on this adjacency is due to expire. If 264 the corresponding interface is a LAN interface, then the 265 Restarting Neighbor System ID SHOULD be set to the System ID of 266 the router from whom the IIH with RR bit set was received. This 267 is required to correctly associate the acknowledgement and 268 holding time in the case where multiple systems on a LAN restart 269 at approximately the same time. This IIH SHOULD be transmitted 270 before any LSPs or SNPs transmitted as a result of the receipt of 271 the original IIH. 273 c) if the corresponding interface is a Point-to-Point interface, or 274 if the receiving router has the highest LnRouterPriority (with 275 highest source MAC address breaking ties) among those routers to 276 which the receiving router has an adjacency in state "Up" on this 277 interface whose IIHs contain the restart TLV, excluding 278 adjacencies to all routers which are considered in "Restart mode" 279 (note the actual DIS is NOT changed by this process), initiate 280 the transmission over the corresponding interface of a complete 281 set of CSNPs, and set SRMflags on the corresponding interface for 282 all LSPs in the local LSP database. 284 Otherwise (i.e. if there was no adjacency in the "UP" state to the 285 system ID in question), process the IIH as normal by reinitializing 286 the adjacency, and setting the RA bit in the returned IIH. 288 3.2.2 Use of SA bit 290 The SA bit is used by a starting router to request that its neighbor 291 suppress advertisement of the adjacency to the starting router in 292 the neighbor's LSPs. 294 A router which is starting has no maintained forwarding function 295 state. This may or may not be the first time the router has started. 296 If this is not the first time the router has started, copies of LSPs 297 generated by this router in its previous incarnation may exist in 298 the LSP databases of other routers in the network. These copies are 299 likely to appear "newer" than LSPs initially generated by the 300 starting router due to the reinitialization of LSP fragment sequence 301 numbers by the starting router. This may cause temporary blackholes 302 to occur until the normal operation of the update process causes the 303 starting router to regenerate and flood copies of its own LSPs with 304 higher sequence numbers. The temporary blackholes can be avoided if 305 the starting router's neighbors suppress advertising an adjacency to 306 the starting router until the starting router has been able to 307 propagate newer versions of LSPs generated by previous incarnations. 309 When a router receives an IIH with the restart TLV having the SA bit 310 set, if there exists on this interface an adjacency in state "Up" 311 with the same System ID, and in the case of a LAN circuit, with the 312 same source LAN address, then the router MUST suppress advertisement 313 of the adjacency to the neighbor in its own LSPs. Until an IIH with 314 the SA bit clear has been received, the neighbor advertisement MUST 315 continue to be suppressed. If the adjacency transitions to the UP 316 state, the new adjacency MUST NOT be advertised until an IIH with 317 the SA bit clear has been received. 319 Note that a router which suppresses advertisement of an adjacency 320 MUST NOT use this adjacency when performing its SPF calculation. In 321 particular, if an implementation follows the example guidelines 322 presented in [3] Annex C.2.5 Step 0:b) "pre-load TENT with the local 323 adjacency database", the suppressed adjacency MUST NOT be loaded 324 into TENT. 326 3.3 Adjacency (re)acquisition 328 Adjacency (re)acquisition is the first step in (re)initialization. 329 Restarting and starting routers will make use of the RR bit in the 330 restart TLV, though each will use it at different stages of the 331 (re)start procedure. 333 3.3.1 Adjacency reacquisition during restart 335 The restarting router explicitly notifies its neighbor that the 336 adjacency is being reacquired, and hence that it SHOULD NOT 337 reinitialize the adjacency. This is achieved by setting the RR bit 338 in the restart TLV. When the neighbor of a restarting router 339 receives an IIH with the restart TLV having the RR bit set, if there 340 exists on this interface an adjacency in state "Up" with the same 341 System ID, and in the case of a LAN circuit, with the same source 342 LAN address, then the procedures described in 4.2.1 are followed. 344 A router that does not support the restart capability will ignore 345 the restart TLV and reinitialize the adjacency as normal, returning 346 an IIH without the restart TLV. 348 On restarting, a router initializes the timer T3, starts the timer 349 T2 for each LSPDB and for each interface (and in the case of a LAN 350 circuit, for each level) starts the timer T1 and transmits an IIH 351 containing the restart TLV with the RR bit set. 353 On a Point-to-Point circuit the restarting router SHOULD set the 354 "Adjacency Three-Way State" to "Init", because the receipt of the 355 acknowledging IIH (with RA set) MUST cause the adjacency to enter 356 "Up" state immediately. 358 On a LAN circuit the LAN-ID assigned to the circuit SHOULD be the 359 same as that used prior to the restart. In particular, for any 360 circuits for which the restarting router was previously DIS, the use 361 of a different LAN-ID would necessitate the generation of a new set 362 of pseudonode LSPs, and corresponding changes in all the LSPs 363 referencing them from other routers on the LAN. By preserving the 364 LAN-ID across the restart, this churn can be prevented. To enable a 365 restarting router to learn the LAN-ID used prior to restart, the 366 LAN-ID specified in an IIH with RR set MUST be ignored. 368 Transmission of "normal" IIHs is inhibited until the conditions 369 described below are met (in order to avoid causing an unnecessary 370 adjacency initialization). On expiry of the timer T1, it is 371 restarted and the IIH is retransmitted as above. 373 When a restarting router receives an IIH a local adjacency is 374 established as usual, and if the IIH contains a restart TLV with the 375 RA bit set (and on LAN circuits with a Restart Neighbor System ID 376 which matches that of the local system), the receipt of the 377 acknowledgement over that interface is noted. When the RA bit is set 378 and the state of the remote adjacency is UP then the timer T3 is set 379 to the minimum of its current value and the value of the "Remaining 380 Time" field in the received IIH. 382 On a Point-to-Point link, receipt of an IIH not containing the 383 restart TLV is also treated as an acknowledgement, since it 384 indicates that the neighbor is not restart capable. However, since 385 no CSNP is guaranteed to be received over this interface, the timer 386 T1 is cancelled immediately without waiting for a complete set of 387 CSNP(s). Synchronization may therefore be deemed complete even 388 though there are some LSPs which are held (only) by this neighbor 389 (see section 4.4). In this case we also want to be certain that the 390 neighbor will reinitialize the adjacency in order to guarantee that 391 SRMflags have been set on its database, thus ensuring eventual LSPDB 392 synchronization. This is guaranteed to happen except in the case 393 where the Adjacency Three-Way State in the received IIH is UP and 394 the Neighbor Extended Local Circuit ID matches the extended local 395 circuit ID assigned by the restarting router. In this case the 396 restarting router MUST force the adjacency to reinitialize by 397 setting the local Adjacency Three-Way State to DOWN and sending a 398 normal IIH. 400 In the case of a LAN interface, receipt of an IIH not containing the 401 restart TLV is unremarkable since synchronization can still occur so 402 long as at least one of the non-restarting neighboring routers on 403 the LAN supports restart. Therefore T1 continues to run in this 404 case. If none of the neighbors on the LAN are restart capable, T1 405 will eventually expire after the locally defined number of retries. 407 In the case of a Point-to-Point circuit, the "LocalCircuitID" and 408 "Extended Local Circuit ID" information contained in the IIH can be 409 used immediately to generate an IIH containing the correct 3-way 410 handshake information. The presence of "Neighbor Extended Local 411 Circuit ID" information which does not match the value currently in 412 use by the local system is ignored (since the IIH may have been 413 transmitted before the neighbor had received the new value from the 414 restarting router), but the adjacency remains in the initializing 415 state until the correct information is received. 417 In the case of a LAN circuit the source neighbor information (e.g. 418 SNPAAddress) is recorded and used for adjacency establishment and 419 maintenance as normal. 421 When BOTH a complete set of CSNP(s) (for each active level, in the 422 case of a pt-pt circuit) and an acknowledgement have been received 423 over the interface, the timer T1 is cancelled. 425 Once the timer T3 has expired or been cancelled, subsequent IIHs are 426 transmitted according to the normal algorithms, but including the 427 restart TLV with both RR and RA clear. 429 If a LAN contains a mixture of systems, only some of which support 430 the new algorithm, database synchronization is still guaranteed, but 431 the "old" systems will have reinitialized their adjacencies. 433 If an interface is active, but does not have any neighboring router 434 reachable over that interface the timer T1 would never be cancelled, 435 and according to clause 3.4.1.1 the SPF would never be run. 436 Therefore timer T1 is cancelled after some pre-determined number of 437 expirations (which MAY be 1). 439 3.3.2 Adjacency acquisition during start 441 The starting router wants to ensure that in the event a neighboring 442 router has an adjacency to the starting router in the UP state (from 443 a previous incarnation of the starting router) that this adjacency 444 is reinitialized. The starting router also wants neighboring routers 445 to suppress advertisement of an adjacency to the starting router 446 until LSP database synchronization is achieved. This is achieved by 447 sending IIHs with the RR bit clear and the SA bit set in the restart 448 TLV. The RR bit remains clear and the SA bit remains set in 449 subsequent transmissions of IIHs until the adjacency has reached the 450 UP state and the initial T1 timer interval (see below) has expired. 452 Receipt of an IIH with RR bit clear will result in the neighboring 453 router utilizing normal operation of the adjacency state machine. 454 This will ensure that any old adjacency on the neighboring router 455 will be reinitialized. 457 On receipt of an IIH with SA bit set the behavior described in 3.2.2 458 is followed. 460 On starting, a router starts timer T2 for each LSPDB. 462 For each interface (and in the case of a LAN circuit, for each 463 level), when an adjacency reaches the UP state, the starting router 464 starts a timer T1 and transmits an IIH containing the restart TLV 465 with the RR bit clear and SA bit set. On expiry of the timer T1, it 466 is restarted and the IIH is retransmitted with both RR and SA bits 467 set (only the RR bit has changed state from earlier IIHs). 469 On receipt of an IIH with RR bit set (regardless of whether SA is 470 set or not) the behavior described in 3.2.1 is followed. 472 When an IIH is received by the starting router and the IIH contains 473 a restart TLV with the RA bit set (and on LAN circuits with a 474 Restart Neighbor System ID which matches that of the local system), 475 the receipt of the acknowledgement over that interface is noted. 477 On a Point-to-Point link, receipt of an IIH not containing the 478 restart TLV is also treated as an acknowledgement, since it 479 indicates that the neighbor is not restart capable. Since the 480 neighbor will have reinitialized the adjacency this guarantees that 481 SRMflags have been set on its database, thus ensuring eventual LSPDB 482 synchronization. However, since no CSNP is guaranteed to be received 483 over this interface, the timer T1 is cancelled immediately without 484 waiting for a complete set of CSNP(s). Synchronization may therefore 485 be deemed complete even though there are some LSPs which are held 486 (only) by this neighbor (see section 4.4). 488 In the case of a LAN interface, receipt of an IIH not containing the 489 restart TLV is unremarkable since synchronization can still occur so 490 long as at least one of the non-restarting neighboring routers on 491 the LAN supports restart. Therefore T1 continues to run in this 492 case. If none of the neighbors on the LAN are restart capable, T1 493 will eventually expire after the locally defined number of retries. 494 The usual operation of the update process will ensure that 495 synchronization is eventually achieved. 497 When BOTH a complete set of CSNP(s) (for each active level, in the 498 case of a pt-pt circuit) and an acknowledgement have been received 499 over the interface, the timer T1 is cancelled. Subsequent IIHs sent 500 by the starting router have the RR and RA bits clear and the SA bit 501 set in the restart TLV. 503 Timer T1 is cancelled after some pre-determined number of 504 expirations (which MAY be 1). 506 When the T2 timer(s) are cancelled or expire transmission of 507 "normal" IIHs (with RR, RA, and SA bits clear) will begin. 509 3.3.3 Multiple levels 511 A router which is operating as both a Level 1 and a Level 2 router 512 on a particular interface MUST perform the above operations for each 513 level. 515 On a LAN interface, it MUST send and receive both Level 1 and 516 Level 2 IIHs and perform the CSNP synchronizations independently for 517 each level. 519 On a pt-pt interface, only a single IIH (indicating support for both 520 levels) is required, but it MUST perform the CSNP synchronizations 521 independently for each level. 523 3.4 Database synchronization 525 When a router is started or restarted it can expect to receive a 526 (set of) CSNP(s) over each interface. The arrival of the CSNP(s) is 527 now guaranteed, since an IIH with RR bit set will be retransmitted 528 until the CSNP(s) are correctly received. 530 The CSNPs describe the set of LSPs that are currently held by each 531 neighbor. Synchronization will be complete when all these LSPs have 532 been received. 534 When (re)starting, a router starts an instance of timer T2 for each 535 LSPDB as described in 4.3.1 or 4.3.2. In addition to normal 536 processing of the CSNPs, the set of LSPIDs contained in the first 537 complete set of CSNP(s) received over each interface is recorded, 538 together with their remaining lifetime. In the case of a LAN 539 interface, a complete set of CSNPs MUST consist of CSNPs received 540 from neighbor(s) which are not restarting. If there are multiple 541 interfaces on the (re)starting router, the recorded set of LSPIDs is 542 the union of those received over each interface. LSPs with a 543 remaining lifetime of zero are NOT so recorded. 545 As LSPs are received (by the normal operation of the update process) 546 over any interface, the corresponding LSPID entry is removed (it is 547 also removed if the LSP had arrived before the CSNP containing the 548 reference). When an LSPID has been held in the list for its 549 indicated remaining lifetime, it is removed from the list. When the 550 list of LSPIDs is empty and the timer T1 has been cancelled for all 551 the interfaces that have an adjacency at this level, the timer T2 is 552 cancelled. 554 At this point the local database is guaranteed to contain all the 555 LSP(s) (either the same sequence number, or a more recent sequence 556 number) which were present in the neighbors' databases at the time 557 of (re)starting. LSPs that arrived in a neighbor's database after 558 the time of (re)starting may or may not be present, but the normal 559 operation of the update process will guarantee that they will 560 eventually be received. At this point the local database is deemed 561 to be "synchronized". 563 Since LSPs mentioned in the CSNP(s) with a zero remaining lifetime 564 are not recorded, and those with a short remaining lifetime are 565 deleted from the list when the lifetime expires, cancellation of the 566 timer T2 will not be prevented by waiting for an LSP that will never 567 arrive. 569 3.4.1 LSP generation and flooding and SPF computation 571 The operation of a router starting, as opposed to restarting is 572 somewhat different. These two cases are dealt with separately below. 574 3.4.1.1. Restarting 576 In order to avoid causing unnecessary routing churn in other 577 routers, it is highly desirable that the own LSPs generated by the 578 restarting system are the same as those previously present in the 579 network (assuming no other changes have taken place). It is 580 important therefore not to regenerate and flood the LSPs until all 581 the adjacencies have been re-established and any information 582 required for propagation into the local LSPs is fully available. 583 Ideally, the information is loaded into the LSPs in a deterministic 584 way, such that the same information occurs in the same place in the 585 same LSP (and hence the LSPs are identical to their previous 586 versions). If this can be achieved, the new versions may not even 587 cause SPF to be run in other systems. However, provided the same 588 information is included in the set of LSPs (albeit in a different 589 order, and possibly different LSPs), the result of running the SPF 590 will be the same and will not cause churn to the forwarding tables. 592 In the case of a restarting router, none of the router's own LSPs 593 are transmitted, nor are the router's own forwarding tables updated 594 while the timer T3 is running. 596 Redistribution of inter-level information MUST be regenerated before 597 this router's LSP is flooded to other nodes. Therefore the Level-n 598 non-pseudonode LSP(s) MUST NOT be flooded until the other level's T2 599 timer has expired and its SPF has been run. This ensures that any 600 inter-level information which is to be propagated can be included in 601 the Level-n LSP(s). 603 During this period, if one of the router's own (including 604 pseudonodes) LSPs is received, which the local router does not 605 currently have in its own database, it is NOT purged. Under normal 606 operation, such an LSP would be purged, since the LSP clearly should 607 not be present in the global LSP database. However, in the present 608 circumstances, this would be highly undesirable, because it could 609 cause premature removal of an own LSP - and hence churn in remote 610 routers. Even if the local system has one or more own LSPs (which it 611 has generated, but not yet transmitted) it is still not valid to 612 compare the received LSP against this set, since it may be that as a 613 result of propagation between Level 1 and Level 2 (or vice versa) a 614 further own LSP will need to be generated when the LSP databases 615 have synchronized. 617 During this period a restarting router SHOULD send CSNPs as it 618 normally would. Information about the router's own LSPs MAY be 619 included, but if it is included it MUST be based on LSPs which have 620 been received, not on versions which have been generated (but not 621 yet transmitted). This restriction is necessary to prevent premature 622 removal of an LSP from the global LSP database. 624 When the timer T2 expires or is cancelled indicating that 625 synchronization for that level is complete, the SPF for that level 626 is run in order to derive any information which is required to be 627 propagated to another level, but the forwarding tables are not yet 628 updated. 630 Once the other level's SPF has run and any inter-level propagation 631 has been resolved, the own LSPs can be generated and flooded. Any 632 own LSPs which were previously ignored, but which are not part of 633 the current set of own LSPs (including pseudonodes) MUST then be 634 purged. Note that it is possible that a Designated Router change may 635 have taken place, and consequently the router SHOULD purge those 636 pseudonode LSPs which it previously owned, but which are now no 637 longer part of its set of pseudonode LSPs. 639 When all the T2 timers have expired or been cancelled, the timer T3 640 is cancelled and the local forwarding tables are updated. 642 If the timer T3 expires before all the T2 timers have expired or 643 been cancelled, this indicates that the synchronization process is 644 taking longer than minimum holding time of the neighbors. The 645 router's own LSP(s) for levels which have not yet completed their 646 first SPF computation are then flooded with the overload bit set to 647 indicate that the router's LSPDB is not yet synchronized (and 648 therefore other routers MUST NOT compute routes through this 649 router). Normal operation of the update process resumes and the 650 local forwarding tables are updated. In order to prevent the 651 neighbor's adjacencies from expiring, IIHs with the normal interface 652 value for the holding time are transmitted over all interfaces with 653 neither RR nor RA set in the restart TLV. This will cause the 654 neighbors to refresh their adjacencies. The own LSP(s) will continue 655 to have the overload bit set until timer T2 has expired or been 656 cancelled. 658 3.4.1.2. Starting 660 In the case of a starting router, as soon as each adjacency is 661 established, and before any CSNP exchanges, the router's own zeroth 662 LSP is transmitted with the overload bit set. This prevents other 663 routers from computing routes through the router until it has 664 reliably acquired the complete set of LSPs. The overload bit remains 665 set in subsequent transmissions of the zeroth LSP (such as will 666 occur if a previous copy of the routers LSP is still present in the 667 network) while any timer T2 is running. 669 When all the T2 timers have been cancelled, the own LSP(s) MAY be 670 regenerated with the overload bit clear (assuming the router isn't 671 in fact overloaded, and there is no other reason, such as incomplete 672 BGP convergence, to keep the overload bit set), and flooded as 673 normal. 675 Other own LSPs (including pseudonodes) are generated and flooded as 676 normal, irrespective of the timer T2. The SPF is also run as normal 677 and the RIB and FIB updated as routes become available. 679 To avoid the possible formation of temporary blackholes the starting 680 router sets the SA bit in the restart TLV (as described in 4.3.2) in 681 all IIHs that it sends. 683 When all T2 timers have been cancelled the starting router MUST 684 transmit IIHs with the SA bit clear. 686 4. State Tables 688 This section presents state tables which summarize the behaviors 689 described in this document. Other behaviors, in particular adjacency 690 state transitions and LSP database update operation, are NOT 691 included in the state tables except where this document modifies the 692 behaviors described in [3] and [5]. 694 The states named in the columns of the tables below are a mixture of 695 states that are specific to a single adjacency (ADJ suppressed, ADJ 696 Seen RA, ADJ Seen CSNP) and states which are indicative of the state 697 of the protocol instance (Running, Restarting, Starting, SPF Wait). 699 Three state tables are presented from the point of view of a running 700 router, a restarting router, and a starting router. 702 4.1 Running Router 704 Event | Running | ADJ suppressed 705 ============================================================== 706 RX RR | Maintain ADJ State | 707 | Send RA | 708 | Set SRM,send CSNP | 709 | (Note 1) | 710 | Update Hold Time, | 711 | set Restart Mode | 712 | (Note 2) | 713 -------------+----------------------+------------------------- 714 RX RR clr | Clr Restart mode | 715 -------------+----------------------+------------------------- 716 RX SA | Suppress IS neighbor | 717 | TLV in LSP(s) | 718 | Goto ADJ Suppressed | 719 -------------+----------------------+------------------------- 720 RX SA clr | |Unsuppress IS neighbor 721 | | TLV in LSP(s) 722 | |Goto Running 723 ============================================================== 725 Note 1: CSNPs are sent by routers in accordance with Section 3.2.1c 726 Note 2: If Restart Mode clear 728 4.2 Restarting Router 730 Event | Restarting | ADJ Seen | ADJ Seen | SPF Wait 731 | | RA | CSNP | 732 =================================================================== 733 Router | Send IIH/RR | | | 734 restarts | ADJ Init | | | 735 | Start T1,T2,T3 | | | 736 ------------+--------------------+-----------+-----------+------------ 737 RX RR | Send RA | | | 738 ------------+--------------------+-----------+-----------+------------ 739 RX RA | Adjust T3 | | Cancel T1 | 740 | Goto ADJ Seen RA | | Adjust T3 | 741 ----------- +--------------------+-----------+-----------+------------ 742 RX CSNP set| Goto ADJ Seen CSNP | Cancel T1 | | 743 ------------+--------------------+-----------+-----------+------------ 744 RX IIH w/o | Cancel T1 (Point- | | | 745 Restart TLV| to-point only) | | | 746 ------------+--------------------+-----------+-----------+------------ 747 T1 Expires | Send IIH/RR |Send IIH/RR|Send IIH/RR| 748 | Restart T1 | Restart T1| Restart T1| 749 ------------+--------------------+-----------+-----------+------------ 750 T1 Expires | Send IIH/ | Send IIH/ | Send IIH/ | 751 nth time | normal | normal | normal | 752 ------------+--------------------+-----------+-----------+------------ 753 T2 expires | Trigger SPF | | | 754 | Goto SPF Wait | | | 755 ------------+--------------------+-----------+-----------+------------ 756 T3 expires | Set OL | | | 757 | Flood local LSPs | | | 758 | Update fwd plane | | | 759 ------------+--------------------+-----------+-----------+------------ 760 LSP DB Sync| Cancel T2, and T3 | | | 761 | Trigger SPF | | | 762 | Goto SPF wait | | | 763 ------------+--------------------+-----------+-----------+------------ 764 All SPF | | | | Clear OL 765 done | | | | Update fwd 766 | | | | plane 767 | | | | Flood local 768 | | | | LSPs 769 | | | | Goto Runing 770 ====================================================================== 772 4.3 Starting Router 774 Event | Starting | ADJ Seen RA| ADJ Seen CSNP 775 ============================================================= 776 Router | Send IIH/SA | | 777 starts | Start T1,T2 | | 778 -------------+-------------------+------------+--------------- 779 RX RR | Send RA | | 780 -------------+-------------------+------------+--------------- 781 RX RA | Goto ADJ Seen RA | | Cancel T1 782 -------------+-------------------+------------+--------------- 783 RX CSNP Set | Goto ADJ Seen CSNP| Cancel T1 | 784 -------------+-------------------+------------+--------------- 785 RX IIH w | Cancel T1 | | 786 no Restart | (Point-to-Point | | 787 TLV | only) | | 788 -------------+-------------------+------------+--------------- 789 ADJ UP | Start T1 | | 790 | Send local LSPs | | 791 | w OL | | 792 -------------+-------------------+------------+--------------- 793 T1 Expires | Send IIH/RR |Send IIH/RR | Send IIH/RR 794 | and SA | and SA | and SA 795 | Restart T1 |Restart T1 | Restart T1 796 -------------+-------------------+------------+--------------- 797 T1 Expires | Send IIH/SA |Send IIH/SA | Send IIH/SA 798 nth time | | | 799 -------------+-------------------+------------+--------------- 800 T2 expires | Clear OL | | 801 | Send IIH normal | | 802 | Goto Running | | 803 -------------+-------------------+------------+--------------- 804 LSP DB Sync | Cancel T2 | | 805 | Clear OL | | 806 | Send IIH normal | | 807 ============================================================== 809 5. Security Considerations 811 Any new security issues raised by the procedures in this document 812 depend upon the ability of an attacker to inject a false but 813 apparently valid IIH, the ease/difficulty of which has not been 814 altered. 816 If the RR bit is set in a false IIH, neighbors who receive such an 817 IIH will continue to maintain an existing adjacency in the UP state 818 and may (re)send a complete set of CSNPs. While the latter action is 819 wasteful, neither action causes any disruption in correct protocol 820 operation. 822 If the RA bit is set in a false IIH, a (re)starting router which 823 receives such an IIH may falsely believe that there is a neighbor on 824 the corresponding interface which supports the procedures described 825 in this document. In the absence of receipt of a complete set of 826 CSNPs on that interface, this could delay the completion of 827 (re)start procedures by requiring the timer T1 to time out the 828 locally defined maximum number of retries. This behavior is the same 829 as would occur on a LAN where none of the (re)starting router's 830 neighbors support the procedures in this document and is covered in 831 Sections 3.3.1 and 3.3.2. 833 If an SA bit is set in a false IIH, this could cause suppression of 834 the advertisement of an IS neighbor which could either continue for 835 an indefinite period or occur intermittently with the result being 836 possible loss of reachability to some destinations in the network 837 and/or increased frequency of LSP flooding and SPF calculation. 839 The possibility of IS-IS PDU spoofing can be reduced by the use of 840 authentication as described in [2] and [3], and especially the use 841 of cryptographic authentication as described in [6]. 843 6. IANA Considerations 845 This document defines the following new ISIS TLV that needs to be 846 reflected in the ISIS TLV code-point registry: 848 Type Description IIH LSP SNP 849 ---- ----------------------------------- --- --- --- 850 211 Restart TLV y n n 852 7. Normative References 854 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 855 9, RFC 2026, October 1996. 857 2 Callon, R., "OSI IS-IS for IP and Dual Environment," RFC 1195, 858 December 1990. 860 3 ISO, "Intermediate system to Intermediate system routeing 861 information exchange protocol for use in conjunction with the 862 Protocol for providing the Connectionless-mode Network Service 863 (ISO 8473)," ISO/IEC 10589:2002, Second Edition. 865 4 Bradner, S., "Key words for use in RFCs to Indicate Requirement 866 Levels", BCP 14, RFC 2119, March 1997 868 5 Katz, D., and Saluja, R., "Three-Way Handshake for IS-IS Point- 869 to-Point Adjacencies", RFC 3373, September 2002 871 6 Li, T., and Atkinson, R., "Intermediate System to Intermediate 872 System (IS-IS) Cryptographic Authentication", RFC 3567, July 873 2003 875 7 Narten, T. and Alvestrand, H., "Guidelines for Writing an IANA 876 Considerations Section in RFCs", BCP 26 , RFC 2434, October 1998 878 8. Acknowledgments 880 The authors would like to acknowledge contributions made by Jeff 881 Parker, Radia Perlman, Mark Schaefer, Naiming Shen, Nischal Sheth, 882 Russ White, and Rena Yang. 884 9. Authors' Addresses 886 Mike Shand 887 Cisco Systems 888 250 Longwater Avenue, 889 Reading, 890 Berkshire, 891 RG2 6GB 892 UK 893 Phone: +44 208 824 8690 894 Email: mshand@cisco.com 896 Les Ginsberg 897 Cisco Systems 898 510 McCarthy Blvd. 899 Milpitas, Ca. 95035 USA 900 Email: ginsberg@cisco.com 902 10. Full Copyright Statement 904 Copyright (C) The Internet Society (2003). All Rights Reserved. 906 This document and translations of it may be copied and furnished to 907 others, and derivative works that comment on or otherwise explain it 908 or assist in its implementation may be prepared, copied, published 909 and distributed, in whole or in part, without restriction of any 910 kind, provided that the above copyright notice and this paragraph 911 are included on all such copies and derivative works. However, this 912 document itself may not be modified in any way, such as by removing 913 the copyright notice or references to the Internet Society or other 914 Internet organizations, except as needed for the purpose of 915 developing Internet standards in which case the procedures for 916 copyrights defined in the Internet Standards process must be 917 followed, or as required to translate it into languages other than 918 English. 920 The limited permissions granted above are perpetual and will not be 921 revoked by the Internet Society or its successors or assigns. 923 This document and the information contained herein is provided on an 924 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 925 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 926 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 927 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 928 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.