idnits 2.17.1 draft-lu-isis-transaction-tlv-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 5, 2012) is 4428 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group W. Lu 3 Internet-Draft A. Tian 4 Intended status: Standards Track Ericsson 5 Expires: September 6, 2012 March 5, 2012 7 ISIS Transaction TLV 8 draft-lu-isis-transaction-tlv-00 10 Abstract 12 ISIS local updates may require multiple LSPs to convey. Receiving 13 routers, whose decision processes are without such knowledge, may 14 generate incorrect routing table updates based on the partial set of 15 LSPs it receives and hence the traffic outage before they are 16 corrected by another run of the decision process. This memo 17 describes a method that makes the decision process more informed so 18 that the interim results can be minimized or avoided. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 6, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 56 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.3. Implicit Purging . . . . . . . . . . . . . . . . . . . . . 3 58 1.4. PDU Transaction Knowledge . . . . . . . . . . . . . . . . 4 59 2. Transaction TLV . . . . . . . . . . . . . . . . . . . . . . . 5 60 2.1. LSP Transaction Set . . . . . . . . . . . . . . . . . . . 5 61 2.2. TLV Format . . . . . . . . . . . . . . . . . . . . . . . . 5 62 2.3. T-TLV Count . . . . . . . . . . . . . . . . . . . . . . . 6 63 2.4. Transaction ID . . . . . . . . . . . . . . . . . . . . . . 6 64 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 3.1. Originator . . . . . . . . . . . . . . . . . . . . . . . . 7 66 3.2. Receiver . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 3.2.1. Opening the Transaction . . . . . . . . . . . . . . . 7 68 3.2.2. Invalid Transaction . . . . . . . . . . . . . . . . . 8 69 3.2.3. Processing T-TLV . . . . . . . . . . . . . . . . . . . 8 70 3.2.4. Closing the Transaction . . . . . . . . . . . . . . . 8 71 3.2.5. Aborting the Transaction . . . . . . . . . . . . . . . 9 72 3.2.6. Exit Transaction . . . . . . . . . . . . . . . . . . . 9 73 3.2.7. Timer Expiry . . . . . . . . . . . . . . . . . . . . . 9 74 3.2.8. TID Wrap . . . . . . . . . . . . . . . . . . . . . . . 9 75 4. Multiple Transaction Sets . . . . . . . . . . . . . . . . . . 9 76 5. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 10 77 5.1. Avoid Unwanted Purging . . . . . . . . . . . . . . . . . . 10 78 5.2. Allow reordering of TLVs in GR case . . . . . . . . . . . 10 79 5.3. Help Precise SPF Scheduling . . . . . . . . . . . . . . . 10 80 5.4. Other than SPFs . . . . . . . . . . . . . . . . . . . . . 11 81 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 83 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 84 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 9.1. Normative References . . . . . . . . . . . . . . . . . . . 12 86 9.2. Informative References . . . . . . . . . . . . . . . . . . 12 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 89 1. Introduction 91 Link state protocols run on the knowledge of the entire topology. 92 Incomplete topology information, even temporary, can result in 93 traffic outage or routing loop. While transitional routing changes 94 are inevitable and common to both OSPF [RFC2328] and ISIS 95 [RFC1195][ISO.10589.1992], impacts to unchanged network connectivity 96 are unnecessary and should be minimized if not totally avoidable. 98 1.1. Requirements Language 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in RFC 2119 [RFC2119]. 104 1.2. Acronyms 106 IS-IS - Intermediate System to Intermediate System 108 OSPF - Open Shortest Path First 110 TLV - Type Length Value 112 PDU - Protocol Data Unit 114 LSP - Link State PDU 116 SPF - Shortest Path First 118 1.3. Implicit Purging 120 Compared to OSPF, these impacts are unique to ISIS. There are two 121 reasons. One is that ISIS LSPs use implicit TLV purging. Although 122 LSPs do have age field which can be used for purging purpose, ISIS 123 does not have the age granularity down to TLV level which is the 124 atomic unit of ISIS link state information. If for some reason a TLV 125 needs to be relocated to a different LSP fragment (e.g. TLV-B in 126 Figure 1 and Figure 2), this TLV can be perceived as being purged 127 from the original LSP fragment. And if the receiving ISIS starts its 128 decision process before it sees the second LSP fragment, the 129 reachability via this TLV, if any, will be lost. 131 LSP-Foo.00-00 LSP-Foo.00-01 132 -------------------------------- ------------------------------- 133 | ............ | TLV-A | TLV-B | | ........... | TLV-X | ----- | 134 -------------------------------- ------------------------------- 136 Figure 1: Frag 00 Almost Full 138 LSP-Foo.00-00 LSP-Foo.00-01 139 -------------------------------- ------------------------------- 140 | ............ | TLV-A | - | | ........... | TLV-X | TLV-B | 141 -------------------------------- ------------------------------- 143 Figure 2: TLV-A grow out TLV-B 145 The second reason is that operating directly on data link layer, ISIS 146 cannot extend the LSP size beyond the MTU limit as opposed to OSPF 147 which can leverage the IP fragmentation capability to extend its LSU 148 size. The consequence is that when an LSP fragment is full or nearly 149 full, if some of its TLVs need to expand, they will have to be 150 relocated to other LSP fragment. Alternatively some other TLVs can 151 be moved out of this LSP fragment to make room for the needy. Either 152 way an implicit purge condition is created. 154 1.4. PDU Transaction Knowledge 156 The above issues can be mitigated if the receiving routers are 157 provided with LSP transaction information. In other words, if the 158 receivers know how many LSPs they should expect from a particular 159 originating Intermediate System, so that they acquire complete 160 topology updates from that System, the receivers should be able to 161 avoid running their decision process based on the incomplete 162 transitional link state information. 164 There can be many ways to accomplish the purpose. However to be 165 practical the solution should meet following requirements: 167 1. It must be backward compatible. Adding a new TLV can easily 168 fulfill this; 170 2. The TLV should be simple and short, that it does not take 171 significant LSP space; 173 3. The solution should be fallback-able. That is, in case of errors 174 or mistakes, it can fallback to the operation state without such 175 solution; 177 4. It can be implemented easily without adding much burden on the 178 originator and its update process. In particular it should not 179 delay or change the timing of LSP flooding; 181 5. On the receiver side, the new logic should be simple and can be 182 easily integrated to the existing logic, such as SPF scheduler. 183 Performance wise, the new addition should be negligible. 185 This document describes a transaction knowledge based TLV that can be 186 used by the receiving routers to make informed decision. 188 2. Transaction TLV 190 A new TLV is introduced to indicate that the carrying LSP needs 191 additional LSPs to complement. For example, in Figure 2 LSPs "00" 192 and "01" both have to be included in the SPF to reflect the correct 193 change. If the receiver kicks off its SPF right after receiving LSP 194 "00" and before seeing LSP "01", the reachability pertaining to TLV-B 195 will be incorrectly removed, cause temporary traffic loss. The TLV 196 is called Transaction TLV as it provides the transaction knowledge of 197 the changes in which a set of LSPs are involved. 199 2.1. LSP Transaction Set 201 LSPs which are coherent in contents are called LSP Transaction Set. 202 These LSPs must be processed atomically by the receiver's decision 203 process to avoid incorrect result. In Figure 2, LSPs "00" and "01" 204 form the LSP transaction set. 206 The LSP transaction set is a subset of all LSPs originated by an IS. 207 In other words, the transaction set belongs to a single IS, and 208 shares the same LSP ID which is the System ID plus the pseudo node 209 ID. 211 2.2. TLV Format 213 Figure 3 describes the Transaction TLV data fields: 215 0 1 2 3 216 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 218 | Type | Len | TID | 219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 220 | Count | 221 +-+-+-+-+-+-+-+-+ 223 Figure 3: Transaction TLV Format 225 Type 226 1 byte, value TBD (T-TLV); 228 Len 229 1 byte, 4 or 5; 231 TID 232 4 bytes, Transaction ID; 234 Count 235 1 byte, optional. 237 The TLV can appear at most once in an LSP (fragment). Each LSP in 238 an LSP transaction set is encoded with the same TLV except the 239 last LSP which MUST include the "count" in this TLV to tell the 240 total number LSPs in this transaction set. 242 2.3. T-TLV Count 244 The count field is of 1 byte size, same as the size for LSP number 245 field. The first "count-1" LSPs use the shorter T-TLV which has 246 length 4. The last LSP use the longer T-TLV which contains the count 247 field which counts in itself. This design is to make the originator 248 easy to encode the TLV without having to know the count beforehand. 250 2.4. Transaction ID 252 Along the time, an IS may generate and flood a number of LSP 253 transaction sets. To differentiate one set from another, a 254 monotonically increased Transaction ID (TID) is used. Each IS 255 maintains and manages its TID. If an IS is also a DIS in one or more 256 of its interfaces, each pseudo node has its own TID which is 257 independent of the TID of the non-pseudo node, or other pseudo node. 259 In case of TID conflict (due to race condition in flooding, or 260 errors), the higher TID invalidates the lower TID. When a TID 261 reaches the maximum, the TID wrap mechanism is used, which is 262 detailed in Section 3.2.8. 264 3. Operation 266 3.1. Originator 268 When an IS decides that its update process will have to use multiple 269 LSPs to convey some information atomically, it labels these LSPs with 270 the T-TLV and follows the procedures below: 272 1. The T-TLV should be used for SPF-sensitive changes only; 274 2. It starts a new TID number which is per LSP ID based. How to 275 choose the initial TID number is a local decision though the 276 natural choice would be 1. The number MUST be incremented from 277 the existing one, and monotonically increased ever since; 279 3. T-TLV is recommended to be added to the end of an LSP. The TID 280 is stored in the LSP set space (containing all LSP fragments), as 281 opposed to the individual LSP space; 283 4. Repeat step 3 until there is no more to pack, at which time a 284 T-TLV with count field is inserted to this last LSP in the 285 transaction set. 287 3.2. Receiver 289 Per ISIS protocol nature, if the receiver does not understand and 290 support the T-TLV, the TLV is silently ignored. This ensures the 291 backward compatibility. 293 Otherwise the receiving IS enters into the transaction procedure. 295 An IS will not engage its decision process into such procedure for 296 T-TLVs whose carrying LSPs are already installed in the database. In 297 other words, the procedure is activated only upon the receiving of 298 the T-TLVs whose carrying LSPs are new. 300 3.2.1. Opening the Transaction 302 When a T-TLV is received, the receiving IS enters the LSP transaction 303 procedure. 305 Type 306 The TID is recorded to indicate that the current TID is active; 308 Len 309 A protection timer is started to prevent the error case where the 310 transaction cannot close in time; 312 TID 313 LSPs in a transaction set may not arrive in the order they are 314 sent. Whichever arrives the first opens the transaction; 316 Count 317 The transaction record (TID/count/status) is maintained under the 318 LSP set space (SystemID + PseudoNodeID). 320 3.2.2. Invalid Transaction 322 The transaction is invalid, and MUST be aborted (exit) if: 324 Type 325 TID is outdated. This occurs if a higher TID is found, or the 326 same TID is closed in the past transaction; 328 Len 329 more than one T-TLV is found in an LSP; 331 TID 332 more than one T-TLV with count field is found in an LSP 333 transaction set; 335 3.2.3. Processing T-TLV 337 When a T-TLV is received, following rules apply: 339 Type 340 Open the transaction if not yet, also increment the corresponding 341 local count. If the received TLV contains the count, note down 342 the announced count. 344 Len 345 If the local count equals the announced count, close the 346 transaction. 348 3.2.4. Closing the Transaction 350 If the received T-TLV causes the local count to match the announced 351 count: 353 Type 354 Change the current transaction TID from active to closed; 356 Len 357 Cancel the protection timer; 359 Len 360 Exit the transaction. 362 Note that any T-TLV can close the transaction as long as it causes 363 the match of counters. Implementation should not assume that the 364 T-TLV with count field comes the last. 366 3.2.5. Aborting the Transaction 368 Any error condition can abort the current transaction. The handling 369 procedure is the same as the one in Section 3.2.4. 371 3.2.6. Exit Transaction 373 A transaction can be terminated normally (closing) or abnormally due 374 to error conditions. 376 Closing and aborting the transaction are technically the same 377 operation. The difference is that closing the transaction fulfills 378 the purpose of T-TLVs for avoiding unnecessary packet loss. 380 Either way after the transaction is terminated, the decision process 381 MUST no longer block its SPF and should start the computation 382 immediately or follow whatever SPF scheduling mandates. 384 3.2.7. Timer Expiry 386 The expiry of the protection timer indicates that some transaction 387 error has occurred. The receiving IS MUST abort the transaction. 389 The length of the timer is a local decision. 391 3.2.8. TID Wrap 393 When a TID reaches the maximum (0xFFFFFFFF), the originating IS will 394 have to refrain from using T-TLV for LSP maximum age (21 minutes 395 usually). The logic is similar to that of LSP sequence number 396 wrapping. 398 4. Multiple Transaction Sets 400 The count field is of 1 byte size, same as the size for LSP number 401 field. The first "count-1" LSPs use the shorter T-TLV which has 402 length 4. The last LSP use the longer T-TLV which contains the count 403 field which counts in itself. This design is to make the originator 404 easy to encode the TLV without having to know the count beforehand. 406 5. Use Cases 408 The count field is of 1 byte size, same as the size for LSP number 409 field. The first "count-1" LSPs use the shorter T-TLV which has 410 length 4. The last LSP use the longer T-TLV which contains the count 411 field which counts in itself. This design is to make the originator 412 easy to encode the TLV without having to know the count beforehand. 414 5.1. Avoid Unwanted Purging 416 The unwanted purging described in Section 1.3 can be avoided using 417 T-TLVs. The originator can add T-TLV to LSP-Foo.00-00 and T-TLV 418 (count=2) to LSP-Foo.00-01. The receivers will withhold the SPF till 419 both LSPs are received. The missing TLV-B in the first LSP as shown 420 in Figure 2 will not be treated as an implicit purging, as it will be 421 found in the second LSP. 423 5.2. Allow reordering of TLVs in GR case 425 If an IS advertises lots of redistributed routes in its LSPs, it is 426 not trivial to maintain its TLV (like TLV 135) orders. 428 This is especially true when an IS has just gone through the graceful 429 restart process. Because the RIB does not necessary supply the 430 redistributed routes the same order in the Pre-Restart time, 431 reconstruct LSPs will result in LSPs with TLVs reordered. 433 And if the number of redistributed routes is high, they spread over 434 multiple LSPs. When the set of LSPs reaches other ISes, the same 435 issue of 5.1 can arise, even if there is no change to redistributed 436 routes at all. 438 It is not impossible for the originating IS to use sophisticated 439 means to keep those TLVs in their original order. Nevertheless this 440 issue can easily be addressed with the T-TLVs. 442 The restarting IS can add T-TLVs to all LSPs that are subject to TLV 443 reordering, and transmit them upon exit of its graceful restart 444 process. Thus the receiving ISes will not mistakenly purge some IS 445 external reachability prefixes. 447 5.3. Help Precise SPF Scheduling 449 As a link state protocol, ISIS has to two conflict goals. One is to 450 be fast responsive to the network changes. The other is the network 451 stability. 453 If the SPF is scheduled too swiftly, the system (and even the 454 network) can melt down for some storm activities. On the other hand 455 if there is lot of delays, the network becomes too slow adapting 456 changes. 458 For example, if an IS receives a burst of redistributed routes from 459 BGP, it may send out dozens of LSPs for advertising all those routes. 460 The receiving ISes, upon received first several LSPs, usually start 461 the decision process to compute the new routing table. The routing 462 table is incomplete and will be soon overwritten by another SPF run. 464 If the number of routes (and hence LSPs) is high, most such SPF runs 465 will be useless and wasteful. Only the last SPF will contribute to 466 the final and correct routing table. 468 The T-TLV if used will provide guidance to the receiving ISes to run 469 SPF only when all LSPs are in place. This SPF is equivalent to the 470 final SPF mentioned above. Therefore it saves a lot of SPF runs and 471 network churns. What is more is that the T-TLV driven SPF can be 472 kick started immediately, compared to the final SPF which usually has 473 some amount of delay. 475 5.4. Other than SPFs 477 The T-TLV may also be used for some non-SPF related operation. For 478 example, the receiving ISes may choose to defer its TE database 479 uploading process until all LSPs that carry the TE information are 480 received. 482 6. Security Considerations 484 This proposal does not introduce additional issues on security 485 condition. 487 7. IANA Considerations 489 A new ISIS T-TLV is introduced. The type is TBD by IANA. 491 8. Acknowledgements 493 TBD 495 9. References 496 9.1. Normative References 498 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 499 Requirement Levels", BCP 14, RFC 2119, March 1997. 501 9.2. Informative References 503 [ISO.10589.1992] 504 International Organization for Standardization, 505 "Intermediate system to intermediate system intra-domain- 506 routing routine information exchange protocol for use in 507 conjunction with the protocol for providing the 508 connectionless-mode Network Service (ISO 8473)", 509 ISO Standard 10589, 1992. 511 [RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and 512 dual environments", RFC 1195, December 1990. 514 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. 516 Authors' Addresses 518 Wenhu Lu 519 Ericsson 520 300 Holger Way 521 San Jose, California 95134 522 USA 524 Email: Wenhu.Lu@ericsson.com 526 Albert Tian 527 Ericsson 528 300 Holger Way 529 San Jose, California 95134 530 USA 532 Email: Albert.Tian@ericsson.com