idnits 2.17.1 draft-ietf-isis-domain-wide-01.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 11 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 18 instances of too long lines in the document, the longest one being 5 characters in excess of 72. ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 255 has weird spacing: '...ra-area or |...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1999) is 9079 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1142 (ref. '1') (Obsoleted by RFC 7142) == Outdated reference: A later version (-05) exists of draft-ietf-isis-traffic-00 ** Downref: Normative reference to an Informational draft: draft-ietf-isis-traffic (ref. '3') Summary: 10 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Tony Li 2 INTERNET DRAFT Li Consulting 4 Tony Przygienda 5 Siara Systems 7 Henk Smit 8 Cisco Systems 9 June 1999 11 Domain-wide Prefix Distribution with Multi-Level IS-IS 13 15 Status 17 This document is an Internet-Draft and is in full conformance with 18 all provisions of Section 10 of RFC2026. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet- Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 1.0 Abstract 38 This document describes extensions to the IS-IS protocol to support 39 optimal routing within a multi-level domain. The IS-IS protocol is 40 specified in ISO 10589 [1], with extensions for supporting IPv4 41 specified in RFC 1195 [2]. 43 This document extends the semantics presented in RFC 1195 so that a 44 routing domain running with both Level 1 and Level 2 Intermediate 45 Systems (IS) [routers] can distribute IP prefixes between Level 1 and 46 Level 2 and vice versa. This distribution requires certain 47 restrictions to insure that persistent forwarding loops do not form. 48 The goal of this domain-wide prefix distribution is to increase the 49 granularity of the routing information within the domain. 51 2.0 Introduction 53 An IS-IS routing domain (a.k.a., an autonomous system running IS-IS) 54 can be partitioned into multiple level 1 (L1) areas, and a level 2 55 (L2) connected subset of the topology that interconnects all of the 56 L1 areas. Within each L1 area, all routers exchange link state 57 information. L2 routers also exchange L2 link state information to 58 compute routes between areas. 60 RFC 1195 [2] defines the Type, Length and Value (TLV) tuples that are 61 used to transport IPv4 routing information in IS-IS. RFC 1195 also 62 specifies the semantics and procedures for interactions between 63 levels. Specifically, routers in a L1 area will exchange information 64 within the L1 area. For IP destinations not found in the prefixes in 65 the L1 database, the L1 router should forward packets to the nearest 66 router that is in both L1 and L2 (i.e., an L1L2 router) with the 67 'attach' bit set in its L1 Link State Protocol Data Unit (LSP). 69 Also per RFC 1195, an L1L2 router should be manually configured with 70 a set of prefixes that summarize the IP prefixes found in that L1 71 area. These summaries are injected into L2. RFC 1195 specifies no 72 further interactions between L1 and L2 for IPv4 prefixes. 74 2.1 Motivations for domain-wide prefix distribution 76 The mechanisms specified in RFC 1195 are appropriate in many 77 situations, and lead to excellent scalability properties. However, 78 in certain circumstances, the domain administrator may wish to 79 sacrifice some amount of scalability and distribute more specific 80 information than is described by RFC 1195. This section discusses 81 the various reasons why the domain administrator may wish to make 82 such a tradeoff. 84 One major reason for distributing more prefix information is to 85 improve the quality of the resulting routes. A well know property of 86 prefix summarization or any abstraction mechanism is that it 87 necessarily results in a loss of information. This loss of 88 information in turn results in the computation of a route based upon 89 less information, which will frequently result in routes that are not 90 optimal. 92 A simple example can serve to demonstrate this adequately. Suppose 93 that a L1 area has two L1L2 routers that both advertise a single 94 summary of all prefixes within the L1 area. To reach a destination 95 inside the L1 area, any other L2 router is going to compute the 96 shortest path to one of the two L1L2 routers for that area. Suppose, 97 for example, that both of the L1L2 routers are equidistant from the 98 L2 source, and that the L2 source arbitrarily selects one L1L2 99 router. This router may not be the optimal router when viewed from 100 the L1 topology. In fact, it may be the case that the path from the 101 selected L1L2 router to the destination router may traverse the L1L2 102 router that was not selected. If more detailed topological 103 information or more detailed metric information was available to the 104 L2 source router, it could make a more optimal route computation. 106 This situation is symmetric in that an L1 router has no information 107 about prefixes in L2 or within a different L1 area. In using the 108 nearest L1L2 router, that L1L2 is effectively injecting a default 109 route without metric information into the L1 area. The route 110 computation that the L1 router performs is similarly suboptimal. 112 Besides the optimality of the routes computed, there two other 113 significant drivers for the domain wide distribution of prefix 114 information. 116 When a router learns multiple possible paths to external destinations 117 via BGP, it will select only one of those routes to be installed in 118 the forwarding table. One of the factors in the BGP route selection 119 is the IGP cost to the BGP next hop address. Many ISP networks 120 depend on this technique, which is known as "shortest exit routing". 121 If a L1 router does not know the exact IGP metric to all BGP speakers 122 in other L1 areas, it cannot do effective shortest exit routing. 124 The third driver is the current practice of using the IGP (IS-IS) 125 metric as part of the BGP Multi-Exit Discriminator (MED). The value 126 in the MED is advertised to other domains and is used to inform other 127 domains of the optimal entry point into the current domain. Current 128 practice is to take the IS-IS metric and insert it as the MED value. 129 This tends to cause external traffic to enter the domain at the point 130 closest to the exit router. Note that the receiving domain may, 131 based upon policy, choose to ignore the MED that is advertised. 132 However, current practice is to distribute the IGP metric in this way 133 in order to optimize routing wherever possible. This is possible in 134 current networks that only are a single area, but becomes problematic 135 if hierarchy is to be installed into the network. This is again 136 because the loss of end-to-end metric information means that the MED 137 value will not reflect the true distance across the advertising 138 domain. Full distribution of prefix information within the domain 139 would alleviate this problem as it would allow accurate computation 140 of the IS-IS metric across the domain, resulting in an accurate value 141 presented in the MED. 143 2.2 Scalability 145 The disadvantage to performing the domain-wide prefix distribution 146 described above is that it has an impact to the scalability of IS-IS. 147 Areas within IS-IS help scalability in that LSPs are contained within 148 a single area. This limits the size of the link state database, that 149 in turn limits the complexity of the shortest path computation. 151 Further, the summarization of the prefix information aids scalability 152 in that the abstraction of the prefix information removes the sheer 153 number of data items to be transported and the number of routes to be 154 computed. 156 It should be noted quite strongly that the distribution of prefixes 157 on a domain wide basis impacts the scalability of IS-IS in the second 158 respect. It will increase the number of prefixes throughout the 159 domain. This will result in increased memory consumption, 160 transmission requirements and computation requirements throughout the 161 domain. 163 It must also be noted that the domain-wide distribution of prefixes 164 has no effect whatsoever on the first aspect of scalability, namely 165 the existence of areas and the limitation of the distribution of the 166 link state database. 168 Thus, the net result is that the introduction of domain-wide prefix 169 distribution into a formerly flat, single area network is a clear 170 benefit to the scalability of that network. However, it is a 171 compromise and does not provide the maximum scalability available 172 with IS-IS. Domains that choose to make use of this facility should 173 be aware of the tradeoff that they are making between scalability and 174 optimality and provision and monitor their networks accordingly. 175 Normal provisioning guidelines that would apply to a fully 176 hierarchical deployment of IS-IS will not apply to this type of 177 configuration. 179 3.0 New semantics for external type metrics 181 RFC 1195 defines two TLVs for carrying IP prefixes. TLV 128 is 182 defined to carry 'internal' prefixes and TLV 130 is defined to carry 183 'external' prefixes. The original intent of RFC 1195 was to carry 184 intra-domain routes within the internal prefix TLV and inter-domain 185 routes or intra-domain routes from alternate IGPs in an external 186 prefix TLV. Interestingly, TLV type 130 is not documented to exist 187 in Level 1 LSPs. 189 In addition to this distinction, RFC 1195 provides for a bit in each 190 of these TLVs that distinguishes between an internal metric type and 191 an external metric type. Similarly, the clear intent was that the 192 internal metric type should reflect a total metric that is the sum of 193 the metrics to the advertising router plus the metric to the prefix. 194 Further, for an external metric type, the total metric should simply 195 be the metric advertised to the prefix, not including the total 196 metric necessary to reach the exit router. Prefixes with internal 197 metrics are always preferred over external metrics, regardless of the 198 value of the metrics. 200 It should be noted that the combination of an internal prefix with an 201 external metric type is not obviously useful, and is not allowed by 202 RFC 1195. 204 It should also be noted that as of this writing, the author knows of 205 no deployed implementations that make use of either the external 206 prefix or the external metric type. The implication is that this 207 proposal is free to redefine the semantics of the external metric 208 type bit without conflicting with existing protocol deployment. 210 An essential property when redistributing prefixes between levels is 211 to insure that no persistent loops form in the distribution of 212 information (i.e., a routing loop), as this would lead to the 213 indefinite propagation of the information, even in the event that the 214 information was no longer originated by some system in the domain. 215 Further, a routing loop is likely to form a forwarding loop, where 216 actual traffic traverses the network in a cycle in the topology. 217 Forwarding loops are known to consume large amounts of resources and 218 are to be avoided. 220 3.1 Proposed semantics for inter-area routes 222 To provide the above properties, this proposal defines the following 223 syntax and semantics. 225 An intra-area route is a route computed based on a prefix advertised 226 by some IS-IS router in the area. Thus, a prefix advertised in the 227 L1 link state database may become a L1 intra-area route within the 228 area of the advertiser. Similarly, a prefix advertised in the L2 229 link state database may become a L2 intra-area route within L2. 230 Prefixes associated with an intra-area route are also said to be 231 intra-area prefixes. 233 An inter-area route is a route computed based on a prefix advertised 234 by an IS-IS router not in the local area. Inter-area routes exist 235 either in L2, in which case they are L1->L2 inter-area routes, or in 236 L1, in which case they are L2->L1 inter-area routes. Prefixes 237 associated with an inter-area route area also said to be inter-area 238 prefixes. 240 External prefixes are reserved for prefixes originating outside of 241 the IS-IS system, usually learned from another routing protocol. 243 The following tables describe the types of prefixes now defined 244 within IS-IS and how they are encoded: 246 Level-1 LSPs | Internal TLV (128) | External TLV (130) 247 ---------------------------------------------------------------------- 248 Internal metric-type | L1 intra-area | external | 249 ---------------------------------------------------------------------- 250 External metric-type | L2->L1 inter-area | external | 251 ---------------------------------------------------------------------- 253 Level-2 LSPs | Internal TLV (128) | External TLV (130) 254 ---------------------------------------------------------------------- 255 Internal metric-type | L2 intra-area or | external | 256 | L1->L2 inter-area | | 257 ---------------------------------------------------------------------- 258 External metric-type | should not exist | external | 259 ---------------------------------------------------------------------- 261 Based on these definitions and encodings, this proposal defines the 262 following redistribution rules: 264 1) Only L1 intra-area prefixes and external prefixes are 265 redistributed from L1 into L2. 267 2) All prefixes can be redistributed from L2 into L1 and become L2- 268 >L1 inter-area routes. A L2 prefix must not be redistributed into a 269 L1 area if that same prefix is an intra-area prefix in the L1 area. 271 3) Within L1, an intra-area prefix is preferred over an inter-area 272 prefix, regardless of the comparison of the metrics. 274 Based on these rules, we first observe that this proposal is free 275 from routing loops. No prefix can be redistributed from L2 to L1 and 276 back into L2, because the route first becomes an L1 inter-area prefix 277 by rule (2) and by rule (1) cannot be redistributed into L2. 278 Similarly, a prefix redistributed from L1 to L2 becomes an L2 inter- 279 area prefix by rule (1) but will not be redistributed into the 280 original L1 area by rule (2). 282 Even when following all the indicated rules, there is the possibility 283 of a transient routing loop when the original prefix is withdrawn and 284 the inter-area prefix is selected. However, all link state protocols 285 are subject to transient routing loops, so this is no worse than the 286 status quo. 288 Note that this proposal is not radically different than the current 289 semantics for RFC 1195: internal metric types are always preferred 290 over externals, so rule (3) is an extension that allows external 291 metric types in internal prefix TLVs. It does not introduce a new 292 comparison between internal and external metric values. 294 3.2 Transition issues 296 Because no implementations currently make use of the external metric 297 type, the deployment of prefixes with an external metric type is 298 somewhat problematic. There is the possibility that the new type of 299 advertisement may result in software instability in systems that do 300 not deal with even the original semantics correctly. Further, there 301 is a danger that haphazard deployment of systems supporting this 302 proposal and legacy systems would have an unfortunate interaction. 303 It is required, for any L1 area that should perform the mutual 304 redistribution described in this proposal, that the L1L2 systems be 305 updated first. If these systems operate correctly, this is 306 sufficient to insure that there are no persistent routing loops. In 307 case where L1L2 systems are not being upgraded, consistent routing 308 loops are possible. Consider the following figure that gives an 309 according example: 311 Level 1/8 @ 200 312 | 313 | +-- L2 link cost 1 --+ 314 | | | 315 | computes 1/8 | 316 | @ 64 through (B) | 317 [1] | | 318 V V ^ 319 +--+--------+ +-----+-----+ 320 | new style | | old style | 321 (A) | L1/L2 | | L1/L2 | (B) 322 | leaks 1/8 | | leaks up | 323 | @E-cost 63| | @E-cost 63| 324 +----+------+ +-------+---+ 325 V ^ 326 | | 327 | computes L1 route 328 | 1/8 @ cost 128 329 | | 330 +---- L1 link with cost 1 -+ 332 Originally a prefix 1/8 with a cost of 200 is being computed by 333 upgraded L1L2 router (A) as best route towards 1/8 through interface 334 1. The prefix leaks at maximum cost of 63 (and with the I/E bit being 335 set) into L1 domain and is used by L1L2 router (B) which has not been 336 upgraded to compute best route to 1/8 at cost 128 in L1. We assume 337 that (B) is not masking the I/E bit out but is using it as part of 338 the metric, however the scenario holds as well in case (B) perceives 339 the metric to be 63. This L1 route will be preferred by (B) to a 340 computed L2 route. Assuming that (B) leaks 1/8 into L2 domain, (A) 341 will use it for another L2 computation that ends up with a shorter L2 342 route to 1/8 through (B). Hence, a forwarding loop has been formed. 344 As described in the previous section, rule (2) must be followed to 345 prevent looping when this extension is deployed using L1 routers 346 understanding the semantics of the L1 external metric mixed with 347 RFC1195 routers that treat the metric as purely internal. The 348 following example visualizes a forwarding loop encountered under 349 those assumptions. 351 1/8 with L2 @ cost 200 352 / 353 / 354 +==========+ 355 | L1/L2 | routing table for 1/8 356 | leaking | (top is active route) 357 (A) | 1/8 down | L1 @ 131 active 358 | with cost| L2 @ 200 359 | of 63 and| L1-E @ 127 360 | I/E set | 361 +====+====++ 362 | \ 363 | \ 364 | \ cost 1 365 | \ 366 cost 1 | +-+--+ routing table 367 | (C) | L1 | L1-E @128 active 368 | +---++ L1 @130 369 | \ 370 | \ 371 | path with 372 | total cost 373 routing table +---+-+ of 130 374 L1-E @128 active | L1 | (B) \ 375 L1 @131 +---+-+ \ 376 | \ 377 | \ 378 | +-+--+ 379 +- path with -------+ L1 | advertises 380 total cost +----+ 1/8 as L1 381 of 131 (D) attached 382 prefix 384 (A) is L1L2 router and leaks into L1 a prefix 1/8 that it computed 385 through L2 at the maximum cost of 127 (or expressed differently, at 386 cost 63 with I/E bit set) which violates rule (2). Router (B), (C), 387 (D) are all purely RFC1195 compliant routers so they perceive the 388 leaked prefix as internal L1. At the same time, (D) advertises the 389 same prefix 1/8 as L1 directly attached subnet into L1. To 390 distinguish the different copies, the leaked prefix is shown as L1-E 391 (for L1 external metric). (B) computes the L1-E route at a cost of 392 127+1 and prefers it to the one through (D) since such a router has a 393 cost of 131. Therefore (B) forwards a packet to 1/8 towards (A). (A) 394 cannot prefer the L1-E route since it could not really forward using 395 it but has to use L2 to get the packet into the L2 backbone. 396 However, L1 computed to (D) must be preferred to L2 based on usual 397 preference rules. Hence, (A) forwards the packet towards (C). (C) 398 has the L1-E as preferred (since it looks like cheaper L1 route to 399 1/8 than the L1 route through (D)) and forwards the packet back to 400 (A). 402 4.0 Comparisons with other proposals 404 Another proposal is currently being discussed which is similar to 405 this one in nature. 407 In [3], a new TLV is proposed to transport IP prefix information. 408 Because this is a new TLV, it is somewhat harder to deploy, requiring 409 that all systems understand the new TLV before it can become 410 effective. For this reason, this proposal provides an alternative 411 that can be deployed sooner. There is no effective semantic 412 difference between the two proposals. In [3], a bit is defined to 413 mark a prefix as 'up' or 'down'. This is essentially the same 414 semantics as is proposed here. 416 5.0 Security Considerations 418 This document raises no new security issues for IS-IS. 420 6.0 References 422 [1] ISO 10589, "Intermediate System to Intermediate System Intra- 423 Domain Routeing Exchange Protocol for use in Conjunction with the 424 Protocol for Providing the Connectionless-mode Network Service (ISO 425 8473)" [Also republished as RFC 1142] 427 [2] RFC 1195, "Use of OSI IS-IS for routing in TCP/IP and dual 428 environments", R.W. Callon, Dec. 1990 430 [3] Smit, H., Li, T. "IS-IS extensions for Traffic Engineering", 431 draft-ietf-isis-traffic-00.txt, work in progress 433 7.0 Authors' Addresses 435 Tony Li 436 Li Consulting 437 Email: tony1@home.net 439 Tony Przygienda 440 Siara Systems 441 300 Ferguson Drive 442 Mountain View, CA 94043 443 Email: prz@siara.com 444 Voice: +1 650 237 2173 445 Henk Smit 446 Cisco Systems, Inc. 447 210 West Tasman Drive 448 San Jose, CA 95134 449 Email: hsmit@cisco.com 450 Voice: +31 20 342 3736