idnits 2.17.1 draft-richardson-fib-reduction-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-23) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 16 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 17 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 18 instances of too long lines in the document, the longest one being 8 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1996) is 10174 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '2' on line 707 looks like a reference -- Missing reference section? '3' on line 710 looks like a reference -- Missing reference section? '4' on line 712 looks like a reference -- Missing reference section? '1' on line 704 looks like a reference -- Missing reference section? '8' on line 731 looks like a reference -- Missing reference section? '10' on line 735 looks like a reference -- Missing reference section? '5' on line 717 looks like a reference -- Missing reference section? '6' on line 720 looks like a reference -- Missing reference section? '7' on line 724 looks like a reference -- Missing reference section? '9' on line 733 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Steven J. Richardson 2 Internet Draft Merit Network, Inc. 3 June 1996 4 Expire in six months 6 Vertical Aggregation: A Strategy for FIB Reduction 7 9 1. Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its 13 areas, and its working groups. Note that other groups may also 14 distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months and may be updated, replaced, or obsoleted by other 18 documents at any time. It is inappropriate to use Internet- 19 Drafts as reference material or to cite them other than as 20 ``work in progress.'' 22 To learn the current status of any Internet-Draft, please check 23 the ``1id-abstracts.txt'' listing contained in the Internet- 24 Drafts Shadow Directories on ftp.is.co.za (Africa), 25 nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), 26 ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). 28 2. Abstract 30 This document analyzes Network Layer Reachability Information (NLRI) 31 flow within a router with emphasis on the Forwarding Information 32 Base(s)--FIB(s)--and the effects of currently-implemented aggregation 33 schemes on FIB size. The conditions for reduction of FIB information 34 before insertion into the kernel are considered, with preservation of 35 routing a primary criteria (essentially deriving in a more structured 36 manner the result that one wishes to remove extraneous routing 37 information extant in the FIB(s)). Finally, aggregation algorithms 38 consistent with these conditions are presented. 40 3. Table of Contents 42 1. Status of this Memo 43 2. Abstract 44 3. Table of Contents 45 4. Introduction 46 4.1 NLRI Flow and NLRI Interaction with a Router's FIB(s) and Conceptual RIBs 47 4.1.1 NLRI Flow Between Routers 48 4.1.2 NLRI Flow Within a Router 49 4.1.3 NLRI Flow in Relation to Router's Conceptual RIBs and FIB(s) 50 4.2 Horizontal and Vertical NLRI Flows 51 4.3 Horizontal Aggregation 52 5. Vertical Aggregation 53 5.1 Character of the Current FIB Filter Function f() 54 5.2 New FIB Filter Function--General Character and Constraints 55 5.3 Implications of the Constraint 56 5.3.1 CIDR-Related Implications of the Constraint 57 5.3.2 BGP4-Related Implications of the Constraint 58 5.3.3 Related Considerations 59 5.4 Solution of f() 60 5.4.1 FIB Representation 61 5.4.2 Solution of f()--Simple Case 62 5.4.3 Solution of f()--General Case 63 6. Conclusion and Next Steps 64 7. Security Considerations 65 8. Acknowledgements 66 9. Author's Address 67 10. Notes 68 11. References 70 4. Introduction 72 In order to better understand the proposed FIB reduction scheme, we 73 will first review 75 - the RIB and FIB structure and interaction in a typical router, 76 - the different flows of Network-Layer Reachability Information 77 (NLRI), and 78 - the type of aggregation ("horizontal aggregation") currently 79 implemented. 81 4.1 NLRI Flow and NLRI Interaction with a Router's FIB(s) and Conceptual 82 RIBs 84 4.1.1 NLRI Flow Between Routers 86 It is important to clearly have in mind the flow of NLRI both between 87 different routers and also within a single router. Diagram 1 depicts 88 the flow of NLRI in the former case, which we will label "horizontal" 89 flow of NLRI, in part because we speak of routers as being "peers" 90 and so, by implication, being of equal standing. (Note that NLRI and 91 packet flow, though both bidirectional in nature, have an opposite 92 sense in that packets flow in the reverse direction of NLRI.) 94 +----------+ NLRI +----------+ NLRI +----------+ 95 | |<========>| |<========>| | 96 | Router X | | Router Y | | Router Z | 97 | |<-------->| |<-------->| | 98 +----------+ Packets +----------+ Packets +----------+ 100 Diagram 1 101 Flow of NLRI Between Different Routers 102 ("Horizontal" Flow of NLRI) 104 The entire traffic of NLRI which flow on the Internet can be 105 considered to be a stream of NLRI into which routers are placed; the 106 routers can act to modify this stream via policy, including adding to 107 the stream. 109 4.1.2 NLRI Flow Within a Router 111 Internal to a router, a portion of this stream of NLRI is cached off 112 to one side; this is the FIB(s) of the router, the actual routing 113 table(s) used in packet forwarding. Since this flow of NLRI is 114 orthogonal to the peer-to-peer or horizontal flow, we will call this 115 the "vertical" flow of NLRI. This is shown in Diagram 2, which 116 presents a simplified view of NLRI flow within a router--e.g., Router 117 Y of Diagram 1--and which points out the difference between the 118 "horizontal" flow "through" the router versus the "vertical" NLRI 119 flow from this stream to the local FIB(s). 121 Horizontal NLRI Flow ---> 123 +------------------+ 124 | ________ | 125 | / Cloud \ | 126 Inbound NLRI ==>|==>( of )==>|==> Outbound NLRI 127 | \ RIBs / | 128 | -------- | 129 | || | | 130 | || | | Vertical NLRI Flow 131 | \/ | V 132 | +------+ | 133 | +------+| | 134 | | || | 135 | |FIB(s)|| | 136 | | |+ | 137 | +------+ | 138 +------------------+ 139 Router 141 Diagram 2 142 Flow of NLRI Within a Single Router 143 ("Vertical" Flow of NLRI) 145 4.1.3 NLRI Flow in Relation to Router's Conceptual RIBs and FIB(s) 147 Note that Diagram 2 has lumped the internal RIB structure of the 148 router into a cloud (well, OK, it's some sort of cheesy ASCII lump 149 ;-). In order to better model the internal flow of NLRI in a single 150 router, one must consider the detailed conceptual structure of a 151 router's RIB(s) and the relation of the RIB(s) to the FIB(s); this is 152 depicted in Diagram 3, below.[2] 153 +-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 154 |Adj-RIBs-In Adj-RIBs-Out| 155 +----+ +----+ 156 | +----+| I +----+| | 157 Peer 1 ==> | || ==>-\ +- - - - - - - - ->/-==> | || ==> Peer 1 158 | | |+ || | || | |+ | 159 +----+ || || +----+ 160 | || | || | 161 +----+ || || +----+ 162 | +----+| || | I || +----+| | 163 Peer 2 ==> | || ==>|| +- - - - - - - - ->||==> | || ==> Peer 2 164 | | |+ || | || | |+ | 165 +----+ || || +----+ 166 | || | || | 167 . . . . . . 168 . . . . . . 169 . . . . . . 170 | || | || | 171 +----+ || || +----+ 172 | +----+| || | I || +----+| | 173 Peer N ==> | || ==>|| +- - - - - - - - ->||==> | || ==> Peer N 174 | | |+ || | || | |+ | 175 +----+ || || +----+ 176 | || |I || | 177 Import || || Export 178 | Policies || | Loc-RIBs || Policies | 179 (Inbound || +----+ | || (Outbound 180 | Filters) || | +----+| | || Filters) | 181 \+==+==> | || =====>+/ 182 | | |+ | | 183 +----+ | 184 | || A() | 185 || 186 | RIB(s) || | 187 +-- -- -- -- -- -- -- -- ||- -- -- -- -- -- -- -- --+ 188 || 189 -----||----- f() 190 || 191 \/ 192 FIBs 193 +----+ 194 +----+| 195 | || 196 | |+ 197 +----+ 199 RIBs and FIBs in a Single Router 200 Diagram 3 202 Although Diagram 3 is considerably more complex than Diagram 2, it is 203 basically a more detailed enhancement of the RIB cloud and FIBs shown 204 in the latter figure; in particular: 206 - the dashed bounding box of Diagram 3 represents the 207 RIB cloud of Diagram 2, and it contains the various 208 component RIBs (Adj-RIBs-In/Out and the Loc-RIB) which 209 comprise the complete RIB(s) of the router[3] (in other 210 words, a complete RIB for a router has its Loc-RIB and 211 a set of Adj-RIBs-In and Adj-RIBs-Out); 213 - the RIBs associated with inbound and outbound NLRI 214 (i.e., the Adj-RIBs-In and Adj-RIBs-Out) are split up 215 on a per-peer basis with the former on the left side i 216 of the diagram and the latter on the right side; 218 - multiple RIB and FIB layers are indicated in order to 219 account for the fact that there may be protocols which 220 support Type-of-Service (ToS)/Quality-of-Service (QoS) 221 differentiation of different routes to the same 222 destination, resulting in one layer of RIB/FIB per set 223 of RIB-Attributes (RIB-Atts) which define a given ToS/QoS 224 (i.e., one RIB/FIB layer per ToS/QoS). 226 There are a few lines which the reader should locate, all near the 227 Loc-RIB(s) in Diagram 3; they are labelled "A()" and "f()". The 228 importance of these lines is that they show places where aggregation 229 of NLRI can occur. They are named as functions because aggregation 230 policies essentially are functions which modify the flow of NLRI, the 231 stream of routing information which passes through a router or is 232 cached in the router. 234 Since there are two directions of NLRI flow, there are also two 235 distinct aggregation policies: 237 - "horizontal" aggregation policy (see sec. 4.3, below)-- 238 aggregation which affects only the horizontal flow of NLRI-- 239 conceptually occurs at the line labelled "A()" (i.e., it is 240 a part of the process of creating the Adj-RIBs-Out[4]), while 242 - "vertical" aggregation (see sec. 5, below)--aggregation which 243 affects only the horizontal flow of NLRI--conceptually occurs 244 at the line labelled "f()". 246 (Note that any internal peers of the local router--i.e., those peers 247 which are in the same Routing Domain (RD) as the local router--will 248 receive routing updates via the path labelled "I" in Diagram 3; note 249 that this path circumvents the Loc-RIB(s) and export filter process. 251 External peers (those which are in a different RD) are updated via 252 the right-hand path labelled "Export Policies (Outbound 253 Filters)."[5]) 255 Horizontal aggregation is the form of aggregation familiar to most 256 readers and is very widely-discussed at the present time, 257 particularly in the context of CIDR and it helpfulness both at the 258 point of allocating IP address space and at the point of announcing 259 routing destinations. 261 4.2 Horizontal and Vertical NLRI Flows 263 As we've noted, Diagram 3 represents two different flows of Network- 264 Layer Reachability Information (NLRI): the horizontal flow (the 265 peer-to-peer flow, including the flow "through" a router as the NLRI 266 pass through it on the way from one peer to another), and the 267 vertical flow (the flow from Loc-RIB(s) to FIB(s) within a router. 268 These flows within a router can be detailed as follows (refer to 269 Diagram 3): 271 Horizontal NLRI flow: 273 - incoming NLRI and associated path attributes from peers 274 1, 2, .. N are stored in the Adjacent-RIB(s)-In (Adj-RIBs-In) 275 associated with the peer announcing the NLRI; 277 - all Adj-RIBs-In are filtered via the local router's 278 import policy, which determines which NLRI in these 279 Adj-RIBs-In end up in the local router's Loc-RIB(s); 281 - a "best route" to each destination in the Loc-RIB(s) is 282 selected via both protocol-specified criteria (see, e.g., 283 section 9.1 of RFC1771--and its subsections--for the 284 BGP4 decision process) and, possibly, implementation- 285 specific criteria (e.g., GateD's use of a second internal 286 metric, "preference2"[6]); 288 - the Loc-RIB's/Loc-RIBs' best routes are filtered via 289 the local router's export policy, which determines which 290 NLRI will end up in the router's Adjacent-RIBs-Out, which 291 are also arranged on a per-peer basis (just as the Adj-RIBs- 292 In). 294 Vertical NLRI flow: 296 - "best routes" selected in the Loc-RIB(s) are filtered 297 via a function f()--described below--into the local 298 router's FIB(s). 300 Note that these two information flows only intersect at the Loc- 301 RIB(s) in the form of the "best routes" selected for each destination 302 stored in the Loc-RIB(s); therefore, a change in the FIB filter 303 function f() only affects the "vertical" NLRI flow, and has no effect 304 on the "horizontal" NLRI flow (i.e., it has no effect on existing 305 routing protocol implementations). Also, as the NLRI flow across the 306 Internet, while the horizontal flow is some continuous flow 307 (modified, of course, by policy and outages), the vertical flow is 308 internal to each router (it does not cross router boundaries). 310 4.3 Horizontal Aggregation 312 Classless Inter-Domain Routing (CIDR; RFC1519) aids in the reduction 313 of routing information by making possible 315 a) "right-sizing" of IPv4 address allocations--so that fewer 316 destinations need be announced (aggregation at the source of 317 routing announcements), and 319 b) hierarchical addressing schemes which promote aggregation at 320 intermediate points in the routing topology (topologically-based 321 aggregation). 323 It is to be noted that both of these schemes represent reductions in 324 the number of destinations exchanged between routers; one can view 325 this as "horizontal aggregation" in that it affects the amount of 326 routing information exchanged between BGP4 peers in the routing 327 topology. 329 Though these aggregation actions, which reduce the total number of 330 destinations announced, clearly reduce local Forwarding Information 331 Bases (FIBs, the forwarding tables or routing tables of routers), 332 this effect is rather indirect, since most routing information is not 333 generated at any given local router. Any reduction of the FIB(s) of 334 a given router via these forms of aggregation are therefore primarily 335 due to distant actions rather than local actions; the FIB of a given 336 router has a size primarily dependent upon the decisions of other RDs 337 (in the absence of vertical aggregation). 339 Currently, there are on the order of 35,000 total destinations 340 announced in the "default-free" worldwide Internet.[1] However, even 341 the most well-connected border router (at, e.g., MAE-East in the 342 United States) might have only on the order of 40-50 peers; if we 343 approximate one next hop per peer, the ratio of destinations to next 344 hops suggests that there is a potential for substantial savings of a 345 very important resource: the number of slots in the FIBs of 346 "default-free" routers. The potential for savings is even better for 347 non-border routers of a "default-free" backbone: with about 35,000 348 total destinations and around 5 next hops, there is another order of 349 magnitude in the ratio of destinations to next hops. 351 5. Vertical Aggregation 353 It is precisely this niche which vertical aggregation occupies. The 354 aggregation is "vertical" because the aggregation occurs within a 355 single router and does not affect the flow of routing information to 356 peers of the router implementing the FIB aggregation. The FIB 357 entries produced by this procedure will generally consist of a subset 358 of the destinations available to the router, but FIB aggregation has 359 no affect on the destinations placed in Adj-RIBs-Out (Adjacent-RIBs- 360 Out; see Diagram 3, above) for announcement to routing peers. FIB 361 aggregation acts locally and directly on the "candidate" FIB; hence, 362 the potential for reduction of the local FIB looms much larger than 363 that achievable through manipulation of the local FIB via remote, 364 indirect means. 366 In order to understand how this might work, we need to examine the 367 specifics of this proposal. The proposal will be developed by 369 - considering the FIB filtering currently used, and 370 - developing the design criteria for the desired FIB filter; 371 - proposing a solution algorithm which fulfills these criteria. 373 5.1 Character of the Current FIB Filter Function f() 375 Currently, the selection function indicated by f() in Diagram 3 is a 376 unity filter which acts on either: 378 - the only Loc-RIB (this is the common case in which there is 379 only one layer of RIBs in Diagram 3, above; i.e., there is 380 only one ToS/QoS for all announced destinations), or 382 - a "chosen" Loc-RIB corresponding to the particular ToS/QoS 383 which will be supported by the router as its single FIB (i.e., 384 f() is a unity filter for one Loc-RIB and discards/ignores 385 all other Loc-RIBs). 387 (There are a few kernels which support multiple FIBs--i.e., multiple 388 ToS/QoS values--and which, therefore, have a filter function f() which 389 sets all [or >1] RIBs' routes for inclusion in the FIBs. This is merely 390 a variation on the "all-or-nothing" selection functions mentioned above.) 391 Therefore, the current use of f() is as a very simple transformation; 392 if we view the FIB and the Loc-RIB(s) as matrices, then we have 393 either: 395 FIB = 1 * Loc-RIB 396 FIB = Loc-RIBj = 1 * (KD jk) * Loc-RIBk, k = 1, 2, ... m 398 where (KD jk) is the author's ASCII attempt to express the Kronecker 399 Delta of indices 'j' and 'k' and 'k' varies over the values 1 to 'm' 400 (the number of ToS/QoS supported by the router, which therefore 401 determines the number of Loc-RIBs). 403 It is clear that f() currently depends upon the level of protocol and 404 router technology in terms of ToS/QoS support. 406 5.2 New FIB Filter Function--General Character and Constraints 408 For the case of FIB aggregation, however, f() is a more detailed 409 function, and it would replace any occurrence of f() where f() is 410 currently the unity filter; thus, this vertical or FIB aggregation 411 should occur after "best route" selection (update of the Loc-RIB(s)) 412 has occurred and just prior to insertion of routes into the FIB. 414 The function f() is a transformation of the form 416 FIB' = f() FIB = f() Loc-RIBj 418 i.e., it must produce a transformation of the single Loc-RIB or 419 selected Loc-RIB to produce a new FIB, FIB'. In this document, we 420 will consider a transformation f() such that a sole general 421 constraint is fulfilled: 423 GENERAL-C) Routing of packets must be unaffected by the 424 substitution of FIB' for FIB. 426 In other words, the aggregated FIB, FIB', must yield routing 427 equivalent to the unaggregated FIB;[7] GENERAL-C guarantees that 428 routing which uses any FIB' produced by f() is as reasonable as that 429 experienced by the use of the associated (unaggregated) FIB, so that 430 the "reasonableness" of routing is not determined by f(), but is 431 determined by the specific NLRI exchanged via routing protocols in 432 combination with local policies other than FIB aggregation policies. 434 5.3 Implications of the Constraint 436 5.3.1 CIDR-Related Implications of the Constraint 438 The CIDR draft has two fundamental rules[8]: 440 CIDR-1) Longest-match routing; routing matches for 441 forwarding decisions are based upon testing 442 the most-specific prefixes in the FIB before 443 less-specific prefixes. 445 CIDR-2) Aggregation-related rule; packets bound for 446 non-extant components of a aggregate must 447 be discarded by the aggregating routing 448 domain. 450 How do these rules affect the constraint on the FIB transformation 451 f()? 453 CIDR-2 is easy to deal with; the implementation of this rule has been 454 carried out previous to the Loc-RIB finalization which occurs before 455 the FIB is handed the "best" routes. Therefore, CIDR-2 does not 456 directly affect the constraint or transformation (though there is a 457 tangential effect due to different types of routes which is covered 458 in section 5.3.3, below). 460 The rule labelled CIDR-1 is more interesting, though, and it means 461 that we have an interesting observation in terms of the constraint: 463 CIDR-C) In order to satisfy the constraint in view of 464 CIDR-1, it is necessary to maintain the "stratification 465 of next hops". 467 What is meant will become clear with the help of a definition. 469 "related prefixes" in a FIB are any complete set of 470 destination prefixes in the FIB which: 472 - share a portion of IPv4 address space, no 473 matter what the size; 475 - can be arranged in an order such that each 476 new prefix in the set is a strict superset 477 of the previous prefix; and 479 - are ordered in that fashion, with the longest 480 prefix being encountered first. 482 If a given FIB is arranged into a radix trie, then sets of related 483 prefixes consist of all of the nodes in the path from any given leaf 484 or terminal node in the trie back to the root (in that order). 486 The "stratification of next hops" emerges if one considers arbitrary 487 sets of related prefixes; when the prefixes are ordered as above and 488 listed along with their next hops, it is clear that routing which 489 uses the given FIB works in the way that it does because different 490 parts of the sequence of related prefixes have different next hops. 491 However, adjacent prefixes in any set of related prefixes which have 492 the _same_ next hop show that there is an extra route in the FIB--the 493 more-specific route does not provide any more routing information 494 than does the less-specific route. Whenever a different next hop is 495 encountered in more-specific information, though, it must be 496 preserved. 498 Therefore, CIDR-C means that this layering of next hops cannot be 499 violated; it must be an invariant of the transformation f(). 500 Conversely, CIDR-C also means that adjacent prefixes in a set of 501 related prefixes which have the same next hop can be safely 502 aggregated, assuming that any other constraints developed below are 503 also enforced; clearly, this will not violate our general constraint 504 on f(). 506 5.3.2 BGP4-Related Implications of the Constraint 508 RFC1771, the current BGP4 specification, says that a BGP4 speaker 510 - must have next hops for all routes in the Loc-RIB 511 in its FIB,[9] and 513 - cannot advertise routes which it doesn't use.[10] 515 How are these to be handled? The first requirement of RFC1771 will 516 be broken in the letter of the rule but will kept in the spirit via 517 the constraint on f(), since FIB' will route in the same way that FIB 518 routes. Similarly, the second requirement is kept in spirit via the 519 announcement only of "best" routes (the horizontal path doesn't 520 change with FIB aggregation) and in routing fact via the 521 implementation of the new f(), though, again, not all routes in the 522 Loc-RIB are actually used. 524 5.3.3 Related Considerations 526 There is an additional, particularly salient feature of aggregation 527 which needs to be considered by any algorithm for aggregation, 528 particularly for FIB aggregation; though obvious, it must be 529 remembered that only similar kinds of routes can be subjected to 530 aggregation by f() in order to uphold the constraint. As alluded to 531 in section 5.3.1, above, CIDR-2 implies that, along with "normal" 532 routes to real destinations (i.e., those routes which have next hops 533 that result in furthering the delivery of packets), a router's kernel 534 may well support, e.g., "reject" or "blackhole" routes. Clearly, 535 these represent different "route types" which cannot be aggregated 536 with "normal" routes without causing changes to routing, in violation 537 of the invariance of routing which we require of the transformation 538 function f(). 540 Indeed, while NLRI at the peer-session level are often considered to 541 consist of triples of the form: 543 545 routing entries in terms of the kernel can be considered to consist 546 of the quintuples: 548 550 in that "route type" and ToS/QoS values affect the processing of NLRI 551 in actual forwarding decisions (note that "route type" is really 552 tagged only at the local router, and therefore varies with local 553 policy, as well as being kernel-dependent). 555 Just as with "route type", different ToS/QoS values represent 556 boundaries across which aggregation cannot occur. Since the FIB or 557 FIBs consist of either an elected ToS/QoS or a set of FIBs (one for 558 each value of ToS/QoS supported by the kernel or the routing domain), 559 the function f() must be applied to each FIB independently, and 560 therefore will not aggregate across different ToS/QoS values. 562 In fact, careful consideration of the implication of the existence of 563 different types of routes leads to a new criterion: 565 TYPE-C) In order to satisfy the constraint in view of the 566 existence of different types of routes, it is necessary 567 to maintain the "stratification of route types". 569 TYPE-C is to be understood in the manner analogous to the 570 understanding of CIDR-C; if a set of related prefixes which contains 571 prefixes of various types is examined, it will become clear that 572 differences and layering of route type also have to be kept invariant 573 by the transformation f(). 575 Therefore, to continue the analogy with CIDR-C, TYPE-C also means 576 that adjacent prefixes in a set of related prefixes which have the 577 same type can be safely aggregated, assuming that CIDR-C is also 578 kept; as above, this will not violate our general constraint on f(). 580 Finally, it is important for any implementor of f() to keep in mind 581 kernel-specific limitations when attempting to create code for a 582 range of machines, some which could have kernels which don't even 583 support CIDR routes (and, therefore, for which f() may be essentially 584 unimplementable or of extremely little utility). Though this should 585 be obvious, it may make adherence to TYPE-C difficult. 587 5.4 Solution of f() 589 5.4.1 FIB Representation 591 The FIB aggregation function can be described in terms of any 592 convenient basis representation of the FIB; let us consider the 593 "candidate FIB"--the FIB to be transformed via application of f() to 594 FIB', the aggregated FIB--to be stored in a radix trie (with the 595 default route, if any, at the root of the trie, and the longest 596 prefixes stored furthest away from the root; prefix length increases 597 as one becomes more distant from the root). The general advantages 598 of this representation are numerous; for our purposes, the fact that 599 potential aggregate prefixes of the prefix associated with a given 600 node will lie on the path to the root is of prime importance. 602 5.4.2 Solution of f()--Simple Case 604 For the case where this candidate FIB is entirely composed of normal, 605 active routes (i.e., where there are no reject or other routes which 606 are to be interpreted in a manner other than that of a normal, active 607 route) to the destinations, the algorithm is simple: 609 If 610 the next hop for a destination associated with a child node 611 is the same as 612 the next hop associated with the parent node, 613 then 614 the destination/next hop associated with the child node 615 is EXCLUDED FROM FIB'; 616 otherwise, 617 the destination/next hop associated with the child node 618 is INCLUDED IN FIB'. 620 Clearly, this will result in a FIB' which upholds the constraint 621 expressed in section 5.2 above; child nodes which have the same next 622 hop as the parent do not contribute additional information to 623 routing. When differences in next hops are detected, the child's 624 information is used, which fulfills the "stratification of next hops" 625 criterion CIDR-C from section 5.3.1 above. (The stratification of 626 types, criterion TYPE-C from section 5.3.3 above, is guaranteed by 627 the degenerate nature of the candidate FIB.) 629 5.4.3 Solution of f()--General Case 631 For the more general case of a candidate FIB which contains several 632 different types of routes we need only slightly revise the above 633 text: 635 If 636 the next hop and route type for a destination associated 637 with a child node 638 are the same as 639 the next hop and route type associated with the parent node, 640 then 641 the destination/next hop associated with the child node 642 is EXCLUDED FROM FIB'; 643 otherwise, 644 the destination/next hop associated with the child node 645 is INCLUDED IN FIB'. 647 Clearly, this solution satisfies both CIDR-C and TYPE-C as well as 648 the overriding constraint GENERAL-C. 650 6. Conclusion and Next Steps 652 The algorithms for f() presented above guarantee that the three 653 constraints presented in this document (GENERAL-C, CIDR-C, and TYPE- 654 C) are all enforced; via the removal of routing information which 655 contributes nothing to the actual forwarding decisions made by a 656 router, the FIB is transformed to a smaller FIB, FIB', which has the 657 same routing characteristics as FIB. 659 The measures propounded in this document will, in general, have an 660 effect which varies as a function of the number of next hops which a 661 given router has, and therefore may not have as much utility in 662 border routers as in non-border routers. However, this scheme will 663 certainly aid a given RD to have greater control over the size of its 664 own FIBs (reducing the impact of RDs which have done a poor job of 665 aggregating their outbound routing announcements) without 666 jeopardizing either the consistency of routing information or the 667 actual routing of IP packets. 669 The actual effects of the implementation and use of f() have yet to 670 be measured, and this is the pressing "next step" in the evaluation 671 of the usefulness of "vertical"/FIB aggregation. The author is 672 currently implementing f() in GateD and hopes to have concrete 673 numbers available shortly; other implementations and results are 674 obviously both welcome and useful, too, in order to attempt to 675 quantify the benefits of using vertical aggregation. 677 7. Security Considerations 679 Security considerations are not discussed in this document. 681 8. Acknowledgements 683 The author would like to thank Sue Hares of Merit Network, Inc. for 684 her encouragement and review of this document. Additionally, he 685 would like to thank both Ms. Hares and Laurent Joncheray (also of 686 Merit Network, Inc.) for providing helpful information and 687 suggestions related to implementing the algorithm described in this 688 document in GateD. Thanks also to Radia Perlman for her review of 689 this work. 691 9. Author's Address 693 Steven J. Richardson 694 Merit Network, Inc. 695 4251 Plymouth Rd., Suite C 696 Ann Arbor, MI 48105-2785 698 email: sjr@merit.edu 699 phone: (313) 647-4813 700 fax: (313) 647-3185 702 10. Notes 704 [1] This is taken from information exchanged on the cidrd email list 705 in late February/early March 1996. 707 [2] This diagram was inspired by Figure 7., entitled "Routeing 708 Information Base," on p. 91 of the reference ISO10747. 710 [3] See, e.g., RFC1771, section 3.2. 712 [4] See, e.g., RFC1771, sections 9.1, 9.1.3, and 9.2.4.2 for BGP4 713 aggregation and the decision process, and sections 7.16, 7.16.3, 714 7.16.3.1, 7.18.2, 7.18.2.1 - 7.18.2.3 of ISO10747 for IDRP 715 aggregation and the decision process. 717 [5] See, e.g., RFC1771, sections 9.1, 9.1.1 and 9.2.1 for the BGP4 718 internal update process and its placement in the decision process. 720 [6] This is documented, e.g., in the gated-R3_6Alpha_1 release 721 (available at ftp://ftp.gated.org/net-research/gated/gated- 722 R3_6Alpha_1.tar.gz), in the BGP configuration statement. 724 [7] Clearly, as with any routing policy, any FIB aggregation 725 function f() which is adopted by a given routing domain should be 726 applied consistently across all of its routers, though the 727 constraint's requirement that FIB' yield the same routing as that 728 generated by FIB means that a routing domain need _not_ implement f() 729 in all of its routers. 731 [8] RFC1771, sections 4.2 and 4.3. 733 [9] RFC1771, section 3.1. 735 [10] RFC1771, sections 3.1, 9.1 (esp. the paragraph labelled "c)"), 736 and 9.1.3. 738 11. References 740 ISO10747 "Information Processing Systems - Telecommunications and Information 741 Exchange between Systems - Protocol for Exchange of Inter-domain 742 Routeing Information among Intermediate Systems to Support Forwarding 743 of ISO 8473 PDUs", Proposed Draft of ISO/IEC 10747, 04/27/1993. 745 RFC1518 Y. Rekhter, T. Li, "An Architecture for IP Address Allocation with 746 CIDR", 09/24/1993. 748 RFC1519 V. Fuller, T. Li, J. Yu, K. Varadhan, "Classless Inter-Domain 749 Routing (CIDR): an Address Assignment and Aggregation Strategy", 750 09/24/1993. 752 RFC1520 Y. Rekhter, C. Topolcic, "Exchanging Routing Information Across 753 Provider Boundaries in the CIDR Environment", 09/24/1993. 755 RFC1771 Y. Rekhter, T. Li, "A Border Gateway Protocol 4 (BGP-4)", 756 03/21/1995. 758 RFC1772 Y. Rekhter, P. Gross, "Application of the Border Gateway Protocol 759 in the Internet", 03/21/1995. 761 RFC1817 Y. Rekhter, "CIDR and Classful Routing", 08/04/1995.