idnits 2.17.1 draft-baker-ipv6-ospf-dst-src-routing-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 02, 2013) is 4009 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-02) exists of draft-acee-ospfv3-lsa-extend-00 ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 OSPF F.J. Baker 3 Internet-Draft Cisco Systems 4 Intended status: Standards Track May 02, 2013 5 Expires: November 03, 2013 7 IPv6 Source/Destination Routing using OSPFv3 8 draft-baker-ipv6-ospf-dst-src-routing-02 10 Abstract 12 This note describes the changes necessary for OSPFv3 to route classes 13 of IPv6 traffic that are defined by an IPv6 source prefix and a 14 destination prefix. This implies not simply routing "to a 15 destination", but "traffic going to that destination AND coming from 16 a specified source". It may be combined with other qualifying 17 attributes, such as "traffic going to that destination AND using a 18 specified flow label AND from a specified source prefix". The 19 obvious application is egress routing, as required for a multihomed 20 entity with a provider-allocated prefix from each of several upstream 21 networks. Traffic within the network could be source/destination 22 routed as well, or could be routed from "any prefix", ::/0. If 23 traffic is routed from the relevant PA prefixes but in fact has a 24 source address that is in none of them, the traffic in effect has no 25 route. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on November 03, 2013. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 63 2. Theory of Routing . . . . . . . . . . . . . . . . . . . . . . 3 64 2.1. Dealing with ambiguity . . . . . . . . . . . . . . . . . 3 65 3. Extensions necessary for IPv6 Source/Destination Routing in 66 OSPFv3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 3.1. IPv6 Source Prefix TLV . . . . . . . . . . . . . . . . . 4 68 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 69 5. Security Considerations . . . . . . . . . . . . . . . . . . . 5 70 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 71 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 7.1. Normative References . . . . . . . . . . . . . . . . . . 6 73 7.2. Informative References . . . . . . . . . . . . . . . . . 6 74 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 6 75 Appendix B. Use case: Egress Routing . . . . . . . . . . . . . . 6 76 Appendix C. FIB Design . . . . . . . . . . . . . . . . . . . . . 7 77 C.1. Linux Source-Address Forwarding . . . . . . . . . . . . . 7 78 C.1.1. One FIB per source prefix . . . . . . . . . . . . . . 8 79 C.1.2. One FIB per source prefix plus a general FIB . . . . 8 80 C.2. PATRICIA . . . . . . . . . . . . . . . . . . . . . . . . 9 81 C.2.1. Virtual Bit String . . . . . . . . . . . . . . . . . 9 82 C.2.2. Tree Construction . . . . . . . . . . . . . . . . . . 9 83 C.2.3. Tree Lookup . . . . . . . . . . . . . . . . . . . . . 10 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 86 1. Introduction 88 This specification builds on OSPF for IPv6 [RFC5340] and the 89 extensible LSAs defined in [I-D.acee-ospfv3-lsa-extend]. It adds the 90 sub-TLV option for an IPv6 [RFC2460] Source Prefix, to define routes 91 defined by a source and a destination prefix. 93 1.1. Requirements Language 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in [RFC2119]. 99 2. Theory of Routing 101 Both IS-IS and OSPF perform their calculations by building a lattice 102 of routers and routes from the router performing the calculation to 103 each router, and then use those routes to get to destinations that 104 those routes advertise connectivity to. Following the SPF algorithm, 105 calculation starts by selecting a starting point (typically the 106 router doing the calculation), and successively adding {link, router) 107 pairs until one has calculated a route to every router in the 108 network. As each router is added, including the original router, 109 destinations that it is directly connected to are turned into routes 110 in the route table: "to get to 2001:db8::/32, route traffic to 111 {interface, list of next hop routers}". For immediate neighbors to 112 the originating router, of course, there is no next hop router; 113 traffic is handled locally. 115 In this context, the route is qualified by a source prefix; It is 116 installed into the FIB with the value, and the FIB applies the route 117 if and only if the IPv6 source address matches the advertised prefix. 118 Of course, there may be multiple LSAs in the LSDB with the same 119 destination and differing source prefixes; these may also have the 120 same or differing next hop lists. The intended forwarding action is 121 to forward matching traffic to one of the next hop routers associated 122 with this destination and source prefix, or to discard non-matching 123 traffic as "destination unreachable". 125 LSAs that lack a source prefix sub-TLV match any flow label, by 126 definition. 128 2.1. Dealing with ambiguity 130 In any routing protocol, there is the possibility of ambiguity. An 131 area border router might, for example, summarize the routes to other 132 areas into a small set of relatively short prefixes, which have more 133 specific routes within the area. Traditionally, we have dealt with 134 that using a "longest match first" rule. If the same datagram 135 matches more than one destination prefix advertised within the area, 136 we follow the route to the longest matching prefix. 138 When routing a class of traffic, we follow an analogous "most 139 specific match" rule; we follow the route for the most specific 140 matching tuple. In cases of simple overlap, such as routing to 141 2001:db8::/32 or 2001:db8:1::/48, that is exactly analogous; we 142 choose the route that specifies more bits. 144 It is possible, however, to construct an ambiguous case in which 145 neither class subsumes the other. For example, presume that 147 o A is a prefix, 149 o B is a more-specific prefix within A, 151 o C is a different prefix, and 153 o D is a more-specific prefix of C. 155 The two routes "from D to A" and "from C to B" are ambiguous: a 156 datagram within "from D to B" matches both tuples, and it is not 157 clear in the data plane what decision to make. Solving this requires 158 the addition of a third route in the FIB corresponding to the class 159 "from D to B", which is more-specific than either of the first two. 161 To avoid routing loops, the important question is what next hops 162 would be relevant. The manufactured FIB route is of course the 163 intersection of the two tuples; its list of next hops MUST of course 164 be the intersection of the two sets of next hop routers and 165 interfaces. That intersection could be the null set, in which case 166 the intersection route would be a discard (null) route. 168 3. Extensions necessary for IPv6 Source/Destination Routing in OSPFv3 170 The extensible LSA format defined in [I-D.acee-ospfv3-lsa-extend] 171 requires one additional option to accomplish label+destination 172 routing: the source prefix. This is defined here. 174 Editor's note-to-self: the following statement is my expectation. 175 That said, the authors of [I-D.acee-ospfv3-lsa-extend] suggest 176 that an area should have one type of LSA (as specified in 177 [RFC5340]) or the extended LSA. I'll leave the statement for the 178 moment, and remove it if the OSPF working group tells me to. 180 In addition, should (as one might expect is normal) destination-only 181 intra-area-prefix, inter-area-prefix, and AS-external-prefix LSAs be 182 encountered, we need a rule for interpretation. The rule is that 183 they are treated exactly as the extensible version if the source 184 prefix option is not specified or is specified to be ::/0 (any IPv6 185 address). 187 3.1. IPv6 Source Prefix TLV 188 The IPv6 Source Prefix TLV is derived from the Link-LSA and Inter- 189 Area-Prefix-LSA address format. 191 0 1 2 3 192 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 194 | Type | Length | 195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 196 | PrefixLength | PrefixOptions | 0 | 197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 198 | Address Prefix | 199 | ... | 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 202 Source Prefix TLV 204 Type: assigned by IANA 206 TLV Length: Length of the value portion of the TLV in octets. This 207 is by definition 20. 209 PrefixLength, PrefixOptions, and Address Prefix: Representation of 210 an IPv6 address prefix, as described in [RFC5340] Appendix A.4.1 212 4. IANA Considerations 214 The OSPF Working Group will need a registry for sub-TLV Types. To be 215 discussed 217 5. Security Considerations 219 While source/destination routing could be used as part of a security 220 solution, it is not really intended for the purpose. The approach 221 limits routing, in the sense that it routes traffic to an appropriate 222 egress, or gives a way to prevent communication between systems not 223 included in a source/destination route, and in that sense could be 224 considered similar to an access list that is managed by and scales 225 with routing. 227 6. Acknowledgements 229 Acee Lindem contributed to the concepts in this draft. 231 7. References 233 7.1. Normative References 235 [I-D.acee-ospfv3-lsa-extend] 236 Lindem, A., Mirtorabi, S., Roy, A., and F. Baker, "OSPFv3 237 LSA Extendibility", draft-acee-ospfv3-lsa-extend-00 (work 238 in progress), May 2013. 240 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 241 Requirement Levels", BCP 14, RFC 2119, March 1997. 243 [RFC2460] Deering, S.E. and R.M. Hinden, "Internet Protocol, Version 244 6 (IPv6) Specification", RFC 2460, December 1998. 246 [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF 247 for IPv6", RFC 5340, July 2008. 249 7.2. Informative References 251 [PATRICIA] 252 Morrison, D.R., "Practical Algorithm to Retrieve 253 Information Coded in Alphanumeric", Journal of the ACM 254 15(4) pp514-534, October 1968. 256 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 257 Defeating Denial of Service Attacks which employ IP Source 258 Address Spoofing", BCP 38, RFC 2827, May 2000. 260 Appendix A. Change Log 262 Initial Version: February 2013 264 First revision: April 2013 266 Correction: Corrected the reference to [I-D.acee-ospfv3-lsa-extend] 268 Appendix B. Use case: Egress Routing 270 Using this technology for egress routing is straightforward. Presume 271 a multihomed edge (residential or enterprise) network with multiple 272 egress points to the various ISPs. These ISPs allocate PA prefixes 273 to the network. Due to BCP 38 [RFC2827], the network must presume 274 that its upstream ISPs will filter out any traffic presented to them 275 that does not use their PA prefix. 277 Within the network, presume that a /64 prefix from each of those PA 278 prefixes is allocated on each LAN, and that hosts generate and use 279 multiple addresses on each interface. 281 Within the network, we permit any host to communicate with any other. 282 Hence, routing advertisements within the network use traditional 283 destination routing, which is understood to be advertising the 284 traffic class 286 {destination, ::/0}. 288 From the egresses, the firewall or its neighboring router injects a 289 default route for traffic "from" its PA prefix: 291 {::/0, PA prefix}. 293 Routing is calculated as normal, with the exception that traffic 294 following a default route will select that route based on the source 295 address. Traffic will never be lost to BCP 38 filters, because by 296 definition the only traffic sent to the ISP is using the PA prefix 297 assigned by the ISP. In addition, while hosts can use spoofed 298 addresses outside of their PA prefixes to attack each other, they 299 cannot send traffic using spoofed addresses to their upstream 300 networks; such traffic has no route. 302 Appendix C. FIB Design 304 While the design of the Forwarding Information Base is not a matter 305 for standardization, as it only has to work correctly, not 306 interoperate with something else, the design of a FIB for this type 307 of lookup may differ from approaches used in destination routing. We 308 describe one possible approach that is known to work, from the 309 perspective of a proof of concept. 311 C.1. Linux Source-Address Forwarding 313 The University of Waikato has added to the Linux Advanced Routing & 314 Traffic Control facility the ability to maintain multiple FIBS, one 315 for each of a set of prefixes. Implementing source/destination 316 routing using this mechanism is not difficult. 318 The router must know what source prefixes might be used in its 319 domain. This may be by configuration or, at least in concept, 320 learned from the routing protocols themselves. In whichever way that 321 is done, one can imagine two fundamental FIB structures to serve N 322 source prefixes; N FIBs, one per prefix, or N+1 FIBs, one per prefix 323 plus one for destinations for which the source prefix is unspecified. 325 C.1.1. One FIB per source prefix 327 In an implementation with one FIB per source prefix, the routing 328 algorithm has two possibilities. 330 o If it calculates a route to a prefix (such as a default route) 331 associated with a given source prefix, it stores the route in the 332 FIB for the relevant source prefix. 334 o If it calculates a route for which the source prefix is 335 unspecified, it stores that route in all N FIBs. 337 When forwarding a datagram, the IP forwarder looks at the source 338 address of the datagram to determine which FIB it should use. If it 339 is from an address for which there is no FIB, the forwarder discards 340 the datagram as containing a forged source address. If it is from an 341 address within one of the relevant prefixes, it looks up the 342 destination in the indicated FIB and forwards it in the usual way. 344 The argument for this approach is simplicity: there is one place to 345 look in making a forwarding decision for any given datagram. The 346 argument against it is memory space; it is likely that the FIBs will 347 be similar, but every destination route not associated with a source 348 prefix is duplicated in each FIB. In addition, since it 349 automatically removes traffic whose source address is not among the 350 configured list, it limits the possibility of user software using 351 improper addresses. 353 C.1.2. One FIB per source prefix plus a general FIB 355 In an implementation with N+1 FIBs, the algorithm is slightly more 356 complex. 358 o If it calculates a route to a prefix (such as a default route) 359 associated with a given source prefix, it stores the route in the 360 FIB for the relevant source prefix. 362 o If it calculates a route for which the source prefix is 363 unspecified, it stores that route in the FIB that is not 364 associated with a source prefix. 366 When forwarding a datagram, the IP forwarder looks at the source 367 address of the datagram to determine which FIB it should use. If it 368 is from one of the configured prefixes, it looks the destination up 369 in the indicated FIB. In any event it also looks the destination up 370 in the "unspecified source address" FIB. If the destination is found 371 in only one of the two, the indicated route is followed. If the 372 destination is found in both, the more specific route is followed. 374 The argument for this approach is memory space; if a large percentage 375 of routes are only in the general FIB, such as when egress routing is 376 used for the default route and all other routes are internal, the 377 other FIBs are likely to be very small - perhaps only a single 378 default route. The argument against this approach is complexity: 379 most lookups if not all will be done in a prefix-specific FIB and in 380 the general FIB. 382 C.2. PATRICIA 384 One approach is a [PATRICIA] Tree. This is a relative of a Trie, but 385 unlike a Trie, need not use every bit in classification, and does not 386 need the bits used to be contiguous. It depends on treating the bit 387 string as a set of slices of some size, potentially of different 388 sizes. Slice width is an implementation detail; since the algorithm 389 is most easily described using a slice of a single bit, that will be 390 presumed in this description. 392 C.2.1. Virtual Bit String 394 It is quite possible to view the fields in a datagram header 395 incorporated into the classification tuple as a virtual bit string 396 such as is shown in Figure 1. This bit string has various regions 397 within it. Some vary and are therefore useful in a radix tree 398 lookup. Some may be essentially constant - all global IPv6 addresses 399 at this writing are within 2000::/3, for example, so while it must be 400 tested to assure a match, incorporating it into the radix tree may 401 not be very helpful in classification. Others are ignored; if the 402 destination is a remote /64, we really don't care what the EID is. 403 In addition, due to variation in prefix length and other details, the 404 widths of those fields vary among themselves. The algorithm the FIB 405 implements, therefore, must efficiently deal with the fact of a 406 discontiguous lookup key. 408 +---------------------+----------------------+-----+-----------+ 409 |Destination Prefix |Source Prefix |DSCP | Flow Label| 410 +------+------+-------+------+-------+-------+-----+-----------+ 411 Common|Varying|Ignored|Common|Varying|Ignored|Varying or ignored 413 Figure 1: Treating a traffic class as a virtual bit string 415 C.2.2. Tree Construction 417 The tree is constructed by recursive slice-wise decomposition. At 418 each stage, the input is a set of classes to be classified. At each 419 stage, the result is the addition of a lookup node in the tree that 420 identifies the location of its slice in the virtual bit string (which 421 might be a bit number), the width of the slice to be inspected, and 422 an enumerated set of results. Each result is a similar set of 423 classes, and is analyzed in a similar manner. 425 The analysis is performed by enumerating which bits that have not 426 already been considered are best suited to classification. For a 427 slice of N bits, one wants to select a slide that most evenly divides 428 the set of classes into 2^N subsets. If one or more bits in the 429 slice is ignored in some of the classes, those classes must be 430 included in every subset, as the actual classification of them will 431 depend on other bits. 433 Input:{2001:db8::/32, ::/0, *, *} 434 {2001:db8:1::/48, ::/0, AF41, *} 435 {2001:db8:1::/48, ::/0, AF42, *} 436 {2001:db8:1::/48, ::/0, AF43, *} 437 Common parts: Destination prefix 2001:dba, source prefix, and label 438 Varying parts: DSCP and the third set of sixteen bits in the 439 destination prefix 440 One possible decomposition: 441 (1) slice = DSCP 442 enumerated cases: 443 (a) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF41, *} } 444 (b) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF42, *} } 445 (c) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF43, *} } 446 (2) slice = third sixteen bit field in destination 447 This divides each enumerated case into those containing 0001 and 448 "everything else", which would imply 2001:db8::/32 449 (1) DSCP 450 -------------------------- 451 (1a) (1b) (1c) 452 / \ / \ / \ 453 /32 /48 /32 /48 /32 /48 455 Figure 2: Example PATRICIA Tree 457 C.2.3. Tree Lookup 459 To look something up in a PATRICIA Tree, one starts at the root of 460 the tree and performs the indicated comparisons recursively walking 461 down the tree until one reaches a terminal node. When the enumerated 462 subset is empty or contains only a single class, classification 463 stops. Either classification has failed (there was no matching 464 class, or one has presumably found the indicated class. At that 465 point, every bit in the virtual bit string must be compared to the 466 classifier; classification is accepted on a perfect match. 468 In the example in Figure 2, if a packet {2001:db8:1:2:3:4:5:6, 469 2001:db8:2:3:4:5:6:7, AF41, 0} arrives, we start at the root. Since 470 it is an AF41 packet, we deduce that case (1a) applies, and since the 471 destination has 0001 in the third sixteen bit field of the 472 destination address, we are comparing to {2001:db8:1::/48, ::/0, 473 AF41, *}. Since the destination address is within 2001:db8:1::/48, 474 classification as that succeeds. 476 Author's Address 478 Fred Baker 479 Cisco Systems 480 Santa Barbara, California 93117 481 USA 483 Email: fred@cisco.com