idnits 2.17.1 draft-baker-ipv6-ospf-dst-src-routing-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 17, 2013) is 4084 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F.J. Baker 3 Internet-Draft Cisco Systems 4 Intended status: Standards Track February 17, 2013 5 Expires: August 21, 2013 7 IPv6 Source/Destination Routing using OSPFv3 8 draft-baker-ipv6-ospf-dst-src-routing-00 10 Abstract 12 This note describes the changes necessary for OSPFv3 to route classes 13 of IPv6 traffic that are defined by a source prefix and a destination 14 prefix. This implies not routing "to a destination", but "traffic 15 matching a classification tuple". The obvious application is egress 16 routing - routing traffic using a given prefix to an upstream network 17 that will not drop traffic using that prefix using BCP 38 filters. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on August 21, 2013. 36 Copyright Notice 38 Copyright (c) 2013 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 2 55 2. Theory of Routing . . . . . . . . . . . . . . . . . . . . . . 3 56 2.1. Dealing with ambiguity . . . . . . . . . . . . . . . . . 3 57 3. Extensions necessary for IPv6 Source/Destination Routing in 58 OSPFv3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 3.1. IPv6 Source Prefix TLV . . . . . . . . . . . . . . . . . 4 60 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 4 61 5. Security Considerations . . . . . . . . . . . . . . . . . . . 5 62 6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 5 63 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 64 8. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 9.1. Normative References . . . . . . . . . . . . . . . . . . 5 67 9.2. Informative References . . . . . . . . . . . . . . . . . 5 68 Appendix A. Use case: Egress Routing . . . . . . . . . . . . . . 6 69 Appendix B. FIB Design . . . . . . . . . . . . . . . . . . . . . 6 70 B.1. Linux Source-Address Forwarding . . . . . . . . . . . . . 7 71 B.1.1. One FIB per source prefix . . . . . . . . . . . . . . 7 72 B.1.2. One FIB per source prefix plus a general FIB . . . . 7 73 B.2. PATRICIA . . . . . . . . . . . . . . . . . . . . . . . . 8 74 B.2.1. Virtual Bit String . . . . . . . . . . . . . . . . . 8 75 B.2.2. Tree Construction . . . . . . . . . . . . . . . . . . 9 76 B.2.3. Tree Lookup . . . . . . . . . . . . . . . . . . . . . 10 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 79 1. Introduction 81 This specification builds on the extensible LSAs defined in 82 [I-D.baker-ipv6-ospf-extensible.txt]. It adds the option for an IPv6 83 Source Prefix, to define routes defined by a source and a destination 84 prefix. 86 1.1. Requirements Language 87 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 88 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 89 document are to be interpreted as described in [RFC2119]. 91 2. Theory of Routing 93 Both IS-IS and OSPF perform their calculations by building a lattice 94 of routers and routes from the router performing the calculation to 95 each router, and then use those routes to get to destinations that 96 those routes advertise connectivity to. Following the SPF algorithm, 97 calculation starts by selecting a starting point (typically the 98 router doing the calculation), and successively adding {link, router) 99 pairs until one has calculated a route to every router in the 100 network. As each router is added, including the original router, 101 destinations that it is directly connected to are turned into routes 102 in the route table: "to get to 2001:db8::/32, route traffic to 103 {interface, list of next hop routers}". For immediate neighbors to 104 the originating router, of course, there is no next hop router; 105 traffic is handled locally. 107 2.1. Dealing with ambiguity 109 In any routing protocol, there is the possibility of ambiguity. An 110 area border router might, for example, summarize the routes to other 111 areas into a small set of relatively short prefixes, which have more 112 specific routes within the area. Traditionally, we have dealt with 113 that using a "longest match first" rule. If the same datagram 114 matches more than one destination prefix advertised within the area, 115 we follow the route to the longest matching prefix. 117 When routing a class of traffic, we follow an analogous "most 118 specific match" rule; we follow the route for the most specific 119 matching tuple. In cases of simple overlap, such as routing to 120 2001:db8::/32 or 2001:db8:1::/48, that is exactly analogous; we 121 choose one of the two routes. 123 It is possible, however, to construct an ambiguous case in which 124 neither class subsumes the other. For example, presume that 126 o A is a prefix, 128 o B is a more-specific prefix within A, 130 o C is a different prefix, and 132 o D is a more-specific prefix of C. 134 The two classes {A, D, *, *} and {B, C, *, *} are ambiguous: a 135 datagram within {B, D, *, *} matches both classes, and it is not 136 clear in the data plane what decision to make. Solving this requires 137 the addition of a third route in the FIB corresponding to the class 138 {B, D, *, *}, which is more-specific than either of the first two, 139 and can be given routing guidance based on metrics or other policy in 140 the usual way. 142 3. Extensions necessary for IPv6 Source/Destination Routing in OSPFv3 144 The several extensible LSAs defined in 145 [I-D.baker-ipv6-ospf-extensible.txt] require one additional option to 146 accomplish source/destination routing: the source prefix. This is 147 defined here. 149 In addition, should (as one might expect is normal) destination-only 150 intra-area-prefix, inter-area-prefix, and AS-external-prefix LSAs be 151 encountered, we need a rule for interpretation. The rule is that 152 they are treated exactly as the extensible version if the source 153 prefix option is not specified or is specified to be ::/0 (any IPv6 154 address). 156 3.1. IPv6 Source Prefix TLV 158 The IPv6 Source Prefix TLV MAY be used with the IPv6 Destination 159 Prefix TLV, but MUST NOT be used with the IPv4 Source Prefix TLV or 160 the IPv4 Destination Prefix TLV. 162 0 1 2 3 163 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 165 | Type | Length |Prefix Length | Prefix 166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 168 Source Prefix TLV 170 Source Prefix Type: assigned by IANA 172 TLV Length: Length of the TLV in octets 174 Prefix Length: Length of the prefix in bits, in the range 0..128 176 Prefix: (source prefix length +7)/8 octets of prefix 178 4. IANA Considerations 180 This section will request an identifying value for the TLV defined. 181 This is deferred to the -01 version of the draft. 183 5. Security Considerations 185 To be considered. 187 6. Privacy Considerations 189 To be considered. 191 7. Acknowledgements 193 8. Change Log 195 Initial Version: February 2013 197 9. References 199 9.1. Normative References 201 [ISO.10589.1992] 202 International Organization for Standardization, 203 "Intermediate system to intermediate system intra-domain- 204 routing routine information exchange protocol for use in 205 conjunction with the protocol for providing the 206 connectionless-mode Network Service (ISO 8473)", ISO 207 Standard 10589, 1992. 209 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 210 Requirement Levels", BCP 14, RFC 2119, March 1997. 212 9.2. Informative References 214 [I-D.baker-ipv6-ospf-extensible.txt] 215 Baker, F., "Extensible OSPF LSAs", February 2013. 217 [PATRICIA] 218 Morrison, D.R., "Practical Algorithm to Retrieve 219 Information Coded in Alphanumeric", Journal of the ACM 220 15(4) pp514-534, October 1968. 222 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 223 Defeating Denial of Service Attacks which employ IP Source 224 Address Spoofing", BCP 38, RFC 2827, May 2000. 226 Appendix A. Use case: Egress Routing 228 Using this technology for egress routing is straightforward. Presume 229 a multihomed edge (residential or enterprise) network with multiple 230 egress points to the various ISPs. These ISPs allocate PA prefixes 231 to the network. Due to BCP 38 [RFC2827], the network must presume 232 that its upstream ISPs will filter out any traffic presented to them 233 that does not use their PA prefix. 235 Within the network, presume that a /64 prefix from each of those PA 236 prefixes is allocated on each LAN, and that hosts generate and use 237 multiple addresses on each interface. 239 Within the network, we permit any host to communicate with any other. 240 Hence, routing advertisements within the network use traditional 241 destination routing, which is understood to be advertising the 242 traffic class 244 {destination, ::/0}. 246 From the egresses, the firewall or its neighboring router injects a 247 default route for traffic "from" its PA prefix: 249 {::/0, PA prefix}. 251 Routing is calculated as normal, with the exception that traffic 252 following a default route will select that route based on the source 253 address. Traffic will never be lost to BCP 38 filters, because by 254 definition the only traffic sent to the ISP is using the PA prefix 255 assigned by the ISP. In addition, while hosts can use spoofed 256 addresses outside of their PA prefixes to attack each other, they 257 cannot send traffic using spoofed addresses to their upstream 258 networks; such traffic has no route. 260 Appendix B. FIB Design 262 While the design of the Forwarding Information Base is not a matter 263 for standardization, as it only has to work correctly, not 264 interoperate with something else, the design of a FIB for this type 265 of lookup may differ from approaches used in destination routing. We 266 describe one possible approach that is known to work, from the 267 perspective of a proof of concept. 269 B.1. Linux Source-Address Forwarding 271 The University of Waikato has added to the Linux Advanced Routing & 272 Traffic Control facility the ability to maintain multiple FIBS, one 273 for each of a set of prefixes. Implementing source/destination 274 routing using this mechanism is not difficult. 276 The router must know what source prefixes might be used in its 277 domain. This may be by configuration or, at least in concept, 278 learned from the routing protocols themselves. In whichever way that 279 is done, one can imagine two fundamental FIB structures to serve N 280 source prefixes; N FIBs, one per prefix, or N+1 FIBs, one per prefix 281 plus one for destinations for which the source prefix is unspecified. 283 B.1.1. One FIB per source prefix 285 In an implementation with one FIB per source prefix, the routing 286 algorithm has two possibilities. 288 o If it calculates a route to a prefix (such as a default route) 289 associated with a given source prefix, it stores the route in the 290 FIB for the relevant source prefix. 292 o If it calculates a route for which the source prefix is 293 unspecified, it stores that route in all N FIBs. 295 When forwarding a datagram, the IP forwarder looks at the source 296 address of the datagram to determine which FIB it should use. If it 297 is from an address for which there is no FIB, the forwarder discards 298 the datagram as containing a forged source address. If it is from an 299 address within one of the relevant prefixes, it looks up the 300 destination in the indicated FIB and forwards it in the usual way. 302 The argument for this approach is simplicity: there is one place to 303 look in making a forwarding decision for any given datagram. The 304 argument against it is memory space; it is likely that the FIBs will 305 be similar, but every destination route not associated with a source 306 prefix is duplicated in each FIB. In addition, since it 307 automatically removes traffic whose source address is not among the 308 configured list, it limits the possibility of user software using 309 improper addresses. 311 B.1.2. One FIB per source prefix plus a general FIB 313 In an implementation with N+1 FIBs, the algorithm is slightly more 314 complex. 316 o If it calculates a route to a prefix (such as a default route) 317 associated with a given source prefix, it stores the route in the 318 FIB for the relevant source prefix. 320 o If it calculates a route for which the source prefix is 321 unspecified, it stores that route in the FIB that is not 322 associated with a source prefix. 324 When forwarding a datagram, the IP forwarder looks at the source 325 address of the datagram to determine which FIB it should use. If it 326 is from one of the configured prefixes, it looks the destination up 327 in the indicated FIB. In any event it also looks the destination up 328 in the "unspecified source address" FIB. If the destination is found 329 in only one of the two, the indicated route is followed. If the 330 destination is found in both, the more specific route is followed. 332 The argument for this approach is memory space; if a large percentage 333 of routes are only in the general FIB, such as when egress routing is 334 used for the default route and all other routes are internal, the 335 other FIBs are likely to be very small - perhaps only a single 336 default route. The argument against this approach is complexity: 337 most lookups if not all will be done in a prefix-specific FIB and in 338 the general FIB. 340 B.2. PATRICIA 342 One approach is a [PATRICIA] Tree. This is a relative of a Trie, but 343 unlike a Trie, need not use every bit in classification, and does not 344 need the bits used to be contiguous. It depends on treating the bit 345 string as a set of slices of some size, potentially of different 346 sizes. Slice width is an implementation detail; since the algorithm 347 is most easily described using a slice of a single bit, that will be 348 presumed in this description. 350 B.2.1. Virtual Bit String 351 It is quite possible to view the fields in a datagram header 352 incorporated into the classification tuple as a virtual bit string 353 such as is shown in Figure 1. This bit string has various regions 354 within it. Some vary and are therefore useful in a radix tree 355 lookup. Some may be essentially constant - all global IPv6 addresses 356 at this writing are within 2000::/3, for example, so while it must be 357 tested to assure a match, incorporating it into the radix tree may 358 not be very helpful in classification. Others are ignored; if the 359 destination is a remote /64, we really don't care what the EID is. 360 In addition, due to variation in prefix length and other details, the 361 widths of those fields vary among themselves. The algorithm the FIB 362 implements, therefore, must efficiently deal with the fact of a 363 discontiguous lookup key. 365 +---------------------+----------------------+-----+-----------+ 366 |Destination Prefix |Source Prefix |DSCP | Flow Label| 367 +------+------+-------+------+-------+-------+-----+-----------+ 368 Common|Varying|Ignored|Common|Varying|Ignored|Varying or ignored 370 Figure 1: Treating a traffic class as a virtual bit string 372 B.2.2. Tree Construction 374 The tree is constructed by recursive slice-wise decomposition. At 375 each stage, the input is a set of classes to be classified. At each 376 stage, the result is the addition of a lookup node in the tree that 377 identifies the location of its slice in the virtual bit string (which 378 might be a bit number), the width of the slice to be inspected, and 379 an enumerated set of results. Each result is a similar set of 380 classes, and is analyzed in a similar manner. 382 The analysis is performed by enumerating which bits that have not 383 already been considered are best suited to classification. For a 384 slice of N bits, one wants to select a slide that most evenly divides 385 the set of classes into 2^N subsets. If one or more bits in the 386 slice is ignored in some of the classes, those classes must be 387 included in every subset, as the actual classification of them will 388 depend on other bits. 390 Input:{2001:db8::/32, ::/0, *, *} 391 {2001:db8:1::/48, ::/0, AF41, *} 392 {2001:db8:1::/48, ::/0, AF42, *} 393 {2001:db8:1::/48, ::/0, AF43, *} 394 Common parts: Destination prefix 2001:dba, source prefix, and label 395 Varying parts: DSCP and the third set of sixteen bits in the 396 destination prefix 397 One possible decomposition: 398 (1) slice = DSCP 399 enumerated cases: 400 (a) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF41, *} } 401 (b) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF42, *} } 402 (c) { {2001:db8::/32, ::/0, *, *}, {2001:db8:1::/48, ::/0, AF43, *} } 403 (2) slice = third sixteen bit field in destination 404 This divides each enumerated case into those containing 0001 and 405 "everything else", which would imply 2001:db8::/32 406 (1) DSCP 407 -------------------------- 408 (1a) (1b) (1c) 409 / \ / \ / \ 410 /32 /48 /32 /48 /32 /48 412 Figure 2: Example PATRICIA Tree 414 B.2.3. Tree Lookup 416 To look something up in a PATRICIA Tree, one starts at the root of 417 the tree and performs the indicated comparisons recursively walking 418 down the tree until one reaches a terminal node. When the enumerated 419 subset is empty or contains only a single class, classification 420 stops. Either classification has failed (there was no matching 421 class, or one has presumably found the indicated class. At that 422 point, every bit in the virtual bit string must be compared to the 423 classifier; classification is accepted on a perfect match. 425 In the example in Figure 2, if a packet {2001:db8:1:2:3:4:5:6, 426 2001:db8:2:3:4:5:6:7, AF41, 0} arrives, we start at the root. Since 427 it is an AF41 packet, we deduce that case (1a) applies, and since the 428 destination has 0001 in the third sixteen bit field of the 429 destination address, we are comparing to {2001:db8:1::/48, ::/0, 430 AF41, *}. Since the destination address is within 2001:db8:1::/48, 431 classification as that succeeds. 433 Author's Address 435 Fred Baker 436 Cisco Systems 437 Santa Barbara, California 93117 438 USA 440 Email: fred@cisco.com