idnits 2.17.1 draft-donnerhacke-sidr-bgp-verification-dnssec-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 945. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 956. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 963. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 969. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 31 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: Ambigous domains names SHOULD not be abbrivated. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 5, 2008) is 5834 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4893 (Obsoleted by RFC 6793) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft L. Donnerhacke 2 Category: Proposed Standard IKS GmbH 3 Expires: November 2008 W. Wijngaards 4 NLnet Labs 5 May 5, 2008 7 DNSSEC protected routing announcements for BGP 8 draft-donnerhacke-sidr-bgp-verification-dnssec-04 10 Status of this Memo 12 Distribution of this memo is unlimited. 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/1id-abstracts.html 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 Abstract 37 This document describes an infrastructure for real time verification 38 of routes reveived via BGP4. Some DNS query types are introduced to 39 check the origin of a prefix and validity of the AS path. The crypto 40 part can be offloaded from the routing engine by sending a DNS query 41 and checking the AD bit in the DNS response. The proposal depends on 42 the DNS scalability and caching mechanisms as well as PKI introduced 43 by DNSSEC. 45 Table of Contents 47 1. Introduction .................................................. 3 48 2. DNS Mapping ................................................... 4 49 2.1. The ASSET Resource Record ................................ 4 50 2.1.1. ASSET RDATA wire format ............................. 4 51 2.1.2. ASSET RDATA representation format ................... 6 52 2.1.3. Fallback to TXT ..................................... 6 53 2.2. Prefix origin ............................................ 7 54 2.3. AS Peering ............................................... 7 55 2.4. Delegation hierarchy ..................................... 9 56 2.5. Private numbers .......................................... 10 57 2.6. Route and AS path aggregation ............................ 10 58 3. Verification .................................................. 11 59 3.1. Verification algorithm ................................... 11 60 3.2. Offloading crypto ........................................ 12 61 3.3. Zone slaving ............................................. 12 62 3.4. Utilizing peer's cache ................................... 12 63 3.5. Bootstrapping ............................................ 13 64 3.5.1. Delaying verficiation ............................... 13 65 3.5.2. Utilizing peer's resolver ........................... 13 66 4. Related work .................................................. 15 67 5. Test environment .............................................. 16 68 6. Security Considerations ....................................... 17 69 7. IANA Considerations ........................................... 17 70 8. References .................................................... 17 71 8.1. Normative References ..................................... 17 72 8.2. Informal References ...................................... 18 73 9. Changes history ............................................... 19 74 10. Acknowledgements ............................................. 20 76 Nomenclature 78 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 79 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 80 document are to be interpreted as described in [RFC2119]. 82 The process of checking an DNS record set to match the DNSSEC key 83 hierarchy is called "validation" in this document. 85 The process of checking an BGP route for origin and path consistency 86 is called "verification" in this document. 88 An unordered collection of Autonomous System (AS) numbers is called 89 "AS number set" in this document. 91 The DNS resource record representing an AS number set is called 92 "ASSET" in this document. "ASSET RR" means the whole DNS resource 93 record, while "ASSET RDATA" names the payload section of the ASSET 94 RR. 96 The AS-SET object in the Internet Routing Databases (IIRB) is called 97 "AS database set" in this document. 99 The aggregated AS number set stored in the BGP path information is 100 called "aggregated AS path set" in this document. 102 1. Introduction 104 BGP hijacking is a serious problem in the current internet. In an 105 ideal world those cases can't happen at all, because honest operators 106 apply filters on their BGP4 [RFC4271] peerings in order to catch fat- 107 fingered misconfigurations. The filters can automatically derived 108 from existing, well maintained routing databases. A look at actual 109 routing tables suffices for a reality check. 111 This document proposes a real time verification method of received 112 BGP announcements for routers: An efficient, automatic, and external 113 filter. The described infrastructure allows the filtering of bogus 114 announcements even after some steps of transit. 116 All the routing resource meta information is simplified and mapped 117 into a DNS hierarchy. The allocation and assignment chains for AS 118 and IP numbers from the IANA via RIR and LIRs to the routing entities 119 are reflected by the appropriate DNS delegation chain [iananum]. 121 At the routing entity level (i.e. the ISP or customer) the delegated 122 prefix is mapped to the AS number set, which injects the route into 123 the DFZ. Futhermore the peering state is modeled as a two way 124 announcement at this level. 126 Because of DNSSEC [RFC4033] all those delegations and announcements 127 can be validated. When querying, the router can do the DNSSEC vali- 128 dation itself or delegate it to the next validating resolver. A val- 129 idated response contains a special bit (Authenticated Data) assuming 130 the trustworthiness of the link between the resolver and the router. 131 So the router can work with validated data without performing expen- 132 sive cryptographic operations and difficult lookup algorithms. 134 Some special issues arise from the interaction of building the rout- 135 ing table while requiring a working interconnection for verification, 136 and from verification and other operational errors. 138 2. DNS Mapping 140 The mapping is designed to ease the route verification process. All 141 verification steps should be performed in a building a simple DNS 142 query and looking for a single value in the validated DNS response 143 set. Furthermore the whole process should be easy to debug. 145 A new zone BGP.ARPA is introduced to hold the routing resources. For 146 AS number mapping, the zone AS.BGP.ARPA is used. IPv4 prefixes are 147 mapped into IPV4.BGP.ARPA and IPv6 prefixes are mapped into 148 IPV6.BGP.ARPA. 150 2.1. The ASSET Resource Record 152 The ASSET RR contains a AS number set in a compact format. ASSET RRs 153 can be point to multiple other ASSET RRs. Merging those referenced 154 ASSET RRs allows to include AS database sets (in form of ASSET RRs) 155 and to implement really huge AS number sets (as smaller ASSET RRs). 157 The type value for the ASSET RR is TBD (decimal). 159 The ASSET RR is defined for class IN. 161 2.1.1. ASSET RDATA wire format 163 The ASSET RDATA is the concatenation of a single octet with subtype 164 and name nibbles. The name nibble is bits 4-7, and indicates how many 165 names will follow, zero or more names of referenced ASSET RRs. After 166 the names are zero or more number ranges up to the end of RDATA. The 167 subtype and name count are unsigned integers in network order. 169 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 170 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 171 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 172 |subtype| #names| domain name 0 ... domain name N | 173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 | number range 0 ... number range M | 175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 177 A sender MUST NOT use DNS name compression for the names. This 178 allows the ASSET RR to be handled by older software [RFC3597]. 180 An number range is encoded using an unsigned 16-bit base value in 181 network byte order, a single octet range length which is an unsigned 182 integer with the number of entries - 1 and up to 256 entries of 183 16-bit offset values. Each range encodes 32-bit AS numbers by com- 184 bining the offset as lower 16-bit with the base as higher 16-bit. 186 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 187 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 189 | high 16-bit base value | entry count-1 | low 16-bit | 190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 191 | of AS number | low 16-bit of AS number | ... | 192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 194 Encoder software MAY reorder AS numbers for efficent encoding. 195 Encoders MAY issue a warning, if the encoded RDATA exceeds 1000 196 bytes. Encoders SHOULD reject the input, if the encoded RDATA exceeds 197 3500 bytes. Encoders MUST reject the input, if the encoded RDATA 198 exceeds 55 kbytes. 200 The following values are defined for the subtype value: 202 0 Union of the referenced ASSET RRs and embedded number ranges 204 This ASSET RR corresponds to the union set of all AS number sets 205 corresponding to the embedded number ranges and all the AS number 206 sets corresponding to the named ASSET RRs recursively. While pro- 207 cessing ASSET references, the querier MUST provide loop protec- 208 tion. ASSET references are likely to become circular. An ASSET 209 RR with no reference names and no number ranges is allowed and 210 corresponds to the empty set. 212 1 Set of all possible AS numbers 214 This ASSET RR provides a catch all for any possible AS number. If 215 this resource record is found while recursively processing subtype 216 0 records, the whole recursion process can be aborted resulting in 217 the largest possible AS number set. This RR does not contain any 218 referenced names nor any number ranges. So the RDATA wire format 219 of this subtype consists of the single octet "16" (decimal). 221 2 Transition marker 223 This ASSET RR is used while setting up the global infrastructure 224 to mark "to be done" points. Initially this RR has the same 225 semantics as the subtype 1 RR. In the medium term, the semantics 226 of this RR will be changed to generate warnings or errors. In the 227 long term this RR will vanish. This RR does not contain any ref- 228 erenced names nor any number ranges. So the RDATA wire format of 229 this subtype consists of the single octet "32" (decimal). 231 3-15 Reserved 233 2.1.2. ASSET RDATA representation format 235 Subtype 0 resource records are represented by a space seperated 236 sequence of domain names (of the referenced ASSETs) followed a space 237 seperated sequence of AS numbers in the asdot format [as4byte]. Name 238 representations without a trailing dot are abbrivated names in the 239 current $ORIGIN of the zone file. The first term matching the asdot 240 format, i.e. consisting of digits and an optional single dot only, 241 terminates the domain name sequence and starts the AS number 242 sequence. 244 Subtype 1 resource records are represented by the case insensitive 245 term "any". 247 Subtype 2 resource records are represented by the case insensitive 248 term "transition". 250 Ambigous domains names SHOULD not be abbrivated. 252 2.1.3. Fallback to TXT 254 To ease deployment, ASSET RR can be implemented as TXT records, con- 255 taining the representation format of the ASSET RR as RDATA. This 256 allows to provide DNS mapped data in the BGP.ARPA zone without run- 257 ning ASSET aware DNSSEC tools or DNS servers. 259 Routing devices MUST first query for and understand the ASSET RR. 260 Only if the final response contains an authenticated denial of exis- 261 tance (NSEC) record proving the existance of a TXT record for exactly 262 the queried name, the routing device MUST ask for the TXT record. 263 The TXT record is not queried for in other circumstances. So a mini- 264 mal amount of queries is sent. 266 This fallback procedure will be declared obsolete in the medium term. 268 2.2. Prefix origin 270 To query the origin AS number set for a prefix, the prefix is trans- 271 formed similar to reverse lookups and the DNS is queried for ASSET 272 RRs. The DNS response results in a (possibly empty) AS number set. 274 IPv4 prefixes are queried in the same way as classless IN-ADDR.ARPA 275 reverse delegation [RFC2317], but in IPV4.BGP.ARPA instead of IN- 276 ADDR.ARPA. The least specific label MUST contain the netmask of the 277 prefix. 279 IPv6 prefixes are queried in the same way as IP6.ARPA reverse delega- 280 tion [RFC3596], but in IPV6.BGP.ARPA instead of IP6.ARPA. If a dele- 281 gated misses the nibble boundary, the same technique MUST be used as 282 for IPv4. The least specific label MUST contain the netmask of the 283 prefix. 285 Prefixes which MUST NOT appear in global routing tables do not get an 286 entry in the delegation hierarchy. I.e. IPV6.BGP.ARPA should not 287 contain an entry for F. For locally distributed prefixes, the local 288 resolver SHOULD provide more specific zones and trust anchors for 289 those prefixes. This way exception handling in the routing devices 290 is minimized: They simply ask for the data they have to verify. 292 During rollout of this proposal, a transition period is necessary to 293 allow the AS operators to set up the necessary zones and get the del- 294 egations. During the transition, the RIRs SHOULD derive the AS data 295 from the [irdb] or MAY add the "transition" ASSET subtype for the 296 allocated prefixes. 298 Please note, that for multicast routing the destination addresses are 299 not distributed via BGP4, but only the source addresses. So the mul- 300 ticast group addresses from 224.0.0.0/4 and FF00::/8 are never looked 301 up and will not be delegated in BGP.ARPA. 303 Example: 304 $ORIGIN 192/20.17.217.IPV4.BGP.ARPA. 305 @ ASSET 15725 ; CNAME delegation not necessary 307 $ORIGIN 8.D.B.4.1.0.0.2.IPV6.BGP.ARPA. 308 8/32 ASSET 15725 ; Delegation will have a CNAME for it 310 2.3. AS Peering 312 Peering between two AS is fourfold: Sending and accepting on each 313 site of a peering session. Futhermore peering policies depend on the 314 address family of the prefix [RFC4012]. 316 To query the peering policy of AS A in regard to AS B, both AS num- 317 bers are put together with the protocol and the peering direction, 318 and the DNS is queried for ASSET records. The DNS response results 319 in a (possibly empty) AS number set. 321 Local use of private AS numbers SHOULD be announced by adding spe- 322 cific zones and trust anchors at the local resolver. This way excep- 323 tion handling in the routing devices is minimized: Routing devices 324 handle private numbers in the same way as ordinary assigned AS num- 325 bers. 327 During rollout of this proposal, a transition period is necessary to 328 allow the AS operators to set up the necessary zones and get the del- 329 egations. During the transition, the RIRs SHOULD derive the AS data 330 from the [irdb] or MAY insert the "transition" subtype of ASSET. 332 To ease the delegation of AS numbers ranges to a RIR and in order to 333 keep the zone size small for efficent DNSSEC operation, the combining 334 of the two AS numbers for a peering from AS A to AS B is processed in 335 the following way: The 32-bit AS number of A is written as ., then the order of the labels is reversed, 338 and AS.BGP.ARPA appended. The resulting zone SHOULD be under the 339 control of the AS operators. The asdot format of AS B followed by 340 the peering direction ("import" or "export") and the protocol family 341 is prepended to this zone apex. 343 Conversion example: 344 AS15725 -> 0.15725 -> 5.2.7.5.1.0 345 AS3.10 -> 3.00010 -> 0.1.0.0.0.3 346 AS12.34 -> 12.00034 -> 4.3.0.0.0.12 348 Peering information example: 349 $ORIGIN multicast.ipv4.5.2.7.5.1.0.AS.BGP.ARPA. 350 3.3.export ASSET 15725 ; AS15725 exports to AS3.3 only itself 351 3.3.import ASSET 3.3 ; AS15725 imports from AS3.3 only 3.3 352 15725.export ASSET ANY ; AS15725 may prepend 353 15725.import ASSET ANY ; AS15725 may prepend 355 $ORIGIN ipv4.3.0.0.0.0.3.AS.BGP.ARPA. 356 5539.import.unicast ASSET ANY 357 5539.export.unicast ASSET 3.3 358 6695.import.unicast ASSET as-decix.5.9.6.6.0.0.AS.BGP.ARPA. 359 6695.export.unicast ASSET 3.3 360 15725.import.multicast ASSET 3.3 361 15725.export.multicast ASSET ANY 362 $ORIGIN 5.9.6.6.0.0.AS.BGP.ARPA. 363 as-decix ASSET local as-hosteurope.3.7.7.0.2.0.AS.BGP.ARPA. ... 364 local ASSET 12510 12989 20899 25286 31334 31529 41039 42416 366 $ORIGIN 3.7.7.0.2.0.AS.BGP.ARPA. 367 as-hosteurope ASSET 20773 369 2.4. Delegation hierarchy 371 Currently IPv4 addresses are allocated to the RIRs as /8. The dele- 372 gation at IPV4.BGP.ARPA follows this and delegate the zones to RIR's 373 name servers. This mimics the delegation from IANA to the RIRs in 374 IN-ADDR.ARPA. 376 IPv4 addresses are allocated to the LIRs in various sizes. Delega- 377 tion of the allocate is done by the RIR in classless manner. Futher- 378 more the classless prefixes at this level up to the next classful 379 boundary have to be delegated to the LIR, too. The use of CNAME for 380 classless delegations and DNAME for smaller prefixes is REQUIRED. 382 Example: 383 $ORIGIN 17.217.IPV4.BGP.ARPA. 384 192/20 NS avalon.iks-jena.de. 385 $GENERATE 192-207/8 $/21 CNAME $/21.192/20 386 $GENERATE 192-207/4 $/22 CNAME $/22.192/20 387 $GENERATE 192-207/2 $/23 CNAME $/23.192/20 388 $GENERATE 192-207/1 $/24 CNAME $/24.192/20 389 $GENERATE 192-207 $ DNAME $.192/20 391 If the AS operators announces the full allocate, the LIR adds the 392 ASSET RR to the delegated zone. If the AS operators deaggregate the 393 allocate and/or permit assignments to be seperatly announced, the LIR 394 adds further ASSET records or set up delegations to the AS operators. 396 IPv6 address delegation mimics the delegation in IP6.ARPA. Please 397 note the similarity to IPv4 if an allocate or assignment miss the 398 nibble boundary. Futhermore the classless prefixes at this level up 399 to the next classful boundary have to be delegated to the LIR, too. 400 The use of CNAME for classless delegations and DNAME for smaller pre- 401 fixes is REQUIRED. 403 Example: 404 $ORIGIN 0.1.0.0.2.IPV6.BGP.ARPA. 405 8/22 NS ns.ripe.net. 406 $GENERATE 8-15/2 ${0,0,x}/23 CNAME ${0,0,x}/23.8/22 407 $GENERATE 8-15/1 ${0,0,x}/24 CNAME ${0,0,x}/24.8/22 408 $GENERATE 8-15 ${0,0,x} DNAME ${0,0,x}.8/22 410 AS number allocations from IANA to the RIRs are done in large blocks. 411 IANA has to delegate every zone for which the RIR might be responsi- 412 ble, but not more. Additional zones MAY be introduced using DNAME to 413 delegate single AS numbers via RIRs, if the RIR can't maintain the 414 LIRs data directly in the IANA zone (sometimes the IANA delegation 415 can be directly to the LIR). 417 RIRs assign single AS numbers to the LIRs and delegate the appropri- 418 ate zone. 420 AS database sets are a common tool in the Internet Routing Registy 421 [irdb] and maintained by a AS operators. AS operators SHOULD provide 422 their common AS database sets of the routing registry directly as 423 ASSET RR in their associated AS.BGP.ARPA zone. Other AS operators 424 are encouraged to refer to those ASSET records instead of generating 425 the own ASSET RR using a database toolset. Referencing provides much 426 smaller zone files and "automatic" update of changes. On the other 427 hand generating the whole AS number set directly from the database 428 provides a locally cached and therefore more stable version of the 429 peering information. 431 2.5. Private numbers 433 The delegation described in the previous section can't cover usage of 434 private addresses or AS numbers. Private numbers are not delegated, 435 but only reserved by IANA. Instead of officially marking reserved 436 ranges to hand over the control to local router configuration, the 437 reserved ranges are simply not delegated at all. 439 If private addresses or numbers are in use, the DNS operators of this 440 environment SHOULD set up local zones in BGP.ARPA, sign them and 441 locally distribute the trust anchors. This way the verification pro- 442 cess for routers stays simple. The zones SHOULD be shared between 443 between involved AS to avoid duplication of configuration data. 445 Configured local zones for private space MUST NOT be redistributed in 446 the official BGP.ARPA tree. DNS operators need to make sure, that 447 those zones are not visible in unrelated AS. The authoritive name 448 servers serving local zones in BGP.ARPA SHOULD be kept seperate from 449 the authoritive name servers visible to the public. When using local 450 zones in BGP.ARPA, the recursive, validating resolver used for router 451 equipment SHOULD be kept seperate from the DNS resolvers for cus- 452 tomers. 454 2.6. Route and AS path aggregation 456 A not uncommon BGP setup is to aggregate several more specific routes 457 to a larger prefix. The aggregated prefix is injected into the 458 global routing table by the aggregating AS. Optionally the AS path 459 can contain a aggregated AS path set, in order to prevent the aggre- 460 gated route to be propagated back. 462 For the purpose of verifying the origin of a prefix, the whole aggre- 463 gation process as well as the aggregated AS path set can be ignored. 464 So aggregated AS path sets MUST be stripped from the AS path before 465 verification. The aggregating AS is considered as the origin of the 466 aggregated prefix. 468 3. Verification 470 A router receives routes in a given address family consisting of a 471 prefix and a AS path via BGP4. The router has to verify, if the 472 incoming route is allowed or not. 474 The router has to check the following criteria: 475 - is the originating AS allowed to inject the route? 476 - do all the AS in the path peer as claimed? 477 - does the recorded path fullfill the peering policies? 479 3.1. Verification algorithm 481 To check the origin, the router queries for the prefix as described 482 in 2.2. If the last AS in the path, which is not part of an aggre- 483 gated AS path set, is in the AS number set of the DNS response, the 484 origin is verified. If the prefix can't be found, the check fails. 486 To check the peering policies, for each pair of sequenced AS in the 487 path a query as described in 2.3. is performed. Aggregated AS path 488 sets are ignored. The policy of the sending AS MUST contain all AS 489 numbers of the path tail including the sending AS number for the 490 address family and for the direction "export". The policy of the 491 receiving AS MUST contain all AS numbers of the path tail including 492 the sending AS number for the address family and for the direction 493 "import". If an AS can't be found, the check fails. 495 The router SHOULD NOT check the recursive peering policy for dupli- 496 cate AS numbers, which are the result of prepending. AS operators 497 SHOULD add a self peering entry, if they use prepending. 499 If all checks succeed, the route is accepted. 501 If the check fails, the processing for this route MUST be delayed and 502 retried. This is necessary, because BGP4 does announce a route only 503 once during a peering session. If the problem with the DNS disap- 504 pears, the route will not be reannounced in the BGP4 session, but 505 MUST be accepted now. 507 Routers MAY record the TTL of the responses and assign the route the 508 minimum of all TTLs to regularly reverify the route. Routers MUST 509 NOT drop the route solely because the TTL times out. 511 3.2. Offloading crypto 513 Routers are not designed for DNS processing and should not do it. 514 DNSSEC offers a validating resolver and a Authenticated Data bit in 515 the response header. Routers SHOULD ask a validating resolver and 516 rely on the AD bit in the response [RFC4033]. 518 Using this approach, PKI processing, caching, and debugging is handed 519 over to specialized software and admins. 521 3.3. Zone slaving 523 Normally name servers of new AS can't be reached, because the new 524 route to the prefix of the AS can't be verified until the route to 525 the nameserver is active. 527 That's why all zones in BGP.ARPA MUST have secondaries in other AS. 528 The RIRs are urged to provide public secondaries for their LIRs and 529 their routing customers. 531 To avoid a net split after a hypothetical major outage, running sec- 532 ondaries of other zones, especially of those of the peering AS, is 533 RECOMMENDED. Name server operators in BGP.ARPA SHOULD allow zone 534 transfers to everyone [RFC1034]. 536 3.4. Utilizing peer's cache 538 Querying each record from the authoritive name servers for every 539 recursive resolver would cause a storm of queries from the whole 540 internet if a prefix is injected or flaps. Such a query storm is 541 similar to a DDoS and should be avoided. 543 Any received prefix comes from a peer router which should have veri- 544 fied the prefix before sending. So the peer's router knows it's 545 local resolver which in turn may have cached all the necessary data 546 to validate the prefix. 548 Routing devices SHOULD add the peer's router name as NS for BGP.ARPA 549 in the authority section, and the peer's router address as A or AAAA 550 for the router name to the additional section of it's own queries to 551 it's own validating resolver. The name for the NS and A/AAAA entry 552 is not important, it only connects the NS RR and the A/AAAA RR. The 553 qname of the NS RR can be considered as the maximum scope of allowed 554 DNS queries. 556 The resolver SHOULD ask the mentioned address first for all necessary 557 recursive queries regarding this query. It MUST NOT add the router 558 address into the cache as a valid nameserver for the zone BGP.ARPA. 559 If the peer's resolver denies access or is unreachable, the resolver 560 MUST NOT query the peer's resolver for a reasonable time. If the 561 necessary data can not be obtained from the peer's resolver, the 562 resolver MUST start the normal DNS resolving algorithm. Sending DNS 563 queries to a different host is a security risk, so resolvers SHOULD 564 permit this redirection only for known sources (their own routers) 565 and MAY limit this feature to zones under BGP.ARPA. 567 The peer's router SHOULD forward the queries to it's local resolver. 568 It is NOT RECOMMENDED for the router to provide this service for 569 everyone, so the routing device SHOULD permit DNS forwarding only for 570 sources of the peering AS and MAY use it's BGP routing table for this 571 purpose. 573 The peer's resolver SHOULD respond using it's cache data as a regular 574 recursor providing forwarding service. The resolver MUST take care 575 not to serve information for private zones, this can also be accom- 576 plished by having two resolvers, one for the router, one for outside 577 queries. 579 3.5. Bootstrapping 581 There are two strategies to handle the startup of AS routing. 583 3.5.1. Delaying verficiation 585 Routers SHOULD postphone all the checkings but accept all the routes 586 as long as the routing table stays below to a configurable value. 587 This behaviour allows a cold start after disasterous problems: The 588 verification is postphoned until DNS becomes useable. 590 3.5.2. Utilizing peer's resolver 592 While bootstrapping, foreign AS will need security information to 593 accept routes originating from an AS. This can be accomplished by 594 putting master authority DNS servers for the AS AS.BGP.ARPA zone, the 595 AS prefixes in IPV4.BGP.ARPA and IPV6.BGP.ARPA inside the AS and 596 reachable by the forwarding resolver. Far away AS can then query 597 their neighboring routers, which will forward the query to their 598 resolver, which will ask routers that are closer, and so on, towards 599 the authority server. 601 A resolver performing such router forwarding MUST be able get the 602 address from its router for the resolver in a neighboring AS that is 603 closer to a destination AS or prefix. The router consults its 604 routing tables to determine the AS neighbor closer for a prefix. For 605 unrouted prefixes, the router has no answer, because it does not know 606 a closer AS, or the resolver address for the closer AS. 608 The router is queried for the neighboring resolver address with a 609 query of type NEIGHBOR_NS, and name in AS.BGP.ARPA, IPV4.BGP.ARPA, 610 IPV6.BGP.ARPA. The reply contains an NS for BGP.ARPA and addresses 611 for the remote server that handles forwarded router queries in the 612 neighboring AS. This NS MUST NOT be stored in the validator cache as 613 a nameserver for BGP.ARPA. Query RR type NEIGHBOR_NS has type code 614 TBD3 (decimal). 616 To be able to validate the DNSSEC chain of trust while the root, 617 IANA, RIR and other servers are unreachable during bootstrapping, the 618 DNSSEC chain of struct information MUST be stored. The AS stores 619 such information in CHAINOFTRUST RRs at the zone apex for its 620 AS.BGP.ARPA, IPV4.BGP.ARPA and IPV6.BGP.ARPA zones. The information 621 was inserted at the last zone sign for the zone, so may be out of 622 date regarding current information served by parent zones, but the 623 information MUST be verifiable using the current trust anchors. 625 The CHAINOFTRUST RR has type code TBD2 (decimal) and is class inde- 626 pendent. Its wire format consists of a 16-bit value type code and an 627 uncompressed original domain name, and the remainder up to rdata 628 length is the original rdata and presented in base64. The RR type is 629 used to wrap DNSSEC chain of trust data so that it can be stored at 630 the authority servers of the AS without conflicting with data from 631 other AS. It is RFC3597 compliant. The data can be copied from the 632 parent authority servers verbatim. The CHAINOFTRUST RRset must also 633 be signed by the ZSK as usual. An example: 635 $ORIGIN 3.0.0.0.3.as.bgp.arpa. 636 @ CHAINOFTRUST DNSKEY . 637 CHAINOFTRUST RRSIG . 638 CHAINOFTRUST DS arpa. 639 CHAINOFTRUST RRSIG . 640 CHAINOFTRUST DNSKEY arpa. 641 CHAINOFTRUST RRSIG arpa. 642 CHAINOFTRUST DS bgp.arpa. 643 CHAINOFTRUST RRSIG arpa. 644 CHAINOFTRUST DNSKEY bgp.arpa. 645 CHAINOFTRUST RRSIG bgp.arpa. 646 CHAINOFTRUST DS as.bgp.arpa. 647 CHAINOFTRUST RRSIG bgp.arpa. 648 CHAINOFTRUST DNSKEY as.bgp.arpa. 649 CHAINOFTRUST RRSIG as.bgp.arpa. 650 CHAINOFTRUST DS 3.as.bgp.arpa. 651 CHAINOFTRUST RRSIG as.bgp.arpa. 652 CHAINOFTRUST DNSKEY 3.as.bgp.arpa. 653 CHAINOFTRUST RRSIG 3.as.bgp.arpa. 654 CHAINOFTRUST DS 3.0.0.0.3.as.bgp.arpa. 655 CHAINOFTRUST RRSIG 3.bgp.arpa. 656 CHAINOFTRUST DNSKEY 3.0.0.0.0.3.as.bgp.arpa. 657 CHAINOFTRUST RRSIG 3.0.0.0.3.as.bgp.arpa. 658 RRSIG 3.0.0.0.3.as.bgp.arpa. 660 The CHAINOFTRUST type can thus become fairly large, and will probably 661 require TCP failover when queried for. Storing a CHAINOFTRUST with 662 original type CHAINOFSTRUCT can be used to refer a validator to more 663 CHAINOFTRUST RRs which can be found at the name pointed to by the 664 domain name stored. 666 4. Related work 668 The idea is not new. Directly after the specification of DNSSEC, the 669 provided infrastructure was applied for verifying BGP announcements. 670 Prefix originating verification was proposed by [bates] and discussed 671 by [liauth]. AS mapping to DNS was proposed by [eastlake]. 673 [bates] prefered to define the new record type AS in order to keep 674 the current semantics of TXT. This proposal initially prefered TXT. 676 Filling the testbed with real world data reveals AS database sets 677 with more than 20000 AS numbers after deaggregation. Using TXT 678 records, the record set exceeds 100 kbyte and all limites for DNS 679 packets. Such record sets can't be retrieved. Mr. Wijngaards devel- 680 oped the ASSET type with bitfields and name chaining. Following the 681 responsibility principle, chaining was extended to multiple refer- 682 ences. 684 Multiple encoding variants of ASSET where tried with real world data: 685 Decimal encoding as TXT, binary encoding of 32-bit numbers, binary 686 encoding if 16-bit numbers within a high 16-bit window, and NSEC like 687 bitmaps within a 32-bit base window. Bitmap encoding is more effi- 688 cent if RDATA exceeds about 700 bytes. In all other cases the 16-bit 689 encoding as described in 2.1.1 is more compact. 691 [eastlake] define the AS mapping to DNS using the asplain notation 692 combined with a length indicator of the significat digits. With the 693 introduction of four-byte AS numbers [RFC4893], IANA chooses to allo- 694 cate a whole to the a single RIR only, which suggests 695 asdot usage. Futhermore fixed sized formats are easier to handle in 696 embedded devices. 698 The current proposal chooses to expand the to five deci- 699 mal digits and and append the whole as a single decimal 700 number. This decision does only scale, as long as the number of 701 allocated keeps small. 703 The alternate approach of coding the AS number in hex as in IP6.ARPA 704 offers the possibility to follow the IANA allocation policy more 705 closely (allocation step is 0x100). Tests show, that currently 3455 706 delegations based on decimal number vom IANA to RIRs are necessary, 707 but only 3440 based on hexadecimal numbers. Only if lookup would be 708 done on binary numbers, the number of delegations would drop to 70. 709 In order to ease debugging, this proposal chooses to stick on decimal 710 numbers. 712 The actual work of the SIDR WG focuses on automatic generation and 713 validation of filters [sidrwg]. AS Path checks are not yet devel- 714 oped. 716 [wijngaards] is very similar to the current proposal, so the results 717 where merged. 719 There are other proposals, i.e. a redesign of the BGP4 protocol to 720 include cryptographic authentication of the path and origin [bar- 721 tels]. 723 5. Test environment 725 A testbed was build to test implementations and verify assumptions 726 based on this recommendation. The data in the testbed is derived on 727 snapshots of the Internet Routing Registy [irdb] with focus of the 728 RIPE region. 730 The primary NS for the testbed of BGP.ARPA is IANA.BGP.IKS-JENA.DE. 731 If you run secondaries, Lutz is happy to add them as name servers for 732 the test zones. If you like to get a delegation to maintain your own 733 part in the testbed, please contact Lutz Donnerhacke. 735 IANA and RIRs are especially encouraged to maintain their own area of 736 responsibility. This way the testbed would be more accurate and the 737 communication channels between the participating parties could be 738 covered. 740 To gain experience with DNSSEC signed domains up to the root, Lutz 741 Donnerhacke runs a signed root [iksroot], which is expanded to cover 742 the BGP.ARPA testbed. You MUST NOT consider this environment as a 743 permanent resource. It will vanish as soon as the root gets signed 744 [rootsign]. 746 6. Security Considerations 748 All zones in BGP.ARPA MUST be signed. Local infrastructure between 749 the routers and the validating recursive resolvers SHOULD be secured 750 against data modification or spoofing attacks. 752 Operational errors in DNSSEC or DNS handling will cause routing prob- 753 lems. Operational errors at RIR or IANA will cause larger shutdowns 754 of global routing. These errors may be mitigated if the CHAINOFTRUST 755 types are queried, and contain data from before the error. 757 Injecting or flapping routes may cause a storm of DNS queries from 758 routers of the whole internet. Such a request storm is similar to a 759 DDoS attack. Be prepared. Have secondaries. Don't flap. 761 7. IANA Considerations 763 IANA should gracefully add the BGP.ARPA zone and maintain the delega- 764 tions to the RIRs. 766 IANA should sign the all the zones from the RIR delegation point down 767 to the root. IANA should maintain the resigning and key rollover 768 procedures for those zones. 770 IANA should set up a Delegate Signer (i.e. manual) update protocol 771 for the delegation points to allow the RIRs to change their keys. 773 IANA should maintain a registry of ASSET subtype numbers. Those num- 774 bers should be updated by IETF consensus. 776 IANA should assign RR type codes for ASSET, CHAINOFTRUST and NEIGH- 777 BOR_NS. 779 8. References 781 8.1. Normative References 783 [RFC1034] Mockapetris, P, "Domain Names - Concepts and Facilities", 784 RfC 1034, November 1987 786 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 787 Requirement Levels", RFC2119, March 1997 789 [RFC2317] Eidnes, H. and de Groot, G. and Vixie, P., "Classless IN- 790 ADDR.ARPA delegation", RFC2317, March 1998 792 [RFC3596] Thomson, S. and Huitema, C. and Ksinant, V. and Souissi, 793 M., "DNS Extensions to support IP version 6", RFC 3596, 794 October 2003 796 [RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record 797 (RR) Types", RFC 3597, September 2003 799 [RFC4012] Blunk, L. and Damas, J. and Parent, F. and Robachevsky, 800 A., "Routing Policy Specification Language next genera- 801 tion", RFC4012, March 2005 803 [RFC4033] Arends, R. and Austein, R. and Larson, M. and Massey, D. 804 and Rose, S., "DNS Security Introduction and Require- 805 ments", RFC4033, March 2005 807 [RFC4271] Rekhter, Y. and Li, T. and Hares, S., "A Border Gateway 808 Protocol 4", RFC 4271, January 2006 810 [RFC4893] Vohra, Q. and Chen, E., "BGP Support for Four-octet AS 811 Number Space", RFC 4893, May 2007 813 [as4byte] Michaelson, G. and Hustone, G., "Canonical Text Represen- 814 tation of Four-octet AS Numbers", Work in Progress: draft- 815 michaelson-4byte-as-representation-05, December 2007 817 8.2. Informal References 819 [bates] Bates, T. and Bush, R. and Li, T. and Rekhter, Y., "DNS- 820 based NLRI origin AS verification in BGP", Expired work in 821 progress: draft-bates-bgp4-nlri-orig-verif-00, December 822 1997 824 [eastlake] Eastlake, D., "Mapping Autonomous Systems Number into the 825 Domain Name System", Expired work in progress: draft-ietf- 826 dnssec-as-map-05, July 1997 828 [liauth] Li, T., "Origin Authentication in BGP", Expired work in 829 progress: http://www.academ.com/nanog/feb1998/origin.html, 830 February 1998 832 [irdb] "The Internet Routing Registry: History and Purpose", 833 http://www.ripe.net/db/irr.html 835 [iananum] "Number Resources", http://www.iana.org/numbers/ 837 [sidrwg] "Secure Inter-Domain Routing", 838 http://tools.ietf.org/wg/sidr/ 840 [rootsign] "IANA (DEMO) DNSSEC Status", 841 https://ns.iana.org/dnssec/status.html 843 [youtube] RIPE NCC, "YouTube Hijacking: A RIPE NCC RIS case study", 844 http://www.ripe.net/news/study-youtube-hijacking.html, 845 February 2008 847 [bartels] Bartels, O., "Requirements for a new routing protocol", 848 Work in Progress: news:6msps3tgjug- 849 mvlkk1hcr26jpo8nrfhbmj0@4ax.com, March 2008 851 [wijngaards] Wijngaards, W., "Securing BGP using DNSSEC", unpub- 852 lished, April 2008 854 [iksroot] Donnerhacke, L., "Instructions for a signed root", 855 https://www.iks-jena.de/leistungen/keys.txt, December 2007 857 9. Changes history 859 This section will not appear in the final document. It does provide 860 some convenience hints what changed between the document version. It 861 is not complete nor normative. 863 Important differences from 02 to 03: 864 - IP delegation requires always a netmask for propper delegation 866 Important differences from 02 to 03: 867 - Wouter Wijngaards added as author 868 - ASSET RR added in favor of TXT RR 869 - Peering direction and address familiy moved from RDATA to NAME. 870 - DNAME for delegations are now REQUIRED instead of RECOMMENDED. 871 - IRDB AS-Set mappings added 872 - Bootstrapping seperated out as an extra section 873 - Utilizing peer's cache section added 874 - Testbed responsibility assigned to Lutz Donnerhacke 875 - Added DDoS risks 876 - Added subtype registry for IANA 878 Important differences from 01 to 02: 879 - Removed reserved handling in favor to local served DNS zones. 880 - Added aggregate handling. 882 Important differences from 00 to 01: 883 - Added handling of reserved address space using wildcards. 884 - Added handling of non routable address space using denial of exis- 885 tence. 886 - Added classification of multicast address space as non routable 887 space. 888 - Added transition phase where information is copied from [irdb] or 889 verification is explicitly turned off. 890 - Added recommendation to explicitly announce prepending as self 891 peering. 892 - Raised recheck of delayed verifications from SHOULD to MUST. 893 - Added a section about related work and reasons for design deci- 894 sions. 896 10. Acknowledgements 898 The proposal was developed with the help of Gert Doering and Oliver 899 Bartels in a USENET News discussion about the YouTube hijacking in 900 February 2008 [youtube]. 902 Many thanks go to Tony Li for pointing out several historic docu- 903 ments, and his invaluable comments on the transition phase, reserved 904 areas, and readvertisement of received prefixes. 906 Wouter Wijngaards independently developed a very similar proposal. 907 Both proposals were merged. Mr. Wijngaards does a wonderful job in 908 developing the DNS related parts. 910 Authors' Addresses 912 Lutz Donnerhacke 913 IKS GmbH 914 Leutragraben 1 915 07743 Jena 916 Germany 917 Phone: +49-3641-573561 918 EMail: lutz@iks-jena.de 920 Wouter Wijngaards 921 NLnet Labs 922 Kruislaan 419 923 Amsterdam 1098 VA 924 The Netherlands 925 Phone: +31-20-888-4551 926 EMail: wouter@nlnetlabs.nl 928 Full Copyright Statement 930 Copyright (C) The IETF Trust (2008). 932 This document is subject to the rights, licenses and restrictions 933 contained in BCP 78, and except as set forth therein, the authors 934 retain all their rights. 936 Disclamer 938 This document and the information contained herein are provided on 939 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 940 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 941 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 942 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 943 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY 944 RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 945 PARTICULAR PURPOSE. 947 Intellectual Property 949 The IETF takes no position regarding the validity or scope of any 950 Intellectual Property Rights or other rights that might be claimed 951 to pertain to the implementation or use of the technology 952 described in this document or the extent to which any license 953 under such rights might or might not be available; nor does it 954 represent that it has made any independent effort to identify any 955 such rights. Information on the procedures with respect to rights 956 in RFC documents can be found in BCP 78 and BCP 79. 958 Copies of IPR disclosures made to the IETF Secretariat and any 959 assurances of licenses to be made available, or the result of an 960 attempt made to obtain a general license or permission for the use 961 of such proprietary rights by implementers or users of this 962 specification can be obtained from the IETF on-line IPR repository 963 at http://www.ietf.org/ipr. 965 The IETF invites any interested party to bring to its attention 966 any copyrights, patents or patent applications, or other 967 proprietary rights that may cover technology that may be required 968 to implement this standard. Please address the information to the 969 IETF at ietf-ipr@ietf.org.