idnits 2.17.1 draft-ietf-sidr-rpki-rtr-24.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.ietf-sidr-arch]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 12, 2012) is 4489 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-10) exists of draft-ietf-sidr-pfx-validate-03 ** Obsolete normative reference: RFC 2385 (Obsoleted by RFC 5925) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) == Outdated reference: A later version (-01) exists of draft-ymbk-rpki-rtr-impl-00 Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Bush 3 Internet-Draft Internet Initiative Japan 4 Intended status: Standards Track R. Austein 5 Expires: July 15, 2012 Dragon Research Labs 6 January 12, 2012 8 The RPKI/Router Protocol 9 draft-ietf-sidr-rpki-rtr-24 11 Abstract 13 In order to formally validate the origin ASs of BGP announcements, 14 routers need a simple but reliable mechanism to receive RPKI 15 [I-D.ietf-sidr-arch] prefix origin data from a trusted cache. This 16 document describes a protocol to deliver validated prefix origin data 17 to routers. 19 Requirements Language 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in [RFC2119]. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on July 15, 2012. 42 Copyright Notice 44 Copyright (c) 2012 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. Deployment Structure . . . . . . . . . . . . . . . . . . . . . 4 62 4. Operational Overview . . . . . . . . . . . . . . . . . . . . . 4 63 5. Protocol Data Units (PDUs) . . . . . . . . . . . . . . . . . . 5 64 5.1. Serial Notify . . . . . . . . . . . . . . . . . . . . . . 6 65 5.2. Serial Query . . . . . . . . . . . . . . . . . . . . . . . 6 66 5.3. Reset Query . . . . . . . . . . . . . . . . . . . . . . . 7 67 5.4. Cache Response . . . . . . . . . . . . . . . . . . . . . . 7 68 5.5. IPv4 Prefix . . . . . . . . . . . . . . . . . . . . . . . 8 69 5.6. IPv6 Prefix . . . . . . . . . . . . . . . . . . . . . . . 9 70 5.7. End of Data . . . . . . . . . . . . . . . . . . . . . . . 9 71 5.8. Cache Reset . . . . . . . . . . . . . . . . . . . . . . . 10 72 5.9. Error Report . . . . . . . . . . . . . . . . . . . . . . . 10 73 5.10. Fields of a PDU . . . . . . . . . . . . . . . . . . . . . 11 74 6. Protocol Sequences . . . . . . . . . . . . . . . . . . . . . . 13 75 6.1. Start or Restart . . . . . . . . . . . . . . . . . . . . . 13 76 6.2. Typical Exchange . . . . . . . . . . . . . . . . . . . . . 14 77 6.3. No Incremental Update Available . . . . . . . . . . . . . 15 78 6.4. Cache has No Data Available . . . . . . . . . . . . . . . 15 79 7. Transport . . . . . . . . . . . . . . . . . . . . . . . . . . 16 80 7.1. SSH Transport . . . . . . . . . . . . . . . . . . . . . . 17 81 7.2. TLS Transport . . . . . . . . . . . . . . . . . . . . . . 18 82 7.3. TCP MD5 Transport . . . . . . . . . . . . . . . . . . . . 18 83 7.4. TCP-AO Transport . . . . . . . . . . . . . . . . . . . . . 18 84 8. Router-Cache Set-Up . . . . . . . . . . . . . . . . . . . . . 18 85 9. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 19 86 10. Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 20 87 11. Security Considerations . . . . . . . . . . . . . . . . . . . 21 88 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 89 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 90 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 91 14.1. Normative References . . . . . . . . . . . . . . . . . . . 24 92 14.2. Informative References . . . . . . . . . . . . . . . . . . 25 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 95 1. Introduction 97 In order to formally validate the origin ASs of BGP announcements, 98 routers need a simple but reliable mechanism to receive RPKI 99 [I-D.ietf-sidr-arch] formally validated prefix origin data from a 100 trusted cache. This document describes a protocol to deliver 101 validated prefix origin data to routers. 103 Section 3 describes the deployment structure and Section 4 then 104 presents an operational overview. The binary payloads of the 105 protocol are formally described in Section 5, and the expected PDU 106 sequences are described in Section 6. The transport protocol options 107 are described in Section 7. Section 8 details how routers and caches 108 are configured to connect and authenticate. Section 9 describes 109 likely deployment scenarios. The traditional security and IANA 110 considerations end the document. 112 The protocol is extensible to support new PDUs with new semantics 113 when and as needed, as indicated by deployment experience. PDUs are 114 versioned should deployment experience call for change. 116 For an implementation (not inter-op) report, see 117 [I-D.ymbk-rpki-rtr-impl] 119 2. Glossary 121 The following terms are used with special meaning: 123 Global RPKI: The authoritative data of the RPKI are published in a 124 distributed set of servers at the IANA, RIRs, NIRs, and ISPs, see 125 [I-D.ietf-sidr-repos-struct]. 127 Cache: A coalesced copy of the RPKI which is periodically fetched/ 128 refreshed directly or indirectly from the global RPKI using the 129 [RFC5781] protocol/tools. Relying party software is used to 130 gather and validate the distributed data of the RPKI into a cache. 131 Trusting this cache further is a matter between the provider of 132 the cache and a relying party. 134 Serial Number: A 32-bit monotonically increasing ordinal which wraps 135 from 2^32-1 to 0. It denotes the logical version of a cache. A 136 cache increments the value by one when it successfully updates its 137 data from a parent cache or from primary RPKI data. As a cache is 138 receiving, new incoming data and implicit deletes are marked with 139 the new serial but MUST NOT be sent until the fetch is complete. 140 A serial number is not commensurate between caches, nor need it be 141 maintained across resets of the cache server. See [RFC1982] on 142 DNS Serial Number Arithmetic for too much detail on serial number 143 arithmetic. 145 Session ID: When a cache server is started, it generates a session 146 identifier to uniquely identify the instance of the cache and to 147 bind it to the sequence of Serial Numbers that cache instance will 148 generate. This allows the router to restart a failed session 149 knowing that the Serial Number it is using is commensurate with 150 that of the cache. 152 3. Deployment Structure 154 Deployment of the RPKI to reach routers has a three level structure 155 as follows: 157 Global RPKI: The authoritative data of the RPKI are published in a 158 distributed set of servers, RPKI publication repositories, e.g. 159 the IANA, RIRs, NIRs, and ISPs, see [I-D.ietf-sidr-repos-struct]. 161 Local Caches: A local set of one or more collected and verified 162 caches. A relying party, e.g. router or other client, MUST have a 163 trust relationship with, and a trusted transport channel to, any 164 authoritative cache(s) it uses. 166 Routers: A router fetches data from a local cache using the protocol 167 described in this document. It is said to be a client of the 168 cache. There MAY be mechanisms for the router to assure itself of 169 the authenticity of the cache and to authenticate itself to the 170 cache. 172 4. Operational Overview 174 A router establishes and keeps open a connection to one or more 175 caches with which it has client/server relationships. It is 176 configured with a semi-ordered list of caches, and establishes a 177 connection to the most preferred cache, or set of caches, which 178 accept the connections. 180 The router MUST choose the most preferred, by configuration, cache or 181 set of caches so that the operator may control load on their caches 182 and the Global RPKI. 184 Periodically, the router sends to the cache the serial number of the 185 highest numbered data it has received from that cache, i.e. the 186 router's current serial number. When a router establishes a new 187 connection to a cache, or wishes to reset a current relationship, it 188 sends a Reset Query. 190 The Cache responds with all data records which have serial numbers 191 greater than that in the router's query. This may be the null set, 192 in which case the End of Data PDU is still sent. Note that 'greater' 193 must take wrap-around into account, see [RFC1982]. 195 When the router has received all data records from the cache, it sets 196 its current serial number to that of the serial number in the End of 197 Data PDU. 199 When the cache updates its database, it sends a Notify message to 200 every currently connected router. This is a hint that now would be a 201 good time for the router to poll for an update, but is only a hint. 202 The protocol requires the router to poll for updates periodically in 203 any case. 205 Strictly speaking, a router could track a cache simply by asking for 206 a complete data set every time it updates, but this would be very 207 inefficient. The serial number based incremental update mechanism 208 allows an efficient transfer of just the data records which have 209 changed since last update. As with any update protocol based on 210 incremental transfers, the router must be prepared to fall back to a 211 full transfer if for any reason the cache is unable to provide the 212 necessary incremental data. Unlike some incremental transfer 213 protocols, this protocol requires the router to make an explicit 214 request to start the fallback process; this is deliberate, as the 215 cache has no way of knowing whether the router has also established 216 sessions with other caches that may be able to provide better 217 service. 219 As a cache server must evaluate certificates and ROAs which are time 220 dependent, servers' clocks MUST be correct to a tolerance of 221 approximately an hour. 223 5. Protocol Data Units (PDUs) 225 The exchanges between the cache and the router are sequences of 226 exchanges of the following PDUs according to the rules described in 227 Section 6. 229 Fields with unspecified content MUST be zero on transmission and MAY 230 be ignored on receipt. 232 5.1. Serial Notify 234 The cache notifies the router that the cache has new data. 236 The Session ID reassures the router that the serial numbers are 237 commensurate, i.e. the cache session has not been changed. 239 Serial Notify is only message that the cache can send that is not in 240 response to a message from the router. 242 0 8 16 24 31 243 .-------------------------------------------. 244 | Protocol | PDU | | 245 | Version | Type | Session ID | 246 | 0 | 0 | | 247 +-------------------------------------------+ 248 | | 249 | Length=12 | 250 | | 251 +-------------------------------------------+ 252 | | 253 | Serial Number | 254 | | 255 `-------------------------------------------' 257 5.2. Serial Query 259 Serial Query: The router sends Serial Query to ask the cache for all 260 payload PDUs which have serial numbers higher than the serial number 261 in the Serial Query. 263 The cache replies to this query with a Cache Response PDU 264 (Section 5.4) if the cache has a, possibly null, record of the 265 changes since the serial number specified by the router. If there 266 have been no changes since the router last queried, the cache then 267 sends an End Of Data PDU. 269 If the cache does not have the data needed to update the router, 270 perhaps because its records do not go back to the Serial Number in 271 the Serial Query, then it responds with a Cache Reset PDU 272 (Section 5.8). 274 The Session ID tells the cache what instance the router expects to 275 ensure that the serial numbers are commensurate, i.e. the cache 276 session has not been changed. 278 0 8 16 24 31 279 .-------------------------------------------. 280 | Protocol | PDU | | 281 | Version | Type | Session ID | 282 | 0 | 1 | | 283 +-------------------------------------------+ 284 | | 285 | Length=12 | 286 | | 287 +-------------------------------------------+ 288 | | 289 | Serial Number | 290 | | 291 `-------------------------------------------' 293 5.3. Reset Query 295 Reset Query: The router tells the cache that it wants to receive the 296 total active, current, non-withdrawn, database. The cache responds 297 with a Cache Response PDU (Section 5.4). 299 0 8 16 24 31 300 .-------------------------------------------. 301 | Protocol | PDU | | 302 | Version | Type | reserved = zero | 303 | 0 | 2 | | 304 +-------------------------------------------+ 305 | | 306 | Length=8 | 307 | | 308 `-------------------------------------------' 310 5.4. Cache Response 312 Cache Response: The cache responds with zero or more payload PDUs. 313 When replying to a Serial Query request (Section 5.2), the cache 314 sends the set of all data records it has with serial numbers greater 315 than that sent by the client router. When replying to a Reset Query, 316 the cache sends the set of all data records it has; in this case the 317 withdraw/announce field in the payload PDUs MUST have the value 1 318 (announce). 320 In response to a Reset Query, the new value of the Session ID tells 321 the router the instance of the cache session for future confirmation. 322 In response to a Serial Query, the Session ID being the same 323 reassures the router that the serial numbers are commensurate, i.e. 324 the cache session has not changed. 326 0 8 16 24 31 327 .-------------------------------------------. 328 | Protocol | PDU | | 329 | Version | Type | Session ID | 330 | 0 | 3 | | 331 +-------------------------------------------+ 332 | | 333 | Length=8 | 334 | | 335 `-------------------------------------------' 337 5.5. IPv4 Prefix 339 0 8 16 24 31 340 .-------------------------------------------. 341 | Protocol | PDU | | 342 | Version | Type | reserved = zero | 343 | 0 | 4 | | 344 +-------------------------------------------+ 345 | | 346 | Length=20 | 347 | | 348 +-------------------------------------------+ 349 | | Prefix | Max | | 350 | Flags | Length | Length | zero | 351 | | 0..32 | 0..32 | | 352 +-------------------------------------------+ 353 | | 354 | IPv4 Prefix | 355 | | 356 +-------------------------------------------+ 357 | | 358 | Autonomous System Number | 359 | | 360 `-------------------------------------------' 362 The lowest order bit of the Flags field is 1 for an announcement and 363 0 for a withdrawal. 365 In the RPKI, nothing prevents a signing certificate from issuing two 366 identical ROAs, and nothing prohibits the existence of two identical 367 route: or route6: objects in the IRR. In this case there would be no 368 semantic difference between the objects, merely a process redundancy. 370 In the RPKI, there is also an actual need for what might appear to a 371 router as identical IPvX PDUs. This can occur when an upstream 372 certificate is being reissued or there is an address ownership 373 transfer up the validation chain. The ROA would be identical in the 374 router sense, i.e. have the same {prefix, len, max-len, asn}, but a 375 different validation path in the RPKI. This is important to the 376 RPKI, but not to the router. 378 The cache server MUST ensure that it has told the router client to 379 have one and only one IPvX PDU for a unique {prefix, len, max-len, 380 asn} at any one point in time. Should the router client receive an 381 IPvX PDU with a {prefix, len, max-len, asn} identical to one it 382 already has active, it SHOULD raise a Duplicate Announcement Received 383 error. 385 5.6. IPv6 Prefix 387 0 8 16 24 31 388 .-------------------------------------------. 389 | Protocol | PDU | | 390 | Version | Type | reserved = zero | 391 | 0 | 6 | | 392 +-------------------------------------------+ 393 | | 394 | Length=32 | 395 | | 396 +-------------------------------------------+ 397 | | Prefix | Max | | 398 | Flags | Length | Length | zero | 399 | | 0..128 | 0..128 | | 400 +-------------------------------------------+ 401 | | 402 +--- ---+ 403 | | 404 +--- IPv6 Prefix ---+ 405 | | 406 +--- ---+ 407 | | 408 +-------------------------------------------+ 409 | | 410 | Autonomous System Number | 411 | | 412 `-------------------------------------------' 414 5.7. End of Data 416 End of Data: Cache tells router it has no more data for the request. 418 The Session ID MUST be the same as that of the corresponding Cache 419 Response which began the, possibly null, sequence of data PDUs. 421 0 8 16 24 31 422 .-------------------------------------------. 423 | Protocol | PDU | | 424 | Version | Type | Session ID | 425 | 0 | 7 | | 426 +-------------------------------------------+ 427 | | 428 | Length=12 | 429 | | 430 +-------------------------------------------+ 431 | | 432 | Serial Number | 433 | | 434 `-------------------------------------------' 436 5.8. Cache Reset 438 The cache may respond to a Serial Query informing the router that the 439 cache cannot provide an incremental update starting from the serial 440 number specified by the router. The router must decide whether to 441 issue a Reset Query or switch to a different cache. 443 0 8 16 24 31 444 .-------------------------------------------. 445 | Protocol | PDU | | 446 | Version | Type | reserved = zero | 447 | 0 | 8 | | 448 +-------------------------------------------+ 449 | | 450 | Length=8 | 451 | | 452 `-------------------------------------------' 454 5.9. Error Report 456 This PDU is used by either party to report an error to the other. 458 Error reports are only sent as responses to other PDUs. 460 The Error Code is described in Section 10. 462 If the error is not associated with any particular PDU, the Erroneous 463 PDU field MUST be empty and the Length of Encapsulated PDU field MUST 464 be zero. 466 An Error Report PDU MUST NOT be sent for an Error Report PDU. If an 467 erroneous Error Report PDU is received, the session SHOULD be 468 dropped. 470 If the error is associated with a PDU of excessive, or possibly 471 corrupt, length, the Erroneous PDU field MAY be truncated. 473 The diagnostic text is optional, if not present the Length of Error 474 Text field SHOULD be zero. If error text is present, it SHOULD be a 475 string in US-ASCII, for maximum portability; if non-US-ASCII 476 characters are absolutely required, the error text MUST use UTF-8 477 encoding. 479 0 8 16 24 31 480 .-------------------------------------------. 481 | Protocol | PDU | | 482 | Version | Type | Error Code | 483 | 0 | 10 | | 484 +-------------------------------------------+ 485 | | 486 | Length | 487 | | 488 +-------------------------------------------+ 489 | | 490 | Length of Encapsulated PDU | 491 | | 492 +-------------------------------------------+ 493 | | 494 ~ Copy of Erroneous PDU ~ 495 | | 496 +-------------------------------------------+ 497 | | 498 | Length of Error Text | 499 | | 500 +-------------------------------------------+ 501 | | 502 | Arbitrary Text | 503 | of | 504 ~ Error Diagnostic Message ~ 505 | | 506 `-------------------------------------------' 508 5.10. Fields of a PDU 510 PDUs contain the following data elements: 512 Protocol Version: An ordinal, currently 0, denoting the version of 513 this protocol. 515 PDU Type: An ordinal, denoting the type of the PDU, e.g. IPv4 516 Prefix, etc. 518 Serial Number: The serial number of the RPKI Cache when this ROA was 519 received from the cache's up-stream cache server or gathered from 520 the global RPKI. A cache increments its serial number when 521 completing an rigorously validated update from a parent cache, for 522 example via rcynic. See [RFC1982] on DNS Serial Number Arithmetic 523 for too much detail on serial number arithmetic. 525 Session ID: When a cache server is started, it generates a Session 526 ID to identify the instance of the cache and to bind it to the 527 sequence of Serial Numbers that cache instance will generate. 528 This allows the router to restart a failed session knowing that 529 the Serial Number it is using is commensurate with that of the 530 cache. If, at any time, either the router or the cache finds the 531 value of the session identifiers they hold disagree, they MUST 532 completely drop the session and the router MUST flush all data 533 learned from that cache. 535 Should a cache erroneously reuse a Session ID so that a router 536 does not realize that the session has changed (old session ID and 537 new session ID have same numeric value), the router may become 538 confused as to the content of the cache. The time it takes the 539 router to discover it is confused will depend on whether the 540 serial numbers are also reused. If the serial numbers in the old 541 and new sessions are different enough, the cache will respond to 542 the router's Serial Query with a Cache Reset, which will solve the 543 problem. If, however, the serial numbers are close, the cache may 544 respond with a Cache Response, which may not be enough to bring 545 the router into sync. In such cases, it's likely but not certain 546 that the router will detect some discrepancy between the state 547 that the cache expects and its own state. For example, the Cache 548 Response may tell the router to drop a record which the router 549 does not hold, or may tell the router to add a record which the 550 router already has. In such cases, a router will detect the error 551 and reset the session. The one case in which the router may stay 552 out of sync is when nothing in the Cache Response contradicts any 553 data currently held by the router. 555 Using persistent storage for the session identifier or a clock- 556 based scheme for generating session identifiers should avoid the 557 risk of session identifier collisions. 559 The Session ID might be a pseudo-random, a monotonically 560 increasing value if the cache has reliable storage, etc. 562 Length: A 32 bit ordinal which has as its value the count of the 563 bytes in the entire PDU, including the eight bytes of header which 564 end with the length field. 566 Flags: The lowest order bit of the Flags field is 1 for an 567 announcement and 0 for a withdrawal, whether this PDU announces a 568 new right to announce the prefix or withdraws a previously 569 announced right. A withdraw effectively deletes one previously 570 announced IPvX Prefix PDU with the exact same Prefix, Length, Max- 571 Len, and ASN. 573 Prefix Length: An ordinal denoting the shortest prefix allowed for 574 the prefix. 576 Max Length: An ordinal denoting the longest prefix allowed by the 577 prefix. This MUST NOT be less than the Prefix Length element. 579 Prefix: The IPv4 or IPv6 prefix of the ROA. 581 Autonomous System Number: ASN allowed to announce this prefix, a 32 582 bit ordinal. 584 Zero: Fields shown as zero or reserved MUST be zero. The value of 585 such a field MUST be ignored on receipt. 587 6. Protocol Sequences 589 The sequences of PDU transmissions fall into three conversations as 590 follows: 592 6.1. Start or Restart 594 Cache Router 595 ~ ~ 596 | <----- Reset Query -------- | R requests data (or Serial Query) 597 | | 598 | ----- Cache Response -----> | C confirms request 599 | ------- IPvX Prefix ------> | C sends zero or more 600 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 601 | ------- IPvX Prefix ------> | Payload PDUs 602 | ------ End of Data ------> | C sends End of Data 603 | | and sends new serial 604 ~ ~ 606 When a transport session is first established, the router MAY send a 607 Reset Query and the cache responds with a data sequence of all data 608 it contains. 610 Alternatively, if the router has significant unexpired data from a 611 broken session with the same cache, it MAY start with a Serial Query 612 containing the Session ID from the previous session to ensure the 613 serial numbers are commensurate. 615 This Reset Query sequence is also used when the router receives a 616 Cache Reset, chooses a new cache, or fears that it has otherwise lost 617 its way. 619 To limit the length of time a cache must keep the data necessary to 620 generate incremental updates, a router MUST send either a Serial 621 Query or a Reset Query no less frequently than once an hour. This 622 also acts as a keep alive at the application layer. 624 As the cache MAY not keep updates for little more than one hour, the 625 router MUST have a polling interval of no greater than once an hour. 627 6.2. Typical Exchange 629 Cache Router 630 ~ ~ 631 | -------- Notify ----------> | (optional) 632 | | 633 | <----- Serial Query ------- | R requests data 634 | | 635 | ----- Cache Response -----> | C confirms request 636 | ------- IPvX Prefix ------> | C sends zero or more 637 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 638 | ------- IPvX Prefix ------> | Payload PDUs 639 | ------ End of Data ------> | C sends End of Data 640 | | and sends new serial 641 ~ ~ 643 The cache server SHOULD send a notify PDU with its current serial 644 number when the cache's serial changes, with the expectation that the 645 router MAY then issue a serial query earlier than it otherwise might. 646 This is analogous to DNS NOTIFY in [RFC1996]. The cache MUST rate 647 limit Serial Notifies to no more frequently than one per minute. 649 When the transport layer is up and either a timer has gone off in the 650 router, or the cache has sent a Notify, the router queries for new 651 data by sending a Serial Query, and the cache sends all data newer 652 than the serial in the Serial Query. 654 To limit the length of time a cache must keep old withdraws, a router 655 MUST send either a Serial Query or a Reset Query no less frequently 656 than once an hour. 658 6.3. No Incremental Update Available 660 Cache Router 661 ~ ~ 662 | <----- Serial Query ------ | R requests data 663 | ------- Cache Reset ------> | C cannot supply update 664 | | from specified serial 665 | <------ Reset Query ------- | R requests new data 666 | ----- Cache Response -----> | C confirms request 667 | ------- IPvX Prefix ------> | C sends zero or more 668 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 669 | ------- IPvX Prefix ------> | Payload PDUs 670 | ------ End of Data ------> | C sends End of Data 671 | | and sends new serial 672 ~ ~ 674 The cache may respond to a Serial Query with a Cache Reset, informing 675 the router that the cache cannot supply an incremental update from 676 the serial number specified by the router. This might be because the 677 cache has lost state, or because the router has waited too long 678 between polls and the cache has cleaned up old data that it no longer 679 believes it needs, or because the cache has run out of storage space 680 and had to expire some old data early. Regardless of how this state 681 arose, the cache replies with a Cache Reset to tell the router that 682 it cannot honor the request. When a router receives this, the router 683 SHOULD attempt to connect to any more preferred caches in its cache 684 list. If there are no more preferred caches it MUST issue a Reset 685 Query and get an entire new load from the cache. 687 6.4. Cache has No Data Available 689 Cache Router 690 ~ ~ 691 | <----- Serial Query ------ | R requests data 692 | ---- Error Report PDU ----> | C No Data Available 693 ~ ~ 695 Cache Router 696 ~ ~ 697 | <----- Reset Query ------- | R requests data 698 | ---- Error Report PDU ----> | C No Data Available 699 ~ ~ 701 The cache may respond to either a Serial Query or a Reset Query 702 informing the router that the cache cannot supply any update at all. 703 The most likely cause is that the cache has lost state, perhaps due 704 to a restart, and has not yet recovered. While it is possible that a 705 cache might go into such a state without dropping any of its active 706 sessions, a router is more likely to see this behavior when it 707 initially connects and issues a Reset Query while the cache is still 708 rebuilding its database. 710 When a router receives this kind of error, the router SHOULD attempt 711 to connect to any other caches in its cache list, in preference 712 order. If no other caches are available, the router MUST issue 713 periodic Reset Queries until it gets a new usable load from the 714 cache. 716 7. Transport 718 The transport layer session between a router and a cache carries the 719 binary Protocol Data Units (PDUs) in a persistent session. 721 To prevent cache spoofing and DoS attacks by illegitimate routers, it 722 is highly desirable that the router and the cache are authenticated 723 to each other. Integrity protection for payloads is also desirable 724 to protect against monkey in the middle (MITM) attacks. 725 Unfortunately, there is no protocol to do so on all currently used 726 platforms. Therefore, as of this document, there is no mandatory to 727 implement transport which provides authentication and integrity 728 protection. 730 To reduce exposure to dropped but non-terminated sessions, both 731 caches and routers SHOULD enable keep alives when available in the 732 chosen transport protocol. 734 It is expected that, when TCP-AO [RFC5925] is available on all 735 platforms deployed by operators, it will become the mandatory to 736 implement transport. 738 Caches and routers MUST implement unprotected transport over TCP 739 using a port, rpki-rtr, to be assigned, see Section 12. Operators 740 SHOULD use procedural means, e.g. access control lists (ACLs), ... to 741 reduce the exposure to authentication issues. 743 Caches and routers SHOULD use TCP-AO, SSH, TCP MD5, or IPsec 744 transport. 746 If unprotected TCP is the transport, the cache and routers MUST be on 747 the same trusted and controlled network. 749 If available to the operator, caches and routers MUST use one of the 750 following more protected protocols. 752 Caches and routers SHOULD use TCP-AO transport [RFC5925] over the 753 rpki-rtr port. 755 Caches and routers MAY use SSH transport [RFC4252] using a the normal 756 SSH port. For an example, see Section 7.1. 758 Caches and routers MAY use TCP MD5 transport [RFC2385] using the 759 rpki-rtr port. Note that TCP MD5 has been obsoleted by TCP-AO 760 [RFC5925]. 762 Caches and routers MAY use IPsec transport [RFC4301] using the rpki- 763 rtr port. 765 Caches and routers MAY use TLS transport [RFC5246] using using a 766 port, rpki-rtr-tls, to be assigned, see Section 12. 768 7.1. SSH Transport 770 To run over SSH, the client router first establishes an SSH transport 771 connection using the SSH transport protocol, and the client and 772 server exchange keys for message integrity and encryption. The 773 client then invokes the "ssh-userauth" service to authenticate the 774 application, as described in the SSH authentication protocol RFC 4252 775 [RFC4252]. Once the application has been successfully authenticated, 776 the client invokes the "ssh-connection" service, also known as the 777 SSH connection protocol. 779 After the ssh-connection service is established, the client opens a 780 channel of type "session", which results in an SSH session. 782 Once the SSH session has been established, the application invokes 783 the application transport as an SSH subsystem called "rpki-rtr". 784 Subsystem support is a feature of SSH version 2 (SSHv2) and is not 785 included in SSHv1. Running this protocol as an SSH subsystem avoids 786 the need for the application to recognize shell prompts or skip over 787 extraneous information, such as a system message that is sent at 788 shell start-up. 790 It is assumed that the router and cache have exchanged keys out of 791 band by some reasonably secured means. 793 Cache servers supporting SSH transport MUST accept RSA and DSA 794 authentication, and SHOULD accept ECDSA authentication. User 795 authentication MUST be supported; host authentication MAY be 796 supported. Implementations MAY support password authentication. 797 Client routers SHOULD verify the public key of the cache, to avoid 798 monkey in the middle attacks. 800 7.2. TLS Transport 802 Client routers using TLS transport MUST use client-side certificates 803 for authentication. While in principle any type of certificate and 804 certificate authority may be used, in general cache operators will 805 generally wish to create their own small-scale CA and issue 806 certificates to each authorized router. This simplifies credential 807 roll-over; any unrevoked, unexpired certificate from the proper CA 808 may be used. If such certificates are used, the CN field [RFC5280] 809 MUST be used to denote the router's identity. 811 Clients SHOULD verify the cache's certificate as well, to avoid 812 monkey in the middle attacks. 814 7.3. TCP MD5 Transport 816 If TCP-MD5 is used, implementations MUST support key lengths of at 817 least 80 printable ASCII bytes, per section 4.5 of [RFC2385]. 818 Implementations MUST also support hexadecimal sequences of at least 819 32 characters, i.e., 128 bits. 821 Key rollover with TCP-MD5 is problematic. Cache servers SHOULD 822 support [RFC4808]. 824 7.4. TCP-AO Transport 826 Implementations MUST support key lengths of at least 80 printable 827 ASCII bytes. Implementations MUST also support hexadecimal sequences 828 of at least 32 characters, i.e., 128 bits. MAC lengths of at least 829 96 bits MUST be supported, per section 5.3 of [RFC2385]. 831 The cryptographic algorithms and associcated parameters described in 832 [RFC5926] MUST be supported. 834 8. Router-Cache Set-Up 836 A cache has the public authentication data for each router it is 837 configured to support. 839 A router may be configured to peer with a selection of caches, and a 840 cache may be configured to support a selection of routers. Each must 841 have the name of, and authentication data for, each peer. In 842 addition, in a router, this list has a non-unique preference value 843 for each server in order of preference. This preference merely 844 denotes proximity, not trust, preferred belief, etc. The client 845 router attempts to establish a session with each potential serving 846 cache in preference order, and then starts to load data from the most 847 preferred cache to which it can connect and authenticate. The 848 router's list of caches has the following elements: 850 Preference: An ordinal denoting the router's preference to connect 851 to that cache, the lower the value the more preferred. 853 Name: The IP Address or fully qualified domain name of the cache. 855 Key: Any needed public key of the cache. 857 MyKey: Any needed private key or certificate of this client. 859 Due to the distributed nature of the RPKI, caches simply can not be 860 rigorously synchronous. A client may hold data from multiple caches, 861 but MUST keep the data marked as to source, as later updates MUST 862 affect the correct data. 864 Just as there may be more than one covering ROA from a single cache, 865 there may be multiple covering ROAs from multiple caches. The 866 results are as described in [I-D.ietf-sidr-pfx-validate]. 868 If data from multiple caches are held, implementations MUST NOT 869 distinguish between data sources when performing validation. 871 When a more preferred cache becomes available, if resources allow, it 872 would be prudent for the client to start fetching from that cache. 874 The client SHOULD attempt to maintain at least one set of data, 875 regardless of whether it has chosen a different cache or established 876 a new connection to the previous cache. 878 A client MAY drop the data from a particular cache when it is fully 879 in synch with one or more other caches. 881 A client SHOULD delete the data from a cache when it has been unable 882 to refresh from that cache for a configurable timer value. The 883 default for that value is twice the polling period for that cache. 885 If a client loses connectivity to a cache it is using, or otherwise 886 decides to switch to a new cache, it SHOULD retain the data from the 887 previous cache until it has a full set of data from one or more other 888 caches. Note that this may already be true at the point of 889 connection loss if the client has connections to more than one cache. 891 9. Deployment Scenarios 893 For illustration, we present three likely deployment scenarios. 895 Small End Site: The small multi-homed end site may wish to outsource 896 the RPKI cache to one or more of their upstream ISPs. They would 897 exchange authentication material with the ISP using some out of 898 band mechanism, and their router(s) would connect to one or more 899 up-streams' caches. The ISPs would likely deploy caches intended 900 for customer use separately from the caches with which their own 901 BGP speakers peer. 903 Large End Site: A larger multi-homed end site might run one or more 904 caches, arranging them in a hierarchy of client caches, each 905 fetching from a serving cache which is closer to the global RPKI. 906 They might configure fall-back peerings to up-stream ISP caches. 908 ISP Backbone: A large ISP would likely have one or more redundant 909 caches in each major PoP, and these caches would fetch from each 910 other in an ISP-dependent topology so as not to place undue load 911 on the global RPKI publication infrastructure. 913 Experience with large DNS cache deployments has shown that complex 914 topologies are ill-advised as it is easy to make errors in the graph, 915 e.g. not maintaining a loop-free condition. 917 Of course, these are illustrations and there are other possible 918 deployment strategies. It is expected that minimizing load on the 919 global RPKI servers will be a major consideration. 921 To keep load on global RPKI services from unnecessary peaks, it is 922 recommended that primary caches which load from the distributed 923 global RPKI not do so all at the same times, e.g. on the hour. 924 Choose a random time, perhaps the ISP's AS number modulo 60 and 925 jitter the inter-fetch timing. 927 10. Error Codes 929 This section contains a preliminary list of error codes. The authors 930 expect additions to this section during development of the initial 931 implementations. Errors which are considered fatal SHOULD cause the 932 session to be dropped. 934 0: Corrupt Data (fatal): The receiver believes the received PDU to 935 be corrupt in a manner not specified by other error codes. 937 1: Internal Error (fatal): The party reporting the error experienced 938 some kind of internal error unrelated to protocol operation (ran 939 out of memory, a coding assertion failed, et cetera). 941 2: No Data Available: The cache believes itself to be in good 942 working order, but is unable to answer either a Serial Query or a 943 Reset Query because it has no useful data available at this time. 944 This is likely to be a temporary error, and most likely indicates 945 that the cache has not yet completed pulling down an initial 946 current data set from the global RPKI system after some kind of 947 event that invalidated whatever data it might have previously held 948 (reboot, network partition, et cetera). 950 3: Invalid Request (fatal): The cache server believes the client's 951 request to be invalid. 953 4: Unsupported Protocol Version (fatal): The Protocol Version is not 954 known by the receiver of the PDU. 956 5: Unsupported PDU Type (fatal): The PDU Type is not known by the 957 receiver of the PDU. 959 6: Withdrawal of Unknown Record (fatal): The received PDU has Flag=0 960 but a record for the Prefix/PrefixLength/MaxLength triple does not 961 exist in the receiver's database. 963 7: Duplicate Announcement Received (fatal): The received PDU has an 964 identical {prefix, len, max-len, asn} tuple as a PDU which is 965 still active in the router. 967 11. Security Considerations 969 As this document describes a security protocol, many aspects of 970 security interest are described in the relevant sections. This 971 section points out issues which may not be obvious in other sections. 973 Cache Validation: In order for a collection of caches as described 974 in Section 9 to guarantee a consistent view, they need to be given 975 consistent trust anchors to use in their internal validation 976 process. Distribution of a consistent trust anchor is assumed to 977 be out of band. 979 Cache Peer Identification: The router initiates a transport session 980 to a cache, which it identifies by either IP address or fully 981 qualified domain name. Be aware that a DNS or address spoofing 982 attack could make the correct cache unreachable. No session would 983 be established, as the authorization keys would not match. 985 Transport Security: The RPKI relies on object, not server or 986 transport, trust. I.e. the IANA root trust anchor is distributed 987 to all caches through some out of band means, and can then be used 988 by each cache to validate certificates and ROAs all the way down 989 the tree. The inter-cache relationships are based on this object 990 security model, hence the inter-cache transport can be lightly 991 protected. 993 But this protocol document assumes that the routers can not do the 994 validation cryptography. Hence the last link, from cache to 995 router, is secured by server authentication and transport level 996 security. This is dangerous, as server authentication and 997 transport have very different threat models than object security. 999 So the strength of the trust relationship and the transport 1000 between the router(s) and the cache(s) are critical. You're 1001 betting your routing on this. 1003 While we can not say the cache must be on the same LAN, if only 1004 due to the issue of an enterprise wanting to off-load the cache 1005 task to their upstream ISP(s), locality, trust, and control are 1006 very critical issues here. The cache(s) really SHOULD be as 1007 close, in the sense of controlled and protected (against DDoS, 1008 MITM) transport, to the router(s) as possible. It also SHOULD be 1009 topologically close so that a minimum of validated routing data 1010 are needed to bootstrap a router's access to a cache. 1012 The identity of the cache server SHOULD be verified and 1013 authenticated by the router client, and vice versa, before any 1014 data are exchanged. 1016 Transports which can not provide the necessary authentication and 1017 integrity (see Section 7) must rely on network design and 1018 operational controls to provide protection against spoofing/ 1019 corruption attacks. As pointed out in Section 7, TCP-AO is the 1020 long term plan. Protocols which provide integrity and 1021 authenticity SHOULD be used, and if they can not, i.e. TCP is 1022 used as the transport, the router and cache MUST be on the same 1023 trusted, controlled network. 1025 12. IANA Considerations 1027 This document requests the IANA to assign 'well known' TCP Port 1028 Numbers to the RPKI-Router Protocol for the following, see Section 7: 1030 rpki-rtr 1031 rpki-rtr-tls 1033 This document requests the IANA to create a registry for tuples of 1034 Protocol Version / PDU Type, each of which may range from 0 to 255. 1035 The name of the registry should be rpki-rtr-pdu. The policy for 1036 adding to the registry is RFC Required per [RFC5226], either 1037 standards track or experimental. The initial entries should be as 1038 follows: 1040 Protocol 1041 Version PDU Type 1042 -------- ------------------- 1043 0 0 - Serial Notify 1044 0 1 - Serial Query 1045 0 2 - Reset Query 1046 0 3 - Cache Response 1047 0 4 - IPv4 Prefix 1048 0 6 - IPv6 Prefix 1049 0 7 - End of Data 1050 0 8 - Cache Reset 1051 0 10 - Error Report 1052 0 255 - Reserved 1054 This document requests the IANA to create a registry for Error Codes 1055 0 to 255. The name of the registry should be rpki-rtr-error. The 1056 policy for adding to the registry is Expert Review per [RFC5226], 1057 where the responsible IESG area director should appoint the Expert 1058 Reviewer. The initial entries should be as follows: 1060 0 - Corrupt Data 1061 1 - Internal Error 1062 2 - No Data Available 1063 3 - Invalid Request 1064 4 - Unsupported Protocol Version 1065 5 - Unsupported PDU Type 1066 6 - Withdrawal of Unknown Record 1067 7 - Duplicate Announcement Received 1068 255 - Reserved 1070 This document requests the IANA to add an SSH Connection Protocol 1071 Subsystem Name, as defined in [RFC4250], of 'rpki-rtr'. 1073 13. Acknowledgments 1075 The authors wish to thank Steve Bellovin, Rex Fernando, Paul Hoffman, 1076 Russ Housley, Pradosh Mohapatra, Keyur Patel, Sandy Murphy, Robert 1077 Raszuk, John Scudder, Ruediger Volk, and David Ward. Particular 1078 thanks go to Hannes Gredler for showing us the dangers of unnecessary 1079 fields. 1081 14. References 1083 14.1. Normative References 1085 [I-D.ietf-sidr-pfx-validate] 1086 Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1087 Austein, "BGP Prefix Origin Validation", 1088 draft-ietf-sidr-pfx-validate-03 (work in progress), 1089 October 2011. 1091 [RFC1982] Elz, R. and R. Bush, "Serial Number Arithmetic", RFC 1982, 1092 August 1996. 1094 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1095 Requirement Levels", BCP 14, RFC 2119, March 1997. 1097 [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 1098 Signature Option", RFC 2385, August 1998. 1100 [RFC4250] Lehtinen, S. and C. Lonvick, "The Secure Shell (SSH) 1101 Protocol Assigned Numbers", RFC 4250, January 2006. 1103 [RFC4252] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) 1104 Authentication Protocol", RFC 4252, January 2006. 1106 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1107 Internet Protocol", RFC 4301, December 2005. 1109 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1110 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1111 May 2008. 1113 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1114 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1116 [RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., 1117 Housley, R., and W. Polk, "Internet X.509 Public Key 1118 Infrastructure Certificate and Certificate Revocation List 1119 (CRL) Profile", RFC 5280, May 2008. 1121 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1122 Authentication Option", RFC 5925, June 2010. 1124 [RFC5926] Lebovitz, G. and E. Rescorla, "Cryptographic Algorithms 1125 for the TCP Authentication Option (TCP-AO)", RFC 5926, 1126 June 2010. 1128 14.2. Informative References 1130 [I-D.ietf-sidr-arch] 1131 Lepinski, M. and S. Kent, "An Infrastructure to Support 1132 Secure Internet Routing", draft-ietf-sidr-arch-13 (work in 1133 progress), May 2011. 1135 [I-D.ietf-sidr-repos-struct] 1136 Huston, G., Loomans, R., and G. Michaelson, "A Profile for 1137 Resource Certificate Repository Structure", 1138 draft-ietf-sidr-repos-struct-09 (work in progress), 1139 July 2011. 1141 [I-D.ymbk-rpki-rtr-impl] 1142 Bush, R., Austein, R., Patel, K., and H. Gredler, "RPKI 1143 Router Implementation Report", draft-ymbk-rpki-rtr-impl-00 1144 (work in progress), January 2012. 1146 [RFC1996] Vixie, P., "A Mechanism for Prompt Notification of Zone 1147 Changes (DNS NOTIFY)", RFC 1996, August 1996. 1149 [RFC4808] Bellovin, S., "Key Change Strategies for TCP-MD5", 1150 RFC 4808, March 2007. 1152 [RFC5781] Weiler, S., Ward, D., and R. Housley, "The rsync URI 1153 Scheme", RFC 5781, February 2010. 1155 Authors' Addresses 1157 Randy Bush 1158 Internet Initiative Japan 1159 5147 Crystal Springs 1160 Bainbridge Island, Washington 98110 1161 US 1163 Phone: +1 206 780 0431 x1 1164 Email: randy@psg.com 1166 Rob Austein 1167 Dragon Research Labs 1169 Email: sra@hactrn.net