idnits 2.17.1 draft-ietf-sidr-rpki-rtr-26.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.ietf-sidr-arch]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 3, 2012) is 4458 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC5280' is mentioned on line 815, but not defined == Outdated reference: A later version (-10) exists of draft-ietf-sidr-pfx-validate-03 ** Obsolete normative reference: RFC 2385 (Obsoleted by RFC 5925) ** Downref: Normative reference to an Informational RFC: RFC 3269 ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 6125 (Obsoleted by RFC 9525) Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Bush 3 Internet-Draft Internet Initiative Japan 4 Intended status: Standards Track R. Austein 5 Expires: August 6, 2012 Dragon Research Labs 6 February 3, 2012 8 The RPKI/Router Protocol 9 draft-ietf-sidr-rpki-rtr-26 11 Abstract 13 In order to verifiably validate the origin ASs of BGP announcements, 14 routers need a simple but reliable mechanism to receive RPKI 15 [I-D.ietf-sidr-arch] prefix origin data from a trusted cache. This 16 document describes a protocol to deliver validated prefix origin data 17 to routers. 19 Requirements Language 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in [RFC2119]. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on August 6, 2012. 42 Copyright Notice 44 Copyright (c) 2012 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. Deployment Structure . . . . . . . . . . . . . . . . . . . . . 4 62 4. Operational Overview . . . . . . . . . . . . . . . . . . . . . 4 63 5. Protocol Data Units (PDUs) . . . . . . . . . . . . . . . . . . 5 64 5.1. Fields of a PDU . . . . . . . . . . . . . . . . . . . . . 6 65 5.2. Serial Notify . . . . . . . . . . . . . . . . . . . . . . 7 66 5.3. Serial Query . . . . . . . . . . . . . . . . . . . . . . . 8 67 5.4. Reset Query . . . . . . . . . . . . . . . . . . . . . . . 9 68 5.5. Cache Response . . . . . . . . . . . . . . . . . . . . . . 9 69 5.6. IPv4 Prefix . . . . . . . . . . . . . . . . . . . . . . . 10 70 5.7. IPv6 Prefix . . . . . . . . . . . . . . . . . . . . . . . 11 71 5.8. End of Data . . . . . . . . . . . . . . . . . . . . . . . 11 72 5.9. Cache Reset . . . . . . . . . . . . . . . . . . . . . . . 12 73 5.10. Error Report . . . . . . . . . . . . . . . . . . . . . . . 12 74 6. Protocol Sequences . . . . . . . . . . . . . . . . . . . . . . 13 75 6.1. Start or Restart . . . . . . . . . . . . . . . . . . . . . 14 76 6.2. Typical Exchange . . . . . . . . . . . . . . . . . . . . . 15 77 6.3. No Incremental Update Available . . . . . . . . . . . . . 15 78 6.4. Cache has No Data Available . . . . . . . . . . . . . . . 16 79 7. Transport . . . . . . . . . . . . . . . . . . . . . . . . . . 16 80 7.1. SSH Transport . . . . . . . . . . . . . . . . . . . . . . 18 81 7.2. TLS Transport . . . . . . . . . . . . . . . . . . . . . . 18 82 7.3. TCP MD5 Transport . . . . . . . . . . . . . . . . . . . . 19 83 7.4. TCP-AO Transport . . . . . . . . . . . . . . . . . . . . . 19 84 8. Router-Cache Set-Up . . . . . . . . . . . . . . . . . . . . . 20 85 9. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 21 86 10. Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 22 87 11. Security Considerations . . . . . . . . . . . . . . . . . . . 22 88 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 89 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 90 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 91 14.1. Normative References . . . . . . . . . . . . . . . . . . . 25 92 14.2. Informative References . . . . . . . . . . . . . . . . . . 26 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 95 1. Introduction 97 In order to verifiably validate the origin ASs of BGP announcements, 98 routers need a simple but reliable mechanism to receive RPKI 99 (Resource Public Key Infrastructure) [I-D.ietf-sidr-arch] 100 cryptographically validated prefix origin data from a trusted cache. 101 This document describes a protocol to deliver validated prefix origin 102 data to routers. The design is intentionally constrained to be 103 usable on much of the current generation of ISP router platforms. 105 Section 3 describes the deployment structure and Section 4 then 106 presents an operational overview. The binary payloads of the 107 protocol are formally described in Section 5, and the expected PDU 108 sequences are described in Section 6. The transport protocol options 109 are described in Section 7. Section 8 details how routers and caches 110 are configured to connect and authenticate. Section 9 describes 111 likely deployment scenarios. The traditional security and IANA 112 considerations end the document. 114 The protocol is extensible to support new PDUs with new semantics 115 when and as needed, as indicated by deployment experience. PDUs are 116 versioned should deployment experience call for change. 118 For an implementation (not inter-op) report, see 119 [I-D.ymbk-rpki-rtr-impl] 121 2. Glossary 123 The following terms are used with special meaning: 125 Global RPKI: The authoritative data of the RPKI are published in a 126 distributed set of servers at the IANA, RIRs, NIRs, and ISPs, see 127 [I-D.ietf-sidr-repos-struct]. 129 Cache: A coalesced copy of the RPKI which is periodically fetched/ 130 refreshed directly or indirectly from the global RPKI using the 131 [RFC5781] protocol/tools. Relying party software is used to 132 gather and validate the distributed data of the RPKI into a cache. 133 Trusting this cache further is a matter between the provider of 134 the cache and a relying party. 136 Serial Number: A 32-bit strictly increasing unsigned integer which 137 wraps from 2^32-1 to 0. It denotes the logical version of a 138 cache. A cache increments the value when it successfully updates 139 its data from a parent cache or from primary RPKI data. As a 140 cache is receiving, new incoming data and implicit deletes are 141 associated with the new serial but MUST NOT be sent until the 142 fetch is complete. A serial number is not commensurate between 143 caches, nor need it be maintained across resets of the cache 144 server. See [RFC1982] on DNS Serial Number Arithmetic for too 145 much detail on serial number arithmetic. 147 Session ID: When a cache server is started, it generates a session 148 identifier to uniquely identify the instance of the cache and to 149 bind it to the sequence of Serial Numbers that cache instance will 150 generate. This allows the router to restart a failed session 151 knowing that the Serial Number it is using is commensurate with 152 that of the cache. 154 3. Deployment Structure 156 Deployment of the RPKI to reach routers has a three level structure 157 as follows: 159 Global RPKI: The authoritative data of the RPKI are published in a 160 distributed set of servers, RPKI publication repositories, e.g. 161 the IANA, RIRs, NIRs, and ISPs, see [I-D.ietf-sidr-repos-struct]. 163 Local Caches: A local set of one or more collected and verified 164 caches. A relying party, e.g. router or other client, MUST have a 165 trust relationship with, and a trusted transport channel to, any 166 authoritative cache(s) it uses. 168 Routers: A router fetches data from a local cache using the protocol 169 described in this document. It is said to be a client of the 170 cache. There MAY be mechanisms for the router to assure itself of 171 the authenticity of the cache and to authenticate itself to the 172 cache. 174 4. Operational Overview 176 A router establishes and keeps open a connection to one or more 177 caches with which it has client/server relationships. It is 178 configured with a semi-ordered list of caches, and establishes a 179 connection to the most preferred cache, or set of caches, which 180 accept the connections. 182 The router MUST choose the most preferred, by configuration, cache or 183 set of caches so that the operator may control load on their caches 184 and the Global RPKI. 186 Periodically, the router sends to the cache the serial number of the 187 highest numbered data it has received from that cache, i.e. the 188 router's current serial number. When a router establishes a new 189 connection to a cache, or wishes to reset a current relationship, it 190 sends a Reset Query. 192 The Cache responds with all data records which have serial numbers 193 greater than that in the router's query. This may be the null set, 194 in which case the End of Data PDU is still sent. Note that 'greater' 195 must take wrap-around into account, see [RFC1982]. 197 When the router has received all data records from the cache, it sets 198 its current serial number to that of the serial number in the End of 199 Data PDU. 201 When the cache updates its database, it sends a Notify message to 202 every currently connected router. This is a hint that now would be a 203 good time for the router to poll for an update, but is only a hint. 204 The protocol requires the router to poll for updates periodically in 205 any case. 207 Strictly speaking, a router could track a cache simply by asking for 208 a complete data set every time it updates, but this would be very 209 inefficient. The serial number based incremental update mechanism 210 allows an efficient transfer of just the data records which have 211 changed since last update. As with any update protocol based on 212 incremental transfers, the router must be prepared to fall back to a 213 full transfer if for any reason the cache is unable to provide the 214 necessary incremental data. Unlike some incremental transfer 215 protocols, this protocol requires the router to make an explicit 216 request to start the fallback process; this is deliberate, as the 217 cache has no way of knowing whether the router has also established 218 sessions with other caches that may be able to provide better 219 service. 221 As a cache server must evaluate certificates and ROAs (Route Origin 222 Attestations, see [I-D.ietf-sidr-arch]) which are time dependent, 223 servers' clocks MUST be correct to a tolerance of approximately an 224 hour. 226 5. Protocol Data Units (PDUs) 228 The exchanges between the cache and the router are sequences of 229 exchanges of the following PDUs according to the rules described in 230 Section 6. 232 Fields with unspecified content MUST be zero on transmission and MAY 233 be ignored on receipt. 235 5.1. Fields of a PDU 237 PDUs contain the following data elements: 239 Protocol Version: An eight-bit unsigned integer, currently 0, 240 denoting the version of this protocol. 242 PDU Type: An eight-bit unsigned integer, denoting the type of the 243 PDU, e.g. IPv4 Prefix, etc. 245 Serial Number: The serial number of the RPKI Cache when this set of 246 PDUs was received from an up-stream cache server or gathered from 247 the global RPKI. A cache increments its serial number when 248 completing a rigorously validated update from a parent cache or 249 the Global RPKI. 251 Session ID: When a cache server is started, it generates a Session 252 ID to identify the instance of the cache and to bind it to the 253 sequence of Serial Numbers that cache instance will generate. 254 This allows the router to restart a failed session knowing that 255 the Serial Number it is using is commensurate with that of the 256 cache. If, at any time, either the router or the cache finds the 257 value of the session identifiers they hold disagree, they MUST 258 completely drop the session and the router MUST flush all data 259 learned from that cache. 261 Should a cache erroneously reuse a Session ID so that a router 262 does not realize that the session has changed (old session ID and 263 new session ID have same numeric value), the router may become 264 confused as to the content of the cache. The time it takes the 265 router to discover it is confused will depend on whether the 266 serial numbers are also reused. If the serial numbers in the old 267 and new sessions are different enough, the cache will respond to 268 the router's Serial Query with a Cache Reset, which will solve the 269 problem. If, however, the serial numbers are close, the cache may 270 respond with a Cache Response, which may not be enough to bring 271 the router into sync. In such cases, it's likely but not certain 272 that the router will detect some discrepancy between the state 273 that the cache expects and its own state. For example, the Cache 274 Response may tell the router to drop a record which the router 275 does not hold, or may tell the router to add a record which the 276 router already has. In such cases, a router will detect the error 277 and reset the session. The one case in which the router may stay 278 out of sync is when nothing in the Cache Response contradicts any 279 data currently held by the router. 281 Using persistent storage for the session identifier or a clock- 282 based scheme for generating session identifiers should avoid the 283 risk of session identifier collisions. 285 The Session ID might be a pseudo-random, a strictly increasing 286 value if the cache has reliable storage, etc. 288 Length: A 32-bit unsigned integer which has as its value the count 289 of the bytes in the entire PDU, including the eight bytes of 290 header which end with the length field. 292 Flags: The lowest order bit of the Flags field is 1 for an 293 announcement and 0 for a withdrawal, whether this PDU announces a 294 new right to announce the prefix or withdraws a previously 295 announced right. A withdraw effectively deletes one previously 296 announced IPvX Prefix PDU with the exact same Prefix, Length, Max- 297 Len, and ASN. 299 Prefix Length: An eight-bit unsigned integer denoting the shortest 300 prefix allowed for the prefix. 302 Max Length: An eight-bit unsigned integer denoting the longest 303 prefix allowed by the prefix. This MUST NOT be less than the 304 Prefix Length element. 306 Prefix: The IPv4 or IPv6 prefix of the ROA. 308 Autonomous System Number: ASN allowed to announce this prefix, a 32- 309 bit unsigned integer. 311 Zero: Fields shown as zero or reserved MUST be zero. The value of 312 such a field MUST be ignored on receipt. 314 5.2. Serial Notify 316 The cache notifies the router that the cache has new data. 318 The Session ID reassures the router that the serial numbers are 319 commensurate, i.e. the cache session has not been changed. 321 Serial Notify is only message that the cache can send that is not in 322 response to a message from the router. 324 0 8 16 24 31 325 .-------------------------------------------. 326 | Protocol | PDU | | 327 | Version | Type | Session ID | 328 | 0 | 0 | | 329 +-------------------------------------------+ 330 | | 331 | Length=12 | 332 | | 333 +-------------------------------------------+ 334 | | 335 | Serial Number | 336 | | 337 `-------------------------------------------' 339 5.3. Serial Query 341 Serial Query: The router sends Serial Query to ask the cache for all 342 payload PDUs which have serial numbers higher than the serial number 343 in the Serial Query. 345 The cache replies to this query with a Cache Response PDU 346 (Section 5.5) if the cache has a, possibly null, record of the 347 changes since the serial number specified by the router. If there 348 have been no changes since the router last queried, the cache then 349 sends an End Of Data PDU. 351 If the cache does not have the data needed to update the router, 352 perhaps because its records do not go back to the Serial Number in 353 the Serial Query, then it responds with a Cache Reset PDU 354 (Section 5.9). 356 The Session ID tells the cache what instance the router expects to 357 ensure that the serial numbers are commensurate, i.e. the cache 358 session has not been changed. 360 0 8 16 24 31 361 .-------------------------------------------. 362 | Protocol | PDU | | 363 | Version | Type | Session ID | 364 | 0 | 1 | | 365 +-------------------------------------------+ 366 | | 367 | Length=12 | 368 | | 369 +-------------------------------------------+ 370 | | 371 | Serial Number | 372 | | 373 `-------------------------------------------' 375 5.4. Reset Query 377 Reset Query: The router tells the cache that it wants to receive the 378 total active, current, non-withdrawn, database. The cache responds 379 with a Cache Response PDU (Section 5.5). 381 0 8 16 24 31 382 .-------------------------------------------. 383 | Protocol | PDU | | 384 | Version | Type | reserved = zero | 385 | 0 | 2 | | 386 +-------------------------------------------+ 387 | | 388 | Length=8 | 389 | | 390 `-------------------------------------------' 392 5.5. Cache Response 394 Cache Response: The cache responds with zero or more payload PDUs. 395 When replying to a Serial Query request (Section 5.3), the cache 396 sends the set of all data records it has with serial numbers greater 397 than that sent by the client router. When replying to a Reset Query, 398 the cache sends the set of all data records it has; in this case the 399 withdraw/announce field in the payload PDUs MUST have the value 1 400 (announce). 402 In response to a Reset Query, the new value of the Session ID tells 403 the router the instance of the cache session for future confirmation. 404 In response to a Serial Query, the Session ID being the same 405 reassures the router that the serial numbers are commensurate, i.e. 406 the cache session has not changed. 408 0 8 16 24 31 409 .-------------------------------------------. 410 | Protocol | PDU | | 411 | Version | Type | Session ID | 412 | 0 | 3 | | 413 +-------------------------------------------+ 414 | | 415 | Length=8 | 416 | | 417 `-------------------------------------------' 419 5.6. IPv4 Prefix 421 0 8 16 24 31 422 .-------------------------------------------. 423 | Protocol | PDU | | 424 | Version | Type | reserved = zero | 425 | 0 | 4 | | 426 +-------------------------------------------+ 427 | | 428 | Length=20 | 429 | | 430 +-------------------------------------------+ 431 | | Prefix | Max | | 432 | Flags | Length | Length | zero | 433 | | 0..32 | 0..32 | | 434 +-------------------------------------------+ 435 | | 436 | IPv4 Prefix | 437 | | 438 +-------------------------------------------+ 439 | | 440 | Autonomous System Number | 441 | | 442 `-------------------------------------------' 444 The lowest order bit of the Flags field is 1 for an announcement and 445 0 for a withdrawal. 447 In the RPKI, nothing prevents a signing certificate from issuing two 448 identical ROAs. In this case there would be no semantic difference 449 between the objects, merely a process redundancy. 451 In the RPKI, there is also an actual need for what might appear to a 452 router as identical IPvX (IPv4 or IPv6) PDUs. This can occur when an 453 upstream certificate is being reissued or there is an address 454 ownership transfer up the validation chain. The ROA would be 455 identical in the router sense, i.e. have the same {prefix, len, max- 456 len, asn}, but a different validation path in the RPKI. This is 457 important to the RPKI, but not to the router. 459 The cache server MUST ensure that it has told the router client to 460 have one and only one IPvX PDU for a unique {prefix, len, max-len, 461 asn} at any one point in time. Should the router client receive an 462 IPvX PDU with a {prefix, len, max-len, asn} identical to one it 463 already has active, it SHOULD raise a Duplicate Announcement Received 464 error. 466 5.7. IPv6 Prefix 468 0 8 16 24 31 469 .-------------------------------------------. 470 | Protocol | PDU | | 471 | Version | Type | reserved = zero | 472 | 0 | 6 | | 473 +-------------------------------------------+ 474 | | 475 | Length=32 | 476 | | 477 +-------------------------------------------+ 478 | | Prefix | Max | | 479 | Flags | Length | Length | zero | 480 | | 0..128 | 0..128 | | 481 +-------------------------------------------+ 482 | | 483 +--- ---+ 484 | | 485 +--- IPv6 Prefix ---+ 486 | | 487 +--- ---+ 488 | | 489 +-------------------------------------------+ 490 | | 491 | Autonomous System Number | 492 | | 493 `-------------------------------------------' 495 Analogous to the IPv4 Prefix PDU, 96 more bits no magic. 497 5.8. End of Data 499 End of Data: Cache tells router it has no more data for the request. 501 The Session ID MUST be the same as that of the corresponding Cache 502 Response which began the, possibly null, sequence of data PDUs. 504 0 8 16 24 31 505 .-------------------------------------------. 506 | Protocol | PDU | | 507 | Version | Type | Session ID | 508 | 0 | 7 | | 509 +-------------------------------------------+ 510 | | 511 | Length=12 | 512 | | 513 +-------------------------------------------+ 514 | | 515 | Serial Number | 516 | | 517 `-------------------------------------------' 519 5.9. Cache Reset 521 The cache may respond to a Serial Query informing the router that the 522 cache cannot provide an incremental update starting from the serial 523 number specified by the router. The router must decide whether to 524 issue a Reset Query or switch to a different cache. 526 0 8 16 24 31 527 .-------------------------------------------. 528 | Protocol | PDU | | 529 | Version | Type | reserved = zero | 530 | 0 | 8 | | 531 +-------------------------------------------+ 532 | | 533 | Length=8 | 534 | | 535 `-------------------------------------------' 537 5.10. Error Report 539 This PDU is used by either party to report an error to the other. 541 Error reports are only sent as responses to other PDUs. 543 The Error Code is described in Section 10. 545 If the error is not associated with any particular PDU, the Erroneous 546 PDU field MUST be empty and the Length of Encapsulated PDU field MUST 547 be zero. 549 An Error Report PDU MUST NOT be sent for an Error Report PDU. If an 550 erroneous Error Report PDU is received, the session SHOULD be 551 dropped. 553 If the error is associated with a PDU of excessive length, i.e. too 554 long to be any legal PDU other than another Error Report, or possibly 555 corrupt length, the Erroneous PDU field MAY be truncated. 557 The diagnostic text is optional, if not present the Length of Error 558 Text field MUST be zero. If error text is present, it MUST be a 559 string in UTF-8 encoding (see [RFC3269]). 561 0 8 16 24 31 562 .-------------------------------------------. 563 | Protocol | PDU | | 564 | Version | Type | Error Code | 565 | 0 | 10 | | 566 +-------------------------------------------+ 567 | | 568 | Length | 569 | | 570 +-------------------------------------------+ 571 | | 572 | Length of Encapsulated PDU | 573 | | 574 +-------------------------------------------+ 575 | | 576 ~ Copy of Erroneous PDU ~ 577 | | 578 +-------------------------------------------+ 579 | | 580 | Length of Error Text | 581 | | 582 +-------------------------------------------+ 583 | | 584 | Arbitrary Text | 585 | of | 586 ~ Error Diagnostic Message ~ 587 | | 588 `-------------------------------------------' 590 6. Protocol Sequences 592 The sequences of PDU transmissions fall into three conversations as 593 follows: 595 6.1. Start or Restart 597 Cache Router 598 ~ ~ 599 | <----- Reset Query -------- | R requests data (or Serial Query) 600 | | 601 | ----- Cache Response -----> | C confirms request 602 | ------- IPvX Prefix ------> | C sends zero or more 603 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 604 | ------- IPvX Prefix ------> | Payload PDUs 605 | ------ End of Data ------> | C sends End of Data 606 | | and sends new serial 607 ~ ~ 609 When a transport session is first established, the router MAY send a 610 Reset Query and the cache responds with a data sequence of all data 611 it contains. 613 Alternatively, if the router has significant unexpired data from a 614 broken session with the same cache, it MAY start with a Serial Query 615 containing the Session ID from the previous session to ensure the 616 serial numbers are commensurate. 618 This Reset Query sequence is also used when the router receives a 619 Cache Reset, chooses a new cache, or fears that it has otherwise lost 620 its way. 622 To limit the length of time a cache must keep the data necessary to 623 generate incremental updates, a router MUST send either a Serial 624 Query or a Reset Query no less frequently than once an hour. This 625 also acts as a keep alive at the application layer. 627 As the cache MAY not keep updates for little more than one hour, the 628 router MUST have a polling interval of no greater than once an hour. 630 6.2. Typical Exchange 632 Cache Router 633 ~ ~ 634 | -------- Notify ----------> | (optional) 635 | | 636 | <----- Serial Query ------- | R requests data 637 | | 638 | ----- Cache Response -----> | C confirms request 639 | ------- IPvX Prefix ------> | C sends zero or more 640 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 641 | ------- IPvX Prefix ------> | Payload PDUs 642 | ------ End of Data ------> | C sends End of Data 643 | | and sends new serial 644 ~ ~ 646 The cache server SHOULD send a notify PDU with its current serial 647 number when the cache's serial changes, with the expectation that the 648 router MAY then issue a serial query earlier than it otherwise might. 649 This is analogous to DNS NOTIFY in [RFC1996]. The cache MUST rate 650 limit Serial Notifies to no more frequently than one per minute. 652 When the transport layer is up and either a timer has gone off in the 653 router, or the cache has sent a Notify, the router queries for new 654 data by sending a Serial Query, and the cache sends all data newer 655 than the serial in the Serial Query. 657 To limit the length of time a cache must keep old withdraws, a router 658 MUST send either a Serial Query or a Reset Query no less frequently 659 than once an hour. 661 6.3. No Incremental Update Available 663 Cache Router 664 ~ ~ 665 | <----- Serial Query ------ | R requests data 666 | ------- Cache Reset ------> | C cannot supply update 667 | | from specified serial 668 | <------ Reset Query ------- | R requests new data 669 | ----- Cache Response -----> | C confirms request 670 | ------- IPvX Prefix ------> | C sends zero or more 671 | ------- IPvX Prefix ------> | IPv4 and IPv6 Prefix 672 | ------- IPvX Prefix ------> | Payload PDUs 673 | ------ End of Data ------> | C sends End of Data 674 | | and sends new serial 675 ~ ~ 677 The cache may respond to a Serial Query with a Cache Reset, informing 678 the router that the cache cannot supply an incremental update from 679 the serial number specified by the router. This might be because the 680 cache has lost state, or because the router has waited too long 681 between polls and the cache has cleaned up old data that it no longer 682 believes it needs, or because the cache has run out of storage space 683 and had to expire some old data early. Regardless of how this state 684 arose, the cache replies with a Cache Reset to tell the router that 685 it cannot honor the request. When a router receives this, the router 686 SHOULD attempt to connect to any more preferred caches in its cache 687 list. If there are no more preferred caches it MUST issue a Reset 688 Query and get an entire new load from the cache. 690 6.4. Cache has No Data Available 692 Cache Router 693 ~ ~ 694 | <----- Serial Query ------ | R requests data 695 | ---- Error Report PDU ----> | C No Data Available 696 ~ ~ 698 Cache Router 699 ~ ~ 700 | <----- Reset Query ------- | R requests data 701 | ---- Error Report PDU ----> | C No Data Available 702 ~ ~ 704 The cache may respond to either a Serial Query or a Reset Query 705 informing the router that the cache cannot supply any update at all. 706 The most likely cause is that the cache has lost state, perhaps due 707 to a restart, and has not yet recovered. While it is possible that a 708 cache might go into such a state without dropping any of its active 709 sessions, a router is more likely to see this behavior when it 710 initially connects and issues a Reset Query while the cache is still 711 rebuilding its database. 713 When a router receives this kind of error, the router SHOULD attempt 714 to connect to any other caches in its cache list, in preference 715 order. If no other caches are available, the router MUST issue 716 periodic Reset Queries until it gets a new usable load from the 717 cache. 719 7. Transport 721 The transport layer session between a router and a cache carries the 722 binary Protocol Data Units (PDUs) in a persistent session. 724 To prevent cache spoofing and DoS attacks by illegitimate routers, it 725 is highly desirable that the router and the cache are authenticated 726 to each other. Integrity protection for payloads is also desirable 727 to protect against monkey in the middle (MITM) attacks. 728 Unfortunately, there is no protocol to do so on all currently used 729 platforms. Therefore, as of this document, there is no mandatory to 730 implement transport which provides authentication and integrity 731 protection. 733 To reduce exposure to dropped but non-terminated sessions, both 734 caches and routers SHOULD enable keep alives when available in the 735 chosen transport protocol. 737 It is expected that, when TCP-AO [RFC5925] is available on all 738 platforms deployed by operators, it will become the mandatory to 739 implement transport. 741 Caches and routers MUST implement unprotected transport over TCP 742 using a port, rpki-rtr, to be assigned, see Section 12. Operators 743 SHOULD use procedural means, e.g. access control lists (ACLs), to 744 reduce the exposure to authentication issues. 746 Caches and routers SHOULD use TCP-AO, SSHv2, TCP MD5, or IPsec 747 transport. 749 If unprotected TCP is the transport, the cache and routers MUST be on 750 the same trusted and controlled network. 752 If available to the operator, caches and routers MUST use one of the 753 following more protected protocols. 755 Caches and routers SHOULD use TCP-AO transport [RFC5925] over the 756 rpki-rtr port. 758 Caches and routers MAY use SSHv2 transport [RFC4252] using a the 759 normal SSH port. For an example, see Section 7.1. 761 Caches and routers MAY use TCP MD5 transport [RFC2385] using the 762 rpki-rtr port. Note that TCP MD5 has been obsoleted by TCP-AO 763 [RFC5925]. 765 Caches and routers MAY use IPsec transport [RFC4301] using the rpki- 766 rtr port. 768 Caches and routers MAY use TLS transport [RFC5246] using using a 769 port, rpki-rtr-tls, to be assigned, see Section 12. 771 7.1. SSH Transport 773 To run over SSH, the client router first establishes an SSH transport 774 connection using the SSHv2 transport protocol, and the client and 775 server exchange keys for message integrity and encryption. The 776 client then invokes the "ssh-userauth" service to authenticate the 777 application, as described in the SSH authentication protocol RFC 4252 778 [RFC4252]. Once the application has been successfully authenticated, 779 the client invokes the "ssh-connection" service, also known as the 780 SSH connection protocol. 782 After the ssh-connection service is established, the client opens a 783 channel of type "session", which results in an SSH session. 785 Once the SSH session has been established, the application invokes 786 the application transport as an SSH subsystem called "rpki-rtr". 787 Subsystem support is a feature of SSH version 2 (SSHv2) and is not 788 included in SSHv1. Running this protocol as an SSH subsystem avoids 789 the need for the application to recognize shell prompts or skip over 790 extraneous information, such as a system message that is sent at 791 shell start-up. 793 It is assumed that the router and cache have exchanged keys out of 794 band by some reasonably secured means. 796 Cache servers supporting SSH transport MUST accept RSA and DSA 797 authentication, and SHOULD accept ECDSA authentication. User 798 authentication MUST be supported; host authentication MAY be 799 supported. Implementations MAY support password authentication. 800 Client routers SHOULD verify the public key of the cache, to avoid 801 monkey in the middle attacks. 803 7.2. TLS Transport 805 Client routers using TLS transport MUST present client-side 806 certificates to authenticate themselves to the cache, to allow the 807 cache to manage load by rejecting connections from unauthorized 808 routers. While in principle any type of certificate and certificate 809 authority (CA) may be used, in general cache operators will generally 810 wish to create their own small-scale CA and issue certificates to 811 each authorized router. This simplifies credential roll-over; any 812 unrevoked, unexpired certificate from the proper CA may be used. 814 Certificates used to authenticate client routers in this protocol 815 MUST include a subjectAltName extension [RFC5280] containing one or 816 more iPAddress identities; when authenticating the router's 817 certificate, the cache MUST check the IP address of the TLS 818 connection against these iPAddress identities and SHOULD reject the 819 connection if none of the iPAddress identities match the connection. 821 Routers MUST also verify the cache's TLS server certificate, using 822 subjectAltName dNSName identities as described in [RFC6125], to avoid 823 monkey in the middle attacks. The rules and guidelines defined in 824 [RFC6125] apply here, with the following considerations: 826 Support for DNS-ID identifier type (that is, the dNSName identity 827 in the subjectAltName extension) is REQUIRED in rpki-rtr server 828 and client implementations which use TLS. Certification 829 authorities which issue rpki-rtr server certificates MUST support 830 the DNS-ID identifier type, and the DNS-ID identifier type MUST be 831 present in rpki-rtr server certificates. 833 DNS names in rpki-rtr server certificates SHOULD NOT contain the 834 wildcard character "*". 836 rpki-rtr implementations which use TLS MUST NOT use CN-ID 837 identifiers; a CN field may be present in the server certificate's 838 subject name, but MUST NOT be used for authentication within the 839 rules described in [RFC6125]. 841 The client router MUST set its "reference identifier" to the DNS 842 name of the rpki-rtr cache. 844 7.3. TCP MD5 Transport 846 If TCP-MD5 is used, implementations MUST support key lengths of at 847 least 80 printable ASCII bytes, per section 4.5 of [RFC2385]. 848 Implementations MUST also support hexadecimal sequences of at least 849 32 characters, i.e., 128 bits. 851 Key rollover with TCP-MD5 is problematic. Cache servers SHOULD 852 support [RFC4808]. 854 7.4. TCP-AO Transport 856 Implementations MUST support key lengths of at least 80 printable 857 ASCII bytes. Implementations MUST also support hexadecimal sequences 858 of at least 32 characters, i.e., 128 bits. MAC lengths of at least 859 96 bits MUST be supported, per section 5.3 of [RFC2385]. 861 The cryptographic algorithms and associcated parameters described in 862 [RFC5926] MUST be supported. 864 8. Router-Cache Set-Up 866 A cache has the public authentication data for each router it is 867 configured to support. 869 A router may be configured to peer with a selection of caches, and a 870 cache may be configured to support a selection of routers. Each must 871 have the name of, and authentication data for, each peer. In 872 addition, in a router, this list has a non-unique preference value 873 for each server in order of preference. This preference merely 874 denotes proximity, not trust, preferred belief, etc. The client 875 router attempts to establish a session with each potential serving 876 cache in preference order, and then starts to load data from the most 877 preferred cache to which it can connect and authenticate. The 878 router's list of caches has the following elements: 880 Preference: An unsigned integer denoting the router's preference to 881 connect to that cache, the lower the value the more preferred. 883 Name: The IP Address or fully qualified domain name of the cache. 885 Key: Any needed public key of the cache. 887 MyKey: Any needed private key or certificate of this client. 889 Due to the distributed nature of the RPKI, caches simply can not be 890 rigorously synchronous. A client may hold data from multiple caches, 891 but MUST keep the data marked as to source, as later updates MUST 892 affect the correct data. 894 Just as there may be more than one covering ROA from a single cache, 895 there may be multiple covering ROAs from multiple caches. The 896 results are as described in [I-D.ietf-sidr-pfx-validate]. 898 If data from multiple caches are held, implementations MUST NOT 899 distinguish between data sources when performing validation. 901 When a more preferred cache becomes available, if resources allow, it 902 would be prudent for the client to start fetching from that cache. 904 The client SHOULD attempt to maintain at least one set of data, 905 regardless of whether it has chosen a different cache or established 906 a new connection to the previous cache. 908 A client MAY drop the data from a particular cache when it is fully 909 in synch with one or more other caches. 911 A client SHOULD delete the data from a cache when it has been unable 912 to refresh from that cache for a configurable timer value. The 913 default for that value is twice the polling period for that cache. 915 If a client loses connectivity to a cache it is using, or otherwise 916 decides to switch to a new cache, it SHOULD retain the data from the 917 previous cache until it has a full set of data from one or more other 918 caches. Note that this may already be true at the point of 919 connection loss if the client has connections to more than one cache. 921 9. Deployment Scenarios 923 For illustration, we present three likely deployment scenarios. 925 Small End Site: The small multi-homed end site may wish to outsource 926 the RPKI cache to one or more of their upstream ISPs. They would 927 exchange authentication material with the ISP using some out of 928 band mechanism, and their router(s) would connect to one or more 929 up-streams' caches. The ISPs would likely deploy caches intended 930 for customer use separately from the caches with which their own 931 BGP speakers peer. 933 Large End Site: A larger multi-homed end site might run one or more 934 caches, arranging them in a hierarchy of client caches, each 935 fetching from a serving cache which is closer to the global RPKI. 936 They might configure fall-back peerings to up-stream ISP caches. 938 ISP Backbone: A large ISP would likely have one or more redundant 939 caches in each major PoP, and these caches would fetch from each 940 other in an ISP-dependent topology so as not to place undue load 941 on the global RPKI publication infrastructure. 943 Experience with large DNS cache deployments has shown that complex 944 topologies are ill-advised as it is easy to make errors in the graph, 945 e.g. not maintaining a loop-free condition. 947 Of course, these are illustrations and there are other possible 948 deployment strategies. It is expected that minimizing load on the 949 global RPKI servers will be a major consideration. 951 To keep load on global RPKI services from unnecessary peaks, it is 952 recommended that primary caches which load from the distributed 953 global RPKI not do so all at the same times, e.g. on the hour. 954 Choose a random time, perhaps the ISP's AS number modulo 60 and 955 jitter the inter-fetch timing. 957 10. Error Codes 959 This section contains a preliminary list of error codes. The authors 960 expect additions to this section during development of the initial 961 implementations. There is an IANA registry where valid error codes 962 are listed, see Section 12. Errors which are considered fatal SHOULD 963 cause the session to be dropped. 965 0: Corrupt Data (fatal): The receiver believes the received PDU to 966 be corrupt in a manner not specified by other error codes. 968 1: Internal Error (fatal): The party reporting the error experienced 969 some kind of internal error unrelated to protocol operation (ran 970 out of memory, a coding assertion failed, et cetera). 972 2: No Data Available: The cache believes itself to be in good 973 working order, but is unable to answer either a Serial Query or a 974 Reset Query because it has no useful data available at this time. 975 This is likely to be a temporary error, and most likely indicates 976 that the cache has not yet completed pulling down an initial 977 current data set from the global RPKI system after some kind of 978 event that invalidated whatever data it might have previously held 979 (reboot, network partition, et cetera). 981 3: Invalid Request (fatal): The cache server believes the client's 982 request to be invalid. 984 4: Unsupported Protocol Version (fatal): The Protocol Version is not 985 known by the receiver of the PDU. 987 5: Unsupported PDU Type (fatal): The PDU Type is not known by the 988 receiver of the PDU. 990 6: Withdrawal of Unknown Record (fatal): The received PDU has Flag=0 991 but a record for the Prefix/PrefixLength/MaxLength triple does not 992 exist in the receiver's database. 994 7: Duplicate Announcement Received (fatal): The received PDU has an 995 identical {prefix, len, max-len, asn} tuple as a PDU which is 996 still active in the router. 998 11. Security Considerations 1000 As this document describes a security protocol, many aspects of 1001 security interest are described in the relevant sections. This 1002 section points out issues which may not be obvious in other sections. 1004 Cache Validation: In order for a collection of caches as described 1005 in Section 9 to guarantee a consistent view, they need to be given 1006 consistent trust anchors to use in their internal validation 1007 process. Distribution of a consistent trust anchor is assumed to 1008 be out of band. 1010 Cache Peer Identification: The router initiates a transport session 1011 to a cache, which it identifies by either IP address or fully 1012 qualified domain name. Be aware that a DNS or address spoofing 1013 attack could make the correct cache unreachable. No session would 1014 be established, as the authorization keys would not match. 1016 Transport Security: The RPKI relies on object, not server or 1017 transport, trust. I.e. the IANA root trust anchor is distributed 1018 to all caches through some out of band means, and can then be used 1019 by each cache to validate certificates and ROAs all the way down 1020 the tree. The inter-cache relationships are based on this object 1021 security model, hence the inter-cache transport can be lightly 1022 protected. 1024 But this protocol document assumes that the routers can not do the 1025 validation cryptography. Hence the last link, from cache to 1026 router, is secured by server authentication and transport level 1027 security. This is dangerous, as server authentication and 1028 transport have very different threat models than object security. 1030 So the strength of the trust relationship and the transport 1031 between the router(s) and the cache(s) are critical. You're 1032 betting your routing on this. 1034 While we can not say the cache must be on the same LAN, if only 1035 due to the issue of an enterprise wanting to off-load the cache 1036 task to their upstream ISP(s), locality, trust, and control are 1037 very critical issues here. The cache(s) really SHOULD be as 1038 close, in the sense of controlled and protected (against DDoS, 1039 MITM) transport, to the router(s) as possible. It also SHOULD be 1040 topologically close so that a minimum of validated routing data 1041 are needed to bootstrap a router's access to a cache. 1043 The identity of the cache server SHOULD be verified and 1044 authenticated by the router client, and vice versa, before any 1045 data are exchanged. 1047 Transports which can not provide the necessary authentication and 1048 integrity (see Section 7) must rely on network design and 1049 operational controls to provide protection against spoofing/ 1050 corruption attacks. As pointed out in Section 7, TCP-AO is the 1051 long term plan. Protocols which provide integrity and 1052 authenticity SHOULD be used, and if they can not, i.e. TCP is 1053 used as the transport, the router and cache MUST be on the same 1054 trusted, controlled network. 1056 12. IANA Considerations 1058 This document requests the IANA to assign 'well known' TCP Port 1059 Numbers to the RPKI-Router Protocol for the following, see Section 7: 1061 rpki-rtr 1062 rpki-rtr-tls 1064 This document requests the IANA to create a registry for tuples of 1065 Protocol Version / PDU Type, each of which may range from 0 to 255. 1066 The name of the registry should be rpki-rtr-pdu. The policy for 1067 adding to the registry is RFC Required per [RFC5226], either 1068 standards track or experimental. The initial entries should be as 1069 follows: 1071 Protocol 1072 Version PDU Type 1073 -------- ------------------- 1074 0 0 - Serial Notify 1075 0 1 - Serial Query 1076 0 2 - Reset Query 1077 0 3 - Cache Response 1078 0 4 - IPv4 Prefix 1079 0 6 - IPv6 Prefix 1080 0 7 - End of Data 1081 0 8 - Cache Reset 1082 0 10 - Error Report 1083 0 255 - Reserved 1085 This document requests the IANA to create a registry for Error Codes 1086 0 to 255. The name of the registry should be rpki-rtr-error. The 1087 policy for adding to the registry is Expert Review per [RFC5226], 1088 where the responsible IESG area director should appoint the Expert 1089 Reviewer. The initial entries should be as follows: 1091 0 - Corrupt Data 1092 1 - Internal Error 1093 2 - No Data Available 1094 3 - Invalid Request 1095 4 - Unsupported Protocol Version 1096 5 - Unsupported PDU Type 1097 6 - Withdrawal of Unknown Record 1098 7 - Duplicate Announcement Received 1100 255 - Reserved 1102 This document requests the IANA to add an SSH Connection Protocol 1103 Subsystem Name, as defined in [RFC4250], of 'rpki-rtr'. 1105 13. Acknowledgments 1107 The authors wish to thank Steve Bellovin, Rex Fernando, Paul Hoffman, 1108 Russ Housley, Pradosh Mohapatra, Keyur Patel, Sandy Murphy, Robert 1109 Raszuk, John Scudder, Ruediger Volk, and David Ward. Particular 1110 thanks go to Hannes Gredler for showing us the dangers of unnecessary 1111 fields. 1113 14. References 1115 14.1. Normative References 1117 [I-D.ietf-sidr-pfx-validate] 1118 Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1119 Austein, "BGP Prefix Origin Validation", 1120 draft-ietf-sidr-pfx-validate-03 (work in progress), 1121 October 2011. 1123 [RFC1982] Elz, R. and R. Bush, "Serial Number Arithmetic", RFC 1982, 1124 August 1996. 1126 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1127 Requirement Levels", BCP 14, RFC 2119, March 1997. 1129 [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 1130 Signature Option", RFC 2385, August 1998. 1132 [RFC3269] Kermode, R. and L. Vicisano, "Author Guidelines for 1133 Reliable Multicast Transport (RMT) Building Blocks and 1134 Protocol Instantiation documents", RFC 3269, April 2002. 1136 [RFC4250] Lehtinen, S. and C. Lonvick, "The Secure Shell (SSH) 1137 Protocol Assigned Numbers", RFC 4250, January 2006. 1139 [RFC4252] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) 1140 Authentication Protocol", RFC 4252, January 2006. 1142 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1143 Internet Protocol", RFC 4301, December 2005. 1145 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1146 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1147 May 2008. 1149 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1150 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1152 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1153 Authentication Option", RFC 5925, June 2010. 1155 [RFC5926] Lebovitz, G. and E. Rescorla, "Cryptographic Algorithms 1156 for the TCP Authentication Option (TCP-AO)", RFC 5926, 1157 June 2010. 1159 [RFC6125] Saint-Andre, P. and J. Hodges, "Representation and 1160 Verification of Domain-Based Application Service Identity 1161 within Internet Public Key Infrastructure Using X.509 1162 (PKIX) Certificates in the Context of Transport Layer 1163 Security (TLS)", RFC 6125, March 2011. 1165 14.2. Informative References 1167 [I-D.ietf-sidr-arch] 1168 Lepinski, M. and S. Kent, "An Infrastructure to Support 1169 Secure Internet Routing", draft-ietf-sidr-arch-13 (work in 1170 progress), May 2011. 1172 [I-D.ietf-sidr-repos-struct] 1173 Huston, G., Loomans, R., and G. Michaelson, "A Profile for 1174 Resource Certificate Repository Structure", 1175 draft-ietf-sidr-repos-struct-09 (work in progress), 1176 July 2011. 1178 [I-D.ymbk-rpki-rtr-impl] 1179 Bush, R., Austein, R., Patel, K., Gredler, H., and M. 1180 Waehlisch, "RPKI Router Implementation Report", 1181 draft-ymbk-rpki-rtr-impl-01 (work in progress), 1182 January 2012. 1184 [RFC1996] Vixie, P., "A Mechanism for Prompt Notification of Zone 1185 Changes (DNS NOTIFY)", RFC 1996, August 1996. 1187 [RFC4808] Bellovin, S., "Key Change Strategies for TCP-MD5", 1188 RFC 4808, March 2007. 1190 [RFC5781] Weiler, S., Ward, D., and R. Housley, "The rsync URI 1191 Scheme", RFC 5781, February 2010. 1193 Authors' Addresses 1195 Randy Bush 1196 Internet Initiative Japan 1197 5147 Crystal Springs 1198 Bainbridge Island, Washington 98110 1199 US 1201 Phone: +1 206 780 0431 x1 1202 Email: randy@psg.com 1204 Rob Austein 1205 Dragon Research Labs 1207 Email: sra@hactrn.net