idnits 2.17.1 draft-sriram-replay-protection-design-discussion-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 23, 2013) is 3867 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Secure Inter-Domain Routing K. Sriram 3 Internet-Draft D. Montgomery 4 Intended status: Informational US NIST 5 Expires: March 27, 2014 September 23, 2013 7 Design Discussion and Comparison of Replay-Attack Protection Mechanisms 8 for BGPSEC 9 draft-sriram-replay-protection-design-discussion-02 11 Abstract 13 The BGPSEC protocol requires a method for protection from replay 14 attacks, at least to control the window of exposure. In the context 15 of BGPSEC, a replay attack occurs when an adversary suppresses a 16 prefix withdrawal (implicit or explicit) or replays a previously 17 received BGPSEC announcement for a prefix that has since been 18 withdrawn. This informational document provides design discussion 19 and comparison of multiple alternative replay-attack protection 20 mechanisms weighing their pros and cons. It is meant to be a 21 companion document to the standards track I-D.-ietf-sidr-bgpsec- 22 rollover that will specify a method to be used with BGPSEC for 23 replay-attack protection. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on March 27, 2014. 42 Copyright Notice 44 Copyright (c) 2013 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 2. Definition of Replay Attack . . . . . . . . . . . . . . . . . 3 61 3. Classification of Solutions . . . . . . . . . . . . . . . . . 4 62 4. Expire Time Method . . . . . . . . . . . . . . . . . . . . . 4 63 5. Key Rollover Method . . . . . . . . . . . . . . . . . . . . . 5 64 5.1. Periodic Key Rollover Method . . . . . . . . . . . . . . 6 65 5.2. Event-driven Key Rollover Method . . . . . . . . . . . . 8 66 5.2.1. EKR-A: EKR where Update Expiry is Enforced by CRL . . 9 67 5.2.2. EKR-B: EKR where Update Expiry is Enforced by 68 NotValidAfter Time . . . . . . . . . . . . . . . . . 10 69 5.2.3. EKR with Separate Key for Each Incoming-Outgoing 70 Peering-Pair . . . . . . . . . . . . . . . . . . . . 11 71 6. Summary of Pros and Cons . . . . . . . . . . . . . . . . . . 12 72 7. Summary and Conclusions . . . . . . . . . . . . . . . . . . . 14 73 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 74 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 75 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 76 11. Informative References . . . . . . . . . . . . . . . . . . . 15 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 79 1. Introduction 81 The BGPSEC protocol [bgpsec-protocol] requires a method for 82 protection from replay attacks, at least to control the window of 83 exposure [bgpsec-reqs]. In the context of BGPSEC, a replay attack 84 occurs when an adversary suppresses a prefix withdrawal or replays a 85 previously received BGPSEC announcement for a prefix that has since 86 been withdrawn. 88 In this informational document, we provide design discussion and 89 comparison of various replay-attack protection mechanisms that may be 90 used in conjunction with the BGPSEC protocol. It is meant to be a 91 companion document to the standards track document [bgpsec-rollover] 92 that will specify a method to be used with BGPSEC for replay-attack 93 protection. Here we consider four alternative mechanisms - one based 94 on the explicit Expire Time approach and three different variants 95 based on the Key Rollover approach. We provide a detailed comparison 96 between these mechanisms weighing their pros and cons. This document 97 is meant to help inform the decision process leading to an exact 98 description for the mechanism to be finalized and formally specified 99 in [bgpsec-rollover]. 101 2. Definition of Replay Attack 103 In the context of BGPSEC, a replay attack occurs when an adversary 104 suppresses a prefix withdrawal (implicit or explicit). A replay 105 attack occurs also when the adversary replays a previously received 106 BGPSEC announcement for a prefix that has since been withdrawn. In 107 the rest of this document, we will refer to either of these two 108 situations as repay attack. The following are examples of replay 109 attacks: 111 Example 1: AS1 has AS2 and AS3 as eBGPSEC peers. At time x, AS1 had 112 announced a prefix P to AS2 and AS3. At a later time x+d, AS1 sends 113 a Withdraw for prefix P to AS2. AS2 suppresses the Withdraw (does 114 not send to its peers any explicit or implicit Withdraw). AS2 115 continues to attract some of the data for prefix P towards itself by 116 pretending to still have a signed and valid route for P. In effect, 117 AS2 can conduct a DOS attack on a server located at AS1 at prefix P. 118 (See slide #15 in [replay-discussion] for an illustration.) 120 Example 2: AS1 has AS2 and AS3 as eBGPSEC peers. AS2 and AS3 are 121 also eBGPSEC peers. At time x, AS1 had announced a prefix P to AS2 122 and AS3. AS3 also propagates to AS2 its route (via AS1) for prefix 123 P. At a later time x+d, AS1 discontinues its peering with AS2. AS2 124 should propagate an alternate longer path via AS3 for prefix P and 125 thus send an implicit Withdraw. However, AS2 suppresses it. AS2 can 126 thus make a significant part of traffic destined for prefix P to flow 127 via itself and eavesdrop on the data but not cause a DOS attack. 128 (See slide #16 in [replay-discussion] for an illustration.) 130 Example 3: AS1 has AS2 and AS3 as eBGPSEC peers. AS2 and AS3 are 131 also eBGPSEC peers. At time x, AS1 had announced a prefix P to AS2 132 without prepending (Update: AS1{pCount=1} P) but announced the same 133 prefix to AS3 with prepending (Update: AS1{pCount=2} P). Thus AS1 134 had preferred its ingress data traffic for prefix P to come in via 135 AS2. At a later time x+d, AS1 switches ingress data path preference 136 to AS3 over AS2 - announces prefix P without prepending (Update: 137 AS1{pCount=1} P) to AS3 and with prepending (Update: AS1{pCount=2} P) 138 to AS2. AS2 suppresses the new prepended path announcement (does not 139 send to its peers any new update about P). Thus AS2 carries more of 140 AS1's ingress data traffic and generates more revenue for itself at 141 the expense of AS1. (See slide #17 in [replay-discussion] for an 142 illustration.) 143 Thus the scenarios and motivations for replay attacks may differ as 144 illustrated by the examples above. 146 A requirement for replay-attack protection can be stated as follows. 147 The update that AS1 sent to AS2 at time x should expire at time x+w. 148 That means, AS2 can suppress the Withdraw or possibly replay the 149 update from AS1 for prefix P until at most x+w. This limits the 150 replay vulnerability window. (Note: If no peering or policy change 151 affecting prefix P occurs during the vulnerability window, then a 152 typical solution would include a method for extending the validity 153 period of the route(s) beyond x+w.) 155 3. Classification of Solutions 157 Mechanisms for replay-attack protection can be classified into two 158 broad categories as follows: 160 o Expire Time (ET) Method: This method uses an explicit expire time 161 field in the BGPSEC update. 163 o Key Rollover (KR) Method: In this method, the update expiry is 164 enforced by a key rollover. Router rolls over to a new signing 165 cert with a new pair of keys, and the previous router cert either 166 expires or is revoked. 168 The Key Rollover method can be further characterized into the 169 following sub categories: 171 o Periodic Key Rollover (PKR): Key rollovers happen at periodic 172 intervals. 174 o Event-driven Key Rollover (EKR): Key rollovers happen only when 175 peering or policy change events occur. 177 * EKR-A: EKR where expiry of previous update is enforced by CRL. 179 * EKR-B: EKR where expiry of previous update is controlled by 180 NotValidAfter time. 182 In Section 4, Section 5, and Section 6 we describe the various 183 methods listed above, and discuss their pros and cons. 185 4. Expire Time Method 187 The details of the Expire Time (ET) method are as follow: 189 o Explicit Expire Time is used for origin's signature. 191 o Expire Time field is required in the BGPSEC update. 193 o Periodic re-origination (beaconing) of prefixes is performed by 194 origin ASes. The value in the ET field in the update is extended 195 at beaconing time, and thereby the update is refreshed. Every 196 prefix in the Internet is re-originated and propagates through the 197 Internet once every 'beacon' interval. 199 o These beacons are distributed actions by prefix owners and 200 jittered in time by design to reduce burstiness. The beacon 201 interval can be different at different originating ASes. 203 o Beacon interval granularity: TBD but preferably in fairly granular 204 units (days). 206 Discussion of Pros and Cons: 208 Pro: This method is easy on transit routers. In the event of peering 209 or policy change, BGPSEC with the ET method behaves the same way as 210 BGP-4 in terms of which prefix routes are propagated. That is, the 211 router re-evaluates best paths factoring in peering or policy 212 changes, and propagates only those prefix routes that have a change 213 in best path. In other words, there is no necessity for the BGPSEC 214 router to re-propagate and refresh prefixes on all peering links. 215 This is because prefix updates are refreshed anyway once every beacon 216 interval by all prefix originators. There is low steady-state 217 traffic associated with beaconing (see Figure on slide #8 in 218 [replay-discussion]), but there are no huge bursts or spikes in 219 workload due to peering or policy change events at transit routers. 221 Con: Equipment vendor can potentially facilitate unnecessary frequent 222 beaconing if ISP urges and pays (dollar attack!). This possibility 223 is mitigated by having a well thought-out granularity for ET, for 224 example, if the unit of ET is one day (rather than one minute). 226 Con: A change in on-the-wire BGPSEC protocol would be needed in case 227 the unit of the ET field (granularity) needs to be changed. 229 5. Key Rollover Method 231 Key Rollover (KR) method has three variations as outlined in 232 Section 3. Those will be discussed later in this section. The 233 following features are common to all variants of the KR method: 235 o In the KR method, it is best if the BGPSEC router has two pairs of 236 certs as follows: A pair of origination certs (current and next) 237 for signing prefixes being originated by the AS of the router, and 238 a pair of transit certs (current and next) for signing transit 239 prefixes. 241 o Note: If a BGPSEC router only originates prefixes (i.e., has no 242 transit prefixes), then it needs to maintain only a pair of 243 origination certs and need not maintain the extra pair of transit 244 certs. 246 o The three KR methods differ in how the rollover of certs (or keys) 247 is done: 249 * Cert rollovers are Periodic vs. Event-driven. 251 * In the Event-driven method, the expiry of old update is (A) 252 Enforced by CRL vs. (B) Controlled by NotValidAfter time. 254 * In (A), cert's NotValidAfter field is set to a very large value 255 and CRL is issued to revoke the cert when necessary. In (B), 256 NotValidAfter field set to a permissible vulnerability window 257 time and CRL to revoke cert is not required. 259 Discussion of Pros and Cons (common to all Key Rollover methods): 261 Pro: The KR method functions by manipulating the RPKI objects (certs, 262 keys, NotValidAfter field in cert, etc.) to refresh updates or to 263 cause expiry of previously propagated updates. Unlike the ET method, 264 it does not rely on any explicit field in the update. Hence, an 265 advantage of the KR method over the ET method is that in case any 266 parameters need to change or if the method itself is modified, then 267 there is no impact on the BGPSEC protocol on the wire. 269 Con: The KR method introduces additional churn in the global RPKI 270 system. 272 Con: There is also added update churn. The amount of update churn 273 varies depending on the type of KR method used (see Section 5.1 and 274 Section 5.2). 276 We will now describe and discuss in detail the variants of the KR 277 method. 279 5.1. Periodic Key Rollover Method 281 The details of the Periodic Key Rollover (PKR) method are as follow. 283 o Router's origination cert's NotValidAfter time is used as the 284 implicit expire time for origin's signature. 286 o Each origination router re-originates (i.e., beacons) before 287 NotValidAfter time of the current cert. Beaconing is periodic re- 288 origination of prefixes by origin ASes. 290 o At beaconing time, next cert becomes the new current cert, and 291 update is signed with the private key of this new current cert and 292 re-originated. 294 o A new 'next' cert is created and propagated at beaconing time. 295 This can also be done with a good lead time. In practice, 296 multiple 'next' certs can be kept in the pipeline. They must have 297 contiguous or slightly overlapping validity periods. 299 o Every prefix in the Internet is re-originated and propagates 300 through the Internet once every 'beacon' interval. 302 o The re-originations or beacons are distributed actions by prefix 303 owners and jittered in time by design to reduce burstiness. The 304 beacon interval can be different at different originating ASes. 306 o Beacon (or re-origination) interval granularity: TBD but 307 preferably in fairly granular units (days). 309 o Transit certs can have very large NotValidAfter time (say ~years). 311 o When a peering or policy change event occurs at a transit router, 312 the router (i.e. BGPSEC router with PKR) does not perform any key 313 rollover. The router re-evaluates best paths factoring in peering 314 or policy changes, and propagates only those prefix routes that 315 have a change in best path (similar to BGP-4). There is no 316 necessity for the BGPSEC router to re-propagate and refresh 317 prefixes on all peering links. This is because prefix updates are 318 refreshed anyway once every re-origination (i.e. beaconing) 319 interval by all prefix originators. 321 Discussion of Pros and Cons: 323 Several of the same pros/cons of the Expire Time method also apply 324 here for the PKR method. 326 Pro: The main pro for the PKR method is the same as that for the 327 Expire Time (ET) method. That is, being easy on transit routers as 328 discussed in Section 4. Just as in the ET method, there is low 329 steady-state traffic associated with periodic re-originations (i.e. 330 beaconing) (see Figure on slide #8 in [replay-discussion]), but there 331 are no huge bursts or spikes in workload due to peering or policy 332 change events at transit routers. (See comparisons with the EKR 333 methods in Section 5.2.) 335 Pro: The pro discussed above for the KR method regarding parameter 336 changes (e.g., beacon interval units) not requiring change of 337 protocol on the wire is naturally applicable here. 339 Con: Churn in the RPKI is of concern. Every BGPSEC router rolls two 340 origination certs (current and next) once in every beacon (i.e., re- 341 origination) interval. 343 5.2. Event-driven Key Rollover Method 345 The common details of the Event-driven Key Rollover (EKR) methods are 346 as follow. 348 o Key rollover is reactive to events (not periodic). 350 o If a peering or policy change event involves only prefixes being 351 originated at the AS of the router, then the router rolls only the 352 origination key. 354 o If a peering change event involves transit prefixes at the AS of 355 the router, then the router rolls the transit key as well as the 356 origination key. 358 o If a key rollover takes place, then a corresponding (origination 359 or transit) new 'next' cert is propagated in RPKI. 361 Discussion of Pros and Cons: 363 Pro: As long as no triggering events occur, there is no added update 364 churn in BGPSEC. 366 Con: Whenever the transit key is rolled, there is a storm of BGPSEC 367 updates at routers in transit ASes. For example, consider BGPSEC 368 capable transit AS5 that is connected to four BGPSEC non-stub 369 customers (AS1, AS2, AS3, AS4). Assume each AS has a single BGPSEC 370 router in it. AS1 through AS4 each receives almost full table (400K 371 signed prefix updates) from AS5. Assume also that AS1 and its 372 customers together originate 100 prefixes in total; likewise for AS2, 373 AS3 and AS4. Now consider that an event occurs whereby the peering 374 between AS1 and AS5 is discontinued. As a result of this event, in 375 the EKR method, the AS5 router signs and re-propagates approximately 376 3x400K = 1.2 Million signed prefix updates to AS2, AS3 and AS4 377 combined. In addition, it also sends 4x100 = 400 Withdraws, which 378 are negligible. In comparison, in the PKR method, following the same 379 event, the router at AS5 sends only 4x100 = 400 Withdraws and signs/ 380 re-propagates ZERO prefix updates. (An illustration can be found in 381 slide #9 in [replay-discussion]. Also, additional peering change 382 scenarios and quantitative comparisons can be found in slides #10 and 383 #11 in [replay-discussion].) 385 It remains to be seen through measurement and modeling how the impact 386 of such large bursts of workload in the ETR method at the time of 387 event occurrence can be managed in route processors, e.g., by 388 jittering and throttling the workload. 390 5.2.1. EKR-A: EKR where Update Expiry is Enforced by CRL 392 EKR-A builds on the common principles as described for EKR above in 393 Section 5.2. The additional details of EKR-A operation are as 394 follow: 396 o NotValidAfter time of origination and transit certs is set to a 397 large value (~year). 399 o Whenever key rollover (for origination or transit) occurs, then 400 CRL is propagated for the old cert. So the old update expires 401 (due to invalid state) only when the CRL propagates and reaches 402 the relying router. 404 o This method relies on end-to-end CRL propagation through the RPKI 405 system to enforce expiry of a previous update whenever the need 406 arises. 408 o The cert CRL either propagates all the way to the relying router, 409 or the RPKI cache server of the router receives the CRL and then 410 sends a withdrawal of the {AS, SKI, Pub Key} tuple to the router. 411 Either way, the CRL must in effect propagate all the way to the 412 relying router. 414 o Thus the attack vulnerability window with the EKR-A method is 415 governed by the end-to-end CRL propagation time. 417 Discussion of Pros and Cons: 419 The following pro and con for the EKR-A method are in addition to the 420 common pros and cons listed above for the KR and EKR methods 421 (Section 5 and Section 5.2). 423 Pro: EKR-A has much less RPKI churn than PKR or EKR-B (see 424 Section 5.2.2). 426 Con: Router needs to receive a CRL or a withdraw of {AS, SKI, Pub 427 Key} tuple in order to know an update has expired. Hence, the 428 replay-attack vulnerability window is determined by the CRL 429 propagation time which can vary widely from one relying router to 430 another router that may be in different regions. It is anticipated 431 that this would be no worse than 24 hours, but needs to be confirmed 432 by measurements in an operational or emulated RPKI systems 433 [rpki-delay]. 435 5.2.2. EKR-B: EKR where Update Expiry is Enforced by NotValidAfter Time 437 EKR-B builds on the common principles as described for EKR above in 438 Section 5.2. The additional details of EKR-B operation are as 439 follow: 441 o NotValidAfter time of current origination and transit certs is set 442 to a value determined by the desired vulnerability window (~day). 444 o Update expiry is controlled by NotValidAfter time and CRL is not 445 sent for the old cert when key rollover happens. 447 o If no triggering event occurs to cause origination key rollover 448 within a pre-set time (NotValidAfter), then new origination 449 (current and next) certs are issued only to extend the 450 NotValidAfter time but the corresponding key pairs and SKIs remain 451 unchanged. 453 o A previous update automatically becomes invalid at the earliest 454 NotValidAfter time of the certs used in the signatures unless each 455 of those certs' NotValidAfter time has been extended. 457 o Likewise for the transit (current and next) certs and keys. 459 o Changes in certs to extend their NotValidAfter time need not 460 propagate end-to-end (all the way to the relying routers); they 461 may propagate only up to the RPKI cache server of the relying 462 router. RPKI cache server would send a withdraw for an {AS, SKI, 463 Pub Key} tuple to a relying router if the NotValidAfter time of 464 the cert has passed. 466 o The changes in certs to advance NotValidAfter time can be 467 scheduled and propagated in RPKI well in advance. 469 Discussion of Pros and Cons: 471 The following pro and con for EKR-B are in addition to the common 472 pros and cons listed above for the KR and EKR methods (Section 5 and 473 Section 5.2). 475 Pro: Update expiry is automatic in case the NotValidAfter time of any 476 of the certs used to sign the update has not been extended. So the 477 replay-attack vulnerability window is predictable and not influenced 478 by the RPKI end-to-end propagation time. 480 Pro: Routers do not get any RPKI updates from the RPKI cache server 481 when cert changes but the key pair and SKI remain unchanged. Routers 482 do not receive NotValidAfter time from their RPKI cache server. 483 There is no need for it. Instead, the RPKI cache server keeps track 484 of NotValidAfter time, and provides to routers only valid {AS, SKI, 485 Pub Key} tuples. This saves some RPKI state maintenance workload at 486 the routers. 488 Con: EKR-B has much more RPKI churn than EKR-A because both 489 origination and transit certs need to be reissued periodically to 490 extend their validity time (in the absence of any events). 492 5.2.3. EKR with Separate Key for Each Incoming-Outgoing Peering-Pair 494 This is a place holder section where we mention another variant of 495 the EKR method. This idea has not been considered or whetted by the 496 SIDR WG yet. So we only mention it here briefly. 498 As noted earlier, the EKR methods considered so far generate a huge 499 spike in workload whenever the transit key rollover takes place at a 500 router. One way to reduce that workload is to have a separate 501 signing key for each incoming-outgoing peering pair. For example, 502 consider a BGPSEC router in AS4 that has peers in AS1, AS2, and AS3. 503 The router will hold six signing keys, one each corresponding to 504 (AS1, AS2), (AS2, AS1), (AS1, AS3), (AS3, AS1), (AS2, AS3), and (AS3, 505 AS2) peering-pairs. Note that the directionality of peering is 506 included here and is necessary. They key corresponding to (AS-i, 507 AS-j) would only be used to sign updates received from AS-i and being 508 forwarded to AS-j. In the general case, when the BGPSEC router has n 509 peers, the number of transit keys will be n(n-1). Since there would 510 be a Current and a Next key (for rollover), the number of transit 511 keys held in the router for signing will be actually 2n(n-1). When a 512 peering or policy change occurs, the router would rollover only those 513 specific keys that correspond to the peering-pairs over which the 514 prefix updates are affected. In the above example, suppose a policy 515 change between AS4 and AS1 causes AS4 to prepend prefixes sent to AS1 516 (pCount changed from 1 to 2). Then AS4 would do key rollover only 517 for (AS2, AS1) and (AS3, AS1) peering-pairs, and not for any of the 518 others. This would substantially reduce the quantity of prefix 519 updates that are signed and re-propagated. In general, when peering 520 or policy changes occur, this method will reduce the number of prefix 521 updates to be re-propagated to exactly the same as that with normal 522 BGP. That means that this method would also be on par with the ET 523 and PKR methods in terms of update churn when a peering or policy 524 change takes place. The downside of this method is that the router 525 needs to maintain 2n(n-1) key pairs if it has n BGPSEC peers. 527 Detailed discussion and comparison of this method with other methods 528 can be provided in a later version of this document if the idea picks 529 up interest in the WG. 531 6. Summary of Pros and Cons 533 Table 1 below summarizes the pros and cons for the various replay- 534 attack protection methods. This summary follows from the discussion 535 above in Section 4 and Section 5. 537 +----------+---------------------------+----------------------------+ 538 | Method | Pros | Cons | 539 +----------+---------------------------+----------------------------+ 540 | Expire | 1. The background load | 1. Prefix owner can abuse | 541 | Time | due to beaconing is low | by beaconing too | 542 | (ET) | and not bursty. | frequently. | 543 | | --- | --- | 544 | | 2. Transit AS does NOT | 2. Any change to the units | 545 | | have a huge spike in | (granularity) of ET field | 546 | | workload even when a | entails a change to on- | 547 | | peering or policy change | the-wire BGPSEC protocol. | 548 | | happens at that AS. | | 549 | | Beaconing facilitates | | 550 | | this. | | 551 | | --- | --- | 552 | | 3. Does not add to RPKI | | 553 | | churn. | | 554 | -------- | ------------------------- | -------------------------- | 555 | Periodic | 1. The background load | 1. Prefix owner can abuse | 556 | Key | due to beaconing is low | by beaconing (i.e. re- | 557 | Rollover | and not bursty. | originating) too | 558 | (PKR) | | frequently. | 559 | | --- | --- | 560 | | 2. Transit AS does NOT | 2. Adds to RPKI churn. A | 561 | | have a huge spike in | pair of certs (current and | 562 | | workload even when a | next) for each origination | 563 | | peering change happens at | router are rolled once | 564 | | that AS. Beaconing (i.e. | every beacon (i.e. re- | 565 | | periodic re-origination) | origination) interval. | 566 | | facilitates this. | Significantly more RPKI | 567 | | | churn than that with EKR-A | 568 | | | or EKR-B methods. | 569 | | --- | --- | 570 | | 3. If the periodic re- | | 571 | | origination (i.e., | | 572 | | beaconing) interval units | | 573 | | change, BGPSEC protocol | | 574 | | on the wire remains | | 575 | | unaffected. | | 576 | | --- | --- | 577 | | 4. Changes in the method | | 578 | | (while still based on Key | | 579 | | Rollover) can be | | 580 | | accommodated without | | 581 | | requiring any change to | | 582 | | on-the-wire BGPSEC | | 583 | | protocol. | | 584 | -------- | ------------------------- | -------------------------- | 585 | Event | 1. No update churn for | 1. Whenever the transit | 586 | driven | long periods when no | key is rolled (in response | 587 | Key | peering or policy changes | to a peering or policy | 588 | Rollover | occur. | change event), there is a | 589 | Type A | | storm of BGPSEC updates, | 590 | (EKR-A) | | especially at routers in | 591 | | | large transit ASes. | 592 | | --- | --- | 593 | | 2. The added churn in | 2. The replay-attack | 594 | | RPKI is much lower than | vulnerability window is | 595 | | that in the EKR-B method. | dependent on end-to-end | 596 | | | CRL propagation. It may | 597 | | | vary significantly from | 598 | | | one relying router to | 599 | | | another that may be in | 600 | | | different regions. | 601 | | --- | --- | 602 | | 3. Same as Pro #4 for the | | 603 | | PKR method. | | 604 | -------- | ------------------------- | -------------------------- | 605 | Event | 1. Same as Pro #1 for the | 1. Same as Con #1 for the | 606 | driven | EKR-A method. | EKR-A method. | 607 | Key | | | 608 | Rollover | | | 609 | Type B | | | 610 | (EKR-B) | | | 611 | | --- | --- | 612 | | 2. The replay-attack | 2. The added churn in RPKI | 613 | | vulnerability window is | is much higher than that | 614 | | enforced by NotValidAfter | in the EKR-A method. | 615 | | time in certs and is | | 616 | | therefore predictable. | | 617 | | --- | --- | 618 | | 3. Same as Pro #4 for the | | 619 | | PKR method. | | 620 +----------+---------------------------+----------------------------+ 622 Table 1: Table with Summary of Pros and Cons 624 7. Summary and Conclusions 626 We have attempted to provide insights into the operation of multiple 627 alternative methods for replay-attack protection. It is hoped that 628 the SIDR WG will take the insights and trade-offs presented here as 629 input for deciding on the choice of a mechanism for protection from 630 replay attacks. Once that decision is made, the chosen mechanism 631 would be included in the standards track document [bgpsec-rollover]. 633 Some important considerations for the decision making can be possibly 634 listed as follow: 636 1. The Expire Time (ET) method is best (on par with the PKR method) 637 in terms of preventing huge update workloads during peering and 638 policy change events at transit routers with several peers. It 639 has no added RPKI churn. But the ET method has the disadvantage 640 of requiring on-the-wire protocol change if some parameters 641 (e.g., the units of beacon interval) change. 643 2. The Periodic Key Rollover (PKR) method operates the same way as 644 the ET method for preventing huge update workloads during peering 645 and policy change events at transit routers with several peers. 646 It does not have the disadvantage of requiring on-the-wire 647 protocol change if some parameters (e.g., the units of beaconing/ 648 re-origination periodicity) change. But it has the downside of 649 added RPKI churn. 651 3. The Event-driven Key Roll (EKR-A and EKR-B) methods have 652 significantly less RPKI churn than the PKR method. They also 653 have no BGPSEC update churn during long quiet periods when no 654 peering or policy change events occur. But they suffer the 655 drawback of creating huge update workloads during peering and 656 policy change events at transit routers with several peers. Can 657 this workload be jittered or flow controlled to spread it over 658 time without convergence delay concerns? May be - needs further 659 study. 661 4. The EKR-A method relies on end-to-end CRL propagation through the 662 RPKI system to enforce expiry of a previous update when needed. 663 By contrast, in the EKR-B method the update expiry is controlled 664 by NotValidAfter time of the certs used in update signatures. In 665 EKR-B, previous update automatically becomes invalid at the 666 earliest NotValidAfter time of the certs used in the signatures 667 unless each of those certs' NotValidAfter time has been extended. 668 In the latter method, changes in certs to extend their 669 NotValidAfter time need not propagate end-to-end (all the way to 670 the relying routers); they may propagate only up to the RPKI 671 cache server of the relying router (see Section 5.2.2). The 672 changes in certs to advance NotValidAfter time can be scheduled 673 and propagated in RPKI well in advance. 675 5. Besides being out-of-band relative to the BGPSEC protocol on the 676 wire, the other good thing about the Key Rollover method is that 677 once the basics of the mechanism are implemented, there may be 678 flexibility to implement PKR, EKR-A or EKR-B on top of it. It 679 may also be possible to switch from one method to another (within 680 this class) if necessary based on operational experience; this 681 transition would not require any change to on-the-wire BGPSEC 682 protocol. 684 8. Acknowledgements 686 The authors would like to thank Roque Gagliano, Brian Weis and Steve 687 Kent for helpful discussions. Further, we are thankful to fellow 688 NIST BGP team members for comments and suggestions. 690 9. IANA Considerations 692 This memo includes no request to IANA. 694 10. Security Considerations 696 This memo requires no security considerations of its own since it is 697 targeted to be an informational RFC in support of [bgpsec-rollover] 698 and [bgpsec-protocol]. The reader is therefore directed to the 699 security considerations provided in those documents. 701 11. Informative References 703 [bgpsec-protocol] 704 Lepinski (Ed.), M., "BGPSEC Protocol Specification", Work 705 in Progress, February 2013, . 708 [bgpsec-reqs] 709 Belloven, S., Bush, R., and D. Ward, "Security 710 Requirements for BGP Path Validation", Work in Progress, 711 April 2013, . 714 [bgpsec-rollover] 715 Gagliano, R., Patel, K., and B. Weis, "BGPSEC router key 716 rollover as an alternative to beaconing", Work in 717 Progress, April 2013, . 720 [replay-discussion] 721 Sriram, K. and D. Montgomery, "Discussion of Key Rollover 722 Mechanisms for Replay-Attack Protection", Presented at 723 IETF-85 SIDR WG Meeting, November 2012, . 726 [rpki-delay] 727 Kent, S. and K. Sriram, "RPKI rsync Download Delay 728 Modeling", Presented at IETF-86 SIDR WG Meeting, March 729 2013, . 732 Authors' Addresses 734 Kotikalapudi Sriram 735 US NIST 737 Email: ksriram@nist.gov 739 Doug Montgomery 740 US NIST 742 Email: dougm@nist.gov