idnits 2.17.1 

draft-sriram-replay-protection-design-discussion-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 18, 2016) is 2747 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-23) exists of
     draft-ietf-sidr-bgpsec-protocol-18

  == Outdated reference: A later version (-06) exists of
     draft-ietf-sidr-bgpsec-rollover-05


     Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Secure Inter-Domain Routing                                    K. Sriram
3	Internet-Draft                                             D. Montgomery
4	Intended status: Informational                                   US NIST
5	Expires: April 21, 2017                                 October 18, 2016

7	  Design Discussion and Comparison of Protection Mechanisms for Replay
8	              Attack and Withdrawal Suppression in BGPsec
9	          draft-sriram-replay-protection-design-discussion-07

11	Abstract

13	   In the context of BGPsec, a withdrawal suppression occurs when an
14	   adversary AS suppresses a prefix withdrawal with the intension of
15	   continuing to attract traffic for that prefix based on a previous
16	   (signed and valid) BGPsec announcement that was earlier propagated.
17	   Subsequently if the adversary AS had a BGPsec session reset with a
18	   neighboring BGPsec speaker and when the session is restored, the AS
19	   replays said previous BGPsec announcement (even though it was
20	   withdrawn), then such a replay action is called a replay attack.  The
21	   BGPsec protocol should incorporate a method for protection from
22	   Replay Attack and Withdrawal Suppression (RAWS), at least to control
23	   the window of exposure.  This informational document provides design
24	   discussion and comparison of multiple alternative RAWS protection
25	   mechanisms weighing their pros and cons.  This is meant to be a
26	   companion document to the standards track I-D.-ietf-sidr-bgpsec-
27	   rollover that will specify a method to be used with BGPsec for RAWS
28	   protection.

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at http://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on April 21, 2017.

47	Copyright Notice

49	   Copyright (c) 2016 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
65	   2.  Description and Scenarios of Replay Attacks and Withdrawal
66	       Suppression . . . . . . . . . . . . . . . . . . . . . . . . .   3
67	   3.  Classification of Solutions . . . . . . . . . . . . . . . . .   4
68	   4.  Expiration Time Method  . . . . . . . . . . . . . . . . . . .   5
69	   5.  Key Rollover Method . . . . . . . . . . . . . . . . . . . . .   6
70	     5.1.  Periodic Key Rollover Method  . . . . . . . . . . . . . .   7
71	     5.2.  Event-driven Key Rollover Method  . . . . . . . . . . . .   9
72	       5.2.1.  EKR-A: EKR where Update Expiry is Enforced by CRL . .  10
73	       5.2.2.  EKR-B: EKR where Update Expiry is Enforced by
74	               NotAfter Time . . . . . . . . . . . . . . . . . . . .  11
75	       5.2.3.  EKR with Separate Key for Each Incoming-Outgoing
76	               Peering-Pair  . . . . . . . . . . . . . . . . . . . .  12
77	   6.  Summary of Pros and Cons  . . . . . . . . . . . . . . . . . .  13
78	   7.  Summary and Conclusions . . . . . . . . . . . . . . . . . . .  15
79	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
80	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
81	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  16
82	   11. Informative References  . . . . . . . . . . . . . . . . . . .  16
83	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

85	1.  Introduction

87	   In BGP or BGPsec, prefix or route withdrawals happen, and a
88	   withdrawal can be explicit (i.e. route simply withdrawn) or implicit
89	   (i.e. a new route announcement replaces the previous).  In the
90	   context of BGPsec, a withdrawal suppression occurs when an adversary
91	   AS suppresses a prefix withdrawal with the intension of continuing to
92	   attract traffic for that prefix based on a previous (signed and
93	   valid) BGPsec announcement that was earlier propagated.  Subsequently
94	   if the adversary AS has a BGPsec session reset with a neighboring
95	   BGPsec speaker and when the session is restored, the AS replays said
96	   previous BGPsec announcement (even though it was withdrawn), then
97	   such a replay action is called a replay attack.  The BGPsec protocol
98	   [I-D.ietf-sidr-bgpsec-protocol] requires a method for protection from
99	   Replay Attack and Withdrawal Suppression (RAWS), at least to control
100	   the window of exposure (see Sections 4.3, 4.4 of [RFC7353]).

102	   In this informational document, we provide design discussion and
103	   comparison of various RAWS protection mechanisms that may be used in
104	   conjunction with the BGPsec protocol.  This is meant to be a
105	   companion document to the standards track document
106	   [I-D.ietf-sidr-bgpsec-rollover] that will specify a method to be used
107	   with BGPsec for RAWS protection.  Here we consider four alternative
108	   mechanisms - one based on the explicit Expiration Time approach and
109	   three variants based on the Key Rollover approach.  We provide a
110	   detailed comparison among these mechanisms, weighing their pros and
111	   cons.  This document is meant to help inform the decision process
112	   leading to an exact description for the mechanism to be finalized and
113	   formally specified in [I-D.ietf-sidr-bgpsec-rollover].

115	2.  Description and Scenarios of Replay Attacks and Withdrawal
116	    Suppression

118	   The following are examples of various forms of replay attack and
119	   withdrawal suppression (RAWS):

121	   Example 1: AS1 has AS2 and AS3 as eBGPsec peers.  At time x, AS1 had
122	   announced a prefix (P) to AS2 and AS3.  At a later time (x+d), AS1
123	   sends a Withdraw for prefix P to AS2.  AS2 suppresses the Withdraw
124	   (does not send to its peers any explicit or implicit Withdraw).  AS2
125	   continues to attract some of the data for prefix P by pretending to
126	   still have a valid (signed) route for P.  In effect, AS2 can conduct
127	   a Denial of Service (DOS) attack on a server located at prefix P.
128	   (See slide #15 in [RAWS-discussion] for an illustration.)

130	   Example 2: AS1 has AS2 and AS3 as eBGPsec peers.  AS2 and AS3 are
131	   also eBGPsec peers.  At time x, AS1 announced a prefix P to AS2 and
132	   AS3.  AS3 also propagates to AS2 its route (via AS1) for prefix P.
133	   At a later time (x+d), AS1 discontinues its peering with AS2.  AS2
134	   should propagate an alternate longer path via AS3 for prefix P and
135	   thus implicitly withdraw the route via AS1.  However, AS2 suppresses
136	   it.  AS2 can thus make some traffic destined for prefix P to flow via
137	   itself.  This enables AS2 to eavesdrop on the data but not cause a
138	   DOS attack.  AS2 may also choose to DoS attack hosts in prefix P.
139	   (See slide #16 in [RAWS-discussion] for an illustration.)

141	   Example 3: AS1 has AS2 and AS3 as eBGPsec peers.  AS2 and AS3 are
142	   also eBGPsec peers.  At time x, AS1 announced a prefix P to AS2
143	   without prepending (Update: AS1{pCount=1} P) but announced the same
144	   prefix to AS3 with prepending (Update: AS1{pCount=2} P).  Thus AS1
145	   had preferred its ingress data traffic for prefix P to come in via
146	   AS2.  At a later time (x+d), AS1 switches ingress data path
147	   preference to AS3 over AS2 - announcing prefix P to AS3 without
148	   prepending (Update: AS1{pCount=1} P) and to AS2 with prepending
149	   (Update: AS1{pCount=2} P).  AS2 suppresses the new prepended path
150	   announcement (does not send to its peers any new update about P).
151	   Thus AS2 continues to attract more of AS1's ingress data traffic and
152	   generates more revenue for itself at the expense of AS1.  (See slide
153	   #17 in [RAWS-discussion] for an illustration.)

155	   As illustrated above, the mechanisms and motivations for RAWS may
156	   differ.

158	   In the context of the examples mentioned above, a requirement for
159	   RAWS protection can be stated as follows.  An update that AS1 sends
160	   to AS2 at time x should expire at time x+w.  This capability would
161	   allow other ASes to detect actions by AS2 to suppress the Withdraw or
162	   replay the update from AS1 for prefix P after time x+w.  This limits
163	   the RAWS vulnerability window.  (Note: If no peering or policy change
164	   affecting prefix P occurs during the vulnerability window, then a
165	   typical solution would include a method for extending the validity
166	   period of the route(s) beyond x+w.)  We will later discuss what a
167	   reasonable window size, w, should be.

169	   The obvious downside of any mechanism that support this capability is
170	   that it will require AS1 to send a new update before time x+w, and
171	   this update will need to propagate via all the paths that the
172	   original update traversed.  Thus more update traffic will result than
173	   if the RAWS protection mechanism were not employed, and this traffic
174	   will require cryptographic processing by all of the routers along the
175	   paths.  Thus the creation of a mechanism to counter RAWS attacks
176	   potentially introduces a new opportunity for DoS attacks against
177	   eBGPsec routers.

179	3.  Classification of Solutions

181	   Mechanisms for RAWS protection can be classified into two broad
182	   categories as follows:

184	   o  Expiration Time (ET) Method: This method uses an explicit
185	      Expiration Time field in the BGPsec update.  (Note: Explicit
186	      Expire Time field was included in an earlier version of the BGPsec
187	      protocol specification [draft-ietf-sidr-bgpsec-protocol-01].)

189	   o  Key Rollover (KR) Method: In this method, the update expiration is
190	      enforced by a key rollover.  Router transitions to a new
191	      certificate with a new pair of keys, and the previous router
192	      certificate either expires or is revoked.

194	   The Key Rollover method can be further characterized into the
195	   following sub categories:

197	   o  Periodic Key Rollover (PKR): Key rollovers happen at periodic
198	      intervals.

200	   o  Event-driven Key Rollover (EKR): Key rollovers happen only when
201	      peering or policy change events occur.

203	      *  EKR-A: EKR where expiry of previous update is enforced by CRL.

205	      *  EKR-B: EKR where expiry of previous update is controlled by
206	         NotAfter time (router certificate is not revoked at the time
207	         when the event happens).

209	   In Section 4, Section 5, and Section 6 we describe the various
210	   methods listed above, and discuss their pros and cons.

212	4.  Expiration Time Method

214	   The details of the Expiration Time (ET) method are as follow:

216	   o  Explicit Expiration Time is used for origin's signature.

218	   o  Expiration Time field is required in the BGPsec update.

220	   o  Periodic re-origination (beaconing) of prefixes is performed by
221	      origin ASes.  The value in the ET field in the update is extended
222	      at beaconing time, and thereby the update is refreshed.  Every
223	      prefix in the Internet is re-originated and propagates through the
224	      Internet once every 'beacon' interval.

226	   o  These beacons are distributed actions by prefix owners and are
227	      intended to be jittered in time to reduce burstiness.  The beacon
228	      interval can be different at each originating AS.

230	   o  Beacon interval granularity: TBD but preferably in fairly granular
231	      units (days).  It is important to limit the ability of each AS to
232	      specify a short beacon interval, to prevent an AS from using this
233	      mechanism to cause BGPsec to thrash.

235	   Discussion of Pros and Cons:

237	   Pro: This method is easy on transit routers.  In the event of peering
238	   or policy change, BGPsec with the ET method behaves the same way as
239	   BGP-4 in terms of which prefix routes are propagated.  That is, the
240	   router re-evaluates best paths factoring in peering or policy
241	   changes, and propagates only those prefix routes that have a change
242	   in best path.  In other words, there is no necessity for a transit
243	   BGPsec router to re-propagate and refresh prefixes on all peering
244	   links.  This is because prefix updates are refreshed anyway once
245	   every beacon interval by all prefix originators.  There is low
246	   steady-state traffic associated with beaconing (see Figure on slide
247	   #8 in [RAWS-discussion]), but there are no huge bursts or spikes in
248	   workload due to peering or policy change events at transit routers.

250	   Con: Equipment vendor can potentially facilitate unnecessary frequent
251	   beaconing if ISP urges and pays (dollar attack!).  This possibility
252	   is mitigated by having a well thought-out granularity for ET, for
253	   example, setting the unit for advertising ET to one day (rather than
254	   one minute).

256	   Con: A change in on-the-wire BGPsec protocol would be needed in case
257	   the unit of the ET field (granularity) needs to be changed.

259	5.  Key Rollover Method

261	   Key Rollover (KR) method has three variations as outlined in
262	   Section 3.  Those will be discussed later in this section.  The
263	   following features are common to all variants of the KR method:

265	   o  In the KR method, it is best if the BGPsec router has two pairs of
266	      certificates as follows: A pair of origination certificates
267	      (current and next) for signing prefixes being originated by the AS
268	      of the router, and a pair of transit certificates (current and
269	      next) for signing transit prefixes.

271	   o  Note: If a BGPsec router only originates prefixes (i.e. has no
272	      transit prefixes), then it needs to maintain only a pair of
273	      origination certificates and need not maintain the extra pair of
274	      transit certificates.  (This would be the case for the vast
275	      majority of ASes, since most are stubs.)

277	   o  The three KR methods differ in how the rollover of certificates
278	      (or keys) is done:

280	      *  Certificate rollovers are Periodic vs. Event-driven.

282	      *  In the Event-driven method, the expiry of old update is (A)
283	         Enforced by CRL vs. (B) Controlled by NotAfter time.

285	      *  In (A), certificate's NotAfter field is set to a very large
286	         value and CRL is issued to revoke the certificate when
287	         necessary.  In (B), NotAfter field set to a permissible
288	         vulnerability window time, and CRL to revoke certificate is not
289	         required.

291	   Discussion of Pros and Cons (common to all Key Rollover methods):

293	   Pro: The KR method functions by manipulating the RPKI objects
294	   (certificates, keys, NotAfter field in certificate, etc.) to refresh
295	   updates or to cause expiry of previously propagated updates.  Unlike
296	   the ET method, it does not rely on any explicit field in the update.
297	   Hence, an advantage of the KR method over the ET method is that in
298	   case any parameters need to change or if the method itself is
299	   modified, then there is no impact on the BGPsec protocol on the wire.

301	   Con: The KR method increases the number of objects in the RPKI
302	   repository system, by requiring at least two certificates for every
303	   transit AS.  It also introduces additional churn in the global RPKI
304	   as these certificates expire (or are revoked) and are replaced.

306	   Con: There is also added update churn.  The amount of update churn
307	   varies depending on the type of KR method used (see Section 5.1 and
308	   Section 5.2).

310	   We will now describe and discuss in detail the variants of the KR
311	   method.

313	5.1.  Periodic Key Rollover Method

315	   The details of the Periodic Key Rollover (PKR) method are as follow.

317	   o  Router's origination certificate's NotAfter time is used
318	      effectively as expiration time for origin's signature.

320	   o  Each origination router re-originates (i.e. beacons) before
321	      NotAfter time of the current origination certificate.  Beaconing
322	      is periodic re-origination of prefixes by origin ASes.

324	   o  At beaconing time, the next origination certificate becomes the
325	      new current certificate, and the new update is signed with the
326	      private key of this new current certificate and re-originated.

328	   o  A new 'next' origination certificate is created and propagated at
329	      or before beaconing time.  This can also be done with a good lead
330	      time.  In practice, multiple 'next' certificates for each router
331	      could be propagated and kept in the in the RPKI repositories.
332	      They must have contiguous or slightly overlapping validity
333	      periods.

335	   o  Every prefix in the Internet is re-originated and propagates
336	      through the Internet once every 'beacon' interval.

338	   o  The re-originations or beacons are distributed actions by prefix
339	      owners and jittered in time by design to reduce burstiness.  The
340	      beacon interval can be different at different originating ASes.

342	   o  Beacon (or re-origination) interval granularity: TBD but
343	      preferably in fairly granular units (days).

345	   o  Transit certificates can have large NotAfter time (e.g., whatever
346	      duration is required normally for key maintenance).

348	   o  When a peering or policy change event occurs at a transit router,
349	      the router does not perform any reactive key rollover.  The router
350	      re-evaluates best paths factoring in peering or policy changes,
351	      and propagates only those prefix routes that have a change in best
352	      path (similar to BGP-4).  There is no necessity for the BGPsec
353	      router to re-propagate and refresh prefixes on all peering links.
354	      This is because prefix updates are refreshed anyway once every re-
355	      origination (i.e. beaconing) interval by all prefix originators.

357	   Discussion of Pros and Cons:

359	   Several of the same pros/cons of the Expiration Time method also
360	   apply here for the PKR method.

362	   Pro: The main pro for the PKR method is the same as that for the
363	   Expiration Time (ET) method.  That is, being easy on transit routers
364	   as discussed in Section 4.  Just as in the ET method, there is low
365	   steady-state traffic associated with periodic re-originations (i.e.
366	   beaconing) (see Figure on slide #8 in [RAWS-discussion]), but there
367	   are no huge bursts or spikes in workload due to peering or policy
368	   change events at transit routers.  (See comparisons with the EKR
369	   methods in Section 5.2.)

371	   Pro: The common pro discussed previously for all KR methods, namely,
372	   not requiring change of protocol on the wire when a parameter change
373	   occurs (e.g., change of beacon interval units) is naturally
374	   applicable here.

376	   Con: Churn in the RPKI is of concern.  Every BGPsec router renews and
377	   propagates its 'next' origination certificate once in every beacon
378	   (i.e. re-origination) interval.

380	5.2.  Event-driven Key Rollover Method

382	   The common details of the Event-driven Key Rollover (EKR) methods are
383	   as follow.

385	   o  Key rollover is reactive to events (not periodic).

387	   o  If a peering or policy change event involves only prefixes being
388	      originated at the AS of the router, then the router rolls only the
389	      origination key.

391	   o  If a peering change event involves transit prefixes at the AS of
392	      the router, then the router rolls its transit key as well as the
393	      origination key.  Both keys are rolled because any peering
394	      relationship change also requires refresh of prefixes originated
395	      by the router.

397	   o  If a key rollover takes place, then a corresponding (origination
398	      or transit) new 'next' certificate is propagated in RPKI.

400	   Discussion of Pros and Cons:

402	   Pro: As long as no triggering events occur, there is no added update
403	   churn in BGPsec.

405	   Con: Whenever the transit key is rolled, there is a storm of BGPsec
406	   updates at routers in transit ASes.  For example, consider BGPsec
407	   capable transit AS5 that is connected to four BGPsec non-stub
408	   customers (AS1, AS2, AS3, AS4).  Assume each AS has a single BGPsec
409	   router in it.  AS1 through AS4 each receives almost full table
410	   (approximately 600K signed prefix updates) from AS5.  Assume also
411	   that AS1 and its customers together originate 100 prefixes in total;
412	   likewise for AS2, AS3 and AS4.  Now consider that an event occurs
413	   whereby the peering between AS1 and AS5 is discontinued.  As a result
414	   of this event, in the EKR method, the AS5 router signs and re-
415	   propagates approximately 3x600K = 1.8 Million signed prefix updates
416	   to AS2, AS3 and AS4 combined.  In addition, it also sends 4x100 = 400
417	   Withdraws, which are negligible.  In comparison, in the PKR method,
418	   reacting to the same event, the BGPsec router at AS5 sends only 4x100
419	   = 400 Withdraws and signs/re-propagates ZERO prefix updates.  (An
420	   illustration can be found in slide #9 in [RAWS-discussion].  Also,
421	   additional peering change scenarios and quantitative comparisons can
422	   be found in slides #10 and #11 in [RAWS-discussion].)

424	   It remains to be seen through measurement and modeling how the impact
425	   of such large bursts of workload in the EKR method at the time of
426	   event occurrence can be managed in route processors, e.g., by
427	   jittering and throttling the workload.

429	5.2.1.  EKR-A: EKR where Update Expiry is Enforced by CRL

431	   EKR-A builds on the common principles as described for EKR above in
432	   Section 5.2.  The additional details of EKR-A operation are as
433	   follow:

435	   o  NotAfter time of origination and transit certificates is set to a
436	      large value (e.g., one year or whatever period needed for normal
437	      key maintenance).

439	   o  Whenever key rollover (for origination or transit) occurs, then a
440	      CRL is propagated for the certificate that was used until that
441	      time.  So the old update expires (due to invalid state) only when
442	      the CRL propagates and reaches each relying router.

444	   o  This method relies on end-to-end CRL propagation through the RPKI
445	      system to enforce expiry of a previous update whenever the need
446	      arises.

448	   o  The CRL either propagates all the way to the relying router, or
449	      the RPKI cache server of the router receives the CRL and then
450	      sends a withdrawal of the {AS, SKI, Pub Key} tuple to the router.
451	      Either way, the CRL must in effect propagate all the way to the
452	      relying router.

454	   o  Thus the attack vulnerability window with the EKR-A method is
455	      governed by the end-to-end CRL propagation time.

457	   Discussion of Pros and Cons:

459	   The following pro and con for the EKR-A method are in addition to the
460	   common pros and cons listed above for the KR and EKR methods
461	   (Section 5 and Section 5.2).

463	   Pro: EKR-A has much less RPKI churn than PKR or EKR-B (see
464	   Section 5.2.2).

466	   Con: Router needs to receive a CRL or a withdraw of {AS, SKI, Pub
467	   Key} tuple in order to know an update has expired.  Hence, the RAWS
468	   vulnerability window is determined by the CRL propagation time which
469	   can vary widely from one relying router to another router that may be
470	   in different regions.  It is anticipated that this would be no worse
471	   than 24 hours, but needs to be confirmed by measurements in an
472	   operational or emulated RPKI systems [rpki-delay].

474	5.2.2.  EKR-B: EKR where Update Expiry is Enforced by NotAfter Time

476	   EKR-B builds on the common principles as described for EKR above in
477	   Section 5.2.  The additional details of EKR-B operation are as
478	   follow:

480	   o  NotAfter time of current origination and transit certificates is
481	      set to a value determined by the desired vulnerability window
482	      (~day).

484	   o  Update expiry is controlled by NotAfter time (router certificate
485	      is not revoked at the time when the event happens).

487	   o  If no triggering event occurs to cause origination key rollover
488	      within a pre-set time (NotAfter), then new origination (current
489	      and next) certificates are issued only to extend the NotAfter time
490	      but the corresponding key pairs and SKIs remain unchanged.

492	   o  Do likewise (i.e. similar to what the above bullet says) for the
493	      transit (current and next) certificates and keys.

495	   o  A previous update automatically becomes invalid at the earliest
496	      NotAfter time of the certificates used in the signatures unless
497	      each of those certificates' NotAfter time has been extended.

499	   o  Changes in certificates to extend their NotAfter time need not
500	      propagate end-to-end (all the way to the relying routers); they
501	      may propagate only up to the RPKI cache server of the relying
502	      router.  RPKI cache server would send a withdraw for an {AS, SKI,
503	      Pub Key} tuple to a relying router if the NotAfter time of the
504	      certificate has passed.

506	   o  Changes in certificates to advance NotAfter time can be scheduled
507	      and propagated (in RPKI) reasonably well in advance.

509	   Discussion of Pros and Cons:

511	   The following pro and con for EKR-B are in addition to the common
512	   pros and cons listed above for the KR and EKR methods (Section 5 and
513	   Section 5.2).

515	   Pro: Update expiration is automatic in case the NotAfter time of any
516	   of the certificates used to validate the update has not been
517	   extended.  So the RAWS vulnerability window is predictable and not
518	   influenced by the RPKI end-to-end propagation time.

520	   Pro: Routers do not get any RPKI updates from the RPKI cache server
521	   when a certificate changes but the corresponding key pair and SKI
522	   remain unchanged.  Routers do not receive NotAfter time from their
523	   RPKI cache server.  There is no need for it.  Instead, the RPKI cache
524	   server keeps track of NotAfter time, and provides to routers only
525	   valid {AS, SKI, Pub Key} tuples.  This saves some RPKI state
526	   maintenance workload at the routers.

528	   Con: EKR-B has much more RPKI churn than EKR-A because both
529	   origination and transit certificates need to be reissued periodically
530	   to extend their validity time (even in the absence of any peering or
531	   policy change events).

533	5.2.3.  EKR with Separate Key for Each Incoming-Outgoing Peering-Pair

535	   This is a place holder section where we mention another variant of
536	   the EKR method.  This idea has not been considered or vetted by the
537	   SIDR WG yet.  So we only mention it here briefly.

539	   As noted earlier, the EKR methods considered so far generate a huge
540	   spike in workload whenever the transit key rollover takes place.  One
541	   way to reduce that workload is to have a separate signing key for
542	   each incoming-outgoing peering pair.  For example, consider a BGPsec
543	   router in AS4 that has peers in AS1, AS2, and AS3.  The router will
544	   hold six signing keys, one each corresponding to (AS1, AS2), (AS2,
545	   AS1), (AS1, AS3), (AS3, AS1), (AS2, AS3), and (AS3, AS2) peering-
546	   pairs.  Note that the directionality of peering is included here and
547	   is necessary.  The key corresponding to (AS-i, AS-j) would only be
548	   used to sign updates received from AS-i and being forwarded to AS-j.
549	   In the general case, when the BGPsec router has n peers, the number
550	   of transit keys will be n(n-1).  Since there would be a Current and a
551	   Next key (for rollover), the number of transit keys held in the
552	   router for signing will be actually 2n(n-1).  When a peering or
553	   policy change occurs, the router would rollover only those specific
554	   keys that correspond to the peering-pairs over which the prefix
555	   updates are affected.  In the above example, suppose a policy change
556	   between AS4 and AS1 causes AS4 to prepend prefixes sent to AS1
557	   (pCount changed from 1 to 2).  Then AS4 would do key rollover only
558	   for (AS2, AS1) and (AS3, AS1) peering-pairs, and not for any of the
559	   others.  This would substantially reduce the quantity of prefix
560	   updates that are signed and re-propagated.  In general, when peering
561	   or policy changes occur, this method will reduce the number of prefix
562	   updates to be re-propagated to exactly the same as that with normal
563	   BGP.  That means that this method would also be on par with the ET
564	   and PKR methods in terms of update churn when a peering or policy
565	   change takes place.  The downside of this method is that the router
566	   needs to maintain 2n(n-1) key pairs if it has n BGPsec peers.

568	   Detailed discussion and comparison of this method with other methods
569	   can be provided in a later version of this document if the idea picks
570	   up interest in the WG.

572	6.  Summary of Pros and Cons

574	   Table 1 below summarizes the pros and cons for the various RAWS
575	   protection methods.  This summary follows from the discussion above
576	   in Section 4 and Section 5.

578	   +----------+---------------------------+----------------------------+
579	   | Method   | Pros                      | Cons                       |
580	   +----------+---------------------------+----------------------------+
581	   | Expirati | 1. The background load    | 1. Prefix owner can abuse  |
582	   | on Time  | due to beaconing is low   | by beaconing too           |
583	   | (ET)     | and not bursty.           | frequently.                |
584	   |          | ---                       | ---                        |
585	   |          | 2. Transit AS does NOT    | 2. Any change to the units |
586	   |          | have a huge spike in      | (granularity) of ET field  |
587	   |          | workload even when a      | entails a change to on-    |
588	   |          | peering or policy change  | the-wire BGPsec protocol.  |
589	   |          | happens at that AS.       |                            |
590	   |          | Beaconing facilitates     |                            |
591	   |          | this.                     |                            |
592	   |          | ---                       | ---                        |
593	   |          | 3. Does not add to RPKI   |                            |
594	   |          | churn.                    |                            |
595	   | -------- | ------------------------- | -------------------------- |
596	   | Periodic | 1. The background load    | 1. Prefix owner can abuse  |
597	   | Key      | due to beaconing is low   | by beaconing (i.e. re-     |
598	   | Rollover | and not bursty.           | originating) too           |
599	   | (PKR)    |                           | frequently.                |
600	   |          | ---                       | ---                        |
601	   |          | 2. Transit AS does NOT    | 2. Adds to RPKI churn. A   |
602	   |          | have a huge spike in      | pair of certificates       |
603	   |          | workload even when a      | (current and next) for     |
604	   |          | peering change happens at | each origination router    |
605	   |          | that AS. Beaconing (i.e.  | are rolled once every      |
606	   |          | periodic re-origination)  | beacon (i.e. re-           |
607	   |          | facilitates this.         | origination) interval.     |
608	   |          |                           | Significantly more RPKI    |
609	   |          |                           | churn than that with EKR-A |
610	   |          |                           | or EKR-B methods.          |
611	   |          | ---                       | ---                        |
612	   |          | 3. If the periodic re-    |                            |
613	   |          | origination (i.e.         |                            |
614	   |          | beaconing) interval units |                            |
615	   |          | change, BGPsec protocol   |                            |
616	   |          | on the wire remains       |                            |
617	   |          | unaffected.               |                            |
618	   |          | ---                       | ---                        |
619	   |          | 4. Changes in the method  |                            |
620	   |          | (while still based on Key |                            |
621	   |          | Rollover) can be          |                            |
622	   |          | accommodated without      |                            |
623	   |          | requiring any change to   |                            |
624	   |          | on-the-wire BGPsec        |                            |
625	   |          | protocol.                 |                            |
626	   | -------- | ------------------------- | -------------------------- |
627	   | Event    | 1. No update churn for    | 1. Whenever the transit    |
628	   | driven   | long periods when no      | key is rolled (in response |
629	   | Key      | peering or policy changes | to a peering or policy     |
630	   | Rollover | occur.                    | change event), there is a  |
631	   | Type A   |                           | storm of BGPsec updates,   |
632	   | (EKR-A)  |                           | especially at routers in   |
633	   |          |                           | large transit ASes.        |
634	   |          | ---                       | ---                        |
635	   |          | 2. The added churn in     | 2. The RAWS vulnerability  |
636	   |          | RPKI is much lower than   | window is dependent on     |
637	   |          | that in the EKR-B method. | end-to-end CRL             |
638	   |          |                           | propagation. It may vary   |
639	   |          |                           | significantly from one     |
640	   |          |                           | relying router to another  |
641	   |          |                           | that may be in different   |
642	   |          |                           | regions.                   |
643	   |          | ---                       | ---                        |
644	   |          | 3. Same as Pro #4 for the |                            |
645	   |          | PKR method.               |                            |
646	   | -------- | ------------------------- | -------------------------- |
647	   | Event    | 1. Same as Pro #1 for the | 1. Same as Con #1 for the  |
648	   | driven   | EKR-A method.             | EKR-A method.              |
649	   | Key      |                           |                            |
650	   | Rollover |                           |                            |
651	   | Type B   |                           |                            |
652	   | (EKR-B)  |                           |                            |
653	   |          | ---                       | ---                        |
654	   |          | 2. The RAWS vulnerability | 2. The added churn in RPKI |
655	   |          | window is enforced by     | is much higher than that   |
656	   |          | NotAfter time in          | in the EKR-A method.       |
657	   |          | certificates and is       |                            |
658	   |          | therefore predictable.    |                            |
659	   |          | ---                       | ---                        |
660	   |          | 3. Same as Pro #4 for the |                            |
661	   |          | PKR method.               |                            |
662	   +----------+---------------------------+----------------------------+
663	               Table 1: Table with Summary of Pros and Cons

665	7.  Summary and Conclusions

667	   We have attempted to provide insights into the operation of multiple
668	   alternative methods for RAWS protection.  It is hoped that the SIDR
669	   WG will utilize the analysis presented here as input for deciding on
670	   the choice of a mechanism for protection from RAWS.  Once that
671	   decision is made, the chosen mechanism would be included in the
672	   standards track document [I-D.ietf-sidr-bgpsec-rollover].

674	   Some important considerations for the decision making can be possibly
675	   listed as follow:

677	   1.  The Expiration Time (ET) method is best (on par with the PKR
678	       method) in terms of preventing huge update workloads during
679	       peering and policy change events at transit routers with several
680	       peers.  It has no added RPKI churn.  But the ET method has the
681	       disadvantage of requiring on-the-wire protocol change if some
682	       parameters (e.g., the units of beacon interval) change.

684	   2.  The Periodic Key Rollover (PKR) method operates the same way as
685	       the ET method for preventing huge update workloads during peering
686	       and policy change events at transit routers with several peers.
687	       It does not have the disadvantage of requiring on-the-wire
688	       protocol change if some parameters (e.g., the units of beaconing/
689	       re-origination periodicity) change.  But it has the downside of
690	       added RPKI churn.

692	   3.  The Event-driven Key Roll (EKR-A and EKR-B) methods have
693	       significantly less RPKI churn than the PKR method.  They also
694	       have no BGPsec update churn during long quiet periods when no
695	       peering or policy change events occur.  But they suffer the
696	       drawback of creating huge update workloads during peering and
697	       policy change events at transit routers with several peers.  Can
698	       this workload be jittered or flow controlled to spread it over
699	       time without convergence delay concerns?  May be - needs further
700	       study.

702	   4.  The EKR-A method relies on end-to-end CRL propagation through the
703	       RPKI system to enforce expiry of a previous update when needed.
704	       By contrast, in the EKR-B method the update expiry is controlled
705	       by NotAfter time of the certificates used in update signatures.
706	       In EKR-B method, previous update automatically becomes invalid at
707	       the earliest NotAfter time of the certificates used in the
708	       signatures unless each of those certificates' NotAfter time has
709	       been extended.  Also, in EKR-B method, changes in certificates to
710	       extend their NotAfter time need not propagate end-to-end (all the
711	       way to the relying routers); they may propagate only up to the
712	       RPKI cache server of the relying router (see Section 5.2.2).  The
713	       changes in certificates to advance NotAfter time can be scheduled
714	       and propagated (in RPKI) reasonably well in advance.

716	   5.  Besides being out-of-band relative to the BGPsec protocol on the
717	       wire, the other good thing about the Key Rollover method is that
718	       once the basics of the mechanism are implemented, there may be
719	       flexibility to implement PKR, EKR-A or EKR-B on top of it.  It
720	       may also be possible to switch from one method to another (within
721	       this class) if necessary based on operational experience; this
722	       transition would not require any change to on-the-wire BGPsec
723	       protocol.

725	8.  Acknowledgements

727	   The authors would like to thank Steve Kent for extensive review and
728	   many useful suggestions on an earlier version of this document.
729	   Thanks are also due to Roque Gagliano and Brian Weis for helpful
730	   discussions.  Further, we are thankful to Oliver Borchert and Okhee
731	   Kim for comments and suggestions.

733	9.  IANA Considerations

735	   This memo includes no request to IANA.

737	10.  Security Considerations

739	   This memo requires no security considerations of its own since it is
740	   targeted to be an informational RFC in support of
741	   [I-D.ietf-sidr-bgpsec-rollover] and [I-D.ietf-sidr-bgpsec-protocol]
742	   . The reader is therefore directed to the security considerations
743	   provided in those documents.

745	11.  Informative References

747	   [I-D.ietf-sidr-bgpsec-protocol]
748	              Lepinski, M. and K. Sriram, "BGPsec Protocol
749	              Specification", draft-ietf-sidr-bgpsec-protocol-18 (work
750	              in progress), August 2016.

752	   [I-D.ietf-sidr-bgpsec-rollover]
753	              Gagliano, R., Patel, K., and B. Weis, "BGPsec Router
754	              Certificate Rollover", draft-ietf-sidr-bgpsec-rollover-05
755	              (work in progress), March 2016.

757	   [RAWS-discussion]
758	              Sriram, K. and D. Montgomery, "Discussion of Key Rollover
759	              Mechanisms for Replay-Attack Protection", Presented
760	              at IETF-85 SIDR WG Meeting, November 2012,
761	              <http://www.ietf.org/proceedings/85/slides/
762	              slides-85-sidr-4.pdf>.

764	   [RFC7353]  Bellovin, S., Bush, R., and D. Ward, "Security
765	              Requirements for BGP Path Validation", RFC 7353,
766	              DOI 10.17487/RFC7353, August 2014,
767	              <http://www.rfc-editor.org/info/rfc7353>.

769	   [rpki-delay]
770	              Kent, S. and K. Sriram, "RPKI rsync Download Delay
771	              Modeling", Presented at IETF-86 SIDR WG Meeting, March
772	              2013, <http://www.ietf.org/proceedings/86/slides/
773	              slides-86-sidr-1.pdf>.

775	Authors' Addresses

777	   Kotikalapudi Sriram
778	   US NIST

780	   Email: ksriram@nist.gov

782	   Doug Montgomery
783	   US NIST

785	   Email: dougm@nist.gov