| < draft-decraene-rtgwg-backoff-algo-00.txt | draft-decraene-rtgwg-backoff-algo-01.txt > | |||
|---|---|---|---|---|
| Network Working Group B. Decraene | Network Working Group B. Decraene | |||
| Internet-Draft Orange | Internet-Draft Orange | |||
| Intended status: Standards Track July 3, 2014 | Intended status: Standards Track March 9, 2015 | |||
| Expires: January 4, 2015 | Expires: September 10, 2015 | |||
| Back-off SPF algorithm for link state IGP | Back-off SPF algorithm for link state IGP | |||
| draft-decraene-rtgwg-backoff-algo-00 | draft-decraene-rtgwg-backoff-algo-01 | |||
| Abstract | Abstract | |||
| This document defines a standard algorithm to back-off link-state IGP | This document defines a standard algorithm to back-off link-state IGP | |||
| SPF computations. | SPF computations. | |||
| This improves interoperability by reducing the probability and/or | Having one standardized algorithm improves interoperability by | |||
| duration of transient forwarding loops during the IGP convergence in | reducing the probability and/or duration of transient forwarding | |||
| the area/level when the network reacts to multiple consecutive | loops during the IGP convergence in the area/level when the network | |||
| events. | reacts to multiple consecutive events. | |||
| Requirements Language | Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in [RFC2119]. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on January 4, 2015. | This Internet-Draft will expire on September 10, 2015. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2015 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| skipping to change at page 2, line 46 ¶ | skipping to change at page 2, line 46 ¶ | |||
| Different implementations choose different algorithms, hence in a | Different implementations choose different algorithms, hence in a | |||
| multi-vendor network, it's not possible to enforce that all routers | multi-vendor network, it's not possible to enforce that all routers | |||
| triggers their SPF computation after the same waiting delay. This | triggers their SPF computation after the same waiting delay. This | |||
| situation increases the average differential delay between routers | situation increases the average differential delay between routers | |||
| end of RIB computation. It also increases the probability that | end of RIB computation. It also increases the probability that | |||
| different routers compute their RIB based on a different LSDB. Both | different routers compute their RIB based on a different LSDB. Both | |||
| increases the probability and/or duration of micro-loops. | increases the probability and/or duration of micro-loops. | |||
| To allow for multi-vendors networks having all the routers delaying | To allow for multi-vendors networks having all the routers delaying | |||
| their SPF for the same duration, this document specifies a | their SPF for the same duration, this document specifies a | |||
| standardized algorithm. The algorithm is proposed based on its | standardized algorithm. Implementations may offer alternative | |||
| popularity on existing implementations and its large deployed base. | optional algorithms. | |||
| It's not implied that this algorithm is the best. Implementations | ||||
| may offer alternative optional algorithms. | ||||
| 2. Exponential back off algorithm | 2. High level goals | |||
| This backoff algorithm introduces a delay between the event | The high level goals of this algorithm are the following: | |||
| triggering a new RIB computation and the start of the computation. | ||||
| The initial wait time is set to INITIAL_WAIT. | o Very fast convergence for single simple events (link failure). | |||
| Subsequent wait times are exponentially delayed by INCREMENTAL_WAIT, | o Fast convergence in general while the IGP stability is considered | |||
| 2*INCREMENTAL_WAIT, 4* INCREMENTAL_WAIT, 8* INCREMENTAL_WAIT... up to | under control. | |||
| reaching the maximum value MAX_WAIT. | ||||
| If no new trigger is received for two times MAX_WAIT_TIME, the delay | o A long delay when the IGP stability is considered out of control, | |||
| is set back to INITIAL_WAIT. | in order to let all related process calm down. | |||
| The back off algorithm makes no difference regarding the type of | o At any time, try to avoid using different SPF_TIMERS values for | |||
| computation performed to compute the updated RIB. For example no | nodes in the area/level. Even though not all nodes will receive | |||
| distinction is made between a full SPF, an incremental SPF or a PRC | IGP message at the same time (due to difference in distance from | |||
| computation. | the source and due to different flooding implementations on the | |||
| path from the source). | ||||
| 3. Parameters | 3. Definitions and parameters | |||
| INITIAL_WAIT SHOULD be configurable from 0 ms to at least 5 s. | IGP events: An LSDB change requiring a new RIB computation (topology | |||
| change, prefix change, metric change). No distinction is done | ||||
| between the type of computation performed (e.g. full SPF, incremental | ||||
| SPF, PRC). The type of computation is a local consideration. | ||||
| INCREMENTAL_WAIT SHOULD be configurable from 0 ms to at least 5 s. | The SPF_DELAY timer can take the following values: | |||
| MAX_WAIT SHOULD be configurable from 0 ms to at least 10 s. | INITIAL_WAIT: a very small delay to quickly handle link failure. | |||
| e.g. 0 millisecond. | ||||
| In this version of the draft, it's proposed to not define default | FAST_WAIT: a small delay to have a fast convergence. e.g. 50-100 | |||
| values because such values are subject to change over time as | millisecond. Note: we want to be fast, but as this failure requires | |||
| hardware and software improve and as customers requirements increase. | multiple IGP events, being too fast increase the probability to | |||
| In addition, such timers may be very network dependant. | receive additional IGP events just after the RIB computation. | |||
| 4. Impact on micro-loops | LONG_WAIT: a long delay as IGP is unstable. e.g. 2 seconds. Note: | |||
| let's bring calm in the IGP. | ||||
| The TIME_TO_CONVERGE timer is the time to learn all the IGP events | ||||
| related to a single failure (e.g. node failure, SRLG failure). e.g. 1 | ||||
| second. It's mostly dependent on variation of failure detection | ||||
| times between all nodes which are neighbour to the failure, and then | ||||
| may depend on different flooding algorithms of nodes in the network. | ||||
| The HOLD_DOWN timer is the time needed with no IGP events received, | ||||
| before considering that the IGP is quiet again and we can set the | ||||
| SPF_DELAY back to INITAL_WAIT. e.g. 5 seconds. | ||||
| 4. Principle of SPF delay algorithm | ||||
| The first IGP event is handled very quickly (INITIAL_WAIT) in order | ||||
| to be very reactive for the first event if it only needs one IGP | ||||
| event (e.g. link failure, prefix change). | ||||
| If more IGP events are received quickly after, we consider that they | ||||
| are related to the same single failure, and handle the IGP events | ||||
| relatively quickly (FAST_WAIT) during the time needed to receive all | ||||
| the IGP events related to the failure (TIME_TO_CONVERGE). | ||||
| If IGP events are still received after this time, then the network is | ||||
| presumably experiencing multiple independent failures and the while | ||||
| waiting for its stability, the computations are delayed for a longer | ||||
| time (LONG_WAIT). | ||||
| Note: previous SPF delay algorithms used to count the number of RIB | ||||
| computations. However, as all nodes may receive the LSP events in a | ||||
| different way we cannot assume that all nodes will perform the same | ||||
| number of SPF computations or that they will schedule them at the | ||||
| same time. For example, assuming that the SPF delay is 50 ms, node | ||||
| R1 may receive 3 IGP events (E1, E2, E3) in those 50 ms and hence | ||||
| will perform a single routing computation. While another node R2 may | ||||
| only receive 2 events (E1, E2) in those 50ms and hence will schedule | ||||
| another routing computation when further receiving E3. That's why | ||||
| this document prefers to define a time limit (TIME_TO_CONVERGE) since | ||||
| the first event, rather than a number of routing computations. | ||||
| 5. Specification of SPF delay algorithm | ||||
| When the previous IGP events is more than HOLD_DOWN ago: | ||||
| o The IGP is set to the QUIET state. | ||||
| When the IGP is in the QUIET state and an IGP event is received: | ||||
| o The time of this first IGP event is stored in FIRST_EVENT_TIME. | ||||
| o The next RIB computation time is set to LSP receive time + | ||||
| INITIAL_WAIT. | ||||
| o The IGP is set to the FAST_WAIT state. | ||||
| When the IGP is in the FAST_WAIT state and an IGP event is received: | ||||
| o If more than TIME_TO_CONVERGE has passed since FIRST_EVENT_TIME, | ||||
| then the IGP is set to the HOLD_DOWN state. | ||||
| o If the next RIB_computation time is in the past, set the next RIB | ||||
| computation time to LSP receive time + FAST_WAIT. | ||||
| When the IGP is in the HOLD_DOWN state and an IGP event is received: | ||||
| o If the next RIB_computation time is in the past, set the next RIB | ||||
| computation time to LSP receive time + LONG_WAIT. | ||||
| 6. Impact on micro-loops | ||||
| Micro-loops during IGP convergence are due to a non synchronized or | Micro-loops during IGP convergence are due to a non synchronized or | |||
| non ordered update of the forwarding information tables (FIB) RFC | non ordered update of the forwarding information tables (FIB) | |||
| 5715 [RFC5715] RFC 6976 [RFC6976] draft.litkowski-rtgwg-spf-uloop-pb- | [RFC5715] [RFC6976] [I-D.litkowski-rtgwg-spf-uloop-pb-statement]. | |||
| statement [I-D.litkowski-rtgwg-spf-uloop-pb-statement]. FIB are | FIB are installed after multiple steps such as SPF wait time, SPF | |||
| installed after multiple steps such as SPF wait time, SPF | ||||
| computation, FIB distribution and FIB update. This document only | computation, FIB distribution and FIB update. This document only | |||
| address the first contribution. This standardized procedure reduces | address the first contribution. This standardized procedure reduces | |||
| the probability and/or duration of micro-loops when the IGP | the probability and/or duration of micro-loops when the IGP | |||
| experience multiple consecutive events. It does not remove all | experience multiple consecutive events. It does not remove all | |||
| micro-loops. However, it is beneficial and its cost seems limited | micro-loops. However, it is beneficial and its cost seems limited | |||
| compared to full solutions such as RFC 5715 [RFC5715] or RFC 6976 | compared to full solutions such as [RFC5715] or [RFC6976]. | |||
| [RFC6976]. | ||||
| 5. IANA Considerations | 7. IANA Considerations | |||
| No IANA actions required. | No IANA actions required. | |||
| 6. Security considerations | 8. Security considerations | |||
| This document has no impact on the security of the IGP. | This document has no impact on the security of the IGP. | |||
| 7. Acknowledgements | 9. Acknowledgements | |||
| We would like to acknowledge Hannes Gredler and Les Ginsberg for the | We would like to acknowledge Hannes Gredler, Les Ginsberg and Pierre | |||
| discussions related to this document. | Francois for the discussions related to this document. | |||
| 8. References | 10. References | |||
| 8.1. Normative References | 10.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| 8.2. Informative References | 10.2. Informative References | |||
| [I-D.litkowski-rtgwg-spf-uloop-pb-statement] | [I-D.litkowski-rtgwg-spf-uloop-pb-statement] | |||
| Litkowski, S., "Link State protocols SPF trigger and delay | Litkowski, S., "Link State protocols SPF trigger and delay | |||
| algorithm impact on IGP microloops", draft-litkowski- | algorithm impact on IGP microloops", draft-litkowski- | |||
| rtgwg-spf-uloop-pb-statement-00 (work in progress), June | rtgwg-spf-uloop-pb-statement-02 (work in progress), March | |||
| 2014. | 2015. | |||
| [ISO10589-Second-Edition] | [ISO10589-Second-Edition] | |||
| International Organization for Standardization, | International Organization for Standardization, | |||
| "Intermediate system to Intermediate system intra-domain | "Intermediate system to Intermediate system intra-domain | |||
| routeing information exchange protocol for use in | routeing information exchange protocol for use in | |||
| conjunction with the protocol for providing the | conjunction with the protocol for providing the | |||
| connectionless-mode Network Service (ISO 8473)", ISO/IEC | connectionless-mode Network Service (ISO 8473)", ISO/IEC | |||
| 10589:2002, Second Edition, Nov 2002. | 10589:2002, Second Edition, Nov 2002. | |||
| [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. | [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. | |||
| End of changes. 29 change blocks. | ||||
| 52 lines changed or deleted | 120 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||