< draft-decraene-rtgwg-backoff-algo-00.txt   draft-decraene-rtgwg-backoff-algo-01.txt >
Network Working Group B. Decraene Network Working Group B. Decraene
Internet-Draft Orange Internet-Draft Orange
Intended status: Standards Track July 3, 2014 Intended status: Standards Track March 9, 2015
Expires: January 4, 2015 Expires: September 10, 2015
Back-off SPF algorithm for link state IGP Back-off SPF algorithm for link state IGP
draft-decraene-rtgwg-backoff-algo-00 draft-decraene-rtgwg-backoff-algo-01
Abstract Abstract
This document defines a standard algorithm to back-off link-state IGP This document defines a standard algorithm to back-off link-state IGP
SPF computations. SPF computations.
This improves interoperability by reducing the probability and/or Having one standardized algorithm improves interoperability by
duration of transient forwarding loops during the IGP convergence in reducing the probability and/or duration of transient forwarding
the area/level when the network reacts to multiple consecutive loops during the IGP convergence in the area/level when the network
events. reacts to multiple consecutive events.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in [RFC2119].
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 4, 2015. This Internet-Draft will expire on September 10, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 46 skipping to change at page 2, line 46
Different implementations choose different algorithms, hence in a Different implementations choose different algorithms, hence in a
multi-vendor network, it's not possible to enforce that all routers multi-vendor network, it's not possible to enforce that all routers
triggers their SPF computation after the same waiting delay. This triggers their SPF computation after the same waiting delay. This
situation increases the average differential delay between routers situation increases the average differential delay between routers
end of RIB computation. It also increases the probability that end of RIB computation. It also increases the probability that
different routers compute their RIB based on a different LSDB. Both different routers compute their RIB based on a different LSDB. Both
increases the probability and/or duration of micro-loops. increases the probability and/or duration of micro-loops.
To allow for multi-vendors networks having all the routers delaying To allow for multi-vendors networks having all the routers delaying
their SPF for the same duration, this document specifies a their SPF for the same duration, this document specifies a
standardized algorithm. The algorithm is proposed based on its standardized algorithm. Implementations may offer alternative
popularity on existing implementations and its large deployed base. optional algorithms.
It's not implied that this algorithm is the best. Implementations
may offer alternative optional algorithms.
2. Exponential back off algorithm 2. High level goals
This backoff algorithm introduces a delay between the event The high level goals of this algorithm are the following:
triggering a new RIB computation and the start of the computation.
The initial wait time is set to INITIAL_WAIT. o Very fast convergence for single simple events (link failure).
Subsequent wait times are exponentially delayed by INCREMENTAL_WAIT, o Fast convergence in general while the IGP stability is considered
2*INCREMENTAL_WAIT, 4* INCREMENTAL_WAIT, 8* INCREMENTAL_WAIT... up to under control.
reaching the maximum value MAX_WAIT.
If no new trigger is received for two times MAX_WAIT_TIME, the delay o A long delay when the IGP stability is considered out of control,
is set back to INITIAL_WAIT. in order to let all related process calm down.
The back off algorithm makes no difference regarding the type of o At any time, try to avoid using different SPF_TIMERS values for
computation performed to compute the updated RIB. For example no nodes in the area/level. Even though not all nodes will receive
distinction is made between a full SPF, an incremental SPF or a PRC IGP message at the same time (due to difference in distance from
computation. the source and due to different flooding implementations on the
path from the source).
3. Parameters 3. Definitions and parameters
INITIAL_WAIT SHOULD be configurable from 0 ms to at least 5 s. IGP events: An LSDB change requiring a new RIB computation (topology
change, prefix change, metric change). No distinction is done
between the type of computation performed (e.g. full SPF, incremental
SPF, PRC). The type of computation is a local consideration.
INCREMENTAL_WAIT SHOULD be configurable from 0 ms to at least 5 s. The SPF_DELAY timer can take the following values:
MAX_WAIT SHOULD be configurable from 0 ms to at least 10 s. INITIAL_WAIT: a very small delay to quickly handle link failure.
e.g. 0 millisecond.
In this version of the draft, it's proposed to not define default FAST_WAIT: a small delay to have a fast convergence. e.g. 50-100
values because such values are subject to change over time as millisecond. Note: we want to be fast, but as this failure requires
hardware and software improve and as customers requirements increase. multiple IGP events, being too fast increase the probability to
In addition, such timers may be very network dependant. receive additional IGP events just after the RIB computation.
4. Impact on micro-loops LONG_WAIT: a long delay as IGP is unstable. e.g. 2 seconds. Note:
let's bring calm in the IGP.
The TIME_TO_CONVERGE timer is the time to learn all the IGP events
related to a single failure (e.g. node failure, SRLG failure). e.g. 1
second. It's mostly dependent on variation of failure detection
times between all nodes which are neighbour to the failure, and then
may depend on different flooding algorithms of nodes in the network.
The HOLD_DOWN timer is the time needed with no IGP events received,
before considering that the IGP is quiet again and we can set the
SPF_DELAY back to INITAL_WAIT. e.g. 5 seconds.
4. Principle of SPF delay algorithm
The first IGP event is handled very quickly (INITIAL_WAIT) in order
to be very reactive for the first event if it only needs one IGP
event (e.g. link failure, prefix change).
If more IGP events are received quickly after, we consider that they
are related to the same single failure, and handle the IGP events
relatively quickly (FAST_WAIT) during the time needed to receive all
the IGP events related to the failure (TIME_TO_CONVERGE).
If IGP events are still received after this time, then the network is
presumably experiencing multiple independent failures and the while
waiting for its stability, the computations are delayed for a longer
time (LONG_WAIT).
Note: previous SPF delay algorithms used to count the number of RIB
computations. However, as all nodes may receive the LSP events in a
different way we cannot assume that all nodes will perform the same
number of SPF computations or that they will schedule them at the
same time. For example, assuming that the SPF delay is 50 ms, node
R1 may receive 3 IGP events (E1, E2, E3) in those 50 ms and hence
will perform a single routing computation. While another node R2 may
only receive 2 events (E1, E2) in those 50ms and hence will schedule
another routing computation when further receiving E3. That's why
this document prefers to define a time limit (TIME_TO_CONVERGE) since
the first event, rather than a number of routing computations.
5. Specification of SPF delay algorithm
When the previous IGP events is more than HOLD_DOWN ago:
o The IGP is set to the QUIET state.
When the IGP is in the QUIET state and an IGP event is received:
o The time of this first IGP event is stored in FIRST_EVENT_TIME.
o The next RIB computation time is set to LSP receive time +
INITIAL_WAIT.
o The IGP is set to the FAST_WAIT state.
When the IGP is in the FAST_WAIT state and an IGP event is received:
o If more than TIME_TO_CONVERGE has passed since FIRST_EVENT_TIME,
then the IGP is set to the HOLD_DOWN state.
o If the next RIB_computation time is in the past, set the next RIB
computation time to LSP receive time + FAST_WAIT.
When the IGP is in the HOLD_DOWN state and an IGP event is received:
o If the next RIB_computation time is in the past, set the next RIB
computation time to LSP receive time + LONG_WAIT.
6. Impact on micro-loops
Micro-loops during IGP convergence are due to a non synchronized or Micro-loops during IGP convergence are due to a non synchronized or
non ordered update of the forwarding information tables (FIB) RFC non ordered update of the forwarding information tables (FIB)
5715 [RFC5715] RFC 6976 [RFC6976] draft.litkowski-rtgwg-spf-uloop-pb- [RFC5715] [RFC6976] [I-D.litkowski-rtgwg-spf-uloop-pb-statement].
statement [I-D.litkowski-rtgwg-spf-uloop-pb-statement]. FIB are FIB are installed after multiple steps such as SPF wait time, SPF
installed after multiple steps such as SPF wait time, SPF
computation, FIB distribution and FIB update. This document only computation, FIB distribution and FIB update. This document only
address the first contribution. This standardized procedure reduces address the first contribution. This standardized procedure reduces
the probability and/or duration of micro-loops when the IGP the probability and/or duration of micro-loops when the IGP
experience multiple consecutive events. It does not remove all experience multiple consecutive events. It does not remove all
micro-loops. However, it is beneficial and its cost seems limited micro-loops. However, it is beneficial and its cost seems limited
compared to full solutions such as RFC 5715 [RFC5715] or RFC 6976 compared to full solutions such as [RFC5715] or [RFC6976].
[RFC6976].
5. IANA Considerations 7. IANA Considerations
No IANA actions required. No IANA actions required.
6. Security considerations 8. Security considerations
This document has no impact on the security of the IGP. This document has no impact on the security of the IGP.
7. Acknowledgements 9. Acknowledgements
We would like to acknowledge Hannes Gredler and Les Ginsberg for the We would like to acknowledge Hannes Gredler, Les Ginsberg and Pierre
discussions related to this document. Francois for the discussions related to this document.
8. References 10. References
8.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
8.2. Informative References 10.2. Informative References
[I-D.litkowski-rtgwg-spf-uloop-pb-statement] [I-D.litkowski-rtgwg-spf-uloop-pb-statement]
Litkowski, S., "Link State protocols SPF trigger and delay Litkowski, S., "Link State protocols SPF trigger and delay
algorithm impact on IGP microloops", draft-litkowski- algorithm impact on IGP microloops", draft-litkowski-
rtgwg-spf-uloop-pb-statement-00 (work in progress), June rtgwg-spf-uloop-pb-statement-02 (work in progress), March
2014. 2015.
[ISO10589-Second-Edition] [ISO10589-Second-Edition]
International Organization for Standardization, International Organization for Standardization,
"Intermediate system to Intermediate system intra-domain "Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in routeing information exchange protocol for use in
conjunction with the protocol for providing the conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", ISO/IEC connectionless-mode Network Service (ISO 8473)", ISO/IEC
10589:2002, Second Edition, Nov 2002. 10589:2002, Second Edition, Nov 2002.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
 End of changes. 29 change blocks. 
52 lines changed or deleted 120 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/