IS-IS Fast FloodingOrangebruno.decraene@orange.comCisco Systems821 Alder DriveMilpitas95035CAUSAginsberg@cisco.comArista Networks5453 Great America ParkwaySanta ClaraCalifornia95054USAtony.li@tony.ligsoligna@protonmail.comCisco SystemsPujmanove 1753/10a, Prague 4 - NuslePrague10 14000Czech Republicmkarasek@cisco.comJuniper Networks, Inc.1194 N. Mathilda AvenueSunnyvaleCA94089USAcbowers@juniper.netNokiaCopernicuslaan 50Antwerp2018Belgiumgunter.van_de_velde@nokia.comCisco SystemsApollo Business Center Mlynske nivy 43Bratislava821 09Slovakiappsenak@cisco.comJuniper1137 Innovation WaySunnyvaleCaUSAprz@juniper.net
Current Link State Protocol Data Unit (PDU)
flooding rates are much slower than what modern
networks can support. The use of IS-IS at larger
scale requires faster flooding rates to achieve
desired convergence goals. This document
discusses the need for faster flooding, the issues
around faster flooding, and some example
approaches to achieve faster flooding. It also
defines protocol extensions relevant to faster
flooding.
Link state IGPs such as Intermediate-System-to-Intermediate-System
(IS-IS) depend upon having consistent Link State Databases (LSDB) on all
Intermediate Systems (ISs) in the network in order to provide correct
forwarding of data packets. When topology changes occur, new/updated
Link State PDUs (LSPs) are propagated network-wide. The speed of
propagation is a key contributor to convergence time.Historically, flooding rates have been conservative - on the order of
10s of LSPs/second. This is the result of guidance in the base specification
and early deployments when both CPU speeds and
interface speeds were much slower and the scale of
an area was much smaller than they are today.As IS-IS is deployed in greater scale both in the number of nodes in an
area and in the number of neighbors per node, the impact of the historic
flooding rates becomes more significant. Consider the bringup or failure
of a node with 1000 neighbors. This will result in a minimum of 1000 LSP
updates. At typical LSP flooding rates used today
(33 LSPs/second), it would take 30+ seconds simply to send the updated
LSPs to a given neighbor. Depending on the diameter of the network,
achieving a consistent LSDB on all nodes in the network could easily
take a minute or more.Increasing the LSP flooding rate therefore becomes an essential element
of supporting greater network scale. Improving the LSP flooding rate is complementary to protocol
extensions that reduce LSP flooding traffic by reducing the
flooding topology such as Mesh Groups
or Dynamic Flooding
. Reduction of the
flooding topology does not alter the number of LSPs required
to be exchanged between two nodes, so increasing the overall
flooding speed is still beneficial when such extensions are in
use. It is also possible that the flooding topology can be
reduced in ways that prefer the use of neighbors that support
improved flooding performance.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they appear in all capitals,
as shown here.The base specification for IS-IS
was first
published in 1992 and updated in 2002. The update made no changes in
regards to suggested timer values. Convergence targets at the time were
on the order of seconds and the specified timer values reflect that.
Here are some examples:
The recommended value is 30 seconds.
The recommended value is 5 seconds.
The recommended value is 2 seconds.
Most relevant to a discussion of the LSP flooding rate is the recommended
interval between the transmission of two different LSPs on a given
interface.For broadcast interfaces,
defined:
The default value was defined as 33 milliseconds.
It is permitted to send multiple LSPs "back-to-back"
as a burst, but this was limited to 10 LSPs in a one second
period.
Although this value was specific to LAN interfaces, this has commonly
been applied by implementations to all interfaces though that was not
the original intent of the base specification. In fact Section
12.1.2.4.3 states:Although modern implementations have not strictly adhered to the 33
millisecond interval, it is commonplace for implementations to limit
the flooding rate to an order of magnitude similar to the 33 ms value.In the past 20 years, significant work on achieving faster
convergence - more specifically sub-second convergence - has resulted in
implementations modifying a number of the above timers in order to
support faster signaling of topology changes. For example,
minimumLSPGenerationInterval has been modified to support millisecond
intervals, often with a backoff algorithm applied to prevent LSP
generation storms in the event of a series of rapid oscillations.However, the flooding rate has not been fundamentally altered.
This document defines a new Type-Length-Value
tuple (TLV) called the "Flooding Parameters TLV"
that may be included in IS to IS Hellos (IIH) or
Partial Sequence Number PDUs (PSNPs). It allows
IS-IS implementations to advertise flooding
related parameters and capabilities which may be
of use to the peer in support of faster flooding.
Type: TBD1Length: variable, the size in octets of the Value fieldValue: One or more sub-TLVsSeveral sub-TLVs are defined in this document. The support of any sub-TLV is OPTIONAL.
For a given IS-IS adjacency, the Flooding
Parameters TLV does not need to be advertised
in each IIH or PSNP. An IS uses the latest
received value for each parameter until a new
value is advertised by the peer. However, as
IIHs and PSNPs are not reliably exchanged, and
may never be received, parameters SHOULD be
sent even if there is no change in value since
the last transmission. For a parameter which
has never been advertised, an IS SHOULD use
its local default value. That value SHOULD be
configurable on a per node basis and MAY be
configurable on a per interface basis.
The LSP Burst Window sub-TLV advertises the maximum number of LSPs that the node can receive with no separation interval between LSPs.Type: 1Length: 4 octetsValue: number of LSPs that can be sent back to backThe LSP Transmission Interval sub-TLV advertises the minimum interval, in micro-seconds, between LSPs arrivals which can be received on this interface, after the maximum number of un-acknowledged LSPs has been sent.Type: 2Length: 4 octetsValue: minimum interval, in micro-seconds, between two consecutive LSPs sent after the burst window has been usedThe LSP Transmission Interval is an advertisement of the receiver's steady-state LSP reception rate.The LSP per PSNP (LPP) sub-TLV advertises the number of received LSPs that triggers the immediate sending of a PSNP to acknowledge them.Type: 3Length: 2 octetsValue: number of LSPs acknowledged per PSNPA node advertising this sub-TLV with a value LPP MUST send a PSNP once LPP LSPs have been received and need to be acknowledged.The sub-TLV Flags advertises a set of flags.Type: 4Length: Indicates the length in octets (1-8) of the Value field. The length SHOULD be the minimum required to send all bits that are set.Value: List of flags.When the O flag is set, the LSP will be
acknowledged in the order they are received: a
PSNP acknowledging N LSPs is acknowledging the
N oldest LSPs received. The order inside the
PSNP is meaningless. If the sender keeps track
of the order of LSPs sent, this indication
allows a fast detection of the loss of an
LSP. This MUST NOT be used to trigger faster
retransmission of LSP. This MAY be used to
trigger a congestion signal.The Partial SNP Interval sub-TLV advertises the amount of
time in milliseconds between periodic action for transmission of Partial
Sequence Number PDUs. This time will trigger the sending of a PSNP
even if the number of unacknowledged LSPs received on a given
interface does not exceed LPP (). The time is
measured from the reception of the first unacknowldeged LSP.Type: 5Length: 2 octetsValue: partialSNPInterval in millisecondsA node advertising this sub-TLV SHOULD send a PSNP at least once
per Partial SNP Interval if one or more unacknowledged LSPs have been
received on a given interface.On a LAN interface, all LSPs are link-level multicasts. Each LSP sent will be received by all ISs on the LAN and each IS will receive LSPs from all transmitters. In this section, we clarify how the flooding parameters should be interpreted in the context of a LAN.An LSP receiver on a LAN will communicate its desired flooding parameters using a single Flooding Parameters TLV, copies of which will be received by all transmitters. The flooding parameters sent by the LSP receiver MUST be understood as instructions from the receiver to each transmitter about the desired maximum transmit characteristics of each transmitter. The receiver is aware that there are multiple transmitters that can send LSPs to the receiver LAN interface. The receiver might want to take that into account by advertising more conservative values, e.g. a higher LSP Transmission Interval. When the transmitters receive the LSP Transmission Interval value advertised by a LSP receiver, the transmitters should rate limit LSPs according to the advertised flooding parameters. They should not apply any further interpretation to the flooding parameters advertised by the receiver.A given LSP transmitter will receive multiple flooding parameter advertisements from different receivers that may carry different flooding parameter values. A given transmitter SHOULD use the most convervative value on a per parameter basis. For example, if the transmitter receives multiple LSP Burst Window values, it should use the smallest value.This section defines two behaviors that SHOULD be implemented on the receiver.On point-to-point networks, PSNP PDUs provide acknowledgments for
received LSPs.
suggests that some delay be
used when sending PSNPs. This provides some optimization as multiple
LSPs can be acknowledged in a single PSNP.
Faster LSP flooding benefits from a faster feedback
loop. This requires a reduction in the delay in sending
PSNPs.
The receiver SHOULD reduce its partialSNPInterval. The choice of this lower value is a local choice. It may depend on the available processing power of the node, the number of adjacencies, and the requirement to synchronize the LSDB more quickly. 200 ms seems to be a reasonable value.
In addition to the timer based
partialSNPInterval, the receiver SHOULD keep
track of the number of unacknowledged LSPs
per circuit and level. When this number
exceeds a preset threshold of LSPs Per PSNP
(LPP), the receiver SHOULD immediately send
a PSNP without waiting for the PSNP timer to
expire. In case of a burst of LSPs, this
allows for more frequent PSNPs, giving
faster feedback to the sender. Outside of
the burst case, the usual time-based PSNP
approach comes into effect. The LPP SHOULD
also be less than or equal to 90 as this is
the maximum number of LSPs that can be
acknowledged in a PSNP at common MTU sizes,
hence waiting longer would not reduce the
number of PSNPs sent but would delay the
acknowledgements. Based on experimental
evidence, 15 unacknowledged LSPs is a good
value assuming that the LSP Burst Window is
at least 30 and reasonably fast CPUs for
both the transmitter and receiver. More
frequent PSNPs gives the transmitter more
feedback on receiver progress, allowing the
transmitter to continue transmitting while
not burdening the receiver with undue
overhead.
By deploying both the time-based and the threshold-based PSNP approaches, the receiver can be adaptive to both LSP bursts and infrequent LSP updates. As PSNPs also consume link bandwidth, packet queue space, and
protocol processing time on receipt, the increased sending of PSNPs
should be taken into account when considering the rate at which LSPs
can be sent on an interface.There are three classes of PDUs sent by IS-IS:HellosLSPsComplete Sequence Number PDUs (CSNPs) and PSNPsImplementations today may prioritize the reception of Hellos
over LSPs and SNPs in order to prevent a burst of LSP updates from
triggering an adjacency timeout which in turn would require additional
LSPs to be updated.CSNPs and PSNPs serve to trigger or acknowledge the transmission of specified
LSPs. On a point-to-point link, PSNPs acknowledge the receipt of one
or more LSPs.
For this reason,
specifies a delay
(partialSNPInterval) before sending a PSNP so that the number of PSNPs
required to be sent is reduced. On receipt of a PSNP, the set of LSPs
acknowledged by that PSNP can be marked so that they do not need to be
retransmitted.If a PSNP is dropped on reception,
the set of LSPs advertised in the PSNP cannot be marked as
acknowledged and this results in needless retransmissions that will
further delay transmission of other LSPs that have yet to be
transmitted. It may also make it more likely that a receiver becomes
overwhelmed by LSP transmissions.It is therefore RECOMMENDED that implementations prioritize the
receipt of Hellos and then SNPs over LSPs. Implementations MAY also prioritize IS-IS packets over other less critical protocols.Ensuring the goodput between two entities is a layer 4 responsibility as per the OSI model and a typical example is the TCP protocol defined in
RFC 793 and relies on the flow control, congestion control, and reliability mechanisms of the protocol.
Flow control creates a control loop between a transmiter and a receiver so that the transmitter does not overwhelm the receiver. TCP provides a mean for the receiver to govern the amount of data sent by the sender through the use of a sliding window.Congestion control creates multiple interacting control loops between multiple transmitters and multiple receivers to prevent the transmitters from overwhelming the overall network. For an IS-IS adjacency, the network between two IS-IS neighbors is relatively limited in scope and consist of a link that is typically over-sized compared to the capability of the IS-IS speakers, but may also includes components inside both routers such as a switching fabric, line card CPU, and forwarding plane buffers that may experience congestion. These resources may be shared across multiple IS-IS adjacencies for the system and it is the responsibility of congestion control to ensure that these are shared reasonably.Reliability provides loss detection and recovery. IS-IS already has mechanisms to ensure the reliable transmission of LSPs. This is not changed by this document.The following two sections provides examples of Flow and/or Congestion control algorithms as examples that may be implemented by taking advantage of the extensions defined in this document. They are non-normative. An implementation may implement any congestion control algorithm.
A flow control mechanism creates a control loop
between a single instance of a transmitter and a
single receiver. This example uses a mechanism
similar to the TCP receive window to allow the
receiver to govern the amount of data sent by the
sender. This receive window ('rwin') indicates an
allowed number of LSPs that the sender may
transmit before waiting for an acknowledgment. The
size of the receive window, in units of LSPs, is
initialized with the value advertised by the
receiver in the LSP Burst Window sub-TLV. If no
value is advertised, the transmitter should
initialize rwin with its own local value.
When the transmitter sends a set of LSPs to the
receiver, it subtracts the number of LSPs sent
from rwin. If the transmitter receives a PSNP,
then rwin is incremented for each acknowledged
LSP. The transmitter must ensure that the value of
rwin never goes negative.
By sending the LSP Burst Window sub-TLV, a node advertises to its neighbor its ability to receive that many un-acknowledged LSPs from the neighbor, with no separation interval. This is akin to a receive window or sliding window in flow control. In some implementations, this value should reflect the IS-IS socket buffer size. Special care must be taken to leave space for CSNP and PSNP (SNP) PDUs and IIHs if they share the same input queue. In this case, this document suggests advertising an LSP Burst Window corresponding to half the size of the IS-IS input queue. By advertising an LSP Transmission Interval sub-TLV, a node advertises its ability to receive LSPs separated by at least the advertised value, outside of LSP bursts.The LSP transmitter MUST NOT exceed these parameters. After having sent a full burst of un-acknowledged LSPs, it MUST send the following LSPs with an LSP Transmission Interval between LSP arrivals. For CPU scheduling reasons, this rate may be averaged over a small period e.g. 10 to 30ms.If either the LSP transmitter or receiver does not adhere to these parameters, for example because of transient conditions, this causes no fatal condition to the operation of IS-IS. In the worst case, an LSP is lost at the receiver and this situation is already remedied by mechanisms in
. After a few seconds, neighbors will exchange PSNPs (for point to point interfaces) or CSNPs (for broadcast interfaces) and recover from the lost LSPs. This worst case should be avoided as those additional seconds impact convergence time as the LSDB is not fully synchronized. Hence it is better to err on the conservative side and to under-run the receiver rather than over-run it.
In order for the LSP Burst Window
to be a useful parameter, an LSP
transmitter needs to be able to
keep track of the number of
un-acknowledged LSPs it has sent
to a given LSP receiver. On a LAN
there is no explicit
acknowledgment of the receipt of
LSPs between a given LSP
transmitter and a given LSP
receiver. However, an LSP
transmitter on a LAN can infer
whether any LSP receiver on the
LAN has requested retransmission
of LSPs from the DIS by monitoring
PSNPs generated on the LAN. If no
PSNPs have been generated on the
LAN for a suitable period of time,
then an LSP transmitter can safely
set the number of un-acknowledged
LSPs to zero. Since this suitable
period of time is much higher than
the fast acknowledgment of LSPs
defined in , the
sustainable transmission rate of
LSPs will be much slower on a LAN
interface than on a point to point
interface. The LSP Burst Window is
still very useful for the first
burst of LSPs sent, especially in
the case of a single node failure
that requires the flooding of a
relatively small number of LSPs.
Whereas flow control prevents the sender from overwhelming the receiver, congestion control prevents senders from overwhelming the network. For an IS-IS adjacency, the network between two IS-IS neighbors is relatively limited in scope and includes a single link which is typically over-sized compared to the capability of the IS-IS speakers.This section describes one congestion control algorithm largely inspired by the TCP congestion control algorithm RFC 5681.The proposed algorithm uses a variable congestion window 'cwin'. It plays a role similar to the receive window described above. The main difference is that cwin is dynamically changed according to various events described below.In its simplest form, the congestion control algorithm looks like the following:The algorithm starts with cwin := LPP + 1. In the congestion avoidance phase, cwin increases as LSPs are acked: for every acked LSP, cwin += 1 / cwin. Thus, the sending rate roughly increases linearly with the RTT. Since the RTT is low in many IS-IS deployments, the sending rate can reach fast rates in short periods of time.When updating cwin, it must not become higher than the number of LSPs waiting to be sent, otherwise the sending will not be paced by the receiving of acks. Said differently, tx pressure is needed to maintain and increase cwin.When the congestion signal is triggered, cwin is set back to its initial value and the congestion avoidance phase starts again.The congestion signal can take various forms. The more reactive the congestion signals, the less LSPs will be lost due to congestion. However, congestion signals too aggressive will cause a sender to keep a very low sending rate even without actual congestion on the path.Two practical signals are given hereafter.Timers: when receiving acknowledgements, a sender estimates the acknowledgement time of the receiver. Based on this estimation, it can infer that a packet was lost, and infer congestion on the path.There can be a timer per LSP, but this can become costly for implementations. It is possible to use only a single timer t1 for every LSPs: during t1, sent LSPs are recorded in a list list_1. Once the RTT is over, list_1 is kept and another list list_2 is used to store the next LSPs. LSPs are removed from the lists when acked. At the end of the second t1 period, every LSP in list_1 should have been acked, so list_1 is checked to be empty. list_1 can then be reused for the next RTT.There are multiple strategies to set the timeout value t1. It should be based on measures of the maximum acknowledgement time (MAT) of each PSNPs. The simplest one is to use a exponential moving average of the MATs, like RFC 6298. A more elaborate one is to take a running maximum of the MATs over a period of time of a few seconds. This value should include a margin of error to avoid false positives (e.g. estimated MAT measure variance) which would have a significant impact on performance. Reordering: a sender can record its sending order and check that acknowledgements arrive on the same order than LSPs. This makes an additional assumption and should ideally be backed up by a confirmation by the receiver that this assumption stands. The O flag defined in serves this purpose. With the algorithm presented above, if congestion is detected, cwin goes back to its initial value, and does not use the information gathered in previous congestion avoidance phases.It is possible to use a fast recovery phase once congestion is detected, to avoid going through this linear rate of growth from scratch. When congestion is detected, a fast recovery threshold frthresh is set to frthresh := cwin / 2 In this fast recovery phase, for every acked LSP, cwin += 1. Once cwin reaches frthresh, the algorithm goes back to the congestion avoidance phase.The rates of increase were inspired from TCP RFC 5681, but it is possible that a different rate of increase for cwin in the congestion avoidance phase actually yields better results due to the low RTT values in most IS-IS deployments.
This algorithm's performance is dependent
on the LPP value. Indeed, the smaller LPP
is, the more information is available for
the congestion control algorithm to
perform well. However, it also increases
the resources spent on sending PSNPs, so a
tradeoff must be made. This document
recommends to use an LPP of 15 or less. If
an LSP Burst Window is advertised, LPP
SHOULD be lower and the best performance
is achieved when LPP is an integer
fraction of the LSP Burst Window.
Note that this congestion control algorithm benefits from the extensions proposed in this document. The advertisement of a receive window from the receiver () avoids the use of an arbitrary maximum value by the sender. The faster acknowledgment of LSPs () allows for a faster control loop and hence a faster increase of the congestion window in the absence of congestion.
The values that a receiver advertises do not need to be perfect. If the values are too low then the transmitter will not use the full bandwidth or available CPU resources. If the values are too high then the receiver may drop some LSPs during the first RTT and this loss will reduce the usable receive window and the protocol mechanisms will allow the adjacency to recover. Flooding several orders of magnitude slower than both nodes can achieve will hurt performance, as will consistently overloading the receiver.The values advertised need not be dynamic as feedback is provided by the acknowledgment of LSPs in SNP messages. Acknowledgments provide a feedback loop on how fast the LSPs are processed by the receiver. They also signal that the LSPs can be removed from receive window, explicitly signaling to the sender that more LSPs may be sent. By advertising relatively static parameters, we expect to produce overall flooding behavior similar to what might be achieved by manually configuring per-interface LSP rate limiting on all interfaces in the network. The advertised values may be based, for example, on an offline tests of the overall LSP processing speed for a particular set of hardware and the number of interfaces configured for IS-IS. With such a formula, the values advertised in the Flooding Parameters TLV would only change when additional IS-IS interfaces are configured.The values may be updated dynamically, to reflect the relative change of load of the receiver, by improving the values when the receiver load is getting lower and degrading the values when the receiver load is getting higher. For example, if LSPs are regularly dropped, or if the queue regularly comes close to being filled, then the values may be too high. On the other hand, if the queue is barely used (by IS-IS), then values may be too low.The values may also be absolute value reflecting relevant average hardware resources that are been monitored, typically the amount of buffer space used by incoming LSPs. In this case, care must be taken when choosing the parameters influencing the values in order to avoid undesirable or instable feedback loops. It would be undesirable to use a formula that depends, for example, on an active measurement of the instantaneous CPU load to modify the values advertised in the Flooding Parameters TLV. This could introduce feedback into the IGP flooding process that could produce unexpected behavior.As discussed in , the solution is more effective on point to point adjacencies. Hence a broadcast interface (e.g. Ethernet) only shared by two IS-IS neighbhors should be configured as point to point in order to have a more effective flooding.This section describes a congestion control algorithm based on
performance measured by the transmitter without dependance on
signaling from the receiver.(The following description is an abstraction - implementation
details vary.)Existing router architectures may utilize multiple input queues.
On a given line card, IS-IS PDUs from multiple interfaces may be
placed in a rate limited input queue. This queue may be dedicated to
IS-IS PDUs or may be shared with other routing related packets.The input queue may then pass IS-IS PDUs to a "punt queue" which
is used to pass PDUs from the data plane to the control plane. The
punt queue typically also has controls on its size and the rate at
which packets will be punted.An input queue in the control plane may then be used to assemble
PDUs from multiple linecards, separate the IS-ISs PDU from other
types of packets, and place the IS-IS PDUs in an input queue
dedicated to the IS-IS protocol.The IS-IS input queue then separates the IS-IS PDUs and directs
them to an instance specific processing queue. The instance
specififc processing queue may then further separate the IS-IS PDUs
by type (IIHs, SNPs, and LSPs) so that separate processing threads
with varying priorities may be employed to process the incoming
PDUs.In such an architecture, it may be difficult for IS-IS in the
control plane to accurately track the state of the various input
queues and determine what value should be advertised as a current
receive window.The following section describes a congestion control algorithm
based on performance measured by the transmitter without dependance
on signaling from the receiver.The congestion control algorithm described in this section does
not depend upon direct signaling from the receiver. Instead it
adapts the tranmsmission rate based on measurement of the actual
rate of acknowledgments received.When flow control is necessary, it can be implemented in a
straightforward manner based on knowledge of the current flooding
rate and the current acknowledgement rate. Such an algorithm is a
local matter and there is no requirement or intent to standardize an
algorithm. There are a number of aspects which serve as guidelines
which can be described. A maximum target LSP transmission rate (LSPTxMax) SHOULD be
configurable. This represents the fastest LSP transmission rate
which will be attempted. This value SHOULD be applicable to all
interfaces and SHOULD be consistent network wide.When the current rate of LSP transmission (LSPTxRate) exceeds the
capabilities of the receiver, the flow control algorithm needs to
aggressively reduce the LSPTxRate within a few seconds. Slower
responsiveness is likely to result in a large number of
retransmissions which can introduce much larger delays in
convergence.NOTE: Even with modest increases in flooding speed (for example,
a target LSPTxMax of 300 LSPs/second (10 times the typical rate
supported today)), a topology change triggering 2100 new LSPs would
only take 7 seconds to complete.Dynamic adjustment of the rate of LSP transmission (LSPTxRate)
upwards (i.e., faster) SHOULD be done less aggressively and only be
done when the neighbor has demonstrated its ability to sustain the
current LSPTxRate.The flow control algorithm MUST NOT assume the receive
capabilities of a neighbor are static, i.e., it MUST handle
transient conditions which result in a slower or faster receive rate
on the part of a neighbor.The flow control algorithm needs to consider the expected delay
time in receiving an acknowledgment. It therefore incorporates the
neighbor partialSNPInterval() to help
determine whether acknowlegments are keeping pace with the rate of
LSPs transmitted. In the absence of an advertisement of
partialSNPInterval a locally configured value can be used.IANA is requested to allocate one TLV from the IS-IS TLV codepoint registry.This document creates the following sub-TLV Registry:Name: Sub-TLVs for TLV TBD1 (Flooding Parameters TLV).Registration Procedure(s): Expert ReviewExpert(s): TBDReference: TBDTypeDescription0Reserved1LSP Burst Window2LSP Transmission Interval3LSPs Per PSNP4Flags5Partial SNP Interval6-255UnassignedThis document also requests IANA to create a new registry for
assigning Flag bits advertised in the Flags sub-TLV.Name: Flooding Parameters Flags Bits.Registration Procedure:Expert Review Expert(s): TBD
Security concerns for IS-IS are addressed in
,
, and
. These documents
describe mechanisms that provide the authentication and integrity of IS-IS
PDUs, including SNPs and IIHs. These authentication mechanisms are not
altered by this document.
With the cryptographic mechanisms described in
and
, an attacker wanting to advertise an incorrect
Flooding Parameters TLV would have to first defeat these mechanisms.
In the absence of cryptographic authentication, as IS-IS does not run over IP but directly over the link layer, it's considered difficult to inject false SNP/IHH without having access to the link layer.If a false SNP/IIH is sent with a Flooding Parameters TLV set to conservative values, the attacker can reduce the flooding speed between the two adjacent neighbors which can result in LSDB inconsistencies and transient forwarding loops. However, it is not significantly different than filtering or altering LSPs which would also be possible with access to the link layer. In addition, if the downstream flooding neighbor has multiple IGP neighbors, which is typically the case for reliability or topological reasons, it would receive LSPs at a regular speed from its other neighbors and hence would maintain LSDB consistency.If a false SNP/IIH is sent with a Flooding Parameters TLV set to aggressive values, the attacker can increase the flooding speed which can either overload a node or more likely generate loss of LSPs. However, it is not significantly different than sending many LSPs which would also be possible with access to the link layer, even with cryptographic authentication enabled. In addition, IS-IS has procedures to detect the loss of LSPs and recover.This TLV advertisement is not flooded across the network but only sent between adjacent IS-IS neighbors. This would limit the consequences in case of forged messages, and also limits the dissemination of such information.The following people gave a substantial contribution to the content of this document and should be considered as coauthors:Acee Lindem, Cisco Systems, acee@cisco.comJayesh J, Juniper Networks, jayeshj@juniper.netThe authors would like to thank Henk Smit, Sarah Chen, Xuesong Geng, Pierre Francois and Hannes Gredler for their reviews, comments and suggestions.The authors would like to thank David Jacquet, Sarah Chen, and Qiangzhou Gao for the tests performed on commercial implementations and their identification of some limiting factors.Intermediate system to Intermediate system intra-domain routeing information exchange protocol for use in conjunction with the protocol for providing the connectionless-mode Network Service (ISO 8473)International Organization for Standardization[RFC Editor: Please remove this section before publication]00: Initial version.[RFC Editor: Please remove this section before publication]This section captures issues which the authors either have not yet
had time to address or on which the authors have not yet reached
consensus. Future revisions of this document may include new/altered
text relevant to these issues.
There are no open issues at this time.