[tcpm] mitigating TCP ACK loop ("ACK storm") DoS attacks
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tcpm] mitigating TCP ACK loop ("ACK storm") DoS attacks
TCP DoS scenarios involving ACK loops (aka "ACK storms" or "packet
wars") have come up previously on the TCPM list. For example, Anil
Agarwal brought them up in the Nov 2013 TCPM thread "TCP mismatched
sequence numbers issue"
I wanted to mention that our TCP team at Google has recently submitted
a patch series for Linux that mitigates such attacks by rate-limiting
the dupacks that are sent in response to out-of-window incoming
packets. The code has been in use at Google and was recently merged
into the official Linux "net-next" tree, which means that it should
land in the next official Linux release.
The patch series summary can be browsed at the following URL:
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=f06535c599354816cfbc653ce8965804c7385c61
Below I'm including the cover letter summarizing the patch series, for
convenience/reference.
We are interested to hear any feedback folks may have.
Thanks!
neal
==============
tcp: mitigate TCP ACK loops due to out-of-window validation dupacks
This patch series mitigates "ack loop" DoS scenarios by rate-limiting
outgoing duplicate ACKs sent in response to incoming "out of window"
segments.
Background
-----------
There are several cases in which the TCP RFCs specify that a TCP
endpoint should send a pure duplicate ACK in response to a pure
duplicate ACK that appears to be invalid due to being "out of window":
(1) RFC 793 (section 3.9, page 69) specifies that endpoints should
send a duplicate ACK in response to an ACK when the incoming
sequence number is invalid due to being outside the receive
window: "If an incoming segment is not acceptable, an
acknowledgment should be sent in reply".
(2) RFC 793 (section 3.9, page 72) says: "If the ACK acknowledges
something not yet sent (SEG.ACK > SND.NXT) then send an ACK".
(3) RFC 1323 (section 4.2.1, page 18) specifies that endpoints should
send a duplicate ACK in response to an ACK when the PAWS check for
the incoming timestamp value fails: "If .... SEG.TSval < TS.Recent
and if TS.Recent is valid ... Send an acknowledgement in reply"
The problem
------------
Normally, this is not a problem. However, a buggy middlebox or
malicious man-in-the-middle can inject a few packets into the
conversation that advance each endpoint's notion of the current window
(sequence, ACK, or timestamp), without either side noticing. In this
case, from then on each side can think the other is sending invalid
segments. Thus an infinite feedback loop of duplicate ACKs can ensue,
as each endpoint receives a duplicate ACK, decides that it is invalid
(due to sequence number, ACK number, or timestamp), and then sends a
dupack in reply, which the other side decides is invalid, responding
with a dupack... ad infinitum. This ping-pong feedback loop can happen
at a very high rate.
This phenomenon can and does happen in practice. It has been seen in
datacenter and Internet contexts at Google, and has been documented by
Anil Agarwal in the Nov 2013 tcpm thread "TCP mismatched sequence
numbers issue", and Avery Fay in the Feb 2015 Linux netdev thread
"Invalid timestamp? causing tight ack loop (hundreds of thousands of
packets / sec)".
This patch series
------------------
This patch series mitigates such ack loops by rate-limiting outgoing
duplicate ACKs sent in response to incoming TCP packets that are for
an existing connection but that are invalid due to any of the reasons
mentioned above: sequence number (1), ACK field (2), or timestamp
value (3). The rate limit for such duplicate ACKs is specified by a
new sysctl, tcp_invalid_ratelimit, which specifies the minimal space
between such outbound duplicate ACKs, in milliseconds. The default is
500 (500ms), and 0 disables the mechanism.
We rate-limit these duplicate ACK responses rather than blocking them
entirely or resetting the connection, because legitimate connections
can rely on dupacks in response to some out-of-window segments. For
example, zero window probes are typically sent with a sequence number
that is below the current window, and ZWPs thus expect to thus elicit
a dupack in response.
Testing: this approach has been in use at Google for a while.
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.