| < draft-ietf-tcpm-tcp-security-02.txt | draft-ietf-tcpm-tcp-security-03.txt > | |||
|---|---|---|---|---|
| TCP Maintenance and Minor F. Gont | TCP Maintenance and Minor Extensions F. Gont | |||
| Extensions (tcpm) UK CPNI | (tcpm) UK CPNI | |||
| Internet-Draft January 21, 2011 | Internet-Draft March 13, 2012 | |||
| Intended status: BCP | Intended status: Informational | |||
| Expires: July 25, 2011 | Expires: September 14, 2012 | |||
| Security Assessment of the Transmission Control Protocol (TCP) | Survey of Security Hardening Methods for Transmission Control Protocol | |||
| draft-ietf-tcpm-tcp-security-02.txt | (TCP) Implementations | |||
| draft-ietf-tcpm-tcp-security-03.txt | ||||
| Abstract | Abstract | |||
| This document contains a security assessment of the specifications of | This document surveys methods to harden Transmission Control Protocol | |||
| the Transmission Control Protocol (TCP), and of a number of | (TCP) implementations. It provides an overview of known attacks and | |||
| mechanisms and policies in use by popular TCP implementations. | refers to the corresponding solutions in the TCP standards. | |||
| Additionally, it contains best current practices for hardening a TCP | ||||
| implementation. | ||||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on July 25, 2011. | This Internet-Draft will expire on September 14, 2012. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2012 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 5 | 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.2. Scope of this document . . . . . . . . . . . . . . . . . 6 | 1.2. Scope of this document . . . . . . . . . . . . . . . . . . 6 | |||
| 1.3. Organization of this document . . . . . . . . . . . . . . 8 | 1.3. Organization of this document . . . . . . . . . . . . . . 7 | |||
| 2. The Transmission Control Protocol . . . . . . . . . . . . . . 8 | 2. The Transmission Control Protocol . . . . . . . . . . . . . . 7 | |||
| 3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 9 | 3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.1. Source Port and Destination Port . . . . . . . . . . . . 10 | 3.1. Source Port and Destination Port . . . . . . . . . . . . . 8 | |||
| 3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 12 | 3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . 14 | 3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . . 10 | |||
| 3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 15 | 3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . 15 | 3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . 15 | 3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . . 10 | |||
| 3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 16 | 3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 11 | |||
| 3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . 16 | 3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . 22 | 3.6.1. Security implications arising from closed windows . . 14 | |||
| 3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . 23 | 3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 24 | 3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . 29 | 3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 29 | 4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 29 | 4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 19 | |||
| 4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 29 | 4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 19 | |||
| 4.4. Selective Acknowledgement Option . . . . . . . . . . . . 32 | 4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 19 | |||
| 4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . 32 | 4.4. Selective Acknowledgement Option . . . . . . . . . . . . . 20 | |||
| 4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . 33 | 4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . . 20 | |||
| 4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . 35 | 4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . . 20 | |||
| 4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . 36 | 4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . . 21 | |||
| 4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . 37 | 4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . . 21 | |||
| 4.7.1. Generation of timestamps . . . . . . . . . . . . . . 37 | 4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . . 22 | |||
| 4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 38 | 4.7.1. Generation of timestamps . . . . . . . . . . . . . . . 22 | |||
| 5. Connection-establishment mechanism . . . . . . . . . . . . . 39 | 4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 22 | |||
| 5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 40 | 5. Connection-establishment mechanism . . . . . . . . . . . . . . 24 | |||
| 5.2. Connection forgery . . . . . . . . . . . . . . . . . . . 44 | 5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
| 5.3. Connection-flooding attack . . . . . . . . . . . . . . . 45 | 5.2. Connection forgery . . . . . . . . . . . . . . . . . . . . 28 | |||
| 5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 45 | 5.3. Connection-flooding attack . . . . . . . . . . . . . . . . 29 | |||
| 5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 46 | 5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 29 | |||
| 5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 48 | 5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 30 | |||
| 6. Connection-termination mechanism . . . . . . . . . . . . . . 49 | 5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 32 | |||
| 6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . 49 | ||||
| 6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 49 | 6. Connection-termination mechanism . . . . . . . . . . . . . . . 32 | |||
| 6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 50 | 6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . . 32 | |||
| 7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 52 | 6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 32 | |||
| 7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 52 | 6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 33 | |||
| 7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 52 | 7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 35 | |||
| 7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 53 | 7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 36 | |||
| 7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 56 | 7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 36 | |||
| 7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . 59 | 7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 37 | |||
| 7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 59 | 7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 40 | |||
| 7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 61 | 7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . . 42 | |||
| 8. TCP segment reassembly algorithm . . . . . . . . . . . . . . 63 | 7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 43 | |||
| 7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 45 | ||||
| 8. TCP segment reassembly algorithm . . . . . . . . . . . . . . . 47 | ||||
| 8.1. Problems that arise from ambiguity in the reassembly | 8.1. Problems that arise from ambiguity in the reassembly | |||
| process . . . . . . . . . . . . . . . . . . . . . . . . . 63 | process . . . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| 9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . 64 | 9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . . 48 | |||
| 9.1. Congestion control with misbehaving receivers . . . . . . 66 | 9.1. Congestion control with misbehaving receivers . . . . . . 48 | |||
| 9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . 66 | 9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . . 48 | |||
| 9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . 66 | 9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . . 49 | |||
| 9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 67 | 9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 49 | |||
| 9.2. Blind DupACK triggering attacks against TCP . . . . . . . 68 | 9.2. Blind DupACK triggering attacks against TCP . . . . . . . 50 | |||
| 9.2.1. Blind throughput-reduction attack . . . . . . . . . . 70 | 9.2.1. Blind throughput-reduction attack . . . . . . . . . . 52 | |||
| 9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 70 | 9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 53 | |||
| 9.2.3. Difficulty in performing the attacks . . . . . . . . 71 | 9.2.3. Difficulty in performing the attacks . . . . . . . . . 53 | |||
| 9.2.4. Modifications to TCP's loss recovery algorithms . . . 72 | 9.2.4. Modifications to TCP's loss recovery algorithms . . . 54 | |||
| 9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 74 | 9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 55 | |||
| 9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . 79 | 9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . . 55 | |||
| 9.3.1. Possible attacks by a compromised router . . . . . . 79 | 10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 | |||
| 9.3.2. Possible attacks by a malicious TCP endpoint . . . . 80 | 10.1. Passive opens and binding sockets . . . . . . . . . . . . 56 | |||
| 10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 | 10.2. Active opens and binding sockets . . . . . . . . . . . . . 57 | |||
| 10.1. Passive opens and binding sockets . . . . . . . . . . . . 81 | 11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 59 | |||
| 10.2. Active opens and binding sockets . . . . . . . . . . . . 82 | 11.1. Blind TCP-based connection-reset attacks . . . . . . . . . 59 | |||
| 11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 84 | 11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . . 60 | |||
| 11.1. Blind TCP-based connection-reset attacks . . . . . . . . 84 | 11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . . 60 | |||
| 11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . 85 | 11.1.3. Security/Compartment . . . . . . . . . . . . . . . . . 60 | |||
| 11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . 86 | 11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . . 61 | |||
| 11.1.3. Security/Compartment . . . . . . . . . . . . . . . . 88 | 11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 61 | |||
| 11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . 89 | 11.2. Blind data-injection attacks . . . . . . . . . . . . . . . 61 | |||
| 11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 90 | 12. Information leaking . . . . . . . . . . . . . . . . . . . . . 62 | |||
| 11.2. Blind data-injection attacks . . . . . . . . . . . . . . 90 | ||||
| 12. Information leaking . . . . . . . . . . . . . . . . . . . . . 91 | ||||
| 12.1. Remote Operating System detection via TCP/IP stack | 12.1. Remote Operating System detection via TCP/IP stack | |||
| fingerprinting . . . . . . . . . . . . . . . . . . . . . 91 | fingerprinting . . . . . . . . . . . . . . . . . . . . . . 62 | |||
| 12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 91 | 12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 63 | |||
| 12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 92 | 12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 63 | |||
| 12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . 92 | 12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . . 63 | |||
| 12.1.4. TCP initial window . . . . . . . . . . . . . . . . . 92 | 12.1.4. TCP initial window . . . . . . . . . . . . . . . . . . 63 | |||
| 12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . 93 | 12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . . 64 | |||
| 12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 94 | 12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 65 | |||
| 12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 94 | 12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 65 | |||
| 12.2. System uptime detection . . . . . . . . . . . . . . . . . 94 | ||||
| 13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 95 | 12.2. System uptime detection . . . . . . . . . . . . . . . . . 66 | |||
| 14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 95 | 13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 66 | |||
| 14.1. Traditional connect() scan . . . . . . . . . . . . . . . 96 | 14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 66 | |||
| 14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . 96 | 14.1. Traditional connect() scan . . . . . . . . . . . . . . . . 67 | |||
| 14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 96 | 14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . . 67 | |||
| 14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 98 | 14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 68 | |||
| 14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 98 | 14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 69 | |||
| 14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . 99 | 14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 69 | |||
| 15. Processing of ICMP error messages by TCP . . . . . . . . . . 99 | 14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . . 70 | |||
| 16. TCP interaction with the Internet Protocol (IP) . . . . . . . 99 | 15. Processing of ICMP error messages by TCP . . . . . . . . . . . 70 | |||
| 16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . 99 | 16. TCP interaction with the Internet Protocol (IP) . . . . . . . 70 | |||
| 16.2. Blind TCP data injection through fragmented IP traffic . 100 | 16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . . 71 | |||
| 16.3. Broadcast and multicast IP addresses . . . . . . . . . . 102 | 16.2. Blind TCP data injection through fragmented IP traffic . . 71 | |||
| 17. Security Considerations . . . . . . . . . . . . . . . . . . . 102 | 16.3. Broadcast and multicast IP addresses . . . . . . . . . . . 73 | |||
| 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 102 | 17. Security Considerations . . . . . . . . . . . . . . . . . . . 73 | |||
| 19. References . . . . . . . . . . . . . . . . . . . . . . . . . 103 | 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 73 | |||
| 20. References . . . . . . . . . . . . . . . . . . . . . . . . . 113 | 19. References (to be translated to xml) . . . . . . . . . . . . . 74 | |||
| 20.1. Normative References . . . . . . . . . . . . . . . . . . 113 | 20. References . . . . . . . . . . . . . . . . . . . . . . . . . . 84 | |||
| 20.2. Informative References . . . . . . . . . . . . . . . . . 113 | 20.1. Normative References . . . . . . . . . . . . . . . . . . . 84 | |||
| Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . 113 | 20.2. Informative References . . . . . . . . . . . . . . . . . . 84 | |||
| Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . . 85 | ||||
| Appendix B. Change log (to be removed by the RFC Editor | Appendix B. Change log (to be removed by the RFC Editor | |||
| before publication of this document as an RFC) . . . 113 | before publication of this document as an RFC) . . . 85 | |||
| B.1. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . 113 | B.1. Changes from draft-ietf-tcpm-tcp-security-02 . . . . . . . 85 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 114 | B.2. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . . 86 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 86 | ||||
| 1. Preface | 1. Preface | |||
| 1.1. Introduction | 1.1. Introduction | |||
| The TCP/IP protocol suite was conceived in an environment that was | The TCP/IP protocol suite was conceived in an environment that was | |||
| quite different from the hostile environment they currently operate | quite different from the hostile environment they currently operate | |||
| in. However, the effectiveness of the protocols led to their early | in. However, the effectiveness of the protocols led to their early | |||
| adoption in production environments, to the point that, to some | adoption in production environments, to the point that, to some | |||
| extent, the current world's economy depends on them. | extent, the current world's economy depends on them. | |||
| skipping to change at page 6, line 11 ¶ | skipping to change at page 6, line 11 ¶ | |||
| interoperability [Silbersack, 2005]. | interoperability [Silbersack, 2005]. | |||
| Producing a secure TCP/IP implementation nowadays is a very difficult | Producing a secure TCP/IP implementation nowadays is a very difficult | |||
| task, in part because of the lack of a single document that serves as | task, in part because of the lack of a single document that serves as | |||
| a security roadmap for the protocols. Implementers are faced with | a security roadmap for the protocols. Implementers are faced with | |||
| the hard task of identifying relevant documentation and | the hard task of identifying relevant documentation and | |||
| differentiating between that which provides correct advice, and that | differentiating between that which provides correct advice, and that | |||
| which provides misleading advice based on inaccurate or wrong | which provides misleading advice based on inaccurate or wrong | |||
| assumptions. | assumptions. | |||
| There is a clear need for a companion document to the IETF | ||||
| specifications that discusses the security aspects and implications | ||||
| of the protocols, identifies the existing vulnerabilities, discusses | ||||
| the possible countermeasures, and analyzes their respective | ||||
| effectiveness. | ||||
| This document is the result of a security assessment of the IETF | This document is the result of a security assessment of the IETF | |||
| specifications of the Transmission Control Protocol (TCP), from a | specifications of the Transmission Control Protocol (TCP), from a | |||
| security point of view. Possible threats are identified and, where | security point of view. Possible threats are identified and, where | |||
| possible, countermeasures are proposed. Additionally, many | possible, countermeasures are described. Additionally, many | |||
| implementation flaws that have led to security vulnerabilities have | implementation flaws that have led to security vulnerabilities have | |||
| been referenced in the hope that future implementations will not | been referenced in the hope that future implementations will not | |||
| incur the same problems. | incur the same problems. | |||
| This document does not aim to be the final word on the security | This document is based on the "Security Assessment of the | |||
| aspects of TCP. On the contrary, it aims to raise awareness about a | ||||
| number of TCP vulnerabilities that have been faced in the past, those | ||||
| that are currently being faced, and some of those that we may still | ||||
| have to deal with in the future. | ||||
| Feedback from the community is more than encouraged to help this | ||||
| document be as accurate as possible and to keep it updated as new | ||||
| vulnerabilities are discovered. | ||||
| This document is heavily based on the "Security Assessment of the | ||||
| Transmission Control Protocol (TCP)" released by the UK Centre for | Transmission Control Protocol (TCP)" released by the UK Centre for | |||
| the Protection of National Infrastructure (CPNI), available at: http: | the Protection of National Infrastructure (CPNI), available at: http: | |||
| //www.cpni.gov.uk/Products/technicalnotes/ | //www.cpni.gov.uk/Products/technicalnotes/ | |||
| Feb-09-security-assessment-TCP.aspx . | Feb-09-security-assessment-TCP.aspx . | |||
| 1.2. Scope of this document | 1.2. Scope of this document | |||
| While there are a number of protocols that may affect the way TCP | While there are a number of protocols that may affect the way TCP | |||
| operates, this document focuses only on the specifications of the | operates, this document focuses only on the specifications of the | |||
| Transmission Control Protocol (TCP) itself. | Transmission Control Protocol (TCP) itself. | |||
| The following IETF RFCs were selected for assessment as part of this | The machanisms described in the following documents were selected for | |||
| work: | assessment as part of this work: | |||
| o RFC 793, "Transmission Control Protocol. DARPA Internet Program. | o RFC 793, "Transmission Control Protocol. DARPA Internet Program. | |||
| Protocol Specification" (91 pages) | Protocol Specification" (91 pages) | |||
| o RFC 1122, "Requirements for Internet Hosts -- Communication | o RFC 1122, "Requirements for Internet Hosts -- Communication | |||
| Layers" (116 pages) | Layers" (116 pages) | |||
| o RFC 1191, "Path MTU Discovery" (19 pages) | o RFC 1191, "Path MTU Discovery" (19 pages) | |||
| o RFC 1323, "TCP Extensions for High Performance" (37 pages) | o RFC 1323, "TCP Extensions for High Performance" (37 pages) | |||
| skipping to change at page 8, line 19 ¶ | skipping to change at page 7, line 46 ¶ | |||
| their security implications, and discusses the possible | their security implications, and discusses the possible | |||
| countermeasures. The second part contains an analysis of the | countermeasures. The second part contains an analysis of the | |||
| security implications of the mechanisms and policies implemented by | security implications of the mechanisms and policies implemented by | |||
| TCP, and of a number of implementation strategies in use by a number | TCP, and of a number of implementation strategies in use by a number | |||
| of popular TCP implementations. | of popular TCP implementations. | |||
| 2. The Transmission Control Protocol | 2. The Transmission Control Protocol | |||
| The Transmission Control Protocol (TCP) is a connection-oriented | The Transmission Control Protocol (TCP) is a connection-oriented | |||
| transport protocol that provides a reliable byte-stream data transfer | transport protocol that provides a reliable byte-stream data transfer | |||
| service. | service. Very few assumptions are made about the reliability of | |||
| underlying data transfer services below the TCP layer. Basically, | ||||
| Very few assumptions are made about the reliability of underlying | TCP assumes it can obtain a simple, potentially unreliable datagram | |||
| data transfer services below the TCP layer. Basically, TCP assumes | service from the lower level protocols. | |||
| it can obtain a simple, potentially unreliable datagram service from | ||||
| the lower level protocols. Figure 1 illustrates where TCP fits in | ||||
| the DARPA reference model. | ||||
| +---------------+ | ||||
| | Application | | ||||
| +---------------+ | ||||
| | TCP | | ||||
| +---------------+ | ||||
| | IP | | ||||
| +---------------+ | ||||
| | Network | | ||||
| +---------------+ | ||||
| Figure 1: TCP in the DARPA reference model | ||||
| TCP provides facilities in the following areas: | ||||
| o Basic Data Transfer | ||||
| o Reliability | ||||
| o Flow Control | ||||
| o Multiplexing | ||||
| o Connections | ||||
| o Precedence and Security | ||||
| o Congestion Control | ||||
| The core TCP specification, RFC 793 [Postel, 1981c], dates back to | The core TCP specification, RFC 793 [RFC0793], dates back to 1981 and | |||
| 1981 and standardizes the basic mechanisms and policies of TCP. RFC | standardizes the basic mechanisms and policies of TCP. RFC 1122 | |||
| 1122 [Braden, 1989] provides clarifications and errata for the | [RFC1122] provides clarifications and errata for the original | |||
| original specification. RFC 2581 [Allman et al, 1999] specifies TCP | specification. RFC 2581 [RFC5681] specifies TCP congestion control | |||
| congestion control and avoidance mechanisms, not present in the | and avoidance mechanisms, not present in the original specification. | |||
| original specification. Other documents specify extensions and | Other documents specify extensions and improvements for TCP. | |||
| improvements for TCP. | ||||
| The large amount of documents that specify extensions, improvements, | The large amount of documents that specify extensions, improvements, | |||
| or modifications to existing TCP mechanisms has led the IETF to | or modifications to existing TCP mechanisms has led the IETF to | |||
| publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that | publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that | |||
| clarifies the relevance of each of those documents. | clarifies the relevance of each of those documents. | |||
| 3. TCP header fields | 3. TCP header fields | |||
| RFC 793 [Postel, 1981c] defines the syntax of a TCP segment, along | RFC 793 [RFC0793] defines the syntax of a TCP segment, along with the | |||
| with the semantics of each of the header fields. Figure 2 | semantics of each of the header fields. | |||
| illustrates the syntax of a TCP segment. | ||||
| 0 1 2 3 | ||||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Source Port | Destination Port | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Sequence Number | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Acknowledgment Number | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Data | |C|E|U|A|P|R|S|F| | | ||||
| | Offset|Resrved|W|C|R|C|S|S|Y|I| Window | | ||||
| | | |R|E|G|K|H|T|N|N| | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Checksum | Urgent Pointer | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | Options | Padding | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| | data | | ||||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Note that one tick mark represents one bit position | ||||
| Figure 2: Transmission Control Protocol header format | ||||
| The minimum TCP header size is 20 bytes, and corresponds to a TCP | The minimum TCP header size is 20 bytes, and corresponds to a TCP | |||
| segment with no options and no data. However, a TCP module might be | segment with no options and no data. However, a TCP module might be | |||
| handed an (illegitimate) "TCP segment" of less than 20 bytes. | handed an (illegitimate) "TCP segment" of less than 20 bytes. | |||
| Therefore, before doing any processing of the TCP header fields, the | Therefore, before doing any processing of the TCP header fields, the | |||
| following check should be performed by TCP on the segments handed by | following check should be performed by TCP on the segments handed by | |||
| the internet layer: | the internet layer: | |||
| Segment.Size >= 20 | Segment.Size >= 20 | |||
| skipping to change at page 10, line 29 ¶ | skipping to change at page 8, line 44 ¶ | |||
| 3.1. Source Port and Destination Port | 3.1. Source Port and Destination Port | |||
| The Source Port field contains a 16-bit number that identifies the | The Source Port field contains a 16-bit number that identifies the | |||
| TCP end-point that originated this TCP segment. The TCP Destination | TCP end-point that originated this TCP segment. The TCP Destination | |||
| Port contains a 16-bit number that identifies the destination TCP | Port contains a 16-bit number that identifies the destination TCP | |||
| end-point of this segment. In most of the discussion we refer to | end-point of this segment. In most of the discussion we refer to | |||
| client-side (or "ephemeral") port-numbers and server-side port | client-side (or "ephemeral") port-numbers and server-side port | |||
| numbers, since that distinction is what usually affects the | numbers, since that distinction is what usually affects the | |||
| interpretation of a port number. | interpretation of a port number. | |||
| TCP SHOULD randomize its ephemeral (client-side) ports, to improve | Most active attacks against ongoing TCP connections require the | |||
| its resistance to off-path attacks. For the purpose of ephemeral | attacker to guess or know the four-tuple that identifies the | |||
| port selection, the largest posible port range SHOULD be used | connection. As a result, randomization of the TCP ephemeral ports | |||
| (ideally 1024-65535) I-D.ietf-tsvwg-port-randomization. | provides a (partial) mitigation against off-path attacks. [RFC6056] | |||
| provides guidance in this area. | ||||
| DISCUSSION: | ||||
| [I-D.ietf-tsvwg-port-randomization] provides advice on port | ||||
| randomization. | ||||
| TCP MUST NOT allocate port number 0, as its use could lead to | ||||
| interoperability problems. If a segment is received with port 0 as | ||||
| the Source Port or the Destination Port, a RST segment SHOULD be sent | ||||
| in response (provided that the incomming segment does not have the | ||||
| RST flag set). | ||||
| DISCUSSION: | ||||
| While port 0 is a legitimate port number, it has a special meaning | ||||
| in the UNIX Sockets API. For example, when a TCP port number of 0 | ||||
| is passed as an argument to the bind() function, rather than | ||||
| binding port 0, an ephemeral port is selected for the | ||||
| corresponding TCP end-point. As a result, the TCP port number 0 | ||||
| is never actually used in TCP segments. | ||||
| Different implementations have been found to respond differently | ||||
| to TCP segments that have a port number of 0 as the Source Port | ||||
| and/or the Destination Port. As a result, TCP segments with a | ||||
| port number of 0 are usually employed for remote OS detection via | ||||
| TCP/IP stack fingerprinting [Jones, 2003]. | ||||
| Since in practice TCP port 0 is not used by any legitimate | ||||
| application and is only used for fingerprinting purposes, a number | ||||
| of host implementations already reject TCP segments that use 0 as | ||||
| the Source Port and/or the Destination Port. Also, a number | ||||
| firewalls filter (by default) any TCP segments that contain a port | ||||
| number of zero for the Source Port and/or the Destination Port. | ||||
| We therefore recommend that TCP implementations respond to | ||||
| incoming TCP segments that have a Source Port or a Destination | ||||
| Port of 0 with an RST (provided these incoming segments do not | ||||
| have the RST bit set). | ||||
| Responding with an RST segment to incoming segments that have the | ||||
| RST bit would open the door to RST-war attacks. | ||||
| TCP MUST be able to grecefully handle the case where the source end- | ||||
| point (IP Source Address, TCP Source Port) is the same as the | ||||
| destination end-point (IP Destination Address, TCP Destination Port). | ||||
| DISCUSSION: | ||||
| Some systems have been found to be unable to process TCP segments | ||||
| in which the source endpoint {Source Address, Source Port} is the | ||||
| same than the destination end-point {Destination Address, | ||||
| Destination Port}. Such TCP segments have been reported to cause | ||||
| malfunction of a number of implementations [CERT, 1996], and have | ||||
| been exploited in the past to perform Denial of Service (DoS) | ||||
| attacks [Meltman, 1997]. While these packets are very very | ||||
| unlikely to exist in real and legitimate scenarios, TCP should | ||||
| nevertheless be able to process them without the need of any | ||||
| "extra" code. | ||||
| A SYN segment in which the source end-point {Source Address, | ||||
| Source Port} is the same as the destination end-point {Destination | ||||
| Address, Destination Port} will result in a "simultaneous open" | ||||
| scenario, such as the one described in page 32 of RFC 793 [Postel, | ||||
| 1981c]. Therefore, those TCP implementations that correctly | ||||
| handle simultaneous opens should already be prepared to handle | ||||
| these unusual TCP segments. | ||||
| TCP SHOULD NOT allocate of port numbers that are in use by a TCP that | ||||
| is in the LISTEN or CLOSED states for use as ephemeral ports, as this | ||||
| could allow attackers on the local system to "steal" incomming TCP | ||||
| connections. | ||||
| DISCUSSION: | ||||
| While the only requirement for a selected ephemeral port is that | Some implementations have been known to crash when a TCP segment in | |||
| the resulting four-tuple (connection-id) is unique (i.e., not | which the source end-point (IP Source Address, TCP Source Port) is | |||
| currently in use by any other TCP connection), in practice it may | the same as the destination end-point (IP Destination Address, TCP | |||
| be necessary to not allow the allocation of port numbers that are | Destination Port). [draft-gont-tcpm-tcp-mirrored-endpoints-00.txt] | |||
| in use by a TCP that is in the LISTEN or CLOSED states for use as | describes this issue in detail and provides advice in this area. | |||
| ephemeral ports, as this might allow an attacker to "steal" | ||||
| incoming connections from a local server application. Therefore, | ||||
| TCP SHOULD NOT allocate port numbers that are in use by a TCP in | ||||
| the LISTEN or CLOSED states for use as ephemeral ports. Section | ||||
| 10.2 of this document provides a detailed discussion of this | ||||
| issue. | ||||
| While some systems restrict use of the port numbers in the range | While some systems restrict use of the port numbers in the range | |||
| 0-1024 to privileged users, applications SHOULD NOT grant any trust | 0-1024 to privileged users, applications should not grant any trust | |||
| based on the port numbers used for a TCP connection. | based on the port numbers used for a TCP connection. | |||
| DISCUSSION: | ||||
| Not all systems require superuser privileges to bind port numbers | Not all systems require superuser privileges to bind port numbers | |||
| in that range. Besides, with desktop computers such "distinction" | in that range. Besides, with desktop computers such "distinction" | |||
| has generally become irrelevant. | has generally become irrelevant. | |||
| Middle-boxes such as packet filters MUST NOT assume that clients use | Middle-boxes such as packet filters must not assume that clients use | |||
| port numbers from only the Dynamic or Registered port ranges. | port numbers from only the Dynamic or Registered port ranges. | |||
| DISCUSSION: | ||||
| It should also be noted that some clients, such as DNS resolvers, | It should also be noted that some clients, such as DNS resolvers, | |||
| are known to use port numbers from the "Well Known Ports" range. | are known to use port numbers from the "Well Known Ports" range. | |||
| Therefore, middle-boxes such as packet filters MUST NOT assume | Therefore, middle-boxes such as packet filters MUST NOT assume | |||
| that clients use port number from only the Dynamic or Registered | that clients use port number from only the Dynamic or Registered | |||
| port ranges. | port ranges. | |||
| 3.2. Sequence number | 3.2. Sequence number | |||
| TCP SHOULD select its Initial Sequence Numbers (ISNs) with the | Predictable sequence numbers allow a variety of attacks against TCP, | |||
| following expression: | such as those described in Section 5.2 and Section 11 of this | |||
| document. This vulnerability was first described in [Morris1985], | ||||
| ISN = M + F(localhost, localport, remotehost, remoteport, secret_key) | and its exploitation was widely publicized about 10 years later | |||
| [Shimomura1995]. | ||||
| where M is a monotonically increasing counter maintained within TCP, | ||||
| and F() is a Pseudo-Random Function (PRF). As it is vital that F() | ||||
| not be computable from the outside, F() could be a PRF of the | ||||
| connection-id and some secret data. HMAC-SHA-256 would be a good | ||||
| choice for F() | ||||
| DISCUSSION: | ||||
| The choice of the Initial Sequence Number of a connection is not | ||||
| arbitrary, but aims to minimize the chances of a stale segment | ||||
| from being accepted by a new incarnation of a previous connection. | ||||
| RFC 793 [Postel, 1981c] suggests the use of a global 32-bit ISN | ||||
| generator, whose lower bit is incremented roughly every 4 | ||||
| microseconds. | ||||
| However, use of such an ISN generator makes it trivial to predict | ||||
| the ISN that a TCP will use for new connections, thus allowing a | ||||
| variety of attacks against TCP, such as those described in Section | ||||
| 5.2 and Section 11 of this document. This vulnerability was first | ||||
| described in [Morris, 1985], and its exploitation was widely | ||||
| publicized about 10 years later [Shimomura, 1995]. | ||||
| As a matter of fact, protection against old stale segments from a | ||||
| previous incarnation of the connection comes from allowing the | ||||
| creation of a new incarnation of a previous connection only after | ||||
| 2*MSL have passed since a segment corresponding to the old | ||||
| incarnation was last seen. This is accomplished by the TIME-WAIT | ||||
| state, and TCP's "quiet time" concept. However, as discussed in | ||||
| Section 3.1 and Section 11.1.2 of this document, the ISN can be | ||||
| used to perform some heuristics meant to avoid an interoperability | ||||
| problem that may arise when two systems establish connections at a | ||||
| high rate. In order for such heuristics to work, the ISNs | ||||
| generated by a TCP should be monotonically increasing. | ||||
| The ISN generation scheme recommended in this section was | ||||
| originally proposed in RFC 1948 [Bellovin, 1996], such that the | ||||
| chances of an attacker from guessing the ISN of a TCP are reduced, | ||||
| while still producing a monotonically-increasing sequence that | ||||
| allows implementation of the optimization described in Section 3.1 | ||||
| and Section 11.1.2 of this document. | ||||
| [CERT, 2001] and [US-CERT, 2001] are advisories about the security | In order to mitigate this vulnerabilities, some implementations set | |||
| implications of weak ISN generators. [Zalewski, 2001a] and | the TCP ISN to a PRNG. However, this has been known to cause | |||
| [Zalewski, 2002] contain a detailed analysis of ISN generators, | interoperability problems. [RFC6528] provides advice in this area. | |||
| and a survey of the algorithms in use by popular TCP | ||||
| implementations. | ||||
| Another security consideration that should be made about TCP | Another security consideration that should be made about TCP sequence | |||
| sequence numbers is that they might allow an attacker to count the | numbers is that they might allow an attacker to count the number of | |||
| number of systems behind a Network Address Translator (NAT) | systems behind a Network Address Translator (NAT) [Srisuresh and | |||
| [Srisuresh and Egevang, 2001]. Depending on the ISN generators | Egevang, 2001]. Depending on the ISN generators implemented by each | |||
| implemented by each of the systems behind the NAT, an attacker | of the systems behind the NAT, an attacker might be able to count the | |||
| might be able to count the number of systems behind the NAT by | number of systems behind the NAT by establishing a number of TCP | |||
| establishing a number of TCP connections (using the public address | connections (using the public address of the NAT) and indentifying | |||
| of the NAT) and indentifying the number of different sequence | the number of different sequence number "spaces". [Gont and | |||
| number "spaces". This information leakage could be eliminated by | Srisuresh, 2008] provides a detailed discussion of the security | |||
| rewriting the contents of all those header fields and options that | implications of NATs and of the possible mitigations for this and | |||
| make use of sequence numbers (such as the Sequence Number and the | other issues. | |||
| Acknowledgement Number fields, and the SACK Option) at the NAT. | ||||
| [Gont and Srisuresh, 2008] provides a detailed discussion of the | ||||
| security implications of NATs and of the possible mitigations for | ||||
| this and other issues. | ||||
| 3.3. Acknowledgement Number | 3.3. Acknowledgement Number | |||
| TCP SHOULD set the Acknowledgement Number to zero when sending a TCP | If the ACK bit is on, the Acknowledgement Number contains the value | |||
| segment that does not have the ACK bit set (i.e., a SYN segment). | of the next sequence number the sender of this segment is expecting | |||
| to receive. According to RFC 793, the Acknowledgement Number is | ||||
| TCP MUST check that, on segments that have the ACK bit set, the | considered valid as long as it does not acknowledge the receipt of | |||
| Acknowledgment Number satisfies the expression: | data that has not yet been sent. | |||
| SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT | ||||
| If a TCP segment does not pass this check, the segment MUST be | ||||
| dropped, and an ACK segment SHOULD be sent in response. | ||||
| DISCUSSION: | ||||
| If the ACK bit is on, the Acknowledgement Number contains the | ||||
| value of the next sequence number the sender of this segment is | ||||
| expecting to receive. According to RFC 793, the Acknowledgement | ||||
| Number is considered valid as long as it does not acknowledge the | ||||
| receipt of data that has not yet been sent. | ||||
| However, as a result of recent concerns on forgery attacks against | ||||
| TCP (see Section 11 of this document), ongoing work at the IETF | ||||
| [Ramaiah et al, 2008] has proposed to enforce a more strict check | ||||
| on the Acknowledgement Number of segments that have the ACK bit | ||||
| set: | ||||
| SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT | However, as a result of recent concerns on forgery attacks against | |||
| TCP (see Section 11 of this document) [RFC5961] has proposed to | ||||
| enforce a more strict check on the Acknowledgement Number of segments | ||||
| that have the ACK bit set. See for more details. | ||||
| If the ACK bit is off, the Acknowledgement Number field is not | If the ACK bit is off, the Acknowledgement Number field is not valid. | |||
| valid. We recommend TCP implementations to set the | We recommend TCP implementations to set the Acknowledgement Number to | |||
| Acknowledgement Number to zero when sending a TCP segment that | zero when sending a TCP segment that does not have the ACK bit set | |||
| does not have the ACK bit set (i.e., a SYN segment). Some TCP | (i.e., a SYN segment). Some TCP implementations have been known to | |||
| implementations have been known to fail to set the Acknowledgement | fail to set the Acknowledgement Number to zero, thus leaking | |||
| Number to zero, thus leaking information. | information. | |||
| TCP Acknowledgements are also used to perform heuristics for loss | TCP Acknowledgements are also used to perform heuristics for loss | |||
| recovery and congestion control. Section 9 of this document | recovery and congestion control. Section 9 of this document | |||
| describes a number of ways in which these mechanisms can be | describes a number of ways in which these mechanisms can be | |||
| exploited. | exploited. | |||
| 3.4. Data Offset | 3.4. Data Offset | |||
| TCP MUST enforce the following checks on the Data Offset field: | [draft-gont-tcpm-tcp-sanity-checks-00.txt] specifies a number of | |||
| sanity checks that should be performed on the Data Offset field. | ||||
| Data Offset >= 5 | ||||
| Data Offset * 4 <= TCP segment length | ||||
| If a TCP segment does not pass these checks, it should be silently | ||||
| dropped. | ||||
| The TCP segment length should be obtained from the IP layer, as | ||||
| TCP does not include a TCP segment length field. | ||||
| DISCUSSION: | ||||
| The Data Offset field indicates the length of the TCP header in | ||||
| 32-bit words. As the minimum TCP header size is 20 bytes, the | ||||
| minimum legal value for this field is 5. | ||||
| For obvious reasons, the TCP header cannot be larger than the | ||||
| whole TCP segment it is part of. | ||||
| 3.5. Control bits | 3.5. Control bits | |||
| The following subsections provide a discussion of the different | The following subsections provide a discussion of the different | |||
| control bits in the TCP header. TCP segments with unusual | control bits in the TCP header. TCP segments with unusual | |||
| combinations of flags set have been known in the past to cause | combinations of flags set have been known in the past to cause | |||
| malfunction of some implementations, sometimes to the extent of | malfunction of some implementations, sometimes to the extent of | |||
| causing them to crash [Postel, 1987] [Braden, 1992]. These packets | causing them to crash [RFC1025] [RFC1379]. These packets are still | |||
| are still usually employed for the purpose of TCP/IP stack | usually employed for the purpose of TCP/IP stack fingerprinting. | |||
| fingerprinting. Section 12.1 contains a discussion of TCP/IP stack | Section 12.1 contains a discussion of TCP/IP stack fingerprinting. | |||
| fingerprinting. | ||||
| 3.5.1. Reserved (four bits) | 3.5.1. Reserved (four bits) | |||
| TCP MUST ignore the Reserved field of incoming TCP segments. | These four bits are reserved for future use, and must be zero. As | |||
| with virtually every field, the Reserved field could be used as a | ||||
| DISCUSSION: | covert channel. While there exist intermediate devices such as | |||
| protocol scrubbers that clear these bits, and firewalls that drop/ | ||||
| These four bits are reserved for future use, and must be zero. As | reject segments with any of these bits set, these devices should | |||
| with virtually every field, the Reserved field could be used as a | consider the impact of these policies on TCP interoperability. For | |||
| covert channel. While there exist intermediate devices such as | example, as TCP continues to evolve, all or part of the bits in the | |||
| protocol scrubbers that clear these bits, and firewalls that drop/ | Reserved field could be used to implement some new functionality. If | |||
| reject segments with any of these bits set, these devices should | some middle-box or end-system implementation were to drop a TCP | |||
| consider the impact of these policies on TCP interoperability. | segment merely because some of these bits are not set to zero, | |||
| For example, as TCP continues to evolve, all or part of the bits | interoperability problems would arise. | |||
| in the Reserved field could be used to implement some new | ||||
| functionality. If some middle-box or end-system implementation | ||||
| were to drop a TCP segment merely because some of these bits are | ||||
| not set to zero, interoperability problems would arise. | ||||
| 3.5.2. CWR (Congestion Window Reduced) | 3.5.2. CWR (Congestion Window Reduced) | |||
| DISCUSSION: | The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used | |||
| as part of the Explicit Congestion Notification (ECN) mechanism. For | ||||
| The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is | connections in any of the synchronized states, this flag indicates, | |||
| used as part of the Explicit Congestion Notification (ECN) | when set, that the TCP sending this segment has reduced its | |||
| mechanism. For connections in any of the synchronized states, | congestion window. | |||
| this flag indicates, when set, that the TCP sending this segment | ||||
| has reduced its congestion window. | ||||
| An analysis of the security implications of ECN can be found in | An analysis of the security implications of ECN can be found in | |||
| Section 9.3 of this document. | Section 9.3 of this document. | |||
| 3.5.3. ECE (ECN-Echo) | 3.5.3. ECE (ECN-Echo) | |||
| DISCUSSION: | The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used | |||
| as part of the Explicit Congestion Notification (ECN) mechanism. | ||||
| The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is | ||||
| used as part of the Explicit Congestion Notification (ECN) | ||||
| mechanism. | ||||
| Once a TCP connection has been established, an ACK segment with | ||||
| the ECE bit set indicates that congestion was encountered in the | ||||
| network on the path from the sender to the receiver. This | ||||
| indication of congestion should be treated just as a congestion | ||||
| loss in non-ECN-capable TCP [Ramakrishnan et al, 2001]. | ||||
| Additionally, TCP should not increase the congestion window (cwnd) | ||||
| in response to such an ACK segment that indicates congestion, and | ||||
| should also not react to congestion indications more than once | ||||
| every window of data (or once per round-trip time). | ||||
| An analysis of the security implications of ECN can be found in | An analysis of the security implications of ECN can be found in | |||
| Section 9.3 of this document. | Section 9.3 of this document. | |||
| 3.5.4. URG | 3.5.4. URG | |||
| DISCUSSION: | When the URG flag is set, the Urgent Pointer field contains the | |||
| current value of the urgent pointer. | ||||
| When the URG flag is set, the Urgent Pointer field contains the | ||||
| current value of the urgent pointer. | ||||
| Receipt of an "urgent" indication generates, in a number of | ||||
| implementations (such as those in UNIX-like systems), a software | ||||
| interrupt (signal) that is delivered to the corresponding process. | ||||
| In UNIX-like systems, receipt of an urgent indication causes a | Receipt of an "urgent" indication generates, in a number of | |||
| SIGURG signal to be delivered to the corresponding process. | implementations (such as those in UNIX-like systems), a software | |||
| interrupt (signal) that is delivered to the corresponding process. | ||||
| In UNIX-like systems, receipt of an urgent indication causes a SIGURG | ||||
| signal to be delivered to the corresponding process. | ||||
| A number of applications handle TCP urgent indications by | A number of applications handle TCP urgent indications by installing | |||
| installing a signal handler for the corresponding signal (e.g., | a signal handler for the corresponding signal (e.g., SIGURG). As | |||
| SIGURG). As discussed in [Zalewski, 2001b], some signal handlers | discussed in [Zalewski, 2001b], some signal handlers can be | |||
| can be maliciously exploited by an attacker, for example to gain | maliciously exploited by an attacker, for example to gain remote | |||
| remote access to a system. While secure programming of signal | access to a system. While secure programming of signal handlers is | |||
| handlers is out of the scope of this document, we nevertheless | out of the scope of this document, we nevertheless raise awareness | |||
| raise awareness that TCP urgent indications might be exploited to | that TCP urgent indications might be exploited to abuse poorly- | |||
| abuse poorly-written signal handlers. | written signal handlers. | |||
| Section 3.9 discusses the security implications of the TCP urgent | Section 3.9 discusses the security implications of the TCP urgent | |||
| mechanism. | mechanism. | |||
| 3.5.5. ACK | 3.5.5. ACK | |||
| DISCUSSION: | When the ACK bit is one, the Acknowledgment Number field contains the | |||
| next sequence number expected, cumulatively acknowledging the receipt | ||||
| When the ACK bit is one, the Acknowledgment Number field contains | of all data up to the sequence number in the Acknowledgement Number, | |||
| the next sequence number expected, cumulatively acknowledging the | minus one. Section 3.4 of this document describes sanity checks that | |||
| receipt of all data up to the sequence number in the | should be performed on the Acknowledgement Number field. | |||
| Acknowledgement Number, minus one. Section 3.4 of this document | ||||
| describes sanity checks that should be performed on the | ||||
| Acknowledgement Number field. | ||||
| TCP Acknowledgements are also used to perform heuristics for loss | TCP Acknowledgements are also used to perform heuristics for loss | |||
| recovery and congestion control. Section 9 of this document | recovery and congestion control. Section 9 of this document | |||
| describes a number of ways in which these mechanisms can be | describes a number of ways in which these mechanisms can be | |||
| exploited. | exploited. | |||
| 3.5.6. PSH | 3.5.6. PSH | |||
| As a result of a SEND call, TCP SHOULD send all queued data (provided | [draft-gont-tcpm-tcp-push-semantics-00.txt] describes a number of | |||
| that TCP's flow control and congestion control algorithms allow it). | security issues that may arise as a result of the PUSH semantics, and | |||
| proposes a number of ways to mitigate these issues. | ||||
| Received data SHOULD be immediately delivered to an application | ||||
| calling the RECEIVE function, even if the data already available are | ||||
| less than those requested by the application. | ||||
| DISCUSSION: | ||||
| RFC 793 [Postel, 1981c] contains (in pages 54-64) a functional | ||||
| description of a TCP Application Programming Interface (API). One | ||||
| of the parameters of the SEND function is the PUSH flag which, | ||||
| when set, signals the local TCP that it must send all unsent data. | ||||
| The TCP PSH (PUSH) flag will be set in the last outgoing segment, | ||||
| to signal the push function to the receiving TCP. Upon receipt of | ||||
| a segment with the PSH flag set, the receiving user's buffer is | ||||
| returned to the user, without waiting for additional data to | ||||
| arrive. | ||||
| There are two security considerations arising from the PUSH | ||||
| function. On the sending side, an attacker could cause a large | ||||
| amount of data to be queued for transmission without setting the | ||||
| PUSH flag in the SEND call. This would prevent the local TCP from | ||||
| sending the queued data, causing system memory to be tied to those | ||||
| data for an unnecessarily long period of time. | ||||
| An analogous consideration should be made for the receiving TCP. | ||||
| TCP is allowed to buffer incoming data until the receiving user's | ||||
| buffer fills or a segment with the PSH bit set is received. If | ||||
| the receiving TCP implements this policy, an attacker could send a | ||||
| large amount of data, slightly less than the receiving user's | ||||
| buffer size, to cause system memory to be tied to these data for | ||||
| an unnecessarily long period of time. Both of these issues are | ||||
| discussed in Section 4.2.2.2 of RFC 1122 [Braden, 1989]. | ||||
| In order to mitigate these potential vulnerabilities, we suggest | ||||
| assuming an implicit "PUSH" in every SEND call. On the sending | ||||
| side, this means that as a result of a SEND call TCP should try to | ||||
| send all queued data (provided that TCP's flow control and | ||||
| congestion control algorithms allow it). On the receiving side, | ||||
| this means that the received data will be immediately delivered to | ||||
| an application calling the RECEIVE function, even if the data | ||||
| already available are less than those requested by the | ||||
| application. | ||||
| It is interesting to note that popular TCP APIs (such as | ||||
| "sockets") do not provide a PUSH flag in any of the interfaces | ||||
| they define, but rather perform some kind of "heuristics" to set | ||||
| the PSH bit in outgoing segments. As a result, the value of the | ||||
| PSH bit in the received TCP segments is usually a policy of the | ||||
| sending TCP, rather than a policy of the sending application. All | ||||
| robust applications that make use of those APIs (such as the | ||||
| sockets API) properly handle the case of a RECEIVE call returning | ||||
| less data (e.g., zero) than requested, usually by performing | ||||
| subsequent RECEIVE calls. | ||||
| Another potential malicious use of the PSH bit would be for an | ||||
| attacker to send small TCP segments (probably with zero bytes of | ||||
| data payload) to cause the receiving application to be | ||||
| unnecessarily woken up (increasing the CPU load), or to cause | ||||
| malfunction of poorly-written applications that may not handle | ||||
| well the case of RECEIVE calls returning less data than requested. | ||||
| 3.5.7. RST | 3.5.7. RST | |||
| TCP MUST process RST segments (i.e., segments with the RST bit set) | The RST bit is used to request the abortion (abnormal close) of a TCP | |||
| as follows: | connection. RFC 793 [RFC0793] suggests that an RST segment should be | |||
| considered valid if its Sequence Number is valid (i.e., falls within | ||||
| o If the Sequence Number of the RST segment is not valid (i.e., | the receive window). However, in response to the security concerns | |||
| falls outside of the receive window), silently drop the segment. | raised by [Watson, 2004] and [NISCC, 2004], [RFC6429] proposed | |||
| stricter validity checks. Please see [RFC6429] for additional | ||||
| o If the Sequence Number of the RST segment matches the next | details. | |||
| expected sequence number (RCV.NXT), abort the corresponding | ||||
| connection. | ||||
| o If the Sequence Number is valid (i.e., falls within the receive | ||||
| window) but is not exactly RCV.NXT, send an ACK segment (a | ||||
| "challenge ACK") of the form: <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>. | ||||
| TCP SHOULD rate-limit these challenge ACK segments. | ||||
| DISCUSSION: | ||||
| The RST bit is used to request the abortion (abnormal close) of a | ||||
| TCP connection. RFC 793 [Postel, 1981c] suggests that an RST | ||||
| segment should be considered valid if its Sequence Number is valid | ||||
| (i.e., falls within the receive window). However, in response to | ||||
| the security concerns raised by [Watson, 2004] and [NISCC, 2004], | ||||
| [Ramaiah et al, 2008] proposec the aforementioned stricter | ||||
| validity checks. | ||||
| Section 11.1 of this document describes TCP-based connection-reset | Section 11.1 of this document describes TCP-based connection-reset | |||
| attacks, along with a number of countermeasures to mitigate their | attacks, along with a number of countermeasures to mitigate their | |||
| impact. | impact. | |||
| 3.5.8. SYN | 3.5.8. SYN | |||
| DISCUSSION: | The SYN bit is used during the connection-establishment phase, to | |||
| request the synchronization of sequence numbers. | ||||
| The SYN bit is used during the connection-establishment phase, to | ||||
| request the synchronization of sequence numbers. | ||||
| There are basically four different vulnerabilities that make use | There are basically four different vulnerabilities that make use of | |||
| of the SYN bit: SYN-flooding attacks, connection forgery attacks, | the SYN bit: SYN-flooding attacks, connection forgery attacks, | |||
| connection flooding attacks, and connection-reset attacks. They | connection flooding attacks, and connection-reset attacks. They are | |||
| are described in Section 5.1, Section 5.2, Section 5.3, and | described in Section 5.1, Section 5.2, Section 5.3, and Section | |||
| Section 11.1.2, respectively, along with the possible | 11.1.2, respectively, along with the possible countermeasures. | |||
| countermeasures. | ||||
| 3.5.9. FIN | 3.5.9. FIN | |||
| DISCUSSION: | The FIN flag is used to signal the remote end-point the end of the | |||
| data transfer in this direction. Receipt of a valid FIN segment | ||||
| The FIN flag is used to signal the remote end-point the end of the | (i.e., a TCP segment with the FIN flag set) causes the transition in | |||
| data transfer in this direction. Receipt of a valid FIN segment | the connection state, as part of what is usually referred to as the | |||
| (i.e., a TCP segment with the FIN flag set) causes the transition | "connection termination phase". | |||
| in the connection state, as part of what is usually referred to as | ||||
| the "connection termination phase". | ||||
| The connection-termination phase can be exploited to perform a | The connection-termination phase can be exploited to perform a number | |||
| number of resource-exhaustion attacks. Section 6 of this document | of resource-exhaustion attacks. Section 6 of this document describes | |||
| describes a number of attacks that exploit the connection- | a number of attacks that exploit the connection-termination phase | |||
| termination phase along with the possible countermeasures. | along with the possible countermeasures. | |||
| 3.6. Window | 3.6. Window | |||
| DISCUSSION: | The TCP Window field advertises how many bytes of data the remote | |||
| peer is allowed to send before a new advertisement is made. | ||||
| The TCP Window field advertises how many bytes of data the remote | Theoretically, the maximum transfer rate that can be achieved by TCP | |||
| peer is allowed to send before a new advertisement is made. | is limited to: | |||
| Theoretically, the maximum transfer rate that can be achieved by | ||||
| TCP is limited to: | ||||
| Maximum Transfer Rate = Window / RTT | Maximum Transfer Rate = Window / RTT | |||
| This means that, under ideal network conditions (e.g., no packet | This means that, under ideal network conditions (e.g., no packet | |||
| loss), the TCP Window in use should be at least: | loss), the TCP Window in use should be at least: | |||
| Window = 2 * Bandwidth * Delay | Window = 2 * Bandwidth * Delay | |||
| Using a larger Window than that resulting from the previous | Using a larger Window than that resulting from the previous equation | |||
| equation will not provide any improvements in terms of | will not provide any improvements in terms of performance. | |||
| performance. | ||||
| In practice, selection of the most convenient Window size may also | ||||
| depend on a number of other parameters, such as: packet loss rate, | ||||
| loss recovery mechanisms in use, etc. | ||||
| Security implications of the maximum TCP window size | In practice, selection of the most convenient Window size may also | |||
| depend on a number of other parameters, such as: packet loss rate, | ||||
| loss recovery mechanisms in use, etc. | ||||
| An aspect of the TCP Window that is usually overlooked is the | An aspect of the TCP Window that is usually overlooked is the | |||
| security implications of its size. Increasing the TCP window | security implications of its size. Increasing the TCP window | |||
| increases the sequence number space that will be considered | increases the sequence number space that will be considered "valid" | |||
| "valid" for incoming segments. Thus, use of unnecessarily large | for incoming segments. Thus, use of unnecessarily large TCP Window | |||
| TCP Window sizes increases TCP's vulnerability to forgery attacks | sizes increases TCP's vulnerability to forgery attacks unnecessarily. | |||
| unnecessarily. | ||||
| In those scenarios in which the network conditions are known | In those scenarios in which the network conditions are known and/or | |||
| and/or can be easily predicted, it is recommended that the TCP | can be easily predicted, it is recommended that the TCP Window is | |||
| Window is never set to a value larger than that resulting from the | never set to a value larger than that resulting from the equations | |||
| equations above. Additionally, the nature of the application | above. Additionally, the nature of the application running on top of | |||
| running on top of TCP should be considered when tuning the TCP | TCP should be considered when tuning the TCP window. As an example, | |||
| window. As an example, an H.245 signaling application certainly | an H.245 signaling application certainly does not have high | |||
| does not have high requirements on throughput, and thus a window | requirements on throughput, and thus a window size of around 4 KBytes | |||
| size of around 4 KBytes will usually fulfill its needs, while | will usually fulfill its needs, while keeping TCP's resistance to | |||
| keeping TCP's resistance to off-path forgery attacks at a decent | off-path forgery attacks at a decent level. Some rough measurements | |||
| level. Some rough measurements seem to indicate that a TCP window | seem to indicate that a TCP window of 4Kbytes is common practice for | |||
| of 4Kbytes is common practice for TCP connections servicing | TCP connections servicing applications such as BGP. | |||
| applications such as BGP. | ||||
| In principle, a possible approach to avoid requiring | In principle, a possible approach to avoid requiring administrators | |||
| administrators to manually set the TCP window would be to | to manually set the TCP window would be to implement an automatic | |||
| implement an automatic buffer tuning mechanism, such as that | buffer tuning mechanism, such as that described in [Heffner, 2002]. | |||
| described in [Heffner, 2002]. However, as discussed in Section | However, as discussed in Section 7.3.2 of this document these | |||
| 7.3.2 of this document these mechanisms can be exploited to | mechanisms can be exploited to perform other types of attacks. | |||
| perform other types of attacks. | ||||
| Security implications arising from closed windows | 3.6.1. Security implications arising from closed windows | |||
| The TCP window is a flow-control mechanism that prevents a fast | When a TCP end-point is not willing to receive any more data (before | |||
| data sender application from overwhelming a "slow" receiver. When | some of the data that have already been received are consumed), it | |||
| a TCP end-point is not willing to receive any more data (before | will advertise a TCP window of zero bytes. This will effectively | |||
| some of the data that have already been received are consumed), it | stop the sender from sending any new data to the TCP receiver. | |||
| will advertise a TCP window of zero bytes. This will effectively | Transmission of new data will resume when the TCP receiver advertises | |||
| stop the sender from sending any new data to the TCP receiver. | a nonzero TCP window, usually with a TCP segment that contains no | |||
| Transmission of new data will resume when the TCP receiver | data ("an ACK"). | |||
| advertises a nonzero TCP window, usually with a TCP segment that | ||||
| contains no data ("an ACK"). | ||||
| This segment is usually referred to as a "window update", as the | This segment is usually referred to as a "window update", as the | |||
| only purpose of this segment is to update the server regarding the | only purpose of this segment is to update the server regarding the | |||
| new window. | new window. | |||
| To accommodate those scenarios in which the ACK segment that | To accommodate those scenarios in which the ACK segment that "opens" | |||
| "opens" the window is lost, TCP implements a "persist timer" that | the window is lost, TCP implements a "persist timer" that causes the | |||
| causes the TCP sender to query the TCP receiver periodically if | TCP sender to query the TCP receiver periodically if the last segment | |||
| the last segment received advertised a window of zero bytes. This | received advertised a window of zero bytes. This probe simply | |||
| probe simply consists of sending one byte of new data that will | consists of sending one byte of new data that will force the TCP | |||
| force the TCP receiver to send an ACK segment back to the TCP | receiver to send an ACK segment back to the TCP sender, containing | |||
| sender, containing the current TCP window. Similarly to the | the current TCP window. Similarly to the retransmission timeout | |||
| retransmission timeout timer, an exponential back-off is used when | timer, an exponential back-off is used when calculating the | |||
| calculating the retransmission timer, so that the spacing between | retransmission timer, so that the spacing between probes increases | |||
| probes increases exponentially. | exponentially. | |||
| A fundamental difference between the "persist timer" and the | A fundamental difference between the "persist timer" and the | |||
| retransmission timer is that there is no limit on the amount of | retransmission timer is that there is no limit on the amount of time | |||
| time during which a TCP can advertise a zero window. This means | during which a TCP can advertise a zero window. This means that a | |||
| that a TCP end-point could potentially advertise a zero window | TCP end-point could potentially advertise a zero window forever, thus | |||
| forever, thus keeping kernel memory at the TCP sender tied to the | keeping kernel memory at the TCP sender tied to the TCP | |||
| TCP retransmission buffer. This could clearly be exploited as a | retransmission buffer. This could clearly be exploited as a vector | |||
| vector for performing a Denial of Service (DoS) attack against | for performing a Denial of Service (DoS) attack against TCP, such as | |||
| TCP, such as that described in Section 7.1 of this document. | that described in Section 7.1 of this document. | |||
| Section 7.1 of this document describes a Denial of Service attack | Section 7.1 of this document describes a Denial of Service attack | |||
| that aims at exhausting the kernel memory used for the TCP | that aims at exhausting the kernel memory used for the TCP | |||
| retransmission buffer, along with possible countermeasures. | retransmission buffer, along with possible countermeasures. | |||
| 3.7. Checksum | 3.7. Checksum | |||
| Middleboxes that process TCP segments MUST validate the Checksum | While in principle there should not be security implications arising | |||
| field, and silently discard the TCP segment if such validation fails. | from the Checksum field, due to non-RFC-compliant implementations, | |||
| the Checksum can be exploited to detect firewalls, evade network | ||||
| DISCUSSION: | intrusion detection systems (NIDS), and/or perform Denial of Service | |||
| attacks. | ||||
| The Checksum field is an error detection mechanism meant for the | ||||
| contents of the TCP segment and a number of important fields of | ||||
| the IP header. It is computed over the full TCP header pre-pended | ||||
| with a pseudo header that includes the IP Source Address, the IP | ||||
| Destination Address, the Protocol number, and the TCP segment | ||||
| length. While in principle there should not be security | ||||
| implications arising from this field, due to non-RFC-compliant | ||||
| implementations, the Checksum can be exploited to detect | ||||
| firewalls, evade network intrusion detection systems (NIDS), | ||||
| and/or perform Denial of Service attacks. | ||||
| If a stateful firewall does not check the TCP Checksum in the | If a stateful firewall does not check the TCP Checksum in the | |||
| segments it processes, an attacker can exploit this situation to | segments it processes, an attacker can exploit this situation to | |||
| perform a variety of attacks. For example, he could send a flood | perform a variety of attacks. For example, he could send a flood of | |||
| of TCP segments with invalid checksums, which would nevertheless | TCP segments with invalid checksums, which would nevertheless create | |||
| create state information at the firewall. When each of these | state information at the firewall. When each of these segments is | |||
| segments is received at its intended destination, the TCP checksum | received at its intended destination, the TCP checksum will be found | |||
| will be found to be incorrect, and the corresponding will be | to be incorrect, and the corresponding will be silently discarded. | |||
| silently discarded. As these segments will not elicit a response | As these segments will not elicit a response (e.g., an RST segment) | |||
| (e.g., an RST segment) from the intended recipients, the | from the intended recipients, the corresponding connection state | |||
| corresponding connection state entries at the firewall will not be | entries at the firewall will not be removed. Therefore, an attacker | |||
| removed. Therefore, an attacker may end up tying all the state | may end up tying all the state resources of the firewall to TCP | |||
| resources of the firewall to TCP connections that will never | connections that will never complete or be terminated, probably | |||
| complete or be terminated, probably leading to a Denial of Service | leading to a Denial of Service to legitimate users, or forcing the | |||
| to legitimate users, or forcing the firewall to randomly drop | firewall to randomly drop connection state entries. | |||
| connection state entries. | ||||
| If a NIDS does not check the Checksum of TCP segments, an attacker | If a NIDS does not check the Checksum of TCP segments, an attacker | |||
| may send TCP segments with an invalid checksum to cause the NIDS | may send TCP segments with an invalid checksum to cause the NIDS to | |||
| to obtain a TCP data stream different from that obtained by the | obtain a TCP data stream different from that obtained by the system | |||
| system being monitored. In order to "confuse" the NIDS, the | being monitored. In order to "confuse" the NIDS, the attacker would | |||
| attacker would send TCP segments with an invalid Checksum and a | send TCP segments with an invalid Checksum and a Sequence Number that | |||
| Sequence Number that would overlap the sequence number space being | would overlap the sequence number space being used for his malicious | |||
| used for his malicious activity. FTester [Barisani, 2006] is a | activity. FTester [Barisani, 2006] is a tool that can be used to | |||
| tool that can be used to assess NIDS on this issue. | assess NIDS on this issue. | |||
| Finally, an attacker performing port-scanning could potentially | Finally, an attacker performing port-scanning could potentially | |||
| exploit intermediate systems that do not check the TCP Checksum to | exploit intermediate systems that do not check the TCP Checksum to | |||
| detect whether a given TCP port is being filtered by an | detect whether a given TCP port is being filtered by an intermediate | |||
| intermediate firewall, or the port is actually closed by the host | firewall, or the port is actually closed by the host being port- | |||
| being port-scanned. If a given TCP port appeared to be closed, | scanned. If a given TCP port appeared to be closed, the attacker | |||
| the attacker would then send a SYN segment with an invalid | would then send a SYN segment with an invalid Checksum. If this | |||
| Checksum. If this segment elicited a response (either an ICMP | segment elicited a response (either an ICMP error message or a TCP | |||
| error message or a TCP RST segment) to this packet, then that | RST segment) to this packet, then that response should come from a | |||
| response should come from a system that does not check the TCP | system that does not check the TCP checksum. Since normal host | |||
| checksum. Since normal host implementations of the TCP protocol | implementations of the TCP protocol do check the TCP checksum, such a | |||
| do check the TCP checksum, such a response would most likely come | response would most likely come from a firewall or some other middle- | |||
| from a firewall or some other middle-box. | box. | |||
| [Ed3f, 2002] describes the exploitation of the TCP checksum for | [Ed3f, 2002] describes the exploitation of the TCP checksum for | |||
| performing the above activities. [US-CERT, 2005d] provides an | performing the above activities. [US-CERT, 2005d] provides an | |||
| example of a TCP implementation that failed to check the TCP | example of a TCP implementation that failed to check the TCP | |||
| checksum. | checksum. | |||
| 3.8. Urgent pointer | 3.8. Urgent pointer | |||
| Segment.Size - Data Offset * 4 > 0 | Some implementations have been found to be unable to process TCP | |||
| urgent indications correctly. [Myst, 1997] originally described how | ||||
| If a TCP segment with the URG bit set does not pass this check, it | TCP urgent indications could be exploited to perform a Denial of | |||
| MUST be silently dropped. | Service (DoS) attack against some TCP/IP implementations, usually | |||
| leading to a system crash. | ||||
| For TCP segments that have the URG bit set to zero, sending TCP TCP | ||||
| SHOULD set the Urgent Pointer to zero. | ||||
| A receiving TCP MUST ignore the Urgent Pointer field of TCP segments | ||||
| for which the URG bit is zero. | ||||
| DISCUSSION: | ||||
| Section 3.7 of RFC 793 [Postel, 1981c] states (in page 42) that to | ||||
| send an urgent indication the user must also send at least one | ||||
| byte of data. | ||||
| If the URG bit is zero, the Urgent Pointer is not valid, and thus | ||||
| should not be processed by the receiving TCP. Nevertheless, we | ||||
| recommend TCP implementations to set the Urgent Pointer to zero | ||||
| when sending a TCP segment that does not have the URG bit set, and | ||||
| to ignore the Urgent Pointer (as required by RFC 793) when the URG | ||||
| bit is zero. | ||||
| Some stacks have been known to fail to set the Urgent Pointer to | ||||
| zero when the URG bit is zero, thus leaking out the corresponding | ||||
| system memory contents. [Zalewski, 2008] provides further details | ||||
| about this issue. | ||||
| Some implementations have been found to be unable to process TCP | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of | |||
| urgent indications correctly. [Myst, 1997] originally described | sanity checks to be enforced on TCP segments regarding urgent | |||
| how TCP urgent indications could be exploited to perform a Denial | indications. [RFC6093] deprecates the use of urgent indications in | |||
| of Service (DoS) attack against some TCP/IP implementations, | new applications. | |||
| usually leading to a system crash. | ||||
| 3.9. Options | 3.9. Options | |||
| [IANA, 2007] contains the official list of the assigned option | [IANA, 2007] contains the official list of the assigned option | |||
| numbers. TCP Options have been specified in the past both within the | numbers. TCP Options have been specified in the past both within the | |||
| IETF and by other groups. [Hnes, 2007] contains an un-official | IETF and by other groups. [Hnes, 2007] contains an un-official | |||
| updated version of the IANA list of assigned option numbers. The | updated version of the IANA list of assigned option numbers. The | |||
| following table contains a summary of the assigned TCP option | following table contains a summary of the assigned TCP option | |||
| numbers, which is based on [Hnes, 2007]. | numbers, which is based on [Hnes, 2007]. | |||
| skipping to change at page 27, line 10 ¶ | skipping to change at page 19, line 10 ¶ | |||
| o Case 2: An option-kind byte, followed by an option-length byte, | o Case 2: An option-kind byte, followed by an option-length byte, | |||
| and the actual option-data bytes. | and the actual option-data bytes. | |||
| In options of the Case 2 above, the option-length byte counts the | In options of the Case 2 above, the option-length byte counts the | |||
| option-kind byte and the option-length byte, as well as the actual | option-kind byte and the option-length byte, as well as the actual | |||
| option-data bytes. | option-data bytes. | |||
| All options except "End of Option List" (Kind = 0) and "No Operation" | All options except "End of Option List" (Kind = 0) and "No Operation" | |||
| (Kind = 1), are of "Case 2". | (Kind = 1), are of "Case 2". | |||
| For options that belong to the "Case 2" described above, the | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of | |||
| following checks MUST be performed: | sanity checks that should be performed on TCP options. | |||
| option-length >= 2 | ||||
| option-offset + option-length <= Data Offset * 4 | ||||
| Where option-offset is the offset of the first byte of the option | ||||
| within the TCP header, with the first byte of the TCP header being | ||||
| assigned an offset of 0. | ||||
| If a TCP segment fails to pass any of these checks, it SHOULD be | ||||
| silently dropped. | ||||
| TCP MUST ignore unknown TCP options, provided they pass the | ||||
| validation checks specified above. In the same way, middle-boxes | ||||
| such as packet filters SHOULD NOT reject TCP segments containing | ||||
| "unknown" TCP options that pass the validation checks described | ||||
| earlier in this Section. | ||||
| DISCUSSION: | ||||
| The value "2" in the first equation accounts for the option-kind | ||||
| byte and the option-length byte, and assumes zero bytes of option- | ||||
| data. This check prevents, among other things, loops in option | ||||
| processing that may arise from incorrect option lengths. | ||||
| The second equation takes into account the limit on the legitimate | ||||
| option length imposed by the syntax of the TCP header, and is | ||||
| meant to detect forged option-length values that might make an | ||||
| option overlap with the TCP payload, or even go past the actual | ||||
| end of the TCP segment carrying the option. | ||||
| Middle-boxes such as packet filters should not reject TCP segments | ||||
| containing unknown options solely because these options have not been | ||||
| present in the SYN/SYN-ACK handshake. | ||||
| DISCUSSION: | ||||
| There is renewed interest in defining new TCP options for purposes | ||||
| like improved connection management and maintenance, advanced | ||||
| congestion control schemes, and security features. The evolution | ||||
| of the TCP/IP protocol suite would be severely impacted by | ||||
| obstacles to deploying such new protocol mechanisms. | ||||
| Middle-boxes such as packet filters SHOULD NOT reject TCP segments | ||||
| containing unknown options solely because these options have not been | ||||
| present in the SYN/SYN-ACK handshake. | ||||
| DISCUSSION: | ||||
| In the past, TCP enhancements based on TCP options regularly have | ||||
| specified the exchange of a specific "enabling" option during the | ||||
| initial SYN/SYN-ACK handshake. Due to the severely limited TCP | ||||
| option space which has already become a concern, it should be | ||||
| expected that future specifications might introduce new options | ||||
| not negotiated or enabled in this way. Therefore, middle-boxes | ||||
| such as packet filters should not reject TCP segments containing | ||||
| unknown options solely because these options have not been present | ||||
| in the SYN/SYN-ACK handshake. | ||||
| TCP MUST NOT "echo" in any way unknown TCP options received in | ||||
| inbound TCP segments. | ||||
| DISCUSSION: | ||||
| Some TCP implementations have been known to "echo" unknown TCP | ||||
| options received in incoming segments. Here we stress that TCP | ||||
| must not "echo" in any way unknown TCP options received in inbound | ||||
| TCP segments. This is at the foundation for the introduction of | ||||
| new TCP options, ensuring unambiguous behavior of systems not | ||||
| supporting a new specification. | ||||
| Section 4 discusses the security implications of common TCP options. | Section 4 discusses the security implications of common TCP options. | |||
| 3.10. Padding | 3.10. Padding | |||
| The TCP header padding is used to ensure that the TCP header ends and | The TCP header padding is used to ensure that the TCP header ends and | |||
| data begins on a 32-bit boundary. The padding is composed of zeros. | data begins on a 32-bit boundary. The padding is composed of zeros. | |||
| 3.11. Data | 3.11. Data | |||
| The data field contains the upper-layer packet being transmitted by | The data field contains the upper-layer packet being transmitted by | |||
| means of TCP. This payload is processed by the application process | means of TCP. This payload is processed by the application process | |||
| making use of the transport services of TCP. Therefore, the security | making use of the transport services of TCP. Therefore, the security | |||
| implications of this field are out of the scope of this document. | implications of this field are out of the scope of this document. | |||
| 4. Common TCP Options | 4. Common TCP Options | |||
| 4.1. End of Option List (Kind = 0) | 4.1. End of Option List (Kind = 0) | |||
| TCP implementations MUST be able to gracefully handle those TCP | This option indicates the "End of Options". As noted in | |||
| segments in which the End of Option List should have been present, | [draft-gont-tcpm-tcp-sanity-checks-00.txt], some implementations pad | |||
| but is missing. | the end of options with "No Operation" options rather than including | |||
| an "End of Options List" option. | ||||
| DISCUSSION: | ||||
| This option is used to indicate the "end of options" in those | ||||
| cases in which the end of options would not coincide with the end | ||||
| of the TCP header. | ||||
| TCP implementations are required to ignore those options they do | ||||
| not implement, and to be able to handle options with illegal | ||||
| lengths. Therefore, TCP implementations should be able to | ||||
| gracefully handle those TCP segments in which the End of Option | ||||
| List should have been present, but is missing. | ||||
| It is interesting to note that some TCP implementations do not use | ||||
| the "End of Option List" option for indicating the "end of | ||||
| options", but simply pad the TCP header with several "No | ||||
| Operation" (Kind = 1) options to meet the header length specified | ||||
| by the Data Offset header field. | ||||
| 4.2. No Operation (Kind = 1) | 4.2. No Operation (Kind = 1) | |||
| The no-operation option is basically used to allow the sending system | The no-operation option is basically used to allow the sending system | |||
| to align subsequent options in, for example, 32-bit boundaries. | to align subsequent options in, for example, 32-bit boundaries. | |||
| This option does not have any known security implications. | This option does not have any known security implications. | |||
| 4.3. Maximum Segment Size (Kind = 2) | 4.3. Maximum Segment Size (Kind = 2) | |||
| The Maximum Segment Size (MSS) option is used to indicate to the | The Maximum Segment Size (MSS) option is used to indicate to the | |||
| remote TCP endpoint the maximum segment size this TCP is willing to | remote TCP endpoint the maximum segment size this TCP is willing to | |||
| receive. | receive. | |||
| The following check MUST be performed on a TCP segment that carries a | The MSS option has been employed for performing DoS attacks, by | |||
| MSS option: | advertising very small MSS values thus greatly increasing the packet- | |||
| rate used by the victim system. | ||||
| SYN == 1 | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes this issue, and | |||
| proposes sanity checks to mitigate it. | ||||
| If the segment does not pass this check, it MUST be silently dropped. | ||||
| DISCUSSION: | ||||
| As stated in Section 3.1 of RFC 793 [Postel, 1981c], this option | ||||
| can only be sent in the initial connection request (i.e., in | ||||
| segments with the SYN control bit set). | ||||
| TCP MUST check that the option length is 4. If the option does not | ||||
| pass this check, it MUST be dropped. | ||||
| The received MSS SHOULD be sanitized as follows: | ||||
| Sanitized_MSS = max(MSS, 536) | ||||
| This "sanitized" MSS value SHOULD be used to compute the "effective | ||||
| send MSS" by the expression included in Section 4.2.2.6 of RFC 1122 | ||||
| [Braden, 1989], as follows: | ||||
| Eff.snd.MSS = min(Sanitized_MSS+20, MMS_S) - TCPhdrsize - IPoptionsize | ||||
| where: | ||||
| Sanitized_MSS: | ||||
| sanitized MSS value (the value received in the MSS option, with an | ||||
| enforced minimum value) | ||||
| MMS_S: | ||||
| maximum size for a transport-layer message that TCP may send | ||||
| TCPhdrsize: | ||||
| size of the TCP header, which typically was 20, but may be larger | ||||
| if TCP options are to be sent. | ||||
| IPoptionsize | ||||
| size of any IP options that TCP will pass to the IP layer with the | ||||
| current message. | ||||
| DISCUSSION: | ||||
| The advertised maximum segment size may be the result of the | ||||
| consideration of a number of factors. Firstly, if fragmentation | ||||
| is employed, the size of the IP reassembly buffer may impose a | ||||
| limit on the maximum TCP segment size that can be received. | ||||
| Considering that the minimum IP reassembly buffer size is 576 | ||||
| bytes, if an MSS option is not present included in the connection- | ||||
| establishment phase, an MSS of 536 bytes should be assumed. | ||||
| Secondly, if Path-MTU Discovery (specified in RFC 1191 [Mogul and | ||||
| Deering, 1990] and RFC 1981 [McCann et al, 1996]) is expected to | ||||
| be used for the connection, an artificial maximum segment size may | ||||
| be enforced by a TCP to prevent the remote peer from sending TCP | ||||
| segments which would be too large to be transmitted without | ||||
| fragmentation. Finally, a system connected by a low-speed link | ||||
| may choose to introduce an artificial maximum segment size to | ||||
| enforce an upper limit on the network latency that would otherwise | ||||
| negatively affect its interactive applications [Stevens, 1994]. | ||||
| The TCP specifications do not impose any requirements on the | ||||
| maximum segment size value that is included in the MSS option. | ||||
| However, there are a number of values that may cause undesirable | ||||
| results. Firstly, an MSS of 0 could possible "freeze" the TCP | ||||
| connection, as it would not allow data to be included in the | ||||
| payload of the TCP segments. Secondly, low values other than 0 | ||||
| would degrade the performance of the TCP connection (wasting more | ||||
| bandwidth in protocol headers than in actual data), and could | ||||
| potentially exhaust processing cycles at the sending TCP and/or | ||||
| the receiving TCP by producing an increase in the interrupt rate | ||||
| caused by the transmitted (or received) packets. | ||||
| The problems that might arise from low MSS values were first | ||||
| described by [Reed, 2001]. However, the community did not reach | ||||
| consensus on how to deal with these issues at that point. | ||||
| RFC 791 [Postel, 1981a] requires IP implementations to be able to | ||||
| receive IP datagrams of at least 576 bytes. Assuming an IPv4 | ||||
| header of 20 bytes, and a TCP header of 20 bytes, there should be | ||||
| room in each IP packet for 536 application data bytes. | ||||
| There are two cases to analyze when considering the possible | ||||
| interoperability impact of sanitizing the received MSS value: TCP | ||||
| connections relying on IP fragmentation and TCP connections | ||||
| implementing Path-MTU Discovery. In case the corresponding TCP | ||||
| connection relies on IP fragmentation, given that the minimum | ||||
| reassembly buffer size is required to be 576 bytes by RFC 791 | ||||
| [Postel, 1981a], the adoption of 536 bytes as a lower limit is | ||||
| safe. | ||||
| In case the TCP connection relies on Path-MTU Discovery, imposing | ||||
| a lower limit on the adopted MSS may ignore the advice of the | ||||
| remote TCP on the maximum segment size that can possibly be | ||||
| transmitted without fragmentation. As a result, this could lead | ||||
| to the first TCP data segment to be larger than the Path-MTU. | ||||
| However, in such a scenario, the TCP segment should elicit an ICMP | ||||
| Unreachable "fragmentation needed and DF bit set" error message | ||||
| that would cause the "effective send MSS" (E_MSS) to be decreased | ||||
| appropriately. Thus, imposing a lower limit on the accepted MSS | ||||
| will not cause any interoperability problems. | ||||
| A possible scenario exists in which the proposed enforcement of a | ||||
| lower limit in the received MSS might lead to an interoperability | ||||
| problem. If a system was attached to the network by means of a | ||||
| link with an MTU of less than 576 bytes, and there was some | ||||
| intermediate system which either silently dropped (i.e., without | ||||
| sending an ICMP error message) those packets equal to or larger | ||||
| than that 576 bytes, or some intermediate system simply filtered | ||||
| ICMP "fragmentation needed and DF bit set" error messages, the | ||||
| proposed behavior would not lead to an interoperability problem, | ||||
| when communication could have otherwise succeeded. However, the | ||||
| interoperability problem would really be introduced by the network | ||||
| setup (e.g., the middle-box silently dropping packets), rather | ||||
| than by the mechanism proposed in this section. In any case, TCP | ||||
| should nevertheless implement a mechanism such as that specified | ||||
| by RFC 4821 [Mathis and Heffner, 2007] to deal with this type of | ||||
| "network black-holes". | ||||
| 4.4. Selective Acknowledgement Option | 4.4. Selective Acknowledgement Option | |||
| The Selective Acknowledgement option provides an extension to allow | The Selective Acknowledgement option provides an extension to allow | |||
| the acknowledgement of individual segments, to enhance TCP's loss | the acknowledgement of individual segments, to enhance TCP's loss | |||
| recovery. | recovery. | |||
| Two options are involved in the SACK mechanism. The "Sack-permitted | Two options are involved in the SACK mechanism. The "Sack-permitted | |||
| option" is sent during the connections-establishment phase, to | option" is sent during the connections-establishment phase, to | |||
| advertise that SACK is supported. If both TCP peers agree to use | advertise that SACK is supported. If both TCP peers agree to use | |||
| selective acknowledgements, the actual selective acknowledgements are | selective acknowledgements, the actual selective acknowledgements are | |||
| sent, if needed, by means of "SACK options". | sent, if needed, by means of "SACK options". | |||
| 4.4.1. SACK-permitted Option (Kind = 4) | 4.4.1. SACK-permitted Option (Kind = 4) | |||
| The SACK-permitted option is meant to advertise that the TCP sending | [draft-gont-tcpm-tcp-sanity-checks-00.txt] to be performed on this | |||
| this segment supports Selective Acknowledgements. | option. | |||
| The following check MUST be performed on a TCP segment that carries a | ||||
| MSS option: | ||||
| SYN == 1 | ||||
| If a segment does not pass this check, it MUST be silently dropped. | ||||
| DISCUSSION: | ||||
| The SACK-permitted option can be sent only in SYN segments. | ||||
| TCP MUST check that the option length is 2. If the option does not | ||||
| pass this check it MUST be silently dropped. | ||||
| 4.4.2. SACK Option (Kind = 5) | 4.4.2. SACK Option (Kind = 5) | |||
| The SACK option is used to convey extended acknowledgment information | The TCP receiving a SACK option is expected to keep track of the | |||
| from the receiver to the sender over an established TCP connection. | selectively-acknowledged blocks. Even when space in the TCP header | |||
| The option consists of an option-kind byte (which must be 5), an | is limited (and thus each TCP segment can selectively-acknowledge at | |||
| option-length byte, and a variable number of SACK blocks. | most four blocks of data), an attacker could try to perform a buffer | |||
| overflow or a resource-exhaustion attack by sending a large number of | ||||
| TCP MUST silently discard those TCP segments carrying a SACK option | SACK options. | |||
| that does not pass the following check: | ||||
| option-offset + option-length <= Data Offset * 4 | ||||
| TCP MUST silently discard those TCP segments carrying a SACK option | ||||
| that does not pass the following check: | ||||
| option-length >= 10 | ||||
| DISCUSSION: | ||||
| A SACK Option with zero SACK blocks is nonsensical. The value | ||||
| "10" accounts for the option-kind byte, the option-length byte, a | ||||
| 4-byte left-edge field, and a 4-byte right-edge field. | ||||
| TCP MUST silently discard those TCP segments carrying a SACK option | ||||
| that does not pass the following check: | ||||
| (option-length - 2) % 8 == 0 | ||||
| DISCUSSION: | ||||
| As stated in Section 3 of RFC 2018 [Mathis et al, 1996], a SACK | ||||
| option that specifies n blocks will have a length of 8*n+2. | ||||
| TCP MUST silently discard those TCP segments carrying a SACK option | ||||
| that contains a SACK block that does not pass the following check: | ||||
| Left Edge of Block < Right Edge of Block | ||||
| As in all the other occurrences in this document, all comparisons | ||||
| between sequence numbers should be performed using sequence number | ||||
| arithmetic. | ||||
| DISCUSSION: | ||||
| Each block included in a SACK option represents a number of | ||||
| received data bytes that are contiguous and isolated; that is, the | ||||
| bytes just below the block, (Left Edge of Block - 1), and just | ||||
| above the block, (Right Edge of Block), have not yet been | ||||
| received. | ||||
| TCP MUST enforce a limit on the number of SACK blocks that a TCP will | ||||
| store in memory for each connection at any time. | ||||
| DISCUSSION: | ||||
| The TCP receiving a SACK option is expected to keep track of the | ||||
| selectively-acknowledged blocks. Even when space in the TCP | ||||
| header is limited (and thus each TCP segment can selectively- | ||||
| acknowledge at most four blocks of data), an attacker could try to | ||||
| perform a buffer overflow or a resource-exhaustion attack by | ||||
| sending a large number of SACK options. | ||||
| For example, an attacker could send a large number of SACK | For example, an attacker could send a large number of SACK options, | |||
| options, each of them acknowledging one byte of data. | each of them acknowledging one byte of data. Additionally, for the | |||
| Additionally, for the purpose of wasting resources on the attacked | purpose of wasting resources on the attacked system, each of these | |||
| system, each of these blocks would be separated from each other by | blocks would be separated from each other by one byte, to prevent the | |||
| one byte, to prevent the attacked system from coalescing two (or | attacked system from coalescing two (or more) contiguous SACK blocks | |||
| more) contiguous SACK blocks into a single SACK block. If the | into a single SACK block. If the attacked system kept track of each | |||
| attacked system kept track of each SACKed block by storing both | SACKed block by storing both the Left Edge and the Right Edge of the | |||
| the Left Edge and the Right Edge of the block, then for each | block, then for each window of data, the attacker could waste up to 4 | |||
| window of data, the attacker could waste up to 4 * Window bytes of | * Window bytes of memory at the attacked TCP. | |||
| memory at the attacked TCP. | ||||
| The value "4 * Window" results from the expression "(Window / 2) * | The value "4 * Window" results from the expression "(Window / 2) * | |||
| 8", in which the value "2" accounts for the 1-byte block | 8", in which the value "2" accounts for the 1-byte block | |||
| selectively-acknowledged by each SACK block and 1 byte that would | selectively-acknowledged by each SACK block and 1 byte that would | |||
| be used to separate each SACK blocks from each other, and the | be used to separate each SACK blocks from each other, and the | |||
| value "8" accounts for the 8 bytes needed to store the Left Edge | value "8" accounts for the 8 bytes needed to store the Left Edge | |||
| and the Right Edge of each SACKed block. | and the Right Edge of each SACKed block. | |||
| Therefore, it is clear that a limit should be imposed on the | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes sanity checks to | |||
| number of SACK blocks that a TCP will store in memory for each | be performed on this option such that this and other possible issues | |||
| connection at any time. Measurements in [Dharmapurikar and | are mitigated. | |||
| Paxson, 2005] indicate that in the vast majority of cases | ||||
| connections have a single hole in the data stream at any given | ||||
| time. Thus, a limit of 16 SACK blocks for each connection would | ||||
| handle even most of the more unusual cases in which there is more | ||||
| than one simultaneous hole at a time. | ||||
| 4.5. MD5 Option (Kind=19) | 4.5. MD5 Option (Kind=19) | |||
| The TCP MD5 option provides a mechanism for authenticating TCP | The TCP MD5 option provides a mechanism for authenticating TCP | |||
| segments with a 18-byte digest produced by the MD5 algorithm. The | segments with a 18-byte digest produced by the MD5 algorithm. The | |||
| option consists of an option-kind byte (which must be 19), an option- | option consists of an option-kind byte (which must be 19), an option- | |||
| length byte (which must be 18), and a 16-byte MD5 digest. | length byte (which must be 18), and a 16-byte MD5 digest. | |||
| TCP MUST silently drop a TCP segment that carries a TCP MD5 option | A basic weakness on the TCP MD5 option is that the MD5 algorithm | |||
| that does not pass the following checks: | itself has been known (for a long time) to be vulnerable to collision | |||
| search attacks. | ||||
| option-offset + option-length <= Data Offset * 4 | ||||
| option-length == 18 | ||||
| DISCUSSION: | ||||
| The TCP MD5 option is of "Case 2", and has a fixed length. | ||||
| DISCUSSION: | ||||
| A basic weakness on the TCP MD5 option is that the MD5 algorithm | ||||
| itself has been known (for a long time) to be vulnerable to | ||||
| collision search attacks. | ||||
| [Bellovin, 2006] argues that it has two other weaknesses, namely | [Bellovin, 2006] argues that it has two other weaknesses, namely that | |||
| that it does not provide a key identifier, and that it has no | it does not provide a key identifier, and that it has no provision | |||
| provision for automated key management. However, it is generally | for automated key management. However, it is generally accepted that | |||
| accepted that while a Key-ID field can be a good approach for | while a Key-ID field can be a good approach for providing smooth key | |||
| providing smooth key rollover, it is not actually a requirement. | rollover, it is not actually a requirement. For instance, most | |||
| For instance, most systems implementing the TCP MD5 option include | systems implementing the TCP MD5 option include a "keychain" | |||
| a "keychain" mechanism that fully supports smooth key rollover. | mechanism that fully supports smooth key rollover. Additionally, | |||
| Additionally, with some further work, ISAKMP/IKE could be used to | with some further work, ISAKMP/IKE could be used to configure the MD5 | |||
| configure the MD5 keys. | keys. | |||
| It is interesting to note that while the TCP MD5 option, as | It is interesting to note that while the TCP MD5 option, as specified | |||
| specified by RFC 2385 [Heffernan, 1998], addresses the TCP-based | by RFC 2385 [Heffernan, 1998], addresses the TCP-based forgery | |||
| forgery attacks against TCP discussed in Section 11, it does not | attacks against TCP discussed in Section 11, it does not address the | |||
| address the ICMP-based connection-reset attacks discussed in | ICMP-based connection-reset attacks discussed in Section 15. As a | |||
| Section 15. As a result, while a TCP connection may be protected | result, while a TCP connection may be protected from TCP-based | |||
| from TCP-based forgery attacks by means of the MD5 option, an | forgery attacks by means of the MD5 option, an attacker might still | |||
| attacker might still be able to successfully perform the ICMP- | be able to successfully perform the ICMP-based counter-part. | |||
| based counter-part. | ||||
| The TCP MD5 option has been obsoleted by the TCP-AO. | The TCP MD5 option has been obsoleted by the TCP-AO. | |||
| 4.6. Window scale option (Kind = 3) | 4.6. Window scale option (Kind = 3) | |||
| The window scale option provides a mechanism to expand the definition | The window scale option provides a mechanism to expand the definition | |||
| of the TCP window to 32 bits, such that the performance of TCP can be | of the TCP window to 32 bits, such that the performance of TCP can be | |||
| improved in some network scenarios. The Window scale option consists | improved in some network scenarios. The Window scale option consists | |||
| of an option-kind byte (which must be 3), followed by an option- | of an option-kind byte (which must be 3), followed by an option- | |||
| length byte (which must be 3), and a shift count (shift.cnt) byte | length byte (which must be 3), and a shift count (shift.cnt) byte | |||
| (the actual option-data). | (the actual option-data). | |||
| The option may be sent only in the initial SYN segment, but may also | While there are not known security implications arising from the | |||
| be sent in a SYN/ACK segment if the option was received in the | window scale mechanism itself, the size of the TCP window has a | |||
| initial SYN segment. If the option is received in any other segment, | number of security implications. In general, larger window sizes | |||
| it MUST be silently dropped. | increase the chances of an attacker from successfully performing | |||
| forgery attacks against TCP, such as those described in Section 11 of | ||||
| TCP MUST silently discard TCP segments that contain a Window scale | this document. Additionally, large windows can exacerbate the impact | |||
| option whose option-length is not 3. | of resource exhaustion attacks such as those described in Section 7 | |||
| of this document. | ||||
| DISCUSSION: | ||||
| This option has a fixed length. | ||||
| TCP MUST silently discard TCP segments that contain a Window scale | ||||
| option that does not pass the following check: | ||||
| shift.cnt <= 14 | ||||
| DISCUSSION: | ||||
| As discussed in Section 2.3 of RFC 1323 [Jacobson et al, 1992], in | ||||
| order to prevent new data from being mistakenly considered as old | ||||
| and vice versa, the resulting window should be equal to or smaller | ||||
| than 2^32. | ||||
| DISCUSSION: | ||||
| [Welzl, 2008] describes major problems with the use of the Window | ||||
| scale option in the Internet due to faulty equipment. | ||||
| While there are not known security implications arising from the | ||||
| window scale mechanism itself, the size of the TCP window has a | ||||
| number of security implications. In general, larger window sizes | ||||
| increase the chances of an attacker from successfully performing | ||||
| forgery attacks against TCP, such as those described in Section 11 | ||||
| of this document. Additionally, large windows can exacerbate the | ||||
| impact of resource exhaustion attacks such as those described in | ||||
| Section 7 of this document. | ||||
| Section 3.7 provides a general discussion of the security | Section 3.7 provides a general discussion of the security | |||
| implications of the TCP window size. Section 7.3.2 discusses the | implications of the TCP window size. Section 7.3.2 discusses the | |||
| security implications of Automatic receive-buffer tuning | security implications of Automatic receive-buffer tuning mechanisms. | |||
| mechanisms. | ||||
| 4.7. Timestamps option (Kind = 8) | 4.7. Timestamps option (Kind = 8) | |||
| The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992], | The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992], | |||
| is used to perform two functions: Round-Trip Time Measurement (RTTM), | is used to perform two functions: Round-Trip Time Measurement (RTTM), | |||
| and Protection Against Wrapped Sequence Numbers (PAWS). | and Protection Against Wrapped Sequence Numbers (PAWS). | |||
| TCP MUST silently discard TCP segments that contain a Timestamps | ||||
| option that does not pass the following check: | ||||
| option-length == 10 | ||||
| DISCUSSION: | ||||
| As specified by RFC 1323, the option-length must be 10. | ||||
| 4.7.1. Generation of timestamps | 4.7.1. Generation of timestamps | |||
| TCP SHOULD generate timestamps with the following expression: | For the purpose of PAWS, the timestamps sent on a connection are | |||
| required to be monotonically increasing. While there is no | ||||
| timestamp = T() + F(localhost, localport, remotehost, remoteport, secret_key) | requirement that timestamps are monotonically increasing across TCP | |||
| connections, the generation of timestamps such that they are | ||||
| where the result of T() is a global system clock that complies with | monotonically increasing across connections between the same two | |||
| the requirements of Section 4.2.2 of RFC 1323 [Jacobson et al, 1992], | endpoints allows the use of timestamps for improving the handling of | |||
| and F() is a function that should not be computable from the outside. | SYN segments that are received while the corresponding four-tuple is | |||
| Therefore, we suggest F() to be a cryptographic hash function of the | in the TIME-WAIT state. This is discussed in Section 11.1.2 of this | |||
| connection-id and some secret data. | document. | |||
| DISCUSSION: | ||||
| For the purpose of PAWS, the timestamps sent on a connection are | ||||
| required to be monotonically increasing. While there is no | ||||
| requirement that timestamps are monotonically increasing across | ||||
| TCP connections, the generation of timestamps such that they are | ||||
| monotonically increasing across connections between the same two | ||||
| endpoints allows the use of timestamps for improving the handling | ||||
| of SYN segments that are received while the corresponding four- | ||||
| tuple is in the TIME-WAIT state. This is discussed in Section | ||||
| 11.1.2 of this document. | ||||
| F() provides an offset that will be the same for all incarnations | ||||
| of a connection between the same two endpoints, while T() provides | ||||
| the monotonically increasing values that are needed for PAWS. | ||||
| Further discussion about this algorithm is available in | ||||
| [I-D.gont-timestamps-generation]. | ||||
| TCP SHOULD NOT initialize a global timestamp counter to a fixed value | ||||
| when the system is bootstrapped. | ||||
| DISCUSSION: | ||||
| Some implementations are known to initialize their global | ||||
| timestamp clock to zero when the system is bootstrapped. This is | ||||
| undesirable, as the timestamp clock would disclose the system | ||||
| uptime. | ||||
| TCP SHOULD set the Timestamp Echo Reply (TSecr) field to zero when | ||||
| sending a TCP segment that does not have the ACK bit set (i.e., a SYN | ||||
| segment). | ||||
| DISCUSSION: | ||||
| Some TCP implementations have been found to fail to set the | Some implementations are known to initialize their global timestamp | |||
| Timestamp Echo Reply field (TSecr) to zero in TCP segments that do | clock to zero when the system is bootstrapped. This is undesirable, | |||
| not have the ACK bit set, thus potentially leaking information. | as the timestamp clock would disclose the system uptime. | |||
| [I-D.gont-timestamps-generation] discusses the generation of TCP | ||||
| timestamps in detail. | ||||
| 4.7.2. Vulnerabilities | 4.7.2. Vulnerabilities | |||
| Blind In-Window Attacks | Blind In-Window Attacks | |||
| Segments that contain a timestamp option smaller than the last | Segments that contain a timestamp option smaller than the last | |||
| timestamp option recorded by TCP are silently dropped. This allows | timestamp option recorded by TCP are silently dropped. This allows | |||
| for a subtle attack against TCP that would allow an attacker to cause | for a subtle attack against TCP that would allow an attacker to cause | |||
| one direction of data transfer of the attacked connection to freeze | one direction of data transfer of the attacked connection to freeze | |||
| [US-CERT, 2005c]. An attacker could forge a TCP segment that | [US-CERT, 2005c]. An attacker could forge a TCP segment that | |||
| skipping to change at page 40, line 7 ¶ | skipping to change at page 24, line 14 ¶ | |||
| proposes mitigations for this and other issues. | proposes mitigations for this and other issues. | |||
| 5. Connection-establishment mechanism | 5. Connection-establishment mechanism | |||
| The following subsections describe a number of attacks that can be | The following subsections describe a number of attacks that can be | |||
| performed against TCP by exploiting its connection-establishment | performed against TCP by exploiting its connection-establishment | |||
| mechanism. | mechanism. | |||
| 5.1. SYN flood | 5.1. SYN flood | |||
| TCP SHOULD implement (and enable by default) a syn-cache [Lemon, | TCP uses a mechanism known as the "three-way handshake" for the | |||
| 2002]. | establishment of a connection between two TCP peers. RFC 793 | |||
| [RFC0793] states that when a TCP that is in the LISTEN state receives | ||||
| TCP SHOULD implement syn-cookies, and SHOULD enable them only after a | a SYN segment (i.e., a TCP segment with the SYN flag set), it must | |||
| specified number of TCBs has been allocated for connections in the | transition to the SYN-RECEIVED state, record the control information | |||
| SYN-RECEIVED state. | (e.g., the ISN) contained in the SYN segment in a Transmission | |||
| Control Block (TCB), and respond with a SYN/ACK segment. | ||||
| DISCUSSION: | ||||
| TCP uses a mechanism known as the "three-way handshake" for the | ||||
| establishment of a connection between two TCP peers. RFC 793 | ||||
| [Postel, 1981c] states that when a TCP that is in the LISTEN state | ||||
| receives a SYN segment (i.e., a TCP segment with the SYN flag | ||||
| set), it must transition to the SYN-RECEIVED state, record the | ||||
| control information (e.g., the ISN) contained in the SYN segment | ||||
| in a Transmission Control Block (TCB), and respond with a SYN/ACK | ||||
| segment. | ||||
| A Transmission Control Block is the data structure used to store | A Transmission Control Block is the data structure used to store | |||
| (usually within the kernel) all the information relevant to a TCP | (usually within the kernel) all the information relevant to a TCP | |||
| connection. The concept of "TCB" is introduced in the core TCP | connection. The concept of "TCB" is introduced in the core TCP | |||
| specification RFC 793 [Postel, 1981c]. | specification RFC 793 [RFC0793]. | |||
| In practice, virtually all existing implementations do not modify | In practice, virtually all existing implementations do not modify the | |||
| the state of the TCP that was in the LISTEN state, but rather | state of the TCP that was in the LISTEN state, but rather create a | |||
| create a new TCP (i.e., a new "protocol machine"), and perform all | new TCP (i.e., a new "protocol machine"), and perform all the state | |||
| the state transitions on this newly-created TCP. This allows the | transitions on this newly-created TCP. This allows the application | |||
| application running on top of TCP to service to more than one | running on top of TCP to service to more than one client at the same | |||
| client at the same time. As a result, each connection request | time. As a result, each connection request results in the allocation | |||
| results in the allocation of system memory to store the TCB | of system memory to store the TCB associated with the newly created | |||
| associated with the newly created TCB. | TCB. | |||
| If TCP was implemented strictly as described in RFC 793, the | If TCP was implemented strictly as described in RFC 793, the | |||
| application running on top of TCP would have to finish servicing | application running on top of TCP would have to finish servicing the | |||
| the current client before being able to service the next one in | current client before being able to service the next one in line, or | |||
| line, or should instead be able to perform some kind of connection | should instead be able to perform some kind of connection hand-off. | |||
| hand-off. | ||||
| An attacker could exploit TCP's connection-establishment mechanism | An attacker could exploit TCP's connection-establishment mechanism to | |||
| to perform a Denial of Service (DoS) attack, by sending a large | perform a Denial of Service (DoS) attack, by sending a large number | |||
| number of connection requests to the target system, with the | of connection requests to the target system, with the intent of | |||
| intent of exhausting the system memory destined for storing TCBs | exhausting the system memory destined for storing TCBs (or related | |||
| (or related kernel data structures), thus preventing the attacked | kernel data structures), thus preventing the attacked system from | |||
| system from establishing new connections with legitimate users. | establishing new connections with legitimate users. This attack is | |||
| This attack is widely known as "SYN flood", and has received a lot | widely known as "SYN flood", and has received a lot of attention | |||
| of attention during the late 90's [CERT, 1996]. | during the late 90's [CERT, 1996]. | |||
| Given that the attacker does not need to complete the three-way | Given that the attacker does not need to complete the three-way | |||
| handshake for the attacked system to tie system resources to the | handshake for the attacked system to tie system resources to the | |||
| newly created TCBs, he will typically forge the source IP address | newly created TCBs, he will typically forge the source IP address of | |||
| of the malicious SYN segments he sends, thus concealing his own IP | the malicious SYN segments he sends, thus concealing his own IP | |||
| address. | address. | |||
| If the forged IP addresses corresponded to some reachable system, | If the forged IP addresses corresponded to some reachable system, the | |||
| the impersonated system would receive the SYN/ACK segment sent by | impersonated system would receive the SYN/ACK segment sent by the | |||
| the attacked host (in response to the forged SYN segment), which | attacked host (in response to the forged SYN segment), which would | |||
| would elicit an RST segment. This RST segment would be delivered | elicit an RST segment. This RST segment would be delivered to the | |||
| to the attacked system, causing the corresponding connection to be | attacked system, causing the corresponding connection to be aborted, | |||
| aborted, and the corresponding TCB to be removed. | and the corresponding TCB to be removed. | |||
| As the impersonated host would not have any state information for | As the impersonated host would not have any state information for the | |||
| the TCP connection being referred to by the SYN/ACK segment, it | TCP connection being referred to by the SYN/ACK segment, it would | |||
| would respond with a RST segment, as specified by the TCP segment | respond with a RST segment, as specified by the TCP segment | |||
| processing rules of RFC 793 [Postel, 1981c]. | processing rules of RFC 793 [RFC0793]. | |||
| However, if the forged IP source addresses were unreachable, the | However, if the forged IP source addresses were unreachable, the | |||
| attacked TCP would continue retransmitting the SYN/ACK segment | attacked TCP would continue retransmitting the SYN/ACK segment | |||
| corresponding to each connection request, until timing out and | corresponding to each connection request, until timing out and | |||
| aborting the connection. For this reason, a number of widely | aborting the connection. For this reason, a number of widely | |||
| available attack tools first check whether each of the (forged) IP | available attack tools first check whether each of the (forged) IP | |||
| addresses are reachable by sending an ICMP echo request to them. | addresses are reachable by sending an ICMP echo request to them. The | |||
| The receipt of an ICMP echo response is considered an indication | receipt of an ICMP echo response is considered an indication of the | |||
| of the IP address being reachable (and thus results in the | IP address being reachable (and thus results in the corresponding IP | |||
| corresponding IP address not being used for performing the | address not being used for performing the attack), while the receipt | |||
| attack), while the receipt of an ICMP unreachable error message is | of an ICMP unreachable error message is considered an indication of | |||
| considered an indication of the IP address being unreachable (and | the IP address being unreachable (and thus results in the | |||
| thus results in the corresponding IP address being used for | corresponding IP address being used for performing the attack). | |||
| performing the attack). | ||||
| [Gont, 2008b] describes how the so-called ICMP soft errors could | [Gont, 2008b] describes how the so-called ICMP soft errors could be | |||
| be used by TCP to abort connections in any of the non-synchronized | used by TCP to abort connections in any of the non-synchronized | |||
| states. While implementation of the mechanism described in that | states. While implementation of the mechanism described in that | |||
| document would certainly not eliminate the vulnerability of TCP to | document would certainly not eliminate the vulnerability of TCP to | |||
| SYN flood attacks (as the attacker could use addresses that are | SYN flood attacks (as the attacker could use addresses that are | |||
| simply "black-holed"), it provides an example of how signaling | simply "black-holed"), it provides an example of how signaling | |||
| information such as that provided by means of ICMP error messages | information such as that provided by means of ICMP error messages can | |||
| can provide valuable information that a transport protocol could | provide valuable information that a transport protocol could use to | |||
| use to perform heuristics. | perform heuristics. | |||
| In order to mitigate the impact of this attack, the amount of | In order to mitigate the impact of this attack, the amount of | |||
| information stored for non-established connections should be | information stored for non-established connections should be reduced | |||
| reduced (ideally, non-synchronized connections should not require | (ideally, non-synchronized connections should not require any state | |||
| any state information to be maintained at the TCP performing the | information to be maintained at the TCP performing the passive OPEN). | |||
| passive OPEN). There are basically two mitigation techniques for | There are basically two mitigation techniques for this vulnerability: | |||
| this vulnerability: a syn-cache and syn-cookies. | a syn-cache and syn-cookies. | |||
| [Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general | [Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general discussion | |||
| discussion of SYN-flooding attacks and common mitigation | of SYN-flooding attacks and common mitigation approaches. | |||
| approaches. | ||||
| The syn-cache [Lemon, 2002] approach aims at reducing the amount | The syn-cache [Lemon, 2002] approach aims at reducing the amount of | |||
| of state information that is maintained for connections in the | state information that is maintained for connections in the SYN- | |||
| SYN-RECEIVED state, and allocates a full TCB only after the | RECEIVED state, and allocates a full TCB only after the connection | |||
| connection has transited to the ESTABLISHED state. | has transited to the ESTABLISHED state. | |||
| The syn-cookie [Bernstein, 1996] approach aims at completely | The syn-cookie [Bernstein, 1996] approach aims at completely | |||
| eliminating the need to maintain state information at the TCP | eliminating the need to maintain state information at the TCP | |||
| performing the passive OPEN, by encoding the most elementary | performing the passive OPEN, by encoding the most elementary | |||
| information required to complete the three-way handshake in the | information required to complete the three-way handshake in the | |||
| Sequence Number of the SYN/ACK segment that is sent in response to | Sequence Number of the SYN/ACK segment that is sent in response to | |||
| the received SYN segment. Thus, TCP is relieved from keeping | the received SYN segment. Thus, TCP is relieved from keeping state | |||
| state for connections in the SYN-RECEIVED state. | for connections in the SYN-RECEIVED state. | |||
| The syn-cookie approach has a number of drawbacks: | The syn-cookie approach has a number of drawbacks: | |||
| * Firstly, given the limited space in the Sequence Number field, | o Firstly, given the limited space in the Sequence Number field, it | |||
| it is not possible to encode all the information included in | is not possible to encode all the information included in the | |||
| the initial segment, such as, for example, support of Selective | initial segment, such as, for example, support of Selective | |||
| Acknowledgements (SACK). | Acknowledgements (SACK). | |||
| * Secondly, in the event that the Acknowledgement segment sent in | o Secondly, in the event that the Acknowledgement segment sent in | |||
| response to the SYN/ACK sent by the TCP that performed the | response to the SYN/ACK sent by the TCP that performed the passive | |||
| passive OPEN (i.e., the TCP server) were lost, the connection | OPEN (i.e., the TCP server) were lost, the connection would end up | |||
| would end up in the ESTABLISHED state on the client-side, but | in the ESTABLISHED state on the client-side, but in the CLOSED | |||
| in the CLOSED state on the server side. This scenario is | state on the server side. This scenario is normally handled in | |||
| normally handled in TCP by having the TCP server retransmit its | TCP by having the TCP server retransmit its SYN/ACK. However, if | |||
| SYN/ACK. However, if syn-cookies are enabled, there would be | syn-cookies are enabled, there would be no connection state | |||
| no connection state information on the server side, and thus | information on the server side, and thus the SYN/ACK would never | |||
| the SYN/ACK would never be retransmitted. This could lead to a | be retransmitted. This could lead to a scenario in which the | |||
| scenario in which the connection could remain in the | connection could remain in the ESTABLISHED state on the client | |||
| ESTABLISHED state on the client side, but in the CLOSED state | side, but in the CLOSED state at the server side, indefinitely. | |||
| at the server side, indefinitely. If the application protocol | If the application protocol was such that it required the client | |||
| was such that it required the client to wait for some data from | to wait for some data from the server (e.g., a greeting message) | |||
| the server (e.g., a greeting message) before sending any data | before sending any data to the server, a deadlock would take | |||
| to the server, a deadlock would take place, with the client | place, with the client application waiting for such server data, | |||
| application waiting for such server data, and the server | and the server waiting for the TCP three-way handshake to | |||
| waiting for the TCP three-way handshake to complete. | complete. | |||
| * Thirdly, unless the function used to encode information in the | o Thirdly, unless the function used to encode information in the | |||
| SYN/ACK packet is cryptographically strong, an attacker could | SYN/ACK packet is cryptographically strong, an attacker could | |||
| forge TCP connections in the ESTABLISHED state by forging ACK | forge TCP connections in the ESTABLISHED state by forging ACK | |||
| segments that would be considered as "legitimate" by the | segments that would be considered as "legitimate" by the receiving | |||
| receiving TCP. | TCP. | |||
| * Fourthly, in those scenarios in which establishment of new | o Fourthly, in those scenarios in which establishment of new | |||
| connections is blocked by simply dropping segments with the SYN | connections is blocked by simply dropping segments with the SYN | |||
| bit set, use of SYN cookies could allow an attacker to bypass | bit set, use of SYN cookies could allow an attacker to bypass the | |||
| the firewall rules, as a connection could be established by | firewall rules, as a connection could be established by forging an | |||
| forging an ACK segment with the correct values, without the | ACK segment with the correct values, without the need of setting | |||
| need of setting the SYN bit. | the SYN bit. | |||
| As a result, syn-cookies are usually not employed as a first line | As a result, syn-cookies are usually not employed as a first line of | |||
| of defense against SYN-flood attacks, but are only as the last | defense against SYN-flood attacks, but are only as the last resort to | |||
| resort to cope with them. For example, some TCP implementations | cope with them. For example, some TCP implementations enable syn- | |||
| enable syn-cookies only after a certain number of TCBs has been | cookies only after a certain number of TCBs has been allocated for | |||
| allocated for connections in the SYN-RECEIVED state. We recommend | connections in the SYN-RECEIVED state. We recommend this | |||
| this implementation technique, with a syn-cache enabled by | implementation technique, with a syn-cache enabled by default, and | |||
| default, and use of syn-cookies triggered, for example, when the | use of syn-cookies triggered, for example, when the limit of TCBs for | |||
| limit of TCBs for non-synchronized connections with a given port | non-synchronized connections with a given port number has been | |||
| number has been reached. | reached. | |||
| It is interesting to note that a SYN-flood attack should only | It is interesting to note that a SYN-flood attack should only affect | |||
| affect the establishment of new connections. A number of books | the establishment of new connections. A number of books and online | |||
| and online documents seem to assume that TCP will not be able to | documents seem to assume that TCP will not be able to respond to any | |||
| respond to any TCP segment that is meant for a TCP port that is | TCP segment that is meant for a TCP port that is being SYN-flooded | |||
| being SYN-flooded (e.g., respond with an RST segment upon receipt | (e.g., respond with an RST segment upon receipt of a TCP segment that | |||
| of a TCP segment that refers to a non-existent TCP connection). | refers to a non-existent TCP connection). While SYN-flooding attacks | |||
| While SYN-flooding attacks have been successfully exploited in the | have been successfully exploited in the past for achieving such a | |||
| past for achieving such a goal [Shimomura, 1995], as clarified by | goal [Shimomura, 1995], as clarified by RFC 1948 [Bellovin, 1996] the | |||
| RFC 1948 [Bellovin, 1996] the effectiveness of SYN flood attacks | effectiveness of SYN flood attacks to silence a TCP implementation | |||
| to silence a TCP implementation arose as a result of a bug in the | arose as a result of a bug in the 4.4BSD TCP implementation [Wright | |||
| 4.4BSD TCP implementation [Wright and Stevens, 1994], rather than | and Stevens, 1994], rather than from a theoretical property of SYN- | |||
| from a theoretical property of SYN-flood attacks themselves. | flood attacks themselves. Therefore, those TCP implementations that | |||
| Therefore, those TCP implementations that do not suffer from such | do not suffer from such a bug should not be silenced as a result of a | |||
| a bug should not be silenced as a result of a SYN-flood attack. | SYN-flood attack. | |||
| [Zquete, 2002] describes a mechanism that could theoretically | [Zquete, 2002] describes a mechanism that could theoretically improve | |||
| improve the functionality of SYN cookies. It exploits the TCP | the functionality of SYN cookies. It exploits the TCP "simultaneous | |||
| "simultaneous open" mechanism, as illustrated in Figure 5. | open" mechanism, as illustrated in Figure 5. | |||
| See Figure 5, in page 46 of the UK CPNI document. | See Figure 5, in page 46 of the UK CPNI document. | |||
| Use of TCP simultaneous open for handling SYN floods | Use of TCP simultaneous open for handling SYN floods | |||
| In line 1, TCP A initiates the connection-establishment phase by | In line 1, TCP A initiates the connection-establishment phase by | |||
| sending a SYN segment to TCP B. In line 2, TCP B creates a SYN | sending a SYN segment to TCP B. In line 2, TCP B creates a SYN cookie | |||
| cookie as described by [Bernstein, 1996], but does not set the ACK | as described by [Bernstein, 1996], but does not set the ACK bit of | |||
| bit of the segment it sends (thus really sending a SYN segment, | the segment it sends (thus really sending a SYN segment, rather than | |||
| rather than a SYN/ACK). This "fools" TCP A into thinking that | a SYN/ACK). This "fools" TCP A into thinking that both SYN segments | |||
| both SYN segments "have crossed each other in the network" as if a | "have crossed each other in the network" as if a "simultaneous open" | |||
| "simultaneous open" scenario had taken place. As a result, in | scenario had taken place. As a result, in line 3 TCP A sends a SYN/ | |||
| line 3 TCP A sends a SYN/ACK segment containing the same options | ACK segment containing the same options that were contained in the | |||
| that were contained in the original SYN segment. In line 4, upon | original SYN segment. In line 4, upon receipt of this segment, TCP | |||
| receipt of this segment, TCP processes the cookie encoded in the | processes the cookie encoded in the ACK field as if it had been the | |||
| ACK field as if it had been the result of a traditional SYN cookie | result of a traditional SYN cookie scenario, and moves the connection | |||
| scenario, and moves the connection into the ESTABLISHED state. In | into the ESTABLISHED state. In line 5, TCP B sends a SYN/ACK | |||
| line 5, TCP B sends a SYN/ACK segment, which causes the connection | segment, which causes the connection at TCP A to move into the | |||
| at TCP A to move into the ESTABLISHED state. In line 6, TCP A | ESTABLISHED state. In line 6, TCP A sends a data segment on the | |||
| sends a data segment on the connection. | connection. | |||
| While this mechanism would work in theory, unfortunately there are | While this mechanism would work in theory, unfortunately there are a | |||
| a number of factors that prevent it from being usable in real | number of factors that prevent it from being usable in real network | |||
| network environments: | environments: | |||
| * Some systems are not able to perform the "simultaneous open" | o Some systems are not able to perform the "simultaneous open" | |||
| operation specified in RFC 793, and thus the connection | operation specified in RFC 793, and thus the connection | |||
| establishment will fail. | establishment will fail. | |||
| * Some firewalls might prevent the establishment of TCP | o Some firewalls might prevent the establishment of TCP connections | |||
| connections that rely on the "simultaneous open" mechanism | that rely on the "simultaneous open" mechanism (e.g., a given | |||
| (e.g., a given firewall might be allowing incoming SYN/ACK | firewall might be allowing incoming SYN/ACK segments, but not | |||
| segments, but not outgoing SYN/ACK segments). | outgoing SYN/ACK segments). | |||
| Therefore, we do not recommend implementation of this mechanism | Therefore, we do not recommend implementation of this mechanism for | |||
| for mitigating SYN-flood attacks. | mitigating SYN-flood attacks. | |||
| 5.2. Connection forgery | 5.2. Connection forgery | |||
| The process of causing a TCP connection to be illegitimately | The process of causing a TCP connection to be illegitimately | |||
| established between two arbitrary remote peers is usually referred to | established between two arbitrary remote peers is usually referred to | |||
| as "connection spoofing" or "connection forgery". This can have a | as "connection spoofing" or "connection forgery". This can have a | |||
| great negative impact when systems establish some sort of trust | great negative impact when systems establish some sort of trust | |||
| relationships based on the IP addresses used to establish a TCP | relationships based on the IP addresses used to establish a TCP | |||
| connection [daemon9 et al, 1996]. | connection [daemon9 et al, 1996]. | |||
| skipping to change at page 45, line 24 ¶ | skipping to change at page 29, line 22 ¶ | |||
| recommended that systems disable IP Source Routing by default, or at | recommended that systems disable IP Source Routing by default, or at | |||
| the very least, they disable source routing for IP packets that | the very least, they disable source routing for IP packets that | |||
| encapsulate TCP segments. | encapsulate TCP segments. | |||
| The IPv6 Routing Header Type 0, which provides a similar | The IPv6 Routing Header Type 0, which provides a similar | |||
| functionality to that provided by IPv4 source routing, has been | functionality to that provided by IPv4 source routing, has been | |||
| officially deprecated by RFC 5095 [Abley et al, 2007]. | officially deprecated by RFC 5095 [Abley et al, 2007]. | |||
| 5.3. Connection-flooding attack | 5.3. Connection-flooding attack | |||
| NOTE: THIS SECTION IS BEING EDITED. RFC2119-LANGUAGE IS BEING | ||||
| REMOVED. | ||||
| 5.3.1. Vulnerability | 5.3.1. Vulnerability | |||
| The creation and maintenance of a TCP connection requires system | The creation and maintenance of a TCP connection requires system | |||
| memory to maintain shared state between the local and the remote TCP. | memory to maintain shared state between the local and the remote TCP. | |||
| As system memory is a finite resource, there is a limit on the number | As system memory is a finite resource, there is a limit on the number | |||
| of TCP connections that a system can maintain at any time. When the | of TCP connections that a system can maintain at any time. When the | |||
| TCP API is employed to create a TCP connection with a remote peer, it | TCP API is employed to create a TCP connection with a remote peer, it | |||
| allocates system memory for maintaining shared state with the remote | allocates system memory for maintaining shared state with the remote | |||
| TCP peer, and thus the resulting connection would tie a similar | TCP peer, and thus the resulting connection would tie a similar | |||
| amount of resources at the remote host as at the local host. | amount of resources at the remote host as at the local host. | |||
| skipping to change at page 48, line 36 ¶ | skipping to change at page 32, line 36 ¶ | |||
| Some firewalls can be configured to limit the number of | Some firewalls can be configured to limit the number of | |||
| simultaneous connections that any system can maintain with a | simultaneous connections that any system can maintain with a | |||
| specific system and/or service at any given time. Limiting the | specific system and/or service at any given time. Limiting the | |||
| number of simultaneous connections that each system can establish | number of simultaneous connections that each system can establish | |||
| with a specific system and service would effectively limit the | with a specific system and service would effectively limit the | |||
| possibility of an attacker that controls a single IP address to | possibility of an attacker that controls a single IP address to | |||
| exhaust system resources at the attacker system/service. | exhaust system resources at the attacker system/service. | |||
| 5.4. Firewall-bypassing techniques | 5.4. Firewall-bypassing techniques | |||
| TCP MUST silently drop those TCP segments that have both the SYN and | [draft-gont-tcpm-tcp-sanity-checks-00.txt] discusses how packets with | |||
| the RST flags set. | both the SYN and RST bits set have been employed in the wild to | |||
| bypass firewall rules, and provides advices in this area. | ||||
| DISCUSSION: | ||||
| Some firewalls block incoming TCP connections by blocking only | ||||
| incoming SYN segments. However, there are inconsistencies in how | ||||
| different TCP implementations handle SYN segments that have | ||||
| additional flags set, which may allow an attacker to bypass | ||||
| firewall rules [US-CERT, 2003b]. | ||||
| For example, some firewalls have been known to mistakenly allow | ||||
| incoming SYN segments if they also have the RST bit set. As some | ||||
| TCP implementations will create a new connection in response to a | ||||
| TCP segment with both the SYN and RST bits set, an attacker could | ||||
| bypass the firewall rules and establish a connection with a | ||||
| "protected" system by setting the RST bit in his SYN segments. | ||||
| Here we advise TCP implementations to silently drop those TCP | ||||
| segments that have both the SYN and the RST flags set. | ||||
| 6. Connection-termination mechanism | 6. Connection-termination mechanism | |||
| 6.1. FIN-WAIT-2 flooding attack | 6.1. FIN-WAIT-2 flooding attack | |||
| 6.1.1. Vulnerability | 6.1.1. Vulnerability | |||
| TCP implements a connection-termination mechanism that is employed | TCP implements a connection-termination mechanism that is employed | |||
| for the graceful termination of a TCP connection. This mechanism | for the graceful termination of a TCP connection. This mechanism | |||
| usually consists of the exchange of four-segments. Figure 6 | usually consists of the exchange of four-segments. Figure 6 | |||
| skipping to change at page 49, line 40 ¶ | skipping to change at page 33, line 25 ¶ | |||
| As a result, an attacker could establish a large number of | As a result, an attacker could establish a large number of | |||
| connections with the target system, and cause it close each of them. | connections with the target system, and cause it close each of them. | |||
| For each connection, once the target system has sent its FIN segment, | For each connection, once the target system has sent its FIN segment, | |||
| the attacker would acknowledge the receipt of this segment, but would | the attacker would acknowledge the receipt of this segment, but would | |||
| send no further segments on that connection. As a result, an | send no further segments on that connection. As a result, an | |||
| attacker could cause the corresponding system resources (e.g., the | attacker could cause the corresponding system resources (e.g., the | |||
| system memory used for storing the TCB) without the need to send any | system memory used for storing the TCB) without the need to send any | |||
| further packets. | further packets. | |||
| While the CLOSE command described in RFC 793 [Postel, 1981c] simply | While the CLOSE command described in RFC 793 [RFC0793] simply signals | |||
| signals the remote TCP end-point that this TCP has finished sending | the remote TCP end-point that this TCP has finished sending data | |||
| data (i.e., it closes only one direction of the data transfer), the | (i.e., it closes only one direction of the data transfer), the | |||
| close() system-call available in most operating systems has different | close() system-call available in most operating systems has different | |||
| semantics: it marks the corresponding file descriptor as closed (and | semantics: it marks the corresponding file descriptor as closed (and | |||
| thus it is no longer usable), and assigns the operating system the | thus it is no longer usable), and assigns the operating system the | |||
| responsibility to deliver any queued data to the remote TCP peer and | responsibility to deliver any queued data to the remote TCP peer and | |||
| to terminate the TCP connection. This makes the FIN-WAIT-2 state | to terminate the TCP connection. This makes the FIN-WAIT-2 state | |||
| particularly attractive for performing memory exhaustion attacks, as | particularly attractive for performing memory exhaustion attacks, as | |||
| even if the application running on top of TCP were imposing limits on | even if the application running on top of TCP were imposing limits on | |||
| the maximum number of ongoing connections, and/or time limits on the | the maximum number of ongoing connections, and/or time limits on the | |||
| function calls performed on TCP connections, that application would | function calls performed on TCP connections, that application would | |||
| be unable to enforce these limits on the FIN-WAIT-2 state. | be unable to enforce these limits on the FIN-WAIT-2 state. | |||
| skipping to change at page 56, line 35 ¶ | skipping to change at page 40, line 27 ¶ | |||
| window to cause the target system to tie system memory to the TCP | window to cause the target system to tie system memory to the TCP | |||
| retransmission buffer, it is hard to perform any useful statistics | retransmission buffer, it is hard to perform any useful statistics | |||
| from the advertised window. While it is tempting to enforce a limit | from the advertised window. While it is tempting to enforce a limit | |||
| on the length of the persist state (see Section 3.7.2 of this | on the length of the persist state (see Section 3.7.2 of this | |||
| document), an attacker could simply open the window (i.e., advertise | document), an attacker could simply open the window (i.e., advertise | |||
| a TCP window larger than zero) from time to time to prevent this | a TCP window larger than zero) from time to time to prevent this | |||
| enforced limit from causing his malicious connections to be aborted. | enforced limit from causing his malicious connections to be aborted. | |||
| 7.2. TCP segment reassembly buffer | 7.2. TCP segment reassembly buffer | |||
| TCP MAY discard out-of-order data when system-memory exhaustion is | TCP buffers out-of-order segments to more efficiently handle the | |||
| imminent. | occurrence of packet reordering and segment loss. When out-of-order | |||
| data are received, a "hole" momentarily exists in the data stream | ||||
| DISCUSSION: | which must be filled before the received data can be delivered to the | |||
| application making use of TCP's services. This situation can be | ||||
| TCP buffers out-of-order segments to more efficiently handle the | exploited by an attacker, which could intentionally create a hole in | |||
| occurrence of packet reordering and segment loss. When out-of- | the data stream by sending a number of segments with a sequence | |||
| order data are received, a "hole" momentarily exists in the data | number larger than the next sequence number expected (RCV.NXT) by the | |||
| stream which must be filled before the received data can be | attacked TCP. Thus, the attacked TCP would tie system memory to | |||
| delivered to the application making use of TCP's services. This | buffer the out-of-order segments, without being able to hand the | |||
| situation can be exploited by an attacker, which could | received data to the corresponding application. | |||
| intentionally create a hole in the data stream by sending a number | ||||
| of segments with a sequence number larger than the next sequence | ||||
| number expected (RCV.NXT) by the attacked TCP. Thus, the attacked | ||||
| TCP would tie system memory to buffer the out-of-order segments, | ||||
| without being able to hand the received data to the corresponding | ||||
| application. | ||||
| If a large number of such connections were created, system memory | If a large number of such connections were created, system memory | |||
| could be exhausted, precluding the attacked TCP from servicing new | could be exhausted, precluding the attacked TCP from servicing new | |||
| connections and/or continue servicing TCP connections previously | connections and/or continue servicing TCP connections previously | |||
| established. | established. | |||
| Fortunately, these attacks can be easily mitigated, at the expense | Fortunately, these attacks can be easily mitigated, at the expense of | |||
| of degrading the performance of possibly legitimate connections. | degrading the performance of possibly legitimate connections. When | |||
| When out-of-order data is received, an Acknowledgement segment is | out-of-order data is received, an Acknowledgement segment is sent | |||
| sent with the next sequence number expected (RCV.NXT). This means | with the next sequence number expected (RCV.NXT). This means that | |||
| that receipt of the out-of-order data will not be actually | receipt of the out-of-order data will not be actually acknowledged by | |||
| acknowledged by the TCP's cumulative Acknowledgement Number. As a | the TCP's cumulative Acknowledgement Number. As a result, a TCP is | |||
| result, a TCP is free to discard any data that have been received | free to discard any data that have been received out-of-order, | |||
| out-of-order, without affecting the reliability of the data | without affecting the reliability of the data transfer. Given the | |||
| transfer. Given the performance implications of discarding out- | performance implications of discarding out-of-order segments for | |||
| of-order segments for legitimate connections, this pruning policy | legitimate connections, this pruning policy should be applied only if | |||
| should be applied only if memory exhaustion is imminent. | memory exhaustion is imminent. | |||
| As a result of discarding the out-of-order data, these data will | As a result of discarding the out-of-order data, these data will need | |||
| need to be unnecessarily retransmitted. Additionally, a loss | to be unnecessarily retransmitted. Additionally, a loss event will | |||
| event will be detected by the sending TCP, and thus the slow start | be detected by the sending TCP, and thus the slow start phase of | |||
| phase of TCP's congestion control will be entered, thus reducing | TCP's congestion control will be entered, thus reducing the data | |||
| the data transfer rate of the connection. | transfer rate of the connection. | |||
| It is interesting to note that this pruning policy could be | It is interesting to note that this pruning policy could be applied | |||
| applied even if Selective Acknowledgements (SACK) (specified in | even if Selective Acknowledgements (SACK) (specified in RFC 2018 | |||
| RFC 2018 [Mathis et al, 1996]) are in use, as SACK provides only | [Mathis et al, 1996]) are in use, as SACK provides only advisory | |||
| advisory information, and does not preclude the receiving TCP from | information, and does not preclude the receiving TCP from discarding | |||
| discarding data that have been previously selectively-acknowledged | data that have been previously selectively-acknowledged by means of | |||
| by means of TCP's SACK option, but not acknowledged by TCP's | TCP's SACK option, but not acknowledged by TCP's cumulative | |||
| cumulative Acknowledgement Number. | Acknowledgement Number. | |||
| There are a number of ways in which the pruning policy could be | There are a number of ways in which the pruning policy could be | |||
| triggered. For example, when out of order data are received, a | triggered. For example, when out of order data are received, a timer | |||
| timer could be set, and the sequence number of the out-of-order | could be set, and the sequence number of the out-of-order data could | |||
| data could be recorded. If the hole were filled before the timer | be recorded. If the hole were filled before the timer expires, the | |||
| expires, the timer would be turned off. However, if the timer | timer would be turned off. However, if the timer expired before the | |||
| expired before the hole were filled, all the out-of-order segments | hole were filled, all the out-of-order segments of the corresponding | |||
| of the corresponding connection would be discarded. This would be | connection would be discarded. This would be a proactive counter- | |||
| a proactive counter-measure for attacks that aim at exhausting the | measure for attacks that aim at exhausting the receive buffers. | |||
| receive buffers. | ||||
| In addition, an implementation could incorporate reactive | In addition, an implementation could incorporate reactive mechanisms | |||
| mechanisms for more carefully controlling buffer allocation when | for more carefully controlling buffer allocation when some predefined | |||
| some predefined buffer allocation threshold was reached. At such | buffer allocation threshold was reached. At such point, pruning | |||
| point, pruning policies would be applied. | policies would be applied. | |||
| A number of mechanisms can aid in the process of freeing system | A number of mechanisms can aid in the process of freeing system | |||
| resources. For example, a table of network prefixes corresponding | resources. For example, a table of network prefixes corresponding to | |||
| to the IP addresses of TCP peers that have ongoing TCP connections | the IP addresses of TCP peers that have ongoing TCP connections could | |||
| could record the aggregate amount of out-of-order data currently | record the aggregate amount of out-of-order data currently buffered | |||
| buffered for those connections. When the pruning policy was | for those connections. When the pruning policy was triggered, TCP | |||
| triggered, TCP connections with hosts that have network prefixes | connections with hosts that have network prefixes with large | |||
| with large aggregate out-of-order buffered data could be selected | aggregate out-of-order buffered data could be selected first for | |||
| first for pruning the out-of-order segments. | pruning the out-of-order segments. | |||
| Alternatively, if TCP segments were de-multiplexed by means of a | Alternatively, if TCP segments were de-multiplexed by means of a hash | |||
| hash table (as it is currently the case in many TCP | table (as it is currently the case in many TCP implementations), a | |||
| implementations), a counter could be held at each entry of the | counter could be held at each entry of the hash table that would | |||
| hash table that would record the aggregate out-of-order data | record the aggregate out-of-order data currently buffered for those | |||
| currently buffered for those connections belonging to that hash | connections belonging to that hash table entry. When the pruning | |||
| table entry. When the pruning policy is triggered, the out-of- | policy is triggered, the out-of-order data corresponding to those | |||
| order data corresponding to those connections linked by the hash | connections linked by the hash table entry with largest amount of | |||
| table entry with largest amount of aggregate out-of-order data | aggregate out-of-order data could be pruned first. It is important | |||
| could be pruned first. It is important that this hash is not | that this hash is not computable by an attacker, as this would allow | |||
| computable by an attacker, as this would allow him to maliciously | him to maliciously cause the performance of specific connections to | |||
| cause the performance of specific connections to be degraded. | be degraded. That is, given a four-tuple that identifies a | |||
| That is, given a four-tuple that identifies a connection, an | connection, an attacker should not be able to compute the | |||
| attacker should not be able to compute the corresponding hash | corresponding hash value used by the target system to de-multiplex | |||
| value used by the target system to de-multiplex incoming TCP | incoming TCP segments to that connection. | |||
| segments to that connection. | ||||
| Another variant of a resource exhaustion attack against TCP's | Another variant of a resource exhaustion attack against TCP's segment | |||
| segment reassembly mechanism would target the data structures used | reassembly mechanism would target the data structures used to link | |||
| to link the different holes in a data stream. For example, an | the different holes in a data stream. For example, an attacker could | |||
| attacker could send a burst of 1 byte segments, leaving a one-byte | send a burst of 1 byte segments, leaving a one-byte hole between each | |||
| hole between each of the data bytes sent. Depending on the data | of the data bytes sent. Depending on the data structures used for | |||
| structures used for holding and linking together each of the data | holding and linking together each of the data segments, such an | |||
| segments, such an attack might waste a large amount of system | attack might waste a large amount of system memory by exploiting the | |||
| memory by exploiting the overhead needed store and link together | overhead needed store and link together each of these one-byte | |||
| each of these one-byte segments. | segments. | |||
| For example, if a linked-list is used for holding and linking each | For example, if a linked-list is used for holding and linking each of | |||
| of the data segments, each of the involved data structures could | the data segments, each of the involved data structures could involve | |||
| involve one byte of kernel memory for storing the received data | one byte of kernel memory for storing the received data byte (the TCP | |||
| byte (the TCP payload), plus 4 bytes (32 bits) for storing a | payload), plus 4 bytes (32 bits) for storing a pointer to the next | |||
| pointer to the next node in the linked-list. Additionally, while | node in the linked-list. Additionally, while such a data structure | |||
| such a data structure would require only a few bytes of kernel | would require only a few bytes of kernel memory, it could result in | |||
| memory, it could result in the allocation of a whole memory page, | the allocation of a whole memory page, thus consuming much more | |||
| thus consuming much more memory than expected. | memory than expected. | |||
| Therefore, implementations should enforce a limit on the number of | Therefore, implementations should enforce a limit on the number of | |||
| holes that are allowed in the received data stream at any given | holes that are allowed in the received data stream at any given time. | |||
| time. When such a limit is reached, incoming TCP segments which | When such a limit is reached, incoming TCP segments which would | |||
| would create new holes would be silently dropped. Measurements in | create new holes would be silently dropped. Measurements in | |||
| [Dharmapurikar and Paxson, 2005] indicate that in the vast | [Dharmapurikar and Paxson, 2005] indicate that in the vast majority | |||
| majority of TCP connections have at most a single hole at any | of TCP connections have at most a single hole at any given time. A | |||
| given time. A limit of 16 holes for each connection would | limit of 16 holes for each connection would accommodate even most of | |||
| accommodate even most of the very unusual cases in which there can | the very unusual cases in which there can be more than hole in the | |||
| be more than hole in the data stream at a given time. | data stream at a given time. | |||
| [US-CERT, 2004a] is a security advisory about a Denial of Service | [US-CERT, 2004a] is a security advisory about a Denial of Service | |||
| vulnerability resulting from a TCP implementation that did not | vulnerability resulting from a TCP implementation that did not | |||
| enforce limits on the number of segments stored in the TCP | enforce limits on the number of segments stored in the TCP reassembly | |||
| reassembly buffer. | buffer. | |||
| Section 8 of this document describes the security implications of | Section 8 of this document describes the security implications of the | |||
| the TCP segment reassembly algorithm. | TCP segment reassembly algorithm. | |||
| 7.3. Automatic buffer tuning mechanisms | 7.3. Automatic buffer tuning mechanisms | |||
| NOTE: THIS SECTION IS BEING EDITED. PLEASE DISREGARD THE RFC2119- | ||||
| LANGUAGE RECOMMENDATIONS. | ||||
| 7.3.1. Automatic send-buffer tuning mechanisms | 7.3.1. Automatic send-buffer tuning mechanisms | |||
| A TCP implementing an automatic send-buffer tuning mechanism SHOULD | A TCP implementing an automatic send-buffer tuning mechanism SHOULD | |||
| enforce the following limit on the size of the send buffer of each | enforce the following limit on the size of the send buffer of each | |||
| TCP connection: | TCP connection: | |||
| send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections) | send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections) | |||
| where | where | |||
| skipping to change at page 63, line 37 ¶ | skipping to change at page 47, line 28 ¶ | |||
| It is worth noting that TCP Selective Acknowledgements (SACK) are | It is worth noting that TCP Selective Acknowledgements (SACK) are | |||
| advisory, in the sense that a TCP that has SACKed (but not ACKed) | advisory, in the sense that a TCP that has SACKed (but not ACKed) | |||
| a block of data is free to discard that block, and expect the TCP | a block of data is free to discard that block, and expect the TCP | |||
| sender to retransmit them when the retransmission timer of the | sender to retransmit them when the retransmission timer of the | |||
| peer TCP expires. | peer TCP expires. | |||
| 8. TCP segment reassembly algorithm | 8. TCP segment reassembly algorithm | |||
| 8.1. Problems that arise from ambiguity in the reassembly process | 8.1. Problems that arise from ambiguity in the reassembly process | |||
| If a TCP segment is received containing some data bytes that had | A security consideration that should be made for the TCP segment | |||
| already been received, the first copy of those data SHOULD be used | reassembly algorithm is that of data stream consistency between the | |||
| for reassembling the application data stream. | host performing the TCP segment reassembly, and a Network Intrusion | |||
| Detection System (NIDS) being employed to monitor the host in | ||||
| DISCUSSION: | question. | |||
| A security consideration that should be made for the TCP segment | ||||
| reassembly algorithm is that of data stream consistency between | ||||
| the host performing the TCP segment reassembly, and a Network | ||||
| Intrusion Detection System (NIDS) being employed to monitor the | ||||
| host in question. | ||||
| In the event a TCP segment was unnecessarily retransmitted, or | In the event a TCP segment was unnecessarily retransmitted, or there | |||
| there was packet duplication in any of the intervening networks, a | was packet duplication in any of the intervening networks, a TCP | |||
| TCP might get more than one copy of the same data. Also, as TCP | might get more than one copy of the same data. Also, as TCP segments | |||
| segments can be re-packetized when they are retransmitted, a given | can be re-packetized when they are retransmitted, a given TCP segment | |||
| TCP segment might partially overlap data already received in | might partially overlap data already received in earlier segments. | |||
| earlier segments. In all these cases, the question arises about | In all these cases, the question arises about which of the copies of | |||
| which of the copies of the received data should be used when | the received data should be used when reassembling the data stream. | |||
| reassembling the data stream. In legitimate and normal | In legitimate and normal circumstances, all copies would be | |||
| circumstances, all copies would be identical, and the same data | identical, and the same data stream would be obtained regardless of | |||
| stream would be obtained regardless of which copy of the data was | which copy of the data was used. However, an attacker could | |||
| used. However, an attacker could maliciously send overlapping | maliciously send overlapping segments containing different data, with | |||
| segments containing different data, with the intent of evading a | the intent of evading a Network Intrusion Detection Systems (NIDS), | |||
| Network Intrusion Detection Systems (NIDS), which might reassemble | which might reassemble the received TCP segments differently than the | |||
| the received TCP segments differently than the monitored system. | monitored system. [Ptacek and Newsham, 1998] provides a detailed | |||
| [Ptacek and Newsham, 1998] provides a detailed discussion of these | discussion of these issues. | |||
| issues. | ||||
| As suggested in Section 3.9 of RFC 793 [Postel, 1981c], if a TCP | As suggested in Section 3.9 of RFC 793 [RFC0793], if a TCP segment | |||
| segment arrives containing some data bytes that have already been | arrives containing some data bytes that have already been received, | |||
| received, the first copy of those data should be used for | the first copy of those data should be used for reassembling the | |||
| reassembling the application data stream. It should be noted that | application data stream. It should be noted that while convergence | |||
| while convergence to this policy might prevent some cases of | to this policy might prevent some cases of ambiguity in the | |||
| ambiguity in the reassembly process, there are a number of other | reassembly process, there are a number of other techniques that an | |||
| techniques that an attacker could still exploit to evade a NIDS | attacker could still exploit to evade a NIDS [CPNI, 2008]. These | |||
| [CPNI, 2008]. These techniques can generally be defeated if the | techniques can generally be defeated if the NIDS is placed in-line | |||
| NIDS is placed in-line with the monitored system, thus allowing | with the monitored system, thus allowing the NIDS to normalize the | |||
| the NIDS to normalize the network traffic or apply some other | network traffic or apply some other policy that could ensure | |||
| policy that could ensure consistency between the result of the | consistency between the result of the segment reassembly process | |||
| segment reassembly process obtained by the monitored host and that | obtained by the monitored host and that obtained by the NIDS. | |||
| obtained by the NIDS. | ||||
| [CERT, 2003] and [CORE, 2003] are advisories about a heap buffer | [CERT, 2003] and [CORE, 2003] are advisories about a heap buffer | |||
| overflow in a popular Network Intrusion Detection System resulting | overflow in a popular Network Intrusion Detection System resulting | |||
| from incorrect sequence number calculations in its TCP stream- | from incorrect sequence number calculations in its TCP stream- | |||
| reassembly module. | reassembly module. | |||
| 9. TCP Congestion Control | 9. TCP Congestion Control | |||
| NOTE: THIS SECTION IS BEING EDITED. | ||||
| TCP implements two algorithms, "slow start" and "congestion | TCP implements two algorithms, "slow start" and "congestion | |||
| avoidance", for controlling the rate at which data is transmitted on | avoidance", for controlling the rate at which data is transmitted on | |||
| a TCP connection [Allman et al, 1999]. These algorithms require the | a TCP connection [RFC5681]. | |||
| addition of two variables as part of TCP per-connection state: cwnd | ||||
| and ssthresh. | ||||
| The congestion window (cwnd) is a sender-side limit on the amount of | ||||
| outstanding data that the sender can have at any time, while the | ||||
| receiver's advertised window (rwnd) is a receiver-side limit on the | ||||
| amount of outstanding data. The minimum of cwnd and rwnd governs | ||||
| data transmission. | ||||
| Another state variable, the slow-start threshold (ssthresh), is used | ||||
| to determine whether it is the slow start or the congestion avoidance | ||||
| algorithm that should control data transmission. When cwnd < | ||||
| ssthresh, "slow start" governs data transmission, and the congestion | ||||
| window (cwnd) is exponentially increased. When cwnd > ssthresh, | ||||
| "congestion avoidance" governs data transmission, and the congestion | ||||
| window (cwnd) is only linearly increased. | ||||
| As specified in RFC 2581 [Allman et al, 1999], when cwnd and ssthresh | ||||
| are equal the sender may use either slow start or congestion | ||||
| avoidance. | ||||
| During slow start, TCP increments cwnd by at most SMSS bytes for each | ||||
| ACK received that acknowledges new data. During congestion | ||||
| avoidance, cwnd is incremented by 1 full-sized segment per round-trip | ||||
| time (RTT), until congestion is detected. | ||||
| Additionally, TCP uses two algorithms, Fast Retransmit and Fast | ||||
| Recovery, to mitigate the effects of packet loss. The "Fast | ||||
| Retransmit" algorithm infers packet loss when three Duplicate | ||||
| Acknowledgements (DupACKs) are received. | ||||
| The value "three" is meant to allow for fast-retransmission of | ||||
| "missing" data, while avoiding network packet reordering from | ||||
| triggering loss recovery. | ||||
| Once packet loss is detected by the receipt of three duplicate-ACKs, | ||||
| the "Fast Recovery" algorithm governs the transfer of new data until | ||||
| a non-duplicate ACK is received that acknowledges the receipt of new | ||||
| data. The Fast Retransmit and Fast Recovery algorithms are usually | ||||
| implemented together, as follows (from RFC 2581): | ||||
| o When the third duplicate ACK is received, set ssthresh to no more | ||||
| than the value given in the equation: ssthresh = max (FlightSize / | ||||
| 2, 2*SMSS) | ||||
| o Retransmit the lost segment and set cwnd to ssthresh plus 3*SMSS. | ||||
| This artificially "inflates" the congestion window by the number | ||||
| of segments (three) that have left the network and which the | ||||
| receiver has buffered. | ||||
| o For each additional duplicate ACK received, increment cwnd by | ||||
| SMSS. This artificially inflates the congestion window in order | ||||
| to reflect the additional segment that has left the network. | ||||
| o Transmit a segment, if allowed by the new value of cwnd and the | ||||
| receiver's advertised window. | ||||
| o When the next ACK arrives that acknowledges new data, set cwnd to | ||||
| ssthresh (the value set in step 1). This is termed "deflating" | ||||
| the window. | ||||
| 9.1. Congestion control with misbehaving receivers | 9.1. Congestion control with misbehaving receivers | |||
| [Savage et al, 1999] describes a number of ways in which TCP's | [Savage et al, 1999] describes a number of ways in which TCP's | |||
| congestion control mechanisms can be exploited by a misbehaving TCP | congestion control mechanisms can be exploited by a misbehaving TCP | |||
| receiver to obtain more than its fair share of bandwidth. The | receiver to obtain more than its fair share of bandwidth. The | |||
| following subsections provide a brief discussion of these | following subsections provide a brief discussion of these | |||
| vulnerabilities, along with the possible countermeasures. | vulnerabilities, along with the possible countermeasures. | |||
| 9.1.1. ACK division | 9.1.1. ACK division | |||
| TCP SHOULD increase cwnd by one SMSS only when a valid ACK covers the | Given that TCP updates cwnd based on the number of duplicate ACKs it | |||
| entire data segment sent | receives, rather than on the amount of data that each ACK is actually | |||
| acknowledging, a malicious TCP receiver could cause the TCP sender to | ||||
| (note: or should we recommend the other counter-measure (i.e., | illegitimately increase its congestion window by acknowledging a data | |||
| implementation of ABC?) | segment with a number of separate Acknowledgements, each covering a | |||
| distinct piece of the received data segment. | ||||
| DISCUSSION: | ||||
| Given that TCP updates cwnd based on the number of duplicate ACKs | ||||
| it receives, rather than on the amount of data that each ACK is | ||||
| actually acknowledging, a malicious TCP receiver could cause the | ||||
| TCP sender to illegitimately increase its congestion window by | ||||
| acknowledging a data segment with a number of separate | ||||
| Acknowledgements, each covering a distinct piece of the received | ||||
| data segment. | ||||
| See Figure 7, in page 64 of the UK CPNI document. | See Figure 7, in page 64 of the UK CPNI document. | |||
| ACK division attack | ACK division attack | |||
| [Savage et al, 1999] describes two possible countermeasures for | [Savage et al, 1999] describes two possible countermeasures for this | |||
| this vulnerability. One of them is to increment cwnd not by a | vulnerability. One of them is to increment cwnd not by a full SMSS, | |||
| full SMSS, but proportionally to the amount of data being | but proportionally to the amount of data being acknowledged by the | |||
| acknowledged by the received ACK, similarly to the policy | received ACK, similarly to the policy described in RFC 3465 [Allman, | |||
| described in RFC 3465 [Allman, 2003]. Another alternative is to | 2003]. Another alternative is to increase cwnd by one SMSS only when | |||
| increase cwnd by one SMSS only when a valid ACK covers the entire | a valid ACK covers the entire data segment sent. | |||
| data segment sent. | ||||
| 9.1.2. DupACK forgery | 9.1.2. DupACK forgery | |||
| TCP SHOULD keep track of the number of outstanding segments (o_seg), | The second vulnerability discussed in [Savage et al, 1999] allows an | |||
| and accept only up to (o_seg -1) duplicate Acknowledgements. | attacker to cause the TCP sender to illegitimately increase its | |||
| congestion window by forging a number of duplicate Acknowledgements | ||||
| DISCUSSION: | (DupACKs). Figure 8 shows a sample scenario. The first three | |||
| DupACKs trigger the Fast Recovery mechanism, while the rest of them | ||||
| The second vulnerability discussed in [Savage et al, 1999] allows | cause the congestion window at the TCP sender to be illegitimately | |||
| an attacker to cause the TCP sender to illegitimately increase its | inflated. Thus, the attacker is able to illegitimately cause the TCP | |||
| congestion window by forging a number of duplicate | sender to increase its data transmission rate. | |||
| Acknowledgements (DupACKs). Figure 8 shows a sample scenario. | ||||
| The first three DupACKs trigger the Fast Recovery mechanism, while | ||||
| the rest of them cause the congestion window at the TCP sender to | ||||
| be illegitimately inflated. Thus, the attacker is able to | ||||
| illegitimately cause the TCP sender to increase its data | ||||
| transmission rate. | ||||
| See Figure 8, in page 65 of the UK CPNI document. | See Figure 8, in page 65 of the UK CPNI document. | |||
| DupACK forgery attack | DupACK forgery attack | |||
| Fortunately, a number of sender-side heuristics can be implemented | Fortunately, a number of sender-side heuristics can be implemented to | |||
| to mitigate this vulnerability. First, the TCP sender could keep | mitigate this vulnerability. First, the TCP sender could keep track | |||
| track of the number of outstanding segment (o_seg), and accept | of the number of outstanding segment (o_seg), and accept only up to | |||
| only up to (o_seg -1) DupACKs. Secondly, a TCP sender might, for | (o_seg -1) DupACKs. Secondly, a TCP sender might, for example, | |||
| example, refuse to enter Fast Recovery multiple times in some | refuse to enter Fast Recovery multiple times in some period of time | |||
| period of time (e.g., one RTT). | (e.g., one RTT). | |||
| [Savage et al, 1999] also describes a modification to TCP to | [Savage et al, 1999] also describes a modification to TCP to | |||
| implement a nonce protocol that would eliminate this | implement a nonce protocol that would eliminate this vulnerability. | |||
| vulnerability. However, this would require modification of all | However, this would require modification of all implementations, | |||
| implementations, which makes this counter-measure hard to deploy. | which makes this counter-measure hard to deploy. | |||
| 9.1.3. Optimistic ACKing | 9.1.3. Optimistic ACKing | |||
| Another alternative for an attacker to exploit TCP's congestion | Another alternative for an attacker to exploit TCP's congestion | |||
| control mechanisms is to acknowledge data that has not yet been | control mechanisms is to acknowledge data that has not yet been | |||
| received, thus causing the congestion window at the TCP sender to be | received, thus causing the congestion window at the TCP sender to be | |||
| incremented faster than it should. | incremented faster than it should. | |||
| See Figure 9, in page 66 of the UK CPNI document. | See Figure 9, in page 66 of the UK CPNI document. | |||
| skipping to change at page 68, line 31 ¶ | skipping to change at page 50, line 37 ¶ | |||
| TCP", the third duplicate-ACK will cause the "lost" segment to be | TCP", the third duplicate-ACK will cause the "lost" segment to be | |||
| retransmitted, and each subsequent duplicate-ACK will cause cwnd to | retransmitted, and each subsequent duplicate-ACK will cause cwnd to | |||
| be artificially inflated. Thus, the "sending TCP" might end up | be artificially inflated. Thus, the "sending TCP" might end up | |||
| injecting more packets into the network than it really should, with | injecting more packets into the network than it really should, with | |||
| the potential of causing network congestion. This is a potential | the potential of causing network congestion. This is a potential | |||
| consequence of the "Duplicate-ACK spoofing attack" described in | consequence of the "Duplicate-ACK spoofing attack" described in | |||
| [Savage et al, 1999]. | [Savage et al, 1999]. | |||
| Secondly, if bursts of three duplicate ACKs are sent to the TCP | Secondly, if bursts of three duplicate ACKs are sent to the TCP | |||
| sender, the attacked system would infer packet loss, and ssthresh and | sender, the attacked system would infer packet loss, and ssthresh and | |||
| cwnd would be reduced. As noted in RFC 2581 [Allman et al, 1999], | cwnd would be reduced. As noted in RFC 5681 [RFC5681], causing two | |||
| causing two congestion control events back-to-back will often cut | congestion control events back-to-back will often cut ssthresh and | |||
| ssthresh and cwnd to their minimum value of 2*SMSS, with the | cwnd to their minimum value of 2*SMSS, with the connection | |||
| connection immediately entering the slower-performing congestion | immediately entering the slower-performing congestion avoidance | |||
| avoidance phase. While it would not be attractive for an attacker to | phase. While it would not be attractive for an attacker to perform | |||
| perform this attack against one of his TCP connections, the attack | this attack against one of his TCP connections, the attack might be | |||
| might be attractive when the TCP connection to be attacked is | attractive when the TCP connection to be attacked is established | |||
| established between two other parties. | between two other parties. | |||
| It is usually assumed that in order for an off-path attacker to | It is usually assumed that in order for an off-path attacker to | |||
| perform attacks against a third-party TCP connection, he should be | perform attacks against a third-party TCP connection, he should be | |||
| able to guess a number of values, including a valid TCP Sequence | able to guess a number of values, including a valid TCP Sequence | |||
| Number and a valid TCP Acknowledgement Number. While this is true if | Number and a valid TCP Acknowledgement Number. While this is true if | |||
| the attacker tries to "inject" valid packets into the connection by | the attacker tries to "inject" valid packets into the connection by | |||
| himself, a feature of TCP can be exploited to fool one of the TCP | himself, a feature of TCP can be exploited to fool one of the TCP | |||
| endpoints to transmit valid duplicate Acknowledgements on behalf of | endpoints to transmit valid duplicate Acknowledgements on behalf of | |||
| the attacker, hence relieving the attacker of the hard task of | the attacker, hence relieving the attacker of the hard task of | |||
| forging valid values for the Sequence Number and Acknowledgement | forging valid values for the Sequence Number and Acknowledgement | |||
| Number TCP header fields. | Number TCP header fields. | |||
| Section 3.9 of RFC 793 [Postel, 1981c] describes the processing of | Section 3.9 of RFC 793 [RFC0793] describes the processing of incoming | |||
| incoming TCP segments as a function of the connection state and the | TCP segments as a function of the connection state and the contents | |||
| contents of the various header fields of the received segment. For | of the various header fields of the received segment. For | |||
| connections in the ESTABLISHED state, the first check that is | connections in the ESTABLISHED state, the first check that is | |||
| performed on incoming segments is that they contain "in window" data. | performed on incoming segments is that they contain "in window" data. | |||
| That is, | That is, | |||
| RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or | RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or | |||
| RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND | RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND | |||
| If a segment does not pass this check, it is dropped, and an | If a segment does not pass this check, it is dropped, and an | |||
| Acknowledgement is sent in response: | Acknowledgement is sent in response: | |||
| skipping to change at page 71, line 22 ¶ | skipping to change at page 53, line 28 ¶ | |||
| segments (in red) sent by the attacker causes the TCP sender to enter | segments (in red) sent by the attacker causes the TCP sender to enter | |||
| the loss recovery phase and illegitimately inflate the congestion | the loss recovery phase and illegitimately inflate the congestion | |||
| window, leading to an increase in the data transmission rate. Once a | window, leading to an increase in the data transmission rate. Once a | |||
| segment that acknowledges new data is received by the TCP sender, the | segment that acknowledges new data is received by the TCP sender, the | |||
| loss recovery phase ends, and the data transmission rate is reduced. | loss recovery phase ends, and the data transmission rate is reduced. | |||
| See Figure 12, in page 70 of the UK CPNI document. | See Figure 12, in page 70 of the UK CPNI document. | |||
| Blind flooding attack (time-line graph) | Blind flooding attack (time-line graph) | |||
| Figure 13 is a time-sequence graph produced from packet logs obtained | ||||
| from tests of the described attack in a real network. A burst of | ||||
| segments is sent upon receipt of the burst of Duplicate | ||||
| Acknowledgements illegitimately elicited by the attacker. Figure 14 | ||||
| is an averaged-throughput graphic for the same time frame, which | ||||
| clearly shows the effect of the attack in terms of throughput. | ||||
| See Figure 13, in page 71 of the UK CPNI document. | ||||
| Blind flooding attack (time sequence graph) | ||||
| See Figure 14, in page 71 of the UK CPNI document. | ||||
| Blind flooding attack (averaged throughput graph) | ||||
| These graphics were produced with Shawn Ostermann's tcptrace tool | ||||
| [Ostermann, 2008]. An explanation of the format of the graphics can | ||||
| be found in tcptrace's manual (available at the project's web site: | ||||
| http://www.tcptrace.org). | ||||
| 9.2.3. Difficulty in performing the attacks | 9.2.3. Difficulty in performing the attacks | |||
| In order to exploit the technique described in Section 9.2 of this | In order to exploit the technique described in Section 9.2 of this | |||
| document, an attacker would need to know the four-tuple {IP Source | document, an attacker would need to know the four-tuple {IP Source | |||
| Address, TCP Source Port, IP Destination Address, TCP Destination | Address, TCP Source Port, IP Destination Address, TCP Destination | |||
| Port} that identifies the connection to be attacked. As discussed by | Port} that identifies the connection to be attacked. As discussed by | |||
| [Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of | [Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of | |||
| scenarios in which these values may be known or easily guessed. | scenarios in which these values may be known or easily guessed. | |||
| It is interesting to note that the attacks described in Section 9.2 | It is interesting to note that the attacks described in Section 9.2 | |||
| skipping to change at page 73, line 10 ¶ | skipping to change at page 54, line 43 ¶ | |||
| interesting in the case of the blind-flooding attack, as the attack | interesting in the case of the blind-flooding attack, as the attack | |||
| would elicit even more packets from the TCP sender. | would elicit even more packets from the TCP sender. | |||
| Whether a full-window or just half a window of data is retransmitted | Whether a full-window or just half a window of data is retransmitted | |||
| depends on the Acknowledgement policy at the TCP receiver. If the | depends on the Acknowledgement policy at the TCP receiver. If the | |||
| TCP receiver sends an Acknowledgement (ACK) for every segment, a | TCP receiver sends an Acknowledgement (ACK) for every segment, a | |||
| full-window of data will be retransmitted. If the TCP receiver sends | full-window of data will be retransmitted. If the TCP receiver sends | |||
| an Acknowledgement (ACK) for every other segment, then only half a | an Acknowledgement (ACK) for every other segment, then only half a | |||
| window of data will be retransmitted. | window of data will be retransmitted. | |||
| Figure 15 is a time-sequence graph produced from packet logs obtained | ||||
| from tests performed in a real network. Once loss recovery is | ||||
| illegitimately triggered by the duplicate-ACKs elicited by the | ||||
| attacker, an entire flight of data is unnecessarily retransmitted. | ||||
| Figure 16 is an averaged-throughput graphic for the same time-frame, | ||||
| which shows an increase in the throughput of the connection resulting | ||||
| from the retransmission of segments governed by NewReno's loss | ||||
| recovery. | ||||
| See Figure 15, in page 73 of the UK CPNI document. | ||||
| NewReno loss recovery (time-sequence graph) | ||||
| See Figure 16, in page 74 of the UK CPNI document. | ||||
| NewReno loss recovery (averaged throughput graph) | ||||
| Limited Transmit | Limited Transmit | |||
| RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more | RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more | |||
| effectively recover lost segments when a connection's congestion | effectively recover lost segments when a connection's congestion | |||
| window is small, or when a large number of segments are lost in a | window is small, or when a large number of segments are lost in a | |||
| single transmission window. The "Limited Transmit" algorithm calls | single transmission window. The "Limited Transmit" algorithm calls | |||
| for sending a new data segment in response to each of the first two | for sending a new data segment in response to each of the first two | |||
| Duplicate Acknowledgements that arrive at the TCP sender. This would | Duplicate Acknowledgements that arrive at the TCP sender. This would | |||
| provide two additional transmitted packets that may be useful for the | provide two additional transmitted packets that may be useful for the | |||
| attacker in the case of the blind flooding attack described in | attacker in the case of the blind flooding attack described in | |||
| Section 9.2.2 is performed. | Section 9.2.2 is performed. | |||
| SACK-based loss recovery | SACK-based loss recovery | |||
| RFC 3517 [Blanton et al, 2003] specifies a conservative loss-recovery | [I-D.ietf-tcpm-3517bis] specifies a conservative loss-recovery | |||
| algorithm that is based on the use of the selective acknowledgement | algorithm that is based on the use of the selective acknowledgement | |||
| (SACK) TCP option. The algorithm uses DupACKs as an indication of | (SACK) TCP option. The algorithm uses DupACKs as an indication of | |||
| congestion, as specified in RFC 2581 [Allman et al, 1999]. However, | congestion, as specified in RFC 2581 [RFC5681]. However, a | |||
| a difference between this algorithm and the basic algorithm described | difference between this algorithm and the basic algorithm described | |||
| in RFC 2581 is that it clocks out segments only with the SACK | in RFC 2581 is that it clocks out segments only with the SACK | |||
| information included in the DupACKs. That is, during the loss | information included in the DupACKs. That is, during the loss | |||
| recovery phase, segments will be injected in the network only if the | recovery phase, segments will be injected in the network only if the | |||
| SACK information included in the received DupACKs indicates that one | SACK information included in the received DupACKs indicates that one | |||
| or more segments have left the network. As a result, those systems | or more segments have left the network. As a result, those systems | |||
| that implement SACK-based loss recovery will not be vulnerable to the | that implement SACK-based loss recovery will not be vulnerable to the | |||
| blind flooding attack described in Section 9.2.2. However, as RFC | blind flooding attack described in Section 9.2.2. Additionally, as | |||
| 3517 does not actually require DupACKs to include new SACK | [I-D.ietf-tcpm-3517bis] requires DupACKs to include new SACK | |||
| information (corresponding to data that has not yet been acknowledged | information (corresponding to data that has not yet been acknowledged | |||
| by TCP's cumulative Acknowledgement), systems that implement SACK- | by TCP's cumulative Acknowledgement), systems that implement SACK- | |||
| based loss-recovery may still remain vulnerable to the blind | based loss-recovery will not be vulnerable to the blind throughput- | |||
| throughput-reduction attack described in Section 9.2.1. SACK-based | reduction attack described in Section 9.2.1. | |||
| loss recovery implementations should be updated to implement the | ||||
| countermeasure ("Use of SACK information to validate DupACKs") | ||||
| described in Section 9.2.5. | ||||
| 9.2.5. Countermeasures | 9.2.5. Countermeasures | |||
| TCP SHOULD validate the Sequence Number of an incomming TCP segment | [draft-gont-tcpm-limiting-aow-segments-00.txt] proposes to rate-limit | |||
| as follows: | the reaction to out-of-window segments. This would mitigate the | |||
| attacks described earlier in this section. | ||||
| RCV.NXT - MAX.RCV.WND <= SEG.SEQ <= RCV.NXT + RCV.WND | ||||
| where MAX.RCV.WND is the largest TCP window that has so far been | ||||
| advertised to the remote endpoint. | ||||
| If a segment passes this check, the processing rules specified in RFC | ||||
| 793 [Postel, 1981c] MUST applied. Otherwise, TCP SHOULD send an ACK | ||||
| (as specified by the processing rules in RFC 793 [Postel, 1981c]), | ||||
| applying rate-limiting to the Acknowledgement segments sent in | ||||
| response to out-of-window segments. | ||||
| DISCUSSION: | ||||
| As discussed in Section 9.2, TCP responds with an ACK when an out- | ||||
| of-window segment is received, to accommodate those scenarios in | ||||
| which the Acknowledgement segments that correspond to some | ||||
| received data are lost in the network, and to help discover half- | ||||
| open TCP connections. | ||||
| However, it is possible to restrict the sequence numbers that are | ||||
| considered acceptable, and have TCP respond with ACKs only when it | ||||
| is strictly necessary. | ||||
| A feature of TCP is that, in some scenarios, it can detect half- | ||||
| open connections. If an implementation chose to silently drop | ||||
| those TCP segments that do not pass the check enforced by the | ||||
| equation above, it could prevent TCP from detecting half-open | ||||
| connections. Figure 17 shows a scenario in which, provided that | ||||
| "TCP B" behaves as specified in RFC 793, a half-open connection | ||||
| would be discovered and aborted. | ||||
| An established connection is said to be "half open" if one of the | ||||
| TCPs has closed or aborted the connection at its end without the | ||||
| knowledge of the other, or if the two ends of the connection have | ||||
| become desynchronized owing to a crash that resulted in loss of | ||||
| memory. | ||||
| See Figure 17, in page 76 of the UK CPNI document. | ||||
| Half-Open Connection Discovery | ||||
| In the scenario illustrated by Figure 17, TCP A crashes losing the | ||||
| connection-state information of the TCP connection with TCP B. In | ||||
| line 3, TCP A tries to establish a new connection with TCP B, | ||||
| using the same four-tuple {IP Source Address, TCP source port, IP | ||||
| Destination Address, TCP destination port}. In line 4, as the SYN | ||||
| segment is out of window, TCP B responds with an ACK. This ACK | ||||
| elicits an RST segment from TCP A, which causes the half-open | ||||
| connection at TCP B to be aborted. | ||||
| If the SYN segment had been "in window", TCP B would have sent an | ||||
| RST segment instead, which would have closed the half-open | ||||
| connection. Ongoing work at the TCPM WG of the IETF proposes to | ||||
| change this behavior, and make TCP respond to a SYN segment | ||||
| received for any of the synchronized states with an ACK segment, | ||||
| to avoid in-window SYN segments from being used to perform | ||||
| connection-reset attacks [Ramaiah et al, 2008]. | ||||
| However, in case the out-of-window segment was silently dropped, | ||||
| the scenario in Figure 17 would change into that in Figure 18. | ||||
| See Figure 18, in page 76 of the UK CPNI document. | ||||
| Half-Open Connection Discovery with the proposed counter-measure | ||||
| In line 3, the SYN segment sent by TCP A is silently dropped by | ||||
| TCP B because it does not pass the check enforced by the equation | ||||
| above (i.e., it contains an out-of-window sequence number). As a | ||||
| result, some time later (an RTO) TCP A retransmits its SYN | ||||
| segment. Even after TCP A times out, the half-open connection at | ||||
| TCP B will remain in the same state. | ||||
| Thus, a conservative reaction to those segments that do not pass | ||||
| the check enforced by the equation above would be to respond with | ||||
| an Acknowledgement segment (as specified by RFC 793), applying | ||||
| rate-limiting to those Acknowledgement segments sent in response | ||||
| to segments that do not pass the check enforced by that equation. | ||||
| An implementation might choose to enforce a rate-limit of, e.g., | ||||
| one ACK per five seconds, as a single ACK segment is needed for | ||||
| the Half-Open Connection Discovery mechanism to work. | ||||
| As the only reason to respond with an ACK to those segments that | ||||
| do not pass the check enforced by the equation above is to allow | ||||
| TCP to discover half-open connections, an aggressive rate-limit | ||||
| can be enforced. As long as the rate-limit prevents out-of-window | ||||
| segments from eliciting three Acknowledgment segments in a Round- | ||||
| trip Time (RTT), an attacker would not be able to trigger TCP's | ||||
| loss-recovery, and thus would not be able to perform the attacks | ||||
| described in the previous sections. | ||||
| It is interesting to note that RFC 793 [Postel, 1981c] itself | ||||
| states that half-open connections are expected to be unusual. | ||||
| Additionally, given that in many scenarios it may be unlikely for | ||||
| a TCP connection request to be issued with the same four-tuple as | ||||
| that of the half-open connection, a complete solution for the | ||||
| discovery of half-open connections cannot rely on the mechanism | ||||
| illustrated by Figure 17, either. Therefore, some implementations | ||||
| might choose to sacrifice TCP's ability to detect half-open | ||||
| connections, and have a more aggressive reaction to those segments | ||||
| that do not pass the check enforced by the equation above by | ||||
| silently dropping them. | ||||
| This validation check can also help to avoid ACK wars in some | ||||
| scenarios that may arise from the use of transparent proxies. In | ||||
| those scenarios, when the transparent proxy fails to wire (i.e., | ||||
| is disabled), the sequence numbers of the two end-points of the | ||||
| TCP connection become desynchronized, and both TCPs begin to send | ||||
| duplicate Acknowledgements to each other, with the intention of | ||||
| re-synchronizing them. As the sequence numbers never get re- | ||||
| synchronized, the ACK war can only be stopped by an external | ||||
| agent. | ||||
| TCP SHOULD limit the number of duplicate acknowledgements it will | ||||
| honour to: | ||||
| Max_DupACKs = (FlightSize / SMSS) - 1 | ||||
| Where FlightSize and SMSS are the values defined in RFC 2581 [Allman | ||||
| et al, 1999]. When more than Max_DupACKs duplicate acknowledgements | ||||
| are received, the exceeding DupACKs should be silently dropped. | ||||
| DISCUSSION: | ||||
| Note that duplicate acknowledgements should be elicited by out-of- | ||||
| order segments. | ||||
| In the case of TCP connections that have agreed to employ SACK, TCP | ||||
| SHOULD validate duplicate ACKs with the following criteria: Valid | ||||
| Duplicate ACKs MUST contain new SACK information. The SACK | ||||
| information MUST refer to data that has already been sent, but that | ||||
| has not yet been acknowledged by TCP's cumulative Acknowledgement. A | ||||
| TCP segment that does not pass this check SHOULD NOT be considered as | ||||
| "duplicate Acknowledgement". | ||||
| DISCUSSION: | ||||
| SACK, specified in 2018 [Mathis et al, 1996], provides a mechanism | ||||
| for TCP to be able to acknowledge the receipt of out-of-order TCP | ||||
| segments. For connections that have agreed to use SACK, each | ||||
| legitimate DupACK will contain new SACK information that reflects | ||||
| the data bytes contained in the out-of-order data segment that | ||||
| elicited the DupACK. | ||||
| RFC 3517 [Blanton et al, 2003] specifies a SACK-based loss | ||||
| recovery algorithm for TCP. However, it does recommend TCP | ||||
| implementations to validate DupACKs by requiring that they contain | ||||
| new SACK information. Results obtained from auditing a number of | ||||
| TCP implementations seem to indicate that most TCP implementations | ||||
| do not enforce this validation check on incoming DupACKs, either. | ||||
| In the case of TCP connections that have agreed to use SACK, a | ||||
| validation check should be performed on incoming ACK segments to | ||||
| completely eliminate the attacks described in Section 9.2.1 and | ||||
| Section 9.2.2 of this document: "Duplicate ACKs should contain new | ||||
| SACK information. The SACK information should refer to data that | ||||
| has already been sent, but that has not yet been acknowledged by | ||||
| TCP's cumulative Acknowledgement". | ||||
| Those ACK segments that do not comply with this validation check | ||||
| should not be considered "duplicate ACKs", and thus should not | ||||
| trigger the loss-recovery phase. | ||||
| In case at least one segment in a window of data has been lost, | ||||
| the successive segments will elicit the generation of Duplicate | ||||
| ACKs containing new SACK information. This SACK information will | ||||
| indicate the receipt of these successive segments by the TCP | ||||
| receiver. | ||||
| In the case of pure ACKs illegitimately elicited by out-of-window | ||||
| segments, however, the ACKs will not contain any SACK information. | ||||
| If DSACK (specified in 2883 [Floyd et al, 2000]) were implemented | ||||
| by the TCP receiver, then the illegitimately elicited DupACKs | ||||
| might contain out-of-window SACK information if the sequence | ||||
| number of the forged TCP segment (SEG.SEQ) is lower than the next | ||||
| expected sequence number (RECV.NXT) at the TCP receiver. Such | ||||
| segments should be considered to indicate the receipt of duplicate | ||||
| data, rather than an indication of lost data, and therefore should | ||||
| not trigger loss recovery. | ||||
| Other possible general mitigations are discussed in the following | ||||
| paragraphs: | ||||
| TCP port number randomization | ||||
| As in order to perform the blind attacks described in Section 9.2.1 | ||||
| and Section 9.2.2 the attacker needs to know the TCP port numbers in | ||||
| use by the connection to be attacked, obfuscating the TCP source port | ||||
| used for outgoing TCP connections will increase the number of packets | ||||
| required to successfully perform these attacks. Section 3.1 of this | ||||
| document discusses the use of port randomization. | ||||
| It must be noted that given that these blind DupACK triggering | ||||
| attacks do not require the attacker to forge valid TCP Sequence | ||||
| numbers and TCP Acknowledgement numbers, port randomization should | ||||
| not be relied upon as a first line of defense. | ||||
| Ingress and Egress filtering | ||||
| Ingress and Egress filtering reduces the number of systems in the | ||||
| global Internet that can perform attacks that rely on forged source | ||||
| IP addresses. While protection from the blind attacks discussed in | ||||
| Section 9.2 should not rely only on Ingress and Egress filtering, its | ||||
| deployment is recommended to help prevent all attacks that rely on | ||||
| forged IP addresses. RFC 3704 [Baker and Savola, 2004], RFC 2827 | ||||
| [Ferguson and Senie, 2000], and [NISCC, 2006] provide advice on | ||||
| Ingress and Egress filtering. | ||||
| Generalized TTL Security Mechanism (GTSM) | ||||
| RFC 5082 [Gill et al, 2007] proposes a check on the TTL field of the | ||||
| IP packets that correspond to a given TCP connection to reduce the | ||||
| number of systems that could successfully attack the protected TCP | ||||
| connection. It provides for the attacks discussed in this document | ||||
| the same level of protection than for the attacks described in | ||||
| [Watson, 2004] and RFC 4953 [Touch, 2007]. While implementation of | ||||
| this mechanism may be useful in some scenarios, it should be clear | ||||
| that countermeasures discussed in the previous sections provide a | ||||
| more effective and simpler solution than that provided by the GTSM. | ||||
| 9.3. TCP Explicit Congestion Notification (ECN) | 9.3. TCP Explicit Congestion Notification (ECN) | |||
| ECN (Explicit Congestion Notification) provides a mechanism for | ECN (Explicit Congestion Notification) provides a mechanism for | |||
| intermediate systems to signal congestion to the communicating | intermediate systems to signal congestion to the communicating | |||
| endpoints that in some scenarios can be used as an alternative to | endpoints that in some scenarios can be used as an alternative to | |||
| dropping packets. | dropping packets. | |||
| RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of | RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of | |||
| the possible ways and scenarios in which ECN could be exploited by an | the possible ways and scenarios in which ECN could be exploited by an | |||
| skipping to change at page 79, line 27 ¶ | skipping to change at page 56, line 6 ¶ | |||
| on nonces, that protects against accidental or malicious concealment | on nonces, that protects against accidental or malicious concealment | |||
| of marked packets from the TCP sender. The specified mechanism | of marked packets from the TCP sender. The specified mechanism | |||
| defines a "NS" ("Nonce Sum") field in the TCP header that makes use | defines a "NS" ("Nonce Sum") field in the TCP header that makes use | |||
| of one bit from the Reserved field, and requires a modification in | of one bit from the Reserved field, and requires a modification in | |||
| both of the endpoints of a TCP connection to process this new field. | both of the endpoints of a TCP connection to process this new field. | |||
| This mechanism is still in "Experimental" status, and since it might | This mechanism is still in "Experimental" status, and since it might | |||
| suffer from the behavior of some middle-boxes such as firewalls or | suffer from the behavior of some middle-boxes such as firewalls or | |||
| packet-scrubbers, we defer a recommendation of this mechanism until | packet-scrubbers, we defer a recommendation of this mechanism until | |||
| more experience is gained. | more experience is gained. | |||
| There also is ongoing work in the research community and the IETF to | There also is ongoing work in the research community and the IETF | |||
| define alternate semantics for the ECN field of the IP header (e.g., | to define alternate semantics for the ECN field of the IP header | |||
| see [PCNWG, 2009]). | (e.g., see [PCNWG, 2009]). | |||
| The following subsections try to summarize the security implications | ||||
| of ECN. | ||||
| 9.3.1. Possible attacks by a compromised router | ||||
| Firstly, a router controlled by a malicious user could erase the CE | ||||
| codepoint (either by replacing it with the ECT(0), ECT(1), or non-ECT | ||||
| codepoints), effectively eliminating the congestion indication. As a | ||||
| result, the corresponding TCP sender would not reduce its data | ||||
| transmission rate, possibly leading to network congestion. This | ||||
| could also lead to unfairness, as this flow could experience better | ||||
| performance than other flows for which the congestion indication is | ||||
| not erased (and thus their transmission rate is reduced). | ||||
| Secondly, a router controlled by a malicious user could | ||||
| illegitimately set the CE codepoint, falsely indicating congestion, | ||||
| to cause the TCP sender to reduce its data transmission rate. | ||||
| However, this particular attack is no worse than the malicious router | ||||
| simply dropping the packets rather setting their CE codepoint. | ||||
| Thirdly, a malicious router could turn off the ECT codepoint of a | ||||
| packet, thus disabling ECN support. As a result, if the packet later | ||||
| arrives at a router that is experiencing congestion, it may be | ||||
| dropped rather than marked. As with the previous scenario, though, | ||||
| this is no worse than the malicious router simply dropping the | ||||
| corresponding packet. | ||||
| It should be noted that a compromised on-path IP router could engage | ||||
| in a much broader range of attacks, with broader impacts, and at much | ||||
| lower attacker cost than the ones described here. Such a compromised | ||||
| router is extremely unlikely to engage in the attack vectors | ||||
| discussed in this section, given the existence of more effective | ||||
| attack vectors that have lower attacker cost. | ||||
| 9.3.2. Possible attacks by a malicious TCP endpoint | ||||
| If a packet with the ECT codepoint set arrives at an ECN-capable | ||||
| router that is experiencing moderate congestion, the router may | ||||
| decide to set its CE codepoint instead of dropping it. If either of | ||||
| the TCP endpoints do not honour the congestion indication provided by | ||||
| an ECN-capable router, this would result in unfairness, as other | ||||
| (legitimate) ECN-capable flows would still reduce their sending rate | ||||
| in response to the ECN marking of packets. Furthermore, under | ||||
| moderate congestion, non-ECN-capable flows would be subject to packet | ||||
| drops by the same router. As a result, the flow with a malicious TCP | ||||
| end-point would obtain better service than the legitimate flows. | ||||
| As noted in RFC 3168 [Ramakrishnan et al, 2001], a TCP endpoint | ||||
| falsely indicating ECN capability could lead to unfairness, allowing | ||||
| the mis-beheaving flow to get more than its fair share of the | ||||
| bandwidth. This could be the result of the mis-behavior of either of | ||||
| the TCP endpoints. For example, the sending TCP could indicate ECN | ||||
| capability, but then send a CWR in response to an ECE without | ||||
| actually reducing its congestion window. Alternatively (or in | ||||
| addition), the receiving TCP could simply ignore those packets with | ||||
| the CE codepoint set, thus avoiding the sending TCP from receiving | ||||
| the congestion indication. | ||||
| In the case of the sending TCP ignoring the ECN congestion | ||||
| indication, this would be no worse than the sending TCP ignoring the | ||||
| congestion indication provided by a lost segment. However, the case | ||||
| of a TCP receiver ignoring the CE codepoint allows the TCP receiver | ||||
| to get more than its fair share of bandwidth in a way that was | ||||
| previously unavailable. If congestion was kept "moderate", then the | ||||
| malicious TCP receiver could maintain the unfairness, as the router | ||||
| experiencing congestion would mark the offending packets of the | ||||
| misbehaving flow rather than dropping them. At the same time, | ||||
| legitimate ECN-capable flows would respond to the congestion | ||||
| indication provided by the CE codepoint, while legitimate non-ECN- | ||||
| capable flows would be subject of packet dropping. However, if | ||||
| congestion turned to sufficiently heavy, the router experiencing | ||||
| congestion would switch from marking packets to dropping packets, and | ||||
| at that point the attack vector provided by ECN could no longer be | ||||
| exploited (until congestion returns to moderate state). | ||||
| RFC 3168 [Ramakrishnan et al, 2001] describes the use of "penalty | RFC 3168 [RFC3168] provides a very throrough security assessment of | |||
| boxes" which would act on flows that do not respond appropriately to | ECN. Among the possible mitigations, it describes the use of | |||
| congestion indications. Section 10 of RFC 3168 suggests that a first | "penalty boxes" which would act on flows that do not respond | |||
| action taken at a penalty box for an ECN-capable flow would be to | appropriately to congestion indications. Section 10 of RFC 3168 | |||
| switch to dropping packets (instead of marking them), and, if the | suggests that a first action taken at a penalty box for an ECN- | |||
| flow does not respond appropriately to the congestion indication, the | capable flow would be to switch to dropping packets (instead of | |||
| penalty box could reset the misbehaving connection. Here we | marking them), and, if the flow does not respond appropriately to the | |||
| discourage implementation of such a policy, as it would create a | congestion indication, the penalty box could reset the misbehaving | |||
| vector for connection-reset attacks. For example, an attacker could | connection. Here we discourage implementation of such a policy, as | |||
| forge TCP segments with the same four-tuple as the targeted | it would create a vector for connection-reset attacks. For example, | |||
| connection and cause them to transit the penalty box. The penalty | an attacker could forge TCP segments with the same four-tuple as the | |||
| box would first switch from marking to dropping packets. However, | targeted connection and cause them to transit the penalty box. The | |||
| the attacker would continue sending forged segments, at a steady | penalty box would first switch from marking to dropping packets. | |||
| rate. As a result, if the penalty box implemented such a severe | However, the attacker would continue sending forged segments, at a | |||
| policy of resetting connections for flows that still do not respond | steady rate. As a result, if the penalty box implemented such a | |||
| to end-to-end congestion control after switching from marking to | severe policy of resetting connections for flows that still do not | |||
| dropping, the attacked connection would be reset. | respond to end-to-end congestion control after switching from marking | |||
| to dropping, the attacked connection would be reset. | ||||
| 10. TCP API | 10. TCP API | |||
| Section 3.8 of RFC 793 [Postel, 1981c] describes the minimum set of | NOTE: THIS SECTION IS BEING EDITED. | |||
| TCP User Commands required of all TCP Implementations. Most | ||||
| operating systems provide an Application Programming Interface (API) | Section 3.8 of RFC 793 [RFC0793] describes the minimum set of TCP | |||
| that allows applications to make use of the services provided by TCP. | User Commands required of all TCP Implementations. Most operating | |||
| One of the most popular APIs is the Sockets API, originally | systems provide an Application Programming Interface (API) that | |||
| introduced in the BSD networking package [McKusick et al, 1996]. | allows applications to make use of the services provided by TCP. One | |||
| of the most popular APIs is the Sockets API, originally introduced in | ||||
| the BSD networking package [McKusick et al, 1996]. | ||||
| 10.1. Passive opens and binding sockets | 10.1. Passive opens and binding sockets | |||
| When there is already a pending passive OPEN for some local port | When there is already a pending passive OPEN for some local port | |||
| number, TCP SHOULD NOT allow processes that do not belong to the same | number, TCP SHOULD NOT allow processes that do not belong to the same | |||
| user to "reuse" the local port for another passive OPEN. | user to "reuse" the local port for another passive OPEN. | |||
| Additionally, reuse of a local port SHOULD default to "off", and be | Additionally, reuse of a local port SHOULD default to "off", and be | |||
| enabled only by an explicit command (e.g., the setsockopt() function | enabled only by an explicit command (e.g., the setsockopt() function | |||
| of the Sockets API). | of the Sockets API). | |||
| skipping to change at page 82, line 14 ¶ | skipping to change at page 57, line 18 ¶ | |||
| OPEN (local port, foreign socket, active/passive [, timeout] [, | OPEN (local port, foreign socket, active/passive [, timeout] [, | |||
| precedence] [, security/compartment] [, options]) -> local | precedence] [, security/compartment] [, options]) -> local | |||
| connection name | connection name | |||
| When this command is used to perform a passive open (i.e., the | When this command is used to perform a passive open (i.e., the | |||
| active/passive flag is set to passive), the foreign socket | active/passive flag is set to passive), the foreign socket | |||
| parameter may be either fully-specified (to wait for a particular | parameter may be either fully-specified (to wait for a particular | |||
| connection) or unspecified (to wait for any call). | connection) or unspecified (to wait for any call). | |||
| As discussed in Section 2.7 of RFC 793 [Postel, 1981c], if there | As discussed in Section 2.7 of RFC 793 [RFC0793], if there are | |||
| are several passive OPENs with the same local socket (recorded in | several passive OPENs with the same local socket (recorded in the | |||
| the corresponding TCB), an incoming connection will be matched to | corresponding TCB), an incoming connection will be matched to the | |||
| the TCB with the more specific foreign socket. This means that | TCB with the more specific foreign socket. This means that when | |||
| when the foreign socket of a passive OPEN matches that of the | the foreign socket of a passive OPEN matches that of the incoming | |||
| incoming connection request, that passive OPEN takes precedence | connection request, that passive OPEN takes precedence over those | |||
| over those passive OPENs with an unspecified foreign socket. | passive OPENs with an unspecified foreign socket. | |||
| Popular implementations such as the Sockets API let the user | Popular implementations such as the Sockets API let the user | |||
| specify the local socket as fully-specified {local IP address, | specify the local socket as fully-specified {local IP address, | |||
| local TCP port} pair, or as just the local TCP port (leaving the | local TCP port} pair, or as just the local TCP port (leaving the | |||
| local IP address unspecified). In the former case, only those | local IP address unspecified). In the former case, only those | |||
| connection requests sent to {local port, local IP address} will be | connection requests sent to {local port, local IP address} will be | |||
| accepted. In the latter case, connection requests sent to any of | accepted. In the latter case, connection requests sent to any of | |||
| the system's IP addresses will be accepted. In a similar fashion | the system's IP addresses will be accepted. In a similar fashion | |||
| to the generic API described in Section 2.7 of RFC 793, if there | to the generic API described in Section 2.7 of RFC 793, if there | |||
| is a pending passive OPEN with a fully-specified local socket that | is a pending passive OPEN with a fully-specified local socket that | |||
| skipping to change at page 83, line 6 ¶ | skipping to change at page 58, line 8 ¶ | |||
| port" argument of the "OPEN" command. | port" argument of the "OPEN" command. | |||
| An implementation MAY relax the aforementioned restriction when the | An implementation MAY relax the aforementioned restriction when the | |||
| process or system user requesting allocation of such a port number is | process or system user requesting allocation of such a port number is | |||
| the same that the process or system user controlling the TCP in the | the same that the process or system user controlling the TCP in the | |||
| CLOSED or LISTEN states with the same port number. | CLOSED or LISTEN states with the same port number. | |||
| DISCUSSION: | DISCUSSION: | |||
| As discussed in Section 10.1, the "OPEN" command specified in | As discussed in Section 10.1, the "OPEN" command specified in | |||
| Section 3.8 of RFC 793 [Postel, 1981c] can be used to perform | Section 3.8 of RFC 793 [RFC0793] can be used to perform active | |||
| active opens. In case of active opens, the parameter "local port" | opens. In case of active opens, the parameter "local port" will | |||
| will contain a so-called "ephemeral port". While the only | contain a so-called "ephemeral port". While the only requirement | |||
| requirement for such an ephemeral port is that the resulting | for such an ephemeral port is that the resulting connection-id is | |||
| connection-id is unique, port numbers that are currently in use by | unique, port numbers that are currently in use by a TCP in the | |||
| a TCP in the LISTEN state should not be allowed for use as | LISTEN state should not be allowed for use as ephemeral ports. If | |||
| ephemeral ports. If this rule is not complied, an attacker could | this rule is not complied, an attacker could potentially "steal" | |||
| potentially "steal" an incoming connection to a local server | an incoming connection to a local server application by issuing a | |||
| application by issuing a connection request to the victim client | connection request to the victim client at roughly the same time | |||
| at roughly the same time the client tries to connect to the victim | the client tries to connect to the victim server application. If | |||
| server application. If the SYN segment corresponding to the | the SYN segment corresponding to the attacker's connection request | |||
| attacker's connection request and the SYN segment corresponding to | and the SYN segment corresponding to the victim client "cross each | |||
| the victim client "cross each other in the network", and provided | other in the network", and provided the attacker is able to know | |||
| the attacker is able to know or guess the ephemeral port used by | or guess the ephemeral port used by the client, a TCP simultaneous | |||
| the client, a TCP simultaneous open scenario would take place, and | open scenario would take place, and the incoming connection | |||
| the incoming connection request sent by the client would be | request sent by the client would be matched with the attacker's | |||
| matched with the attacker's socket rather than with the victim | socket rather than with the victim server application's socket. | |||
| server application's socket. | ||||
| As already noted, in order for this attack to succeed, the | As already noted, in order for this attack to succeed, the | |||
| attacker should be able to guess or know (in advance) the | attacker should be able to guess or know (in advance) the | |||
| ephemeral port selected by the victim client, and be able to know | ephemeral port selected by the victim client, and be able to know | |||
| the right moment to issue a connection request to the victim | the right moment to issue a connection request to the victim | |||
| client. While in many scenarios this may prove to be a difficult | client. While in many scenarios this may prove to be a difficult | |||
| task, some factors such as an inadequate ephemeral port selection | task, some factors such as an inadequate ephemeral port selection | |||
| policy at the victim client could make this attack feasible. | policy at the victim client could make this attack feasible. | |||
| It should be noted that most applications based on popular | It should be noted that most applications based on popular | |||
| skipping to change at page 84, line 13 ¶ | skipping to change at page 59, line 13 ¶ | |||
| ports. | ports. | |||
| An implementation might choose to relax the aforementioned | An implementation might choose to relax the aforementioned | |||
| restriction when the process or system user requesting allocation | restriction when the process or system user requesting allocation | |||
| of such a port number is the same that the process or system user | of such a port number is the same that the process or system user | |||
| controlling the TCP in the CLOSED or LISTEN states with the same | controlling the TCP in the CLOSED or LISTEN states with the same | |||
| port number. | port number. | |||
| 11. Blind in-window attacks | 11. Blind in-window attacks | |||
| NOTE: THIS SECTION IS BEING EDITED. | ||||
| In the last few years awareness has been raised about a number of | In the last few years awareness has been raised about a number of | |||
| "blind" attacks that can be performed against TCP by forging TCP | "blind" attacks that can be performed against TCP by forging TCP | |||
| segments that fall within the receive window [NISCC, 2004] [Watson, | segments that fall within the receive window [NISCC, 2004] [Watson, | |||
| 2004]. | 2004]. | |||
| The term "blind" refers to the fact that the attacker does not have | The term "blind" refers to the fact that the attacker does not have | |||
| access to the packets that belong to the attacked connection. | access to the packets that belong to the attacked connection. | |||
| The effects of these attacks range from connection resets to data | The effects of these attacks range from connection resets to data | |||
| injection. While these attacks were known in the research community, | injection. While these attacks were known in the research community, | |||
| skipping to change at page 85, line 7 ¶ | skipping to change at page 60, line 7 ¶ | |||
| reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised | reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised | |||
| awareness about connection-reset attacks that exploit the RST flag of | awareness about connection-reset attacks that exploit the RST flag of | |||
| TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN | TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN | |||
| segments could also be used to perform connection-reset attacks. | segments could also be used to perform connection-reset attacks. | |||
| This document describes yet two previously undocumented vectors for | This document describes yet two previously undocumented vectors for | |||
| performing connection-reset attacks: the Precedence field of IP | performing connection-reset attacks: the Precedence field of IP | |||
| packets that encapsulate TCP segments, and illegal TCP options. | packets that encapsulate TCP segments, and illegal TCP options. | |||
| 11.1.1. RST flag | 11.1.1. RST flag | |||
| TCP SHOULD implement the mitigation for RST-based attacks specified | The RST flag signals a TCP peer that the connection should be | |||
| in [Ramaiah et al, 2008]. | aborted. In contrast with the FIN handshake (which gracefully | |||
| terminates a TCP connection), an RST segment causes the connection to | ||||
| DISCUSSION: | be abnormally closed. | |||
| The RST flag signals a TCP peer that the connection should be | ||||
| aborted. In contrast with the FIN handshake (which gracefully | ||||
| terminates a TCP connection), an RST segment causes the connection | ||||
| to be abnormally closed. | ||||
| As stated in Section 3.4 of RFC 793 [Postel, 1981c], all reset | ||||
| segments are validated by checking their Sequence Numbers, with | ||||
| the Sequence Number considered valid if it is within the receive | ||||
| window. In the SYN-SENT state, however, an RST is valid if the | ||||
| Acknowledgement Number acknowledges the SYN segment that | ||||
| supposedly elicited the reset. | ||||
| [Ramaiah et al, 2008] proposes a modification to TCP's transition | ||||
| diagram to address this attack vector. The counter-measure is a | ||||
| combination of enforcing a more strict validation check on the | ||||
| sequence number of reset segments, and the addition of a | ||||
| "challenge" mechanism. With the implementation of the proposed | ||||
| mechanism, TCP would behave as follows: | ||||
| If the Sequence Number of an RST segment is outside the receive | ||||
| window, the segment is silently dropped (as stated by RFC 793). | ||||
| That is, a reset segment is discarded unless it passes the | ||||
| following check: | ||||
| RCV.NXT <= Sequence Number < RCV.NXT+RCV.WND | ||||
| If the sequence number falls exactly on the left-edge of the | ||||
| receive window, the reset is honoured. That is, the connection is | ||||
| reset if the following condition is true: | ||||
| Sequence Number == RCV.NXT | ||||
| If an RST segment passes the first check (i.e., it is within the | ||||
| receive window) but does not pass the second check (i.e., it does | ||||
| not fall exactly on the left edge of the receive window), an | ||||
| Acknowledgement segment ("challenge ACK") is set in response: | ||||
| <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK> | As stated in Section 3.4 of RFC 793 [RFC0793], all reset segments are | |||
| validated by checking their Sequence Numbers, with the Sequence | ||||
| Number considered valid if it is within the receive window. In the | ||||
| SYN-SENT state, however, an RST is valid if the Acknowledgement | ||||
| Number acknowledges the SYN segment that supposedly elicited the | ||||
| reset. | ||||
| This Acknowledgement segment is referred to as a "challenge ACK" | [RFC5961] proposes a modification to TCP's transition diagram to | |||
| as, in the event the RST segment that elicited it had been | address this attack vector. The counter-measure is a combination of | |||
| legitimate (but silently dropped as a result of enforcing the | enforcing a more strict validation check on the sequence number of | |||
| above checks), the challenge ACK would elicit a new reset segment | reset segments, and the addition of a "challenge" mechanism. | |||
| that would fall exactly on the left edge of the window and would | ||||
| thus pass all the above checks, finally resetting the connection. | ||||
| We recommend the implementation of this countermeasure. However, | We note that we are aware of patent claims on this counter- | |||
| we are aware of patent claims on this counter-measure, and suggest | measure, and suggest vendors to research the consequences of the | |||
| vendors to research the consequences of the possible patents that | possible patents that may apply. | |||
| may apply. | ||||
| [US-CERT, 2003a] is an advisory of a firewall system that was | [US-CERT, 2003a] is an advisory of a firewall system that was found | |||
| found particularly vulnerable to resets attack because of not | particularly vulnerable to resets attack because of not validating | |||
| validating the TCP Sequence Number of RST segments. Clearly, all | the TCP Sequence Number of RST segments. Clearly, all TCPs | |||
| TCPs (including those in middle-boxes) should validate RST | (including those in middle-boxes) should validate RST segments as | |||
| segments as discussed in this section. | discussed in this section. | |||
| 11.1.2. SYN flag | 11.1.2. SYN flag | |||
| Processing of SYN segments received for connections in the | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if a SYN | |||
| synchronized states SHOULD occur as follows: | segment is received with a valid (i.e., "in window") Sequence Number, | |||
| an RST segment should be sent in response, and the connection should | ||||
| o If a SYN segment is received for a connection in any synchronized | be aborted. This could be leveraged to perform a blind connection- | |||
| state other than TIME-WAIT, respond with an ACK, applying rate- | reset attack. [RFC5961] proposes a change in TCP's state diagram to | |||
| throttling. [Ramaiah et al, 2008] | mitigate this attack vector. | |||
| o If the corresponding connection is in the TIME-WAIT state, then | ||||
| process the incomming SYN as specified in | ||||
| [I-D.ietf-tcpm-tcp-timestamps]. | ||||
| DISCUSSION: | ||||
| Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if a | ||||
| SYN segment is received with a valid (i.e., "in window") Sequence | ||||
| Number, an RST segment should be sent in response, and the | ||||
| connection should be aborted. | ||||
| The IETF has published an RFC, "Improving TCP's Resistance to | ||||
| Blind In-Window Attacks" [Ramaiah et al, 2008] which addresses, | ||||
| among others, this variant of TCP-based connection-reset attack. | ||||
| This section describes the counter-measure proposed by the IETF, a | ||||
| problem that may arise from the implementation of that solution, | ||||
| and a workaround to it. | ||||
| In order to mitigate this attack vector, [Ramaiah et al, 2008] | ||||
| proposes to change TCP's reaction to SYN segments as follows. | ||||
| When a SYN segment is received for a connection in any of the | ||||
| synchronized states, an Acknowledgement (ACK) segment is sent in | ||||
| response. | ||||
| As discussed in [Ramaiah et al, 2008], there is a corner-case that | ||||
| would not be properly handled by this mechanism. If a host (TCP | ||||
| A) establishes a TCP connection with a remote peer (TCP B), and | ||||
| then crashes, reboots and tries to initiate a new incarnation of | ||||
| the same connection (i.e., a connection with the same four-tuple | ||||
| as the previous connection) using an Initial Sequence Number equal | ||||
| to the RCV.NXT value at the remote peer (TCP B), the ACK segment | ||||
| sent by TCP B in response to the SYN segment would contain an | ||||
| Acknowledgement number that would be considered valid by TCP A, | ||||
| and thus an RST segment would not be sent in response to the | ||||
| Acknowledgement (ACK) segment. As this ACK would not have the SYN | ||||
| bit set, TCP A (being in the SYN-SENT state) would silently drop | ||||
| it (as stated on page 68 of RFC 793). After a Retransmission | ||||
| Timeout (RTO), TCP A would retransmit its SYN segment, which would | ||||
| lead to the same sequence of events as before. Eventually, TCP A | ||||
| would timeout, and the connection would be aborted. This is a | ||||
| corner case in which the introduced change would lead to a non- | ||||
| desirable behavior. However, we consider this scenario to be | ||||
| extremely unlikely and, in the event it ever took place, the | ||||
| connection would nevertheless be aborted after retrying for a | ||||
| period of USER TIMEOUT seconds. | ||||
| However, when this change is implemented exactly as described in | ||||
| [Ramaiah et al, 2008], the potential of interoperability problems | ||||
| is introduced, as a heuristic widely incorporated in many TCP | ||||
| implementations is disabled. | ||||
| In a number of scenarios a socket pair may need to be reused while | ||||
| the corresponding four-tuple is still in the TIME-WAIT state in a | ||||
| remote TCP peer. For example, a client accessing some service on | ||||
| a host may try to create a new incarnation of a previous | ||||
| connection, while the corresponding four-tuple is still in the | ||||
| TIME-WAIT state at the remote TCP peer (the server). This may | ||||
| happen if the ephemeral port numbers are being reused too quickly, | ||||
| either because of a bad policy of selection of ephemeral ports, or | ||||
| simply because of a high connection rate to the corresponding | ||||
| service. In such scenarios, the establishment of new connections | ||||
| that reuse a four-tuple that is in the TIME-WAIT state would fail. | ||||
| In order to avoid this problem, RFC 1122 [Braden, 1989] states (in | ||||
| Section 4.2.2.13) that when a connection request is received with | ||||
| a four-tuple that is in the TIME-WAIT state, the connection | ||||
| request could be accepted if the sequence number of the incoming | ||||
| SYN segment is greater than the last sequence number seen on the | ||||
| previous incarnation of the connection (for that direction of the | ||||
| data transfer). | ||||
| This requirement aims at avoiding the sequence number space of the | ||||
| new and old incarnations of the connection to overlap, thus | ||||
| avoiding old segments from the previous incarnation of the | ||||
| connection to be accepted as valid by the new connection. | ||||
| The requirement in [Ramaiah et al, 2008] to disregard SYN segments | ||||
| received for connections in any of the synchronized states forbids | ||||
| the implementation of the heuristic described above. As a result, | ||||
| we argue that the processing of SYN segments proposed in [Ramaiah | ||||
| et al, 2008] should apply only for connections in any of the | ||||
| synchronized states other than the TIME-WAIT state. | ||||
| 11.1.3. Security/Compartment | 11.1.3. Security/Compartment | |||
| If the security/compartment field of an incoming TCP segment does not | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP | |||
| match the value recorded in the corresponding TCB, TCP SHOULD NOT | security/compartment of an incoming segment does not exactly match | |||
| abort the connection, but simply discard the corresponding packet. | the security/compartment in the TCB, a RST segment should be sent, | |||
| Additionally, this whole event SHOULD be logged as a security | and the connection should be aborted. This certainly provides | |||
| violation. | another attack vector for performing connection-reset attacks, as an | |||
| attacker could forge TCP segments with a security/compartment that is | ||||
| DISCUSSION: | different from that recorded in the corresponding TCB and, as a | |||
| result, the attacked connection would be reset. | ||||
| Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if | ||||
| the IP security/compartment of an incoming segment does not | ||||
| exactly match the security/compartment in the TCB, a RST segment | ||||
| should be sent, and the connection should be aborted. | ||||
| A discussion of the IP security options relevant to this section | ||||
| can be found in Section 3.13.2.12, Section 3.13.2.13, and Section | ||||
| 3.13.2.14 of [CPNI, 2008]. | ||||
| This certainly provides another attack vector for performing | ||||
| connection-reset attacks, as an attacker could forge TCP segments | ||||
| with a security/compartment that is different from that recorded | ||||
| in the corresponding TCB and, as a result, the attacked connection | ||||
| would be reset. | ||||
| It is interesting to note that for connections in the ESTABLISHED | ||||
| state, this check is performed after validating the TCP Sequence | ||||
| Number and checking the RST bit, but before validating the | ||||
| Acknowledgement field. Therefore, even if the stricter validation | ||||
| of the Acknowledgement field (described in Section 3.4) was | ||||
| implemented, it would not help to mitigate this attack vector. | ||||
| This attack vector can be easily mitigated by relaxing the | [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such | |||
| reaction to TCP segments with "incorrect" security/compartment | that this issue is eliminated. | |||
| values as specified in this section. | ||||
| 11.1.4. Precedence | 11.1.4. Precedence | |||
| If the Precedence field of an incomming TCP segment does not match | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP | |||
| the value recorded in the corresponding TCB, TCP MUST NOT abort the | precedence of an incoming segment does not exactly match the | |||
| connection, and MUST instead continue processing the segment as | precedence in the TCB, a RST segment should be sent, and the | |||
| specified by RFC 793. | connection should be aborted. This certainly provides another attack | |||
| vector for performing connection-reset attacks, as an attacker could | ||||
| DISCUSSION: | forge TCP segments with a precedence that is different from that | |||
| recorded in the corresponding TCB and, as a result, the attacked | ||||
| Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if | connection would be reset. | |||
| the IP Precedence of an incoming segment does not exactly match | ||||
| the Precedence recorded in the TCB, a RST segment should be sent, | ||||
| and the connection should be aborted. | ||||
| This certainly provides another attack vector for performing | ||||
| connection-reset attacks, as an attacker could forge TCP segments | ||||
| with a IP Precedence that is different from that recorded in the | ||||
| corresponding TCB and, as a result, the attacked connection would | ||||
| be reset. | ||||
| It is interesting to note that for connections in the ESTABLISHED | ||||
| state, this check is performed after validating the TCP Sequence | ||||
| Number and checking the RST bit, but before validating the | ||||
| Acknowledgement field. Therefore, even if the stricter validation | ||||
| of the Acknowledgement field (described in Section 3.4) were | ||||
| implemented, it would not help to mitigate this attack vector. | ||||
| This attack vector can be easily mitigated by relaxing the | ||||
| reaction to TCP segments with "incorrect" IP Precedence values. | ||||
| That is, even if the Precedence field does not match the value | ||||
| recorded in the corresponding TCB, TCP should not abort the | ||||
| connection, and should instead continue processing the segment as | ||||
| specified by RFC 793. | ||||
| It is interesting to note that resetting a connection due to a | ||||
| change in the Precedence value might have a negative impact on | ||||
| interoperability. For example, the packets that correspond to the | ||||
| connection could temporarily take a different internet path, in | ||||
| which some middle-box could re-mark the Precedence field (due to | ||||
| administration policies at the network to be transited). In such | ||||
| a scenario, an implementation following the advice in RFC 793 | ||||
| would abort the connection, when the connection would have | ||||
| probably survived. | ||||
| While the IPv4 Type of Service field (and hence the Precedence | [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such | |||
| field) has been redefined by the Differentiated Services (DS) | that this issue is eliminated. | |||
| field specified in RFC 2474 [Nichols et al, 1998], RFC 793 | ||||
| [Postel, 1981c] was never formally updated in this respect. We | ||||
| note that both legacy systems that have not been upgraded to | ||||
| implement the differentiated services architecture described in | ||||
| RFC 2475 [Blake et al, 1998] and current implementations that have | ||||
| extrapolated the discussion of the Precedence field to the | ||||
| Differentiated Services field may still be vulnerable to the | ||||
| connection reset vector discussed in this section. | ||||
| 11.1.5. Illegal options | 11.1.5. Illegal options | |||
| TCP MUST silently drop those TCP segments that contain TCP options | Section 4.2.2.5 of RFC 1122 [RFC1122] discusses the processing of TCP | |||
| with illegal option lengths. | options. It states that TCP should be prepared to handle an illegal | |||
| option length (e.g., zero) without crashing, and suggests handling | ||||
| DISCUSSION: | such illegal options by resetting the corresponding connection and | |||
| logging the reason. However, this suggested behavior could be | ||||
| exploited to perform connection-reset attacks. | ||||
| Section 4.2.2.5 of RFC 1122 [Braden, 1989] discusses the | [draft-gont-tcpm-tcp-illegal-option-lengths-00] aims at formally | |||
| processing of TCP options. It states that TCP must be able to | updating RFC 1122, such that this issue is eliminated. | |||
| receive a TCP option in any segment, and must ignore without error | ||||
| any option it does not implement. Additionally, it states that | ||||
| TCP should be prepared to handle an illegal option length (e.g., | ||||
| zero) without crashing, and suggests handling such illegal options | ||||
| by resetting the corresponding connection and logging the reason. | ||||
| However, this suggested behavior could be exploited to perform | ||||
| connection-reset attacks. Therefore, as discussed in Section 3.10 | ||||
| of this document, we advise TCP implementations to silently drop | ||||
| those TCP segments that contain illegal option lengths. | ||||
| 11.2. Blind data-injection attacks | 11.2. Blind data-injection attacks | |||
| An attacker could try to inject data in the stream of data being | An attacker could try to inject data in the stream of data being | |||
| transferred on the connection. As with the other attacks described | transferred on the connection. As with the other attacks described | |||
| in Section 11 of this document, in order to perform a blind data | in Section 11 of this document, in order to perform a blind data | |||
| injection attack the attacker would need to know or guess the four- | injection attack the attacker would need to know or guess the four- | |||
| tuple that identifies the TCP connection to be attacked. | tuple that identifies the TCP connection to be attacked. | |||
| Additionally, he should be able to guess a valid ("in window") TCP | Additionally, he should be able to guess a valid ("in window") TCP | |||
| Sequence Number, and a valid Acknowledgement Number. | Sequence Number, and a valid Acknowledgement Number. | |||
| As discussed in Section 3.4 of this document, [Ramaiah et al, 2008] | As discussed in Section 3.4 of this document, [Ramaiah et al, 2008] | |||
| proposes to enforce a more strict check on the Acknowledgement Number | proposes to enforce a more strict check on the Acknowledgement Number | |||
| of incoming segments than that specified in RFC 793 [Postel, 1981c]. | of incoming segments than that specified in RFC 793 [RFC0793]. | |||
| Implementation of the proposed check requires more packets on the | Implementation of the proposed check requires more packets on the | |||
| side of the attacker to successfully perform a blind data-injection | side of the attacker to successfully perform a blind data-injection | |||
| attack. However, it should be noted that applications concerned with | attack. However, it should be noted that applications concerned with | |||
| any of the attacks discussed in Section 11 of this document should | any of the attacks discussed in Section 11 of this document should | |||
| make use of proper authentication techniques, such as those specified | make use of proper authentication techniques, such as those specified | |||
| for IPsec in RFC 4301 [Kent and Seo, 2005]. | for IPsec in RFC 4301 [Kent and Seo, 2005]. | |||
| 12. Information leaking | 12. Information leaking | |||
| NOTE: THIS SECTION IS BEING EDITED. | ||||
| 12.1. Remote Operating System detection via TCP/IP stack fingerprinting | 12.1. Remote Operating System detection via TCP/IP stack fingerprinting | |||
| Clearly, remote Operating System (OS) detection is a useful tool for | Clearly, remote Operating System (OS) detection is a useful tool for | |||
| attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the | attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the | |||
| operating system type and version of a remote system with an | operating system type and version of a remote system with an | |||
| amazingly accurate precision. This information can in turn be used | amazingly accurate precision. This information can in turn be used | |||
| by attackers to tailor their exploits to the identified operating | by attackers to tailor their exploits to the identified operating | |||
| system type and version. | system type and version. | |||
| Evasion of OS fingerprinting can prove to be a very difficult task. | Evasion of OS fingerprinting can prove to be a very difficult task. | |||
| skipping to change at page 92, line 6 ¶ | skipping to change at page 63, line 15 ¶ | |||
| 12.1.1. FIN probe | 12.1.1. FIN probe | |||
| TCP MUST silently drop TCP any segments received for a connection in | TCP MUST silently drop TCP any segments received for a connection in | |||
| the LISTEN state that do not have the SYN, RST, or ACK flags set. In | the LISTEN state that do not have the SYN, RST, or ACK flags set. In | |||
| the rest of the cases, the processing rules in RFC 793 MUST be | the rest of the cases, the processing rules in RFC 793 MUST be | |||
| applied. | applied. | |||
| DISCUSSION: | DISCUSSION: | |||
| The attacker sends a FIN (or any packet without the SYN or the ACK | The attacker sends a FIN (or any packet without the SYN or the ACK | |||
| flags set) to an open port. RFC 793 [Postel, 1981c] leaves the | flags set) to an open port. RFC 793 [RFC0793] leaves the reaction | |||
| reaction to such segments unspecified. As a result, some | to such segments unspecified. As a result, some implementations | |||
| implementations silently drop the received segment, while others | silently drop the received segment, while others respond with a | |||
| respond with a RST. | RST. | |||
| 12.1.2. Bogus flag test | 12.1.2. Bogus flag test | |||
| TCP MUST ignore any flags not supported, and MUST NOT reflect them if | TCP MUST ignore any flags not supported, and MUST NOT reflect them if | |||
| a TCP segment is sent in response to the one just received. | a TCP segment is sent in response to the one just received. | |||
| DISCUSSION: | DISCUSSION: | |||
| The attacker sends a TCP segment setting at least one bit of the | The attacker sends a TCP segment setting at least one bit of the | |||
| Reserved field. Some implementations ignore this field, while | Reserved field. Some implementations ignore this field, while | |||
| skipping to change at page 93, line 41 ¶ | skipping to change at page 64, line 49 ¶ | |||
| DISCUSSION: | DISCUSSION: | |||
| [Fyodor, 1998] reports that many implementations differ in the | [Fyodor, 1998] reports that many implementations differ in the | |||
| Acknowledgement Number they use in response to segments received | Acknowledgement Number they use in response to segments received | |||
| for connections in the CLOSED state. In particular, these | for connections in the CLOSED state. In particular, these | |||
| implementations differ in the way they construct the RST segment | implementations differ in the way they construct the RST segment | |||
| that is sent in response to those TCP segments received for | that is sent in response to those TCP segments received for | |||
| connections in the CLOSED state. | connections in the CLOSED state. | |||
| RFC 793 [Postel, 1981c] describes (in pages 36-37) how RST | RFC 793 [RFC0793] describes (in pages 36-37) how RST segments are | |||
| segments are to be generated. According to this RFC, the ACK bit | to be generated. According to this RFC, the ACK bit (and the | |||
| (and the Acknowledgment Number) is set in a RST only if the | Acknowledgment Number) is set in a RST only if the incoming | |||
| incoming segment that elicited the RST did not have the ACK bit | segment that elicited the RST did not have the ACK bit set (and | |||
| set (and thus the Sequence Number of the outgoing RST segment must | thus the Sequence Number of the outgoing RST segment must be set | |||
| be set to zero). However, we recommend TCP implementations to set | to zero). However, we recommend TCP implementations to set the | |||
| the ACK bit (and the Acknowledgement Number) in all outgoing RST | ACK bit (and the Acknowledgement Number) in all outgoing RST | |||
| segments, as it allows for additional validation checks to be | segments, as it allows for additional validation checks to be | |||
| enforced at the system receiving the segment. | enforced at the system receiving the segment. | |||
| 12.1.6. TCP options | 12.1.6. TCP options | |||
| Different implementations differ in the TCP options they enable by | Different implementations differ in the TCP options they enable by | |||
| default. Additionally, they differ in the actual contents of the | default. Additionally, they differ in the actual contents of the | |||
| options, and in the order in which the options are included in a TCP | options, and in the order in which the options are included in a TCP | |||
| segment. There is currently no recommendation on the order in which | segment. There is currently no recommendation on the order in which | |||
| to include TCP options in TCP segments. | to include TCP options in TCP segments. | |||
| skipping to change at page 95, line 36 ¶ | skipping to change at page 66, line 47 ¶ | |||
| [Rowland, 1996] contains a discussion of covert channels in the | [Rowland, 1996] contains a discussion of covert channels in the | |||
| TCP/IP protocol suite, with some TCP-based examples. [Giffin et al, | TCP/IP protocol suite, with some TCP-based examples. [Giffin et al, | |||
| 2002] describes the use of TCP timestamps for the establishment of | 2002] describes the use of TCP timestamps for the establishment of | |||
| covert channels. [Zander, 2008] contains an extensive bibliography | covert channels. [Zander, 2008] contains an extensive bibliography | |||
| of papers on covert channels, and a list of freely-available tools | of papers on covert channels, and a list of freely-available tools | |||
| that implement covert channels with the TCP/IP protocol suite. | that implement covert channels with the TCP/IP protocol suite. | |||
| 14. TCP Port scanning | 14. TCP Port scanning | |||
| NOTE: THIS SECTION IS BEING EDITED. | ||||
| TCP port scanning aims at identifying TCP port numbers on which there | TCP port scanning aims at identifying TCP port numbers on which there | |||
| is a process listening for incoming connections. That is, it aims at | is a process listening for incoming connections. That is, it aims at | |||
| identifying TCPs at the target system that are in the LISTEN state. | identifying TCPs at the target system that are in the LISTEN state. | |||
| The following subsections describe different TCP port scanning | The following subsections describe different TCP port scanning | |||
| techniques that have been implemented in freely-available tools. | techniques that have been implemented in freely-available tools. | |||
| These subsections focus only on those port scanning techniques that | These subsections focus only on those port scanning techniques that | |||
| exploit features of TCP itself, and not of other communication | exploit features of TCP itself, and not of other communication | |||
| protocols. | protocols. | |||
| For example, the following subsections do not discuss the | For example, the following subsections do not discuss the | |||
| skipping to change at page 97, line 5 ¶ | skipping to change at page 68, line 17 ¶ | |||
| scanning tool. | scanning tool. | |||
| 14.3. FIN, NULL, and XMAS scans | 14.3. FIN, NULL, and XMAS scans | |||
| TCP SHOULD respond with an RST when a TCP segment is received for a | TCP SHOULD respond with an RST when a TCP segment is received for a | |||
| connection in the LISTEN state, and the incoming segment has neither | connection in the LISTEN state, and the incoming segment has neither | |||
| the SYN bit nor the RST bit set. | the SYN bit nor the RST bit set. | |||
| DISCUSSION: | DISCUSSION: | |||
| RFC 793 [Postel, 1981c] states, in page 65, that an incoming | RFC 793 [RFC0793] states, in page 65, that an incoming segment | |||
| segment that does not have the RST bit set and that is received | that does not have the RST bit set and that is received for a | |||
| for a connection in the fictional state CLOSED causes an RST to be | connection in the fictional state CLOSED causes an RST to be sent | |||
| sent in response. Pages 65-66 of RFC 793 describes the processing | in response. Pages 65-66 of RFC 793 describes the processing of | |||
| of incoming segments for connections in the state LISTEN, and | incoming segments for connections in the state LISTEN, and | |||
| implicitly states that an incoming segment that does not have the | implicitly states that an incoming segment that does not have the | |||
| ACK bit set (and is not a SYN or an RST) should be silently | ACK bit set (and is not a SYN or an RST) should be silently | |||
| dropped. | dropped. | |||
| As a result, an attacker can exploit this situation to perform a | As a result, an attacker can exploit this situation to perform a | |||
| port scan by sending TCP segments that do not have the ACK bit set | port scan by sending TCP segments that do not have the ACK bit set | |||
| to the target system. When a port is "open" (i.e., there is a TCP | to the target system. When a port is "open" (i.e., there is a TCP | |||
| in the LISTEN state on the corresponding port), the target system | in the LISTEN state on the corresponding port), the target system | |||
| will respond with an RST segment. On the other hand, if the port | will respond with an RST segment. On the other hand, if the port | |||
| is "closed" (i.e., there is a TCP in the fictional state CLOSED) | is "closed" (i.e., there is a TCP in the fictional state CLOSED) | |||
| skipping to change at page 97, line 45 ¶ | skipping to change at page 69, line 9 ¶ | |||
| It should be clear that while the aforementioned control-bits | It should be clear that while the aforementioned control-bits | |||
| combinations are the most popular ones, other combinations could | combinations are the most popular ones, other combinations could | |||
| be used to exploit this port-scanning vector. For example, the | be used to exploit this port-scanning vector. For example, the | |||
| CWR, ECE, and/or any of the Reserved bits could be set in the | CWR, ECE, and/or any of the Reserved bits could be set in the | |||
| probe segments. | probe segments. | |||
| The advantage of this port-scanning technique is that in can | The advantage of this port-scanning technique is that in can | |||
| bypass some stateless firewalls. However, the downside is that a | bypass some stateless firewalls. However, the downside is that a | |||
| number of implementations do not comply strictly with RFC 793 | number of implementations do not comply strictly with RFC 793 | |||
| [Postel, 1981c], and thus always respond to the probe segments | [RFC0793], and thus always respond to the probe segments with an | |||
| with an RST, regardless of whether the port is open or closed. | RST, regardless of whether the port is open or closed. | |||
| This port-scanning vector can be easily defeated as rby responding | This port-scanning vector can be easily defeated as rby responding | |||
| with an RST when a TCP segment is received for a connection in the | with an RST when a TCP segment is received for a connection in the | |||
| LISTEN state, and the incoming segment has neither the SYN bit nor | LISTEN state, and the incoming segment has neither the SYN bit nor | |||
| the RST bit set. | the RST bit set. | |||
| 14.4. Maimon scan | 14.4. Maimon scan | |||
| If a TCP that is in the CLOSED or LISTEN states receives a TCP | If a TCP that is in the CLOSED or LISTEN states receives a TCP | |||
| segment with both the FIN and ACK bits set, it MUST respond with a | segment with both the FIN and ACK bits set, it MUST respond with a | |||
| RST. | RST. | |||
| DISCUSSION: | DISCUSSION: | |||
| This port scanning technique was introduced in [Maimon, 1996] with | This port scanning technique was introduced in [Maimon, 1996] with | |||
| the name "StealthScan" (method #1), and was later incorporated | the name "StealthScan" (method #1), and was later incorporated | |||
| into the nmap tool [Fyodor, 2006b] as the "Maimon scan". | into the nmap tool [Fyodor, 2006b] as the "Maimon scan". | |||
| This port scanning technique employs TCP segments that have both | This port scanning technique employs TCP segments that have both | |||
| the FIN and ACK bits sets as the probe segments. While according | the FIN and ACK bits sets as the probe segments. While according | |||
| to RFC 793 [Postel, 1981c] these segments should elicit an RST | to RFC 793 [RFC0793] these segments should elicit an RST | |||
| regardless of whether the corresponding port is open or closed, a | regardless of whether the corresponding port is open or closed, a | |||
| programming flaw found in a number of TCP implementations has | programming flaw found in a number of TCP implementations has | |||
| caused some systems to silently drop the probe segment if the | caused some systems to silently drop the probe segment if the | |||
| corresponding port was open (i.e., there was a TCP in the LISTEN | corresponding port was open (i.e., there was a TCP in the LISTEN | |||
| state), and respond with an RST only if the port was closed. | state), and respond with an RST only if the port was closed. | |||
| Therefore, an RST would indicate that the scanned port is closed, | Therefore, an RST would indicate that the scanned port is closed, | |||
| while the absence of a response from the target system would | while the absence of a response from the target system would | |||
| indicate that the scanned port is open. | indicate that the scanned port is open. | |||
| skipping to change at page 99, line 18 ¶ | skipping to change at page 70, line 33 ¶ | |||
| implement this policy. | implement this policy. | |||
| 14.6. ACK scan | 14.6. ACK scan | |||
| The so-called "ACK scan" is not really a port-scanning technique | The so-called "ACK scan" is not really a port-scanning technique | |||
| (i.e., it does not aim at determining whether a specific port is open | (i.e., it does not aim at determining whether a specific port is open | |||
| or closed), but rather aims at determining whether some intermediate | or closed), but rather aims at determining whether some intermediate | |||
| system is filtering TCP segments sent to that specific port number. | system is filtering TCP segments sent to that specific port number. | |||
| The probe packet is a TCP segment with the ACK bit set which, | The probe packet is a TCP segment with the ACK bit set which, | |||
| according to RFC 793 [Postel, 1981c] should elicit an RST from the | according to RFC 793 [RFC0793] should elicit an RST from the target | |||
| target system regardless of whether the corresponding TCP port is | system regardless of whether the corresponding TCP port is open or | |||
| open or closed. If no response is received from the target system, | closed. If no response is received from the target system, it is | |||
| it is assumed that some intermediate system is filtering the probe | assumed that some intermediate system is filtering the probe packets | |||
| packets sent to the target system. | sent to the target system. | |||
| It should be noted that this "port scanning" techniques exploits | It should be noted that this "port scanning" techniques exploits | |||
| basic TCP processing rules, and therefore cannot be defeated at an | basic TCP processing rules, and therefore cannot be defeated at an | |||
| end-system. | end-system. | |||
| 15. Processing of ICMP error messages by TCP | 15. Processing of ICMP error messages by TCP | |||
| TCP SHOULD silently ignore received ICMP Source Quench messages. | [RFC5927] analyzes a number of vulnerabilities based on crafted ICMP | |||
| messages, along with possible counter-measures. | ||||
| TCP SHOULD process ICMP "hard errors" as "soft errors" when they are | ||||
| received for connections that are in any of he synchronized states. | ||||
| TCP SHOULD process ICMP "fragmentation needed and DF bit set" and | ||||
| ICMPv6 "Packet Too Big" error messages as described in [RFC5927]. | ||||
| DISCUSSION: | ||||
| [RFC5927] analyzes a number of vulnerabilities based on crafted | ||||
| ICMP messages, along with possible counter-measures. | ||||
| 16. TCP interaction with the Internet Protocol (IP) | 16. TCP interaction with the Internet Protocol (IP) | |||
| 16.1. TCP-based traceroute | 16.1. TCP-based traceroute | |||
| The traceroute tool is used to identify the intermediate systems the | The traceroute tool is used to identify the intermediate systems the | |||
| local system and the destination system. It is usually implemented | local system and the destination system. It is usually implemented | |||
| by sending "probe" packets with increasing IP Time to Live values | by sending "probe" packets with increasing IP Time to Live values | |||
| (starting from 0), without maintaining any state with the final | (starting from 0), without maintaining any state with the final | |||
| destination. | destination. | |||
| Some traceroute implementations use ICMP "echo request" messages as | Some traceroute implementations use ICMP "echo request" messages as | |||
| the probe packets, while others use UDP packets or TCP SYN segments. | the probe packets, while others use UDP packets or TCP SYN segments. | |||
| skipping to change at page 102, line 28 ¶ | skipping to change at page 73, line 36 ¶ | |||
| This document provides a thorough security assessment of the | This document provides a thorough security assessment of the | |||
| Transmission Control Protocol (TCP), identifies a number of | Transmission Control Protocol (TCP), identifies a number of | |||
| vulnerabilities, and specifies possible counter-measures. | vulnerabilities, and specifies possible counter-measures. | |||
| Additionally, it provides implementation guidance such that the | Additionally, it provides implementation guidance such that the | |||
| resilience of TCP implementations is improved. | resilience of TCP implementations is improved. | |||
| 18. Acknowledgements | 18. Acknowledgements | |||
| The author would like to thank (in alphabetical order) David Borman, | The author would like to thank (in alphabetical order) David Borman, | |||
| Wesley Eddy, and Alfred Hoenes, for providing valuable feedback on | Wesley Eddy, Alfred Hoenes, and Michael Scharf, for providing | |||
| earlier versions of thi document. | valuable feedback on earlier versions of thi document. | |||
| This document is heavily based on the document "Security Assessment | This document is heavily based on the document "Security Assessment | |||
| of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by | of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by | |||
| Fernando Gont on behalf of CPNI (Centre for the Protection of | Fernando Gont on behalf of CPNI (Centre for the Protection of | |||
| National Infrastructure). | National Infrastructure). | |||
| The author would like to thank (in alphabetical order) Randall | The author would like to thank (in alphabetical order) Randall | |||
| Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav | Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav | |||
| Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal | Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal | |||
| Zalewski, and Christos Zoulas, for providing valuable feedback on | Zalewski, and Christos Zoulas, for providing valuable feedback on | |||
| skipping to change at page 103, line 6 ¶ | skipping to change at page 74, line 13 ¶ | |||
| Additionally, the author would like to thank (in alphabetical order) | Additionally, the author would like to thank (in alphabetical order) | |||
| Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon, | Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon, | |||
| John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill | John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill | |||
| Squier, and David White, who generously answered a number of | Squier, and David White, who generously answered a number of | |||
| questions that araised while the aforementioned document was being | questions that araised while the aforementioned document was being | |||
| written. | written. | |||
| Finally, the author would like to thank CPNI (formely NISCC) for | Finally, the author would like to thank CPNI (formely NISCC) for | |||
| their continued support. | their continued support. | |||
| 19. References | 19. References (to be translated to xml) | |||
| Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0 | Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0 | |||
| Routing Headers in IPv6. RFC 5095. | Routing Headers in IPv6. RFC 5095. | |||
| Allman, M. 2003. TCP Congestion Control with Appropriate Byte | Allman, M. 2003. TCP Congestion Control with Appropriate Byte | |||
| Counting (ABC). RFC 3465. | Counting (ABC). RFC 3465. | |||
| Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available | Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available | |||
| at: http://www.icir.org/mallman/share/ports-dec08.pdf | at: http://www.icir.org/mallman/share/ports-dec08.pdf | |||
| skipping to change at page 108, line 13 ¶ | skipping to change at page 79, line 22 ¶ | |||
| Protocol. RFC 4301. | Protocol. RFC 4301. | |||
| Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321. | Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321. | |||
| Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa. | Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa. | |||
| Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12. | Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12. | |||
| Available at: http://www.phrack.org/issues.html?issue=57&id=3#article | Available at: http://www.phrack.org/issues.html?issue=57&id=3#article | |||
| Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923. | Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923. | |||
| Larsen, M., Gont, F. 2008. Port Randomization. IETF Internet-Draft | ||||
| (draft-ietf-tsvwg-port-randomization-02), work in progress. | ||||
| Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache. | Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache. | |||
| Proceedings of the BSDCon 2002 Conference, pp 89-98. | Proceedings of the BSDCon 2002 Conference, pp 89-98. | |||
| Maimon, U. 1996. Port Scanning without the SYN flag. Phrack | Maimon, U. 1996. Port Scanning without the SYN flag. Phrack | |||
| Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10. | Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10. | |||
| Available at: | Available at: | |||
| http://www.phrack.org/issues.html?issue=49&id=15#article | http://www.phrack.org/issues.html?issue=49&id=15#article | |||
| Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective | Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective | |||
| Acknowledgment Options. RFC 2018. | Acknowledgment Options. RFC 2018. | |||
| skipping to change at page 113, line 9 ¶ | skipping to change at page 84, line 10 ¶ | |||
| IFIP Communications and Multimedia Security Conference (CMS 2002). | IFIP Communications and Multimedia Security Conference (CMS 2002). | |||
| Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html | Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html | |||
| Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC | Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC | |||
| 1146. | 1146. | |||
| 20. References | 20. References | |||
| 20.1. Normative References | 20.1. Normative References | |||
| [I-D.ietf-tcpm-tcp-timestamps] | [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | |||
| Gont, F., "Reducing the TIME-WAIT state using TCP | RFC 793, September 1981. | |||
| timestamps", draft-ietf-tcpm-tcp-timestamps-03 (work in | ||||
| progress), December 2010. | ||||
| [I-D.ietf-tsvwg-port-randomization] | [RFC1122] Braden, R., "Requirements for Internet Hosts - | |||
| Larsen, M. and F. Gont, "Transport Protocol Port | Communication Layers", STD 3, RFC 1122, October 1989. | |||
| Randomization Recommendations", | ||||
| draft-ietf-tsvwg-port-randomization-09 (work in progress), | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", | ||||
| RFC 3168, September 2001. | ||||
| [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | ||||
| Control", RFC 5681, September 2009. | ||||
| [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | ||||
| Robustness to Blind In-Window Attacks", RFC 5961, | ||||
| August 2010. | August 2010. | |||
| [RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- | ||||
| Protocol Port Randomization", BCP 156, RFC 6056, | ||||
| January 2011. | ||||
| [RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the | [RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the | |||
| TCP Urgent Mechanism", RFC 6093, January 2011. | TCP Urgent Mechanism", RFC 6093, January 2011. | |||
| [RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP | ||||
| Timestamps", BCP 159, RFC 6191, April 2011. | ||||
| [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence | ||||
| Number Attacks", RFC 6528, February 2012. | ||||
| 20.2. Informative References | 20.2. Informative References | |||
| [I-D.gont-timestamps-generation] | [I-D.gont-timestamps-generation] | |||
| Gont, F. and A. Oppermann, "On the generation of TCP | Gont, F. and A. Oppermann, "On the generation of TCP | |||
| timestamps", draft-gont-timestamps-generation-00 (work in | timestamps", draft-gont-timestamps-generation-00 (work in | |||
| progress), June 2010. | progress), June 2010. | |||
| [I-D.ietf-tcpm-3517bis] | ||||
| Blanton, E., Jarvinen, I., Wang, L., Allman, M., Kojo, M., | ||||
| and Y. Nishida, "A Conservative Selective Acknowledgment | ||||
| (SACK)-based Loss Recovery Algorithm for TCP", | ||||
| draft-ietf-tcpm-3517bis-01 (work in progress), | ||||
| January 2012. | ||||
| [Morris1985] | ||||
| Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP | ||||
| Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, | ||||
| NJ, 1985. | ||||
| [RFC1025] Postel, J., "TCP and IP bake off", RFC 1025, | ||||
| September 1987. | ||||
| [RFC1379] Braden, B., "Extending TCP for Transactions -- Concepts", | ||||
| RFC 1379, November 1992. | ||||
| [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. | [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. | |||
| [RFC6429] Bashyam, M., Jethanandani, M., and A. Ramaiah, "TCP Sender | ||||
| Clarification for Persist Condition", RFC 6429, | ||||
| December 2011. | ||||
| [Shimomura1995] | ||||
| Shimomura, T., "Technical details of the attack described | ||||
| by Markoff in NYT", | ||||
| http://www.gont.com.ar/docs/post-shimomura-usenet.txt, | ||||
| Message posted in USENET's comp.security.misc newsgroup, | ||||
| Message-ID: <3g5gkl$5j1@ariel.sdsc.edu>, 1995. | ||||
| Appendix A. TODO list | Appendix A. TODO list | |||
| A Number of formatting issues still have to be fixed in this | A Number of formatting issues still have to be fixed in this | |||
| document. Among others are: | document. Among others are: | |||
| o The ASCII-art corresponding to some figures are still missing. We | o The ASCII-art corresponding to some figures are still missing. We | |||
| still have to convert the nice JPGs of the UK CPNI document into | still have to convert the nice JPGs of the UK CPNI document into | |||
| ugly ASCII-art. | ugly ASCII-art. | |||
| o The references have not yet been converted to xml, but are | o The references have not yet been converted to xml, but are | |||
| hardcoded, instead. That's why they may not look as expected | hardcoded, instead. That's why they may not look as expected | |||
| Appendix B. Change log (to be removed by the RFC Editor before | Appendix B. Change log (to be removed by the RFC Editor before | |||
| publication of this document as an RFC) | publication of this document as an RFC) | |||
| B.1. Changes from draft-ietf-tcpm-tcp-security-01 | B.1. Changes from draft-ietf-tcpm-tcp-security-02 | |||
| o Lots of text has been removed out of the document. | ||||
| o The documento track has been changed from BCP to Informational | ||||
| (RFC2119-language recommendations ahve been removed). | ||||
| o Where necessary, stand-alone std tracks documents have been | ||||
| produced. | ||||
| B.2. Changes from draft-ietf-tcpm-tcp-security-01 | ||||
| A Number of formatting issues still have to be fixed in this | A Number of formatting issues still have to be fixed in this | |||
| document. Among others are: | document. Among others are: | |||
| o The whole document was reformatted with RFC 1122 style. | o The whole document was reformatted with RFC 1122 style. | |||
| Author's Address | Author's Address | |||
| Fernando Gont | Fernando Gont | |||
| UK Centre for the Protection of National Infrastructure | UK Centre for the Protection of National Infrastructure | |||
| End of changes. 192 change blocks. | ||||
| 2391 lines changed or deleted | 1059 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||