Network Working Group Bellovin and Leech Internet Draft AT&T Labs Research Expiration Date: December 2001 March 2000 ICMP Traceback Messages draft-ietf-itrace-00.txt 1. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Abstract It is often useful to learn the path that packets take through the Internet, especially when dealing with certain denial-of-service attacks. We propose a new ICMP [RFC792] message, emitted randomly by routers along the path and sent to the destination. Bellovin [Page 1] Internet Draft draft-ietf-itrace-00.txt March 2000 3. Introduction It is often useful to learn the path that packets take through the Internet. This is especially important for dealing with certain denial-of-service attacks, where the source IP is forged. There are other uses as well, including path characterization and detection of asymmetric routes. There are existing tools, such as traceroute, but these generally provide the forward path, not the reverse. We propose an ICMP Traceback message to help solve this problem. When forwarding packets, routers can, with a low probability, generate a Traceback message that is sent along to the destination. With enough Traceback messages from enough routers along the path, the traffic source and path can be determined. 3.1. Requirements Keywords The keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", and "MAY" that appear in this document are to be interpreted as described in [RFC2119]. 4. Message Definition A router implementing this scheme SHOULD generate and emit an ICMP Traceback packet with probability of about 1/20,000, although local site policy MAY adjust this to better suit local link utilization metrics. The message is carried in an ICMP packet, with ICMP TYPE of TRACEBACK and ICMP CODE of NOTIFY. (The numeric values for these fields will be assigned by IANA.) Any ICMP TRACEBACK message contains individual elements that are self-identifying, using a TAG,LENGTH,VALUE scheme as follows: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TAG | LENGTH | VALUE... . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Elements may appear in any order, and a receiver MUST be capable of processing elements in any order. The TAG field is a single octet, with values defined below. LENGTH is always set to the length of the VALUE field, and always occupies two octets, even when the length of the VALUE field is less Bellovin [Page 2] Internet Draft draft-ietf-itrace-00.txt March 2000 than 256 octets. 4.1. Link Fields The purpose of the link fields is to permit easy construction of a chain of Traceback messages. They are further designed for examination by network operations personnel, and thus contain human- useful information such as interface names. The subfields of a link field are always arranged in "forward order". Each subfield is a separate TLV within the link field TLV. That is, the "destination" subfield is always the address of the router closer to the ultimate recipient of the traceback packet. Thus, on back link packets, the generator's own address is the destination; on forward link packets, the generator's address is the source address. A link field consists of three subfields: the interface name of the generator (it is assumed that the generator does not know its neighbors' interface names), the source and destination IP addresses of the two routers (with appropriate IPv4/IPv6 indicators), and the link-level association string. The association string is an opaque blob that is known to and used by both routers. On LANs, it is constructed by concatenating the source and destination MAC addresses of the pair of machines. If there are no such addresses (say, for a point-to-point link), a suitable string MUST be provisioned in both routers. This field is used to tie together Traceback messages emitted by adjacent routers. Recipients SHOULD use the TTL field differences in conjunction with the link fields to verify the chain. 4.1.1. Back Link (TAG=0x01) This is a compound element, which may contain one or more MAC address elements, IPV4 address elements, IPV6 address elements, and Vendor- defined elements. It is intended to provide identifying information, from the perspective of the router, about the link that the traced packet arrived from. This element MUST contain an Interface Name element. Address elements must appear in pairs, with the first in the pair being the "source" and the second in the pair being "destination" (see below). Bellovin [Page 3] Internet Draft draft-ietf-itrace-00.txt March 2000 4.1.2. Forward link (TAG=0x02) This element is a compound element that can contain the same elements as the Back Link element. It is intended to provide identifying information, from the perspective of the router, about the link that the traced packet was forwarded on. 4.1.3. MAC address pair (TAG=0x03) This element is usually contained within a Forward or Back link element, and contains two 6-octet IEEE MAC addresses of the corresponding link. 4.1.4. IPV4 address pair (TAG=0x04) This element is usually contained within a Forward or Back link element, and contains two 4-octet IPV4 addresses of the corresponding link. 4.1.5. IPV6 address pair (TAG=0x05) This element is usually contained within a Forward or Back link element, and contains two 16-octet IPV6 addresses of the corresponding link. 4.1.6. Vendor-defined link identifier (TAG=0x06) This element is usually contained within the Forward or Back link element, and is an opaque field of varying length. Further definition will emerge in a later document. 4.1.7. Interface name (TAG=0x07) This element is usually contained with the Forward or Back link element, and contains the interface name of the generating router. Bellovin [Page 4] Internet Draft draft-ietf-itrace-00.txt March 2000 4.2. Timestamp (TAG=0x08) This element contains the time, in NTP timestamp format, that the traced packet arrived at the router. 4.3. Traced packet (TAG=0x09) This element provides the contents of the traced packet, as much as can reasonably fit, subject to link and router resource constraints. 4.4. Probability (TAG=0x0A) This element contains the inverse of the probability used to select the traced packet. It appears as an unsigned integer, of one, two, or four octets. 4.5. RouterId (TAG=0x0B) This element contains opaque identifying information, useful to the organization that operates the router emitting the ITRACE message. 4.6. Public-key Information (TAG=0x0C) This element contains a URL, pointing to an XML page that contains the public key used to sign key-disclosure elements. 4.7. Key disclosure list (TAG=0x0D) This element contains one or more key disclosure elements constructed as follows: algorithm identifier, one octet: PKCS7-RSA-MD5, ????-DSS-SHA1 keyid: eight octets validity: two NTP timestamps giving validity period (start, end) key length: one octet key material: variable [key length] octets Keying material for the chosen HMAC function MUST conform to the Bellovin [Page 5] Internet Draft draft-ietf-itrace-00.txt March 2000 requirements for keys outlined in [RFC2104]. siglength: two octets. unsigned integer number of octets of signature signature: variable [siglength] octets This field is variable, depending on the selected signature algorithm and format. The signature covers the entire key disclosure element, less the signature field itself. 4.8. Authentication Some requirements are imposed on the IP header of the Traceback message. In particular, the source address SHOULD be that associated with the interface on which the packet arrived. If that interface has multiple addresses, the address chosen SHOULD, if possible, be the one by which this router is known to the previous hop. If the interface has no IP address, the "primary" IP address associated with the router MAY be used. ("Primary" is discussed below.) The initial TTL field MUST be set to 255. If the Traceback packet follows the same path as the data packets, this provides an unambiguous indication of the distance from this router to the destination. More importantly, by comparing the distances with the link fields, a chain can be constructed and partially verified even without examining the authentication fields. 4.9. Authentication data An attacker may try to generate fake Traceback messages, primarily to conceal the source of the real attack traffic, but also to act as another form of attack. We thus need authentication techniques that are robust but quite cheap to verify. The ideal form of authentication would be a digital signature. It is unlikely, though, that routers will be able to afford such signatures on all Traceback packets. Thus, although we leave hooks for such a variant, we do not further define it at this time. Bellovin [Page 6] Internet Draft draft-ietf-itrace-00.txt March 2000 4.9.1. HMAC Authentication data (TAG=0x0E) This element contains three subfields: algorithm, one octet: HMAC-MD5-128, HMAC-MD5-96, HMAC-SHA1-160, HMAC-SHA1-96 keyid: eight octet key identifier MAC data: variable The MAC data field covers the entire IP datagram, including header information. Where header information is mutable during transport, such information is set to zero (0x00) for purposes of calculating the HMAC. This field is as long as is appropriate for the given MAC algorithm. 4.9.2. Key Disclosure A packet SHOULD contain a list of recently-used keys for hash algorithms. Each key is a a separate TLV within the keylist TLV; within the key TLV, there are subfields for original use time (in NTP format), lifetime in seconds (two bytes), algorithm identifier (1 byte), and key (the balance of the field). 4.9.3. PKI Requirements Digital signatures are useless without some way of authenticating the public key of the signer. The ideal form of authentication would be a certificate-based scheme rooted in the address registries. That is, the registries are the authoritative source of information on who owns which addresses; they are thus the only party that can easily issue such certificates. Until such a PKI is in existence, we suggest that each ISP publish its own root public key. Current registry-based databases can be used to verify the owner of an address block; this information can in turn be used to locate the appropriate root key. The public-key information element can be used to discover the appropriate public keys, and other related information. Bellovin [Page 7] Internet Draft draft-ietf-itrace-00.txt March 2000 5. Implementation Requirements The probability of Traceback generation SHOULD be adjustable by the operator of the router. A default value of about 1/20000 is suggested. If the average maximum diameter of the Internet is 20 hops, that translates to a net increase in traffic at the destination of about .1%; should not be an undue burden on the recipient. The probablity SHOULD NOT be greater than 1/1000. Packet selection SHOULD be based on a pseudo-random number, rather than a simple counter. This will help block attempts to time attack bursts. There does not appear to be any requirement for cryptographically strong pseudo-random numbers. A suggested scheme involves examination of the low-order bits of a linear congruential pseudo-random number generator. If they are all set to 1, the packet should be emitted. This permits easy selection of probabilities 1/8191, 1/16383, etc. N.B. While the low-order bits of LCPRNGs are not very random, that does not matter here. As long as the period of the generator is maximal, all values, including all 1s in the low-order bits, will occur with the proper probability. Although this document describes a router-based implementation of Traceback messages, most of the functionality can be implemented via outboard devices. For example, suitable laptop computers can be used to monitor LANs, and emit the traceback messages as appropriate, on behalf of all of the routers on that LAN. 6. Related Work Another scheme proposed for packet Traceback is by Savage et al. [SWKA00]. It relies on a very clever encoding of the path in the IP header's ID field. That is, in-flight packets may have their ID field changed to provide information about the path. The recipient can decode this information. There are a number of advantages of this compared to ICMP Traceback. No extra traffic is generated. More importantly, the trace information is bound to the packets, and hence doesn't follow a different path and isn't differentially blocked by firewalls or policy routing mechanisms. However, there are disadvantages as well. For one thing, the ID field cannot be changed if fragmentation is necessary (though they propose some schemes to ameliorate this). AH [RFC2402] provides cryptographic protection for the ID field; if it is modified, the packet will be discarded by the receiving system. And IPv6 has no ID field at all. A number of other packet-marking schemes have been proposed. Bellovin [Page 8] Internet Draft draft-ietf-itrace-00.txt March 2000 A different approach is hash-based traceback, by Snoeren et al. [SPSSJTK01]. In this scheme, routers along the path are queried about whether or not they have seen a certain packet; a very compact representation is used to store recent history. The problem is that queries must be done very soon after the attack, unless the routers have some way of offloading historical data to bulk storage. [SDS00] descibes a scheme for coupling IDS systems. A sensor that detects an attack tells its neighbors; they in turn look for the same signature, and notify their neighbors. The current prototype only works within an administrative domain; work is currently under way to produce an inter-domain version. 7. Security Considerations It is quite clear that this scheme cannot cope with all conceivable denial of service attacks. It is limited to those where a significant amount of traffic is coming from a relatively small number of sources. Furthermore, those sources must themselves be in some sense evil or corrupted. An attack based on inducing innocent and uncorrupted machines to send traffic to the victim would be traceable only to these machines, and not to the real attackers. 8. Acknowledgements The ICMP Traceback message is the product of an informal research group; members include (in alphabetical order) Steven M. Bellovin, Matt Blaze, Bill Cheswick, Cory Cohen, Jon David, Jim Duncan, Jim Ellis, Paul Ferguson, John Ioannidis, Marcus Leech, Perry Metzger, Robert Stone, Vern Paxson, Ed Vielmetti, Wietse Venema. 9. References [RFC792] "Internet Control Message Protocol". J. Postel. Sep-01-1981. [RFC2104] "HMAC: Keyed-Hashing for Message Authentication". H. Krawczyk, M. Bellare, R. Canetti. February 1997. [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels". S. Bradner. March 1997. [RFC2402] "IP Authentication Header". S. Kent and R. Atkinson. November 1998. Bellovin [Page 9] Internet Draft draft-ietf-itrace-00.txt March 2000 [SWKA00] "Practical Network Support for IP Traceback", Stefan Savage, David Wetherall, Anna Karlin and Tom Anderson, Department of Computer Science and Engineering, University of Washington, Technical Report UW-CSE-2000-02-01, http://www.cs.washington.edu/homes/savage/traceback.html. [SDS00] "Infrastructure for Intrusion Detection and Response," D. Schnackenberg, K. Djahandari, and D. Sterne, Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX), Hilton Head Island, SC, January 25-27, 2000. [SPSSJTK01] "Hash-Based IP Traceback," A.C. Snoeren, C. Partridge, L.A. Sanchez, W.T. Strayer, C.E. Jones, F. Tchakountio, and S.T. Kent. BBN Technical Memorandum No. 1284. http://www.ir.bbn.com/documents/techmemos/TM1284.ps. 10. Author Information Steven M. Bellovin, Editor AT&T Labs Research Shannon Laboratory 180 Park Avenue Florham Park, NJ 07974 USA Phone: +1 973-360-8656 Email: smb@research.att.com Marcus D. Leech Nortel Networks P.O. Box 3511, Station C Ottawa, ON Canada, K1Y 4H7 Phone: +1 613-763-9145 Email: mleech@nortelnetworks.com Bellovin [Page 10]