TCP Maintenance and Minor M. Jethanandani Extensions Cisco Systems Internet-Draft M. Bashyam Intended status: Informational Ocarina Systems, Inc Expires: August 11, 2007 February 7, 2007 draft-mahesh-persist-timeout-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 11, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This informational document describes how a connection can get stuck in persist state and its implication on the system if there is no mechanism to timeout this state. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Jethanandani & Bashyam Expires August 11, 2007 [Page 1] Internet-Draft Improving TCP robustness in persist state February 2007 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Role of Application . . . . . . . . . . . . . . . . . . . . . . 5 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7.1. Normative References . . . . . . . . . . . . . . . . . . . 6 7.2. Informative References . . . . . . . . . . . . . . . . . . 6 Appendix A. An Appendix . . . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 Intellectual Property and Copyright Statements . . . . . . . . . . 8 Jethanandani & Bashyam Expires August 11, 2007 [Page 2] Internet-Draft Improving TCP robustness in persist state February 2007 1. Introduction RFC 1122 [RFC1122] Section 4.2.2.17, page 92 says that: A TCP MAY keep its offered receive window closed indefinitely. As long as the receiving TCP continues to send acknowledgments in response to the probe segments, the sending TCP MUST allow the connection to stay open. The RFC goes on to say that it is important to remember that ACK (acknowledgement) segments that contain no data are not reliably transmitted by TCP. Therefore zero window probing SHOULD be supported to prevent a connection from hanging forever if ACK segments that re-opens the window is lost. While it is clear why the sender needs to continue to probe the receiver, it is not clear why this process needs to be indefinite, particularly if the receiver reliably responds with a ACK and a window of zero. The particular situation we ran into was with a gaming client that would receive regular updates of the ensuing game from the server. At some point the client decided to pause the game, effectively telling the application to stop reading data from the TCP connection. Another example of such a setup is a HTTP based Web conferencing. The effect of the client that stops reading data is that the server continues to send data till the advertised window goes down to zero at which time the connection enters persist state. Since the server has more buffers with data for the client, it will continue to probe the receiver. However, it is not clear what the sender is supposed to do if the receiver never exits this state. If the sender is servicing several such clients the effect compounds itself to the extent that the system runs out of buffers and or connection resources. The situation therefore lends itself to a DoS attack specially because legitimate connections get dropped or start seeing degraded service. It is quite possible that the receiving end enters the persist state by advertising a zero window and all subsequent window probes will result in a zero window being advertised towards the sender. This could result in the sender holding on to large number of buffers/ data. The problem is applicable to TCP and TCP derived transport protocol like SCTP. Jethanandani & Bashyam Expires August 11, 2007 [Page 3] Internet-Draft Improving TCP robustness in persist state February 2007 2. Solution The current behavior of the connection in persist state SHALL continue to exist as the default behavior. We are proposing an option to enable an upper bound to the persist state with an absolute time limit or via a set number of retires. To enable an upper bound to the persist state, the administrator MAY configure an option. The option SHOULD be configured as a time or number of retries. If both the options are configured, whichever option kicks in first will take effect. If the configured option is time then that implies how long the connection will be allowed to stay in persist state. The configured option is called persist-state-expiry-time. When the connection enters persist state, i.e. the receiver advertises a window of zero, the value of current time is saved in the connection entry. This entry is called persist-entry-time. Thereafter every time the persist timer expires, and before it is set, or when an ACK is received that continues to advertise zero window, a check is done to make sure that the difference between current time and persist-entry- time is not more than persist-state-expiry-time. If it is then the connection is reset and the connection resources are reclaimed by TCP. Any time after the connection has gone into persist state and before reset of the connection, if the receiver advertises a non-zero window, the persist-entry-time is cleared. If the configured option is number of retries it implies the number of retries that will be made before the connection is aborted. The configured option is called persist-state-expiry-retries. When the connection enters persist state, i.e. the receiver advertises a window of zero, the count of retries called persist-state-retry-count in the connection entry is cleared. Thereafter every time the persist timer expires, and before it is set, or when and ACK is received that continues to advertise zero window, a check is done to make sure that persist-state-retry-count does not exceed persist- state-expiry-retries. If it does, the connection is reset and the connection resources are reclaimed by TCP. Any time after the connection has gone into persist state and before reset of the connection, if the receiver advertises a non-zero window, the persist-state-expiry-retries is cleared. If the difference between the current retry count and persist-entry-expiry-count is less than the persist-state-expiry-retries, the current retry count is incremented by one. This configuration option of persist-state- expiry-retries is more coarse grained compared to the persist-state- expiry-time option. Jethanandani & Bashyam Expires August 11, 2007 [Page 4] Internet-Draft Improving TCP robustness in persist state February 2007 3. Role of Application In order to understand if application can play a role in solving this problem, one needs to understand the current behavior of application vis-a-vis TCP. Applications today do not know if a connection is stuck in persist state, Application in most cases is even unaware why TCP is not sending any more data. It cannot distinguish between packets getting dropped because of network issues or send window not advancing because the other end has closed the window. Trying to keep the application appraised of what is causing the problem only takes care of that particular connection and that particular application. It does not take care of all applications and all connections that might be in persist state. TCP in most cases will not signal that a connection is blocked. This is particularly true if there are buffers available or application has no more data to send. If the application were to poll TCP to get the information, it is not clear how often it would need to poll. As described before TCP MAY not send more data because of several reasons and in most cases the polling will show that the connection MAY not even be in persist state. It is quite possible that the application that is encountering the problem may not have implemented a way to detect and close the connection. Since the impact of a connection in persist state is system wide all applications have to have implemented the option for the solution to be effective. Even one application that has not implemented the option can cause the entire system to be impacted. It is also not possible to get every application to implement detection of persist state and have it turn on the option. It is also possible for applications to write data and exit before the data is sent. An example of this application is HTTP server. When a HTTP server receives a HTTP request like a GET, the server will respond with data and go ahead and close the socket even before TCP has finished sending all the data. In that case, TCP has no application it can inform to take action on a connection stuck in persist state. There are cases where the system is application agnostic. A classic case of this is a TCP proxy. In that particular case, there is no end application that can be informed of the state of the connection for the application to take action. Resources like TCP buffers are system wide resources and are not tied to any particular application. TCP needs to be able to monitor Jethanandani & Bashyam Expires August 11, 2007 [Page 5] Internet-Draft Improving TCP robustness in persist state February 2007 buffer usage on a per connection basis for it to detect and drop packets on connections that are taking up a lot of buffers. TCP cannot rely on an application to perform the task of looking at buffers system wide. Therefore we believe applications have at best a limited role to play is solving this problem. TCP already keeps track of connections in persist state. It is in a central position to look at this state system wide. The advantage of doing this in TCP is that once enabled, the entire system including all the applications benefit. Moreover, resources like buffers which are system wide can be monitored by TCP to determine when to reset a connection and reclaim the resources. The code change required to time bound persist state is minimal and easy to implement. 4. IANA Considerations This document makes no request of IANA. 5. Security Considerations This document discusses one security consideration. That is the possible Denial of Service Attack discussed in Section 1. 6. Acknowledgements Thanks to Anantha Ramiah for helping in providing feedback on this draft. 7. References 7.1. Normative References [RFC1122] Braden, R., "Requirements for Internet Hosts - Communication Layers", STD 3, RFC 1122, October 1989. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 7.2. Informative References Jethanandani & Bashyam Expires August 11, 2007 [Page 6] Internet-Draft Improving TCP robustness in persist state February 2007 Appendix A. An Appendix Authors' Addresses Mahesh Jethanandani Cisco Systems 170 West Tasman Drive San Jose, California 95134 USA Phone: +1-408-527-8230 Fax: +1-408-527-0147 Email: mahesh@cisco.com URI: www.cisco.com Murali Bashyam Ocarina Systems, Inc Fremont, CA USA Phone: Fax: Email: mbashyam@ocarinatech.com URI: Jethanandani & Bashyam Expires August 11, 2007 [Page 7] Internet-Draft Improving TCP robustness in persist state February 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Jethanandani & Bashyam Expires August 11, 2007 [Page 8]