idnits 2.17.1 

draft-burri-irc-continuation-message-lines-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 4
     longer pages, the longest (page 2) being 59 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 5 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** There are 15 instances of too long lines in the document, the longest
     one being 1 character in excess of 72.

  ** There is 1 instance of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (Jan 2002) is 8129 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 6 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Draft                                            	 C. Burri
2	                                                       Synecta Informatik
3	                                                         Expires Jan 2002

5	                  Handling IRC continuation message lines
6	              draft-burri-irc-continuation-message-lines-00.txt

8	Status of this Memo

10	   This document is an Internet-Draft and is subject to
11	   all provisions of Section 10 of RFC2026.

13	   Internet-Drafts are working documents of the Internet Engineering
14	   Task Force (IETF), its areas, and its working groups.  Note that
15	   other groups may also distribute working documents as
16	   Internet-Drafts.

18	   Internet-Drafts are draft documents valid for a maximum of six
19	   months and may be updated, replaced, or obsoleted by other
20	   documents at any time.  It is inappropriate to use Internet-
21	   Drafts as reference material or to cite them other than as
22	   "work in progress."

24	   The list of current Internet-Drafts can be accessed at
25	   http://www.ietf.org/1id-abstracts.html

27	   The list of Internet-Draft Shadow Directories can be accessed at
28	   http://www.ietf.org/shadow.html

30	Abstract

32	   Due to the way the IRC protocol is implemented, it may occur that a
33	   server sends incomplete messages to a client, so called continuation
34	   message lines.

36	   There seems to exist confusion about how to handle continuation
37	   message lines; many implementations are broken and do not respect
38	   them at all. Others rely on timers to complete continuation lines,
39	   which is not recommended due to the asyncronous nature of IRC
40	   communications.

42	   This Memo proposes an algorithm to handle continuation lines received
43	   from an IRC connection in such way that no timers are needed, and is
44	   intended as a supplement to the existing RFC 1459 which describes the
45	   Internet Relay Chat protocol.

47	Copyright Notice

49	   Copyright (C) The Internet Society (2001). All Rights Reserved.

51	Table of Contents

53	   1. Introduction......................................................2
54	   2. Handling continuation message lines...............................3
55	      2.1. IRC Message format...........................................3
56	      2.2. Discovery....................................................4
57	      2.3. Reassembly...................................................4
58	   3. Credits and authors' adress.......................................5

60	1. Introduction

62	   Current IRC implementations utilize input and output buffers for async
63	   network IO, whereas the input buffers are always processed first. All
64	   output gets stacked in the send queue, and is not sent to the client
65	   until processing of the input buffer has completed. This process helps
66	   TCP build larger packets, as possibily multiple messages are bundled
67	   into one network transmission (TCP segment). For more information
68	   consult RFC 1459, Sections 8.2, 8.3

70	   The same process can however lead to incomplete messages, which are
71	   cropped due to TCP limitations, namely the TCP window size. Such
72	   lines appear incomplete to the client, which does not normally cause
73	   any problems by itself. However, the line that follows the truncated
74	   line will be incomplete too, only containing data that did not fit
75	   within the last TCP segment. This line is special in such way that,
76	   if it is treatened like a normal line, then this might lead to
77	   arbitrary data being parsed as a complete message from the server.

79	   This circumstance has been observed to cause problems in various IRC
80	   clients. Most of them seem to completely ignore the existence of the
81	   problem, which may possibly result in severe brain damage, or even
82	   loss of chanop status incase the broken implementation is being used
83	   in a bot that maintains an IRC channel, since it is not clearly
84	   defined what happens when a continuation message line is received.
85	   This undefined behaviour could possibly be exploited, by trying to
86	   make the implementation believe that it received a message from
87	   somewhere, where infact the true origin is spoofed.

89	   The algorithm proposed in this Memo has been designed to reassemble
90	   continuation message lines before processing them in the message
91	   parser. It does this without the use of any timers or delays, which
92	   could lead to loss of data, incase the reassembly timeout has been
93	   set to a low value; or to undesirable high delays in reading from
94	   the network, incase the reassembly timeout has been choosen too high.

96	   The proposed algorithm relies on the facts that:

98	            - for any incomplete message line, there will be a
99	              resulting completion message line received in the next
100	              TCP segment that is read from the network.

102	            - the transmission of the completion message line following
103	              the incomplete line will always occur before any other
104	              network transmission occurs, or in other words, the
105	              completion message line will always be the first line of
106	              the next delivered TCP segment.

108	   The proposed algorithm does not depend on any particular programming
109	   language. Instead, it is designed to work with every programming
110	   language that has provides buffers (variables) and comparision tests.

112	   It can be implemented as a preprocessor that is located infront of the
113	   IRC message parser, in the data stream.

115	   It seems further notable to the author that the proposed algorithm is
116	   Public Domain property and may be freely used and implemented without
117	   paying any fee for whatsoever to anyone.

119	2. Handling continuation message lines

121	   In order to reassemble continuation message lines, they must be
122	   detected in a reliable way. After detecting, they need to be marked as
123	   incomplete, and stored in a temporary buffer, for later reassembly.

125	2.1. IRC Message Format

127	   IRC RFC 1459 defines the message format as follows:

129	   <message>  ::= [':' <prefix> <SPACE> ] <command> <params> <crlf>
130	   <prefix>   ::= <servername> | <nick> [ '!' <user> ] [ '@' <host> ]
131	   <command>  ::= <letter> { <letter> } | <number> <number> <number>
132	   <SPACE>    ::= ' ' { ' ' }
133	   <params>   ::= <SPACE> [ ':' <trailing> | <middle> <params> ]
134	   <middle>   ::= <Any *non-empty* sequence of octets not including
135	                   SPACE or NUL or CR or LF, the first of which may
136	                   not be ':'>
137	   <trailing> ::= <Any, possibly *empty*, sequence of octets not
138	                   including NUL or CR or LF>

140	   <crlf>     ::= CR LF

142	   As we can see from the above representation, every complete IRC
143	   message must end in the sequence <crlf>. Section 8 of RFC 1459 also
144	   reports the usage of either CR *or* LF as message delimiter.

146	   It might be a good idea for any implementation to accept all three
147	   variants; for the sake of simplicity we will however refer to CRLF
148	   as the message delimiter in this document.

150	2.2. Discovery

152	   Once we know how IRC messages are delimited, we can check any IRC
153	   message line for completeness. Any complete line must end in the line
154	   delimiter sequence. If a given IRC message line does not end in that
155	   sequence, then it must have been truncated.

157	   To speed up performance, the proposed discovery algorithm performs
158	   the end delimiter test on each received TCP segment, instead of each
159	   received line, since each received TCP segment must also end in a
160	   CRLF sequence if the last contained line has not been truncated by the
161	   sending TCP.

163	   Notice that we are not looking for continuation lines, since the
164	   algorithm cannot recognize a continuation line by itself. That is
165	   impossible to do because of the arbitrary structure of that line. The
166	   line might infact be composed of data that has been sent to either a
167	   channel or to the client via PRIVMSG or other means.

169	   The solution to this problem is to recognize the lines that have
170	   been truncated, and store them for reassembly, instead of parsing
171	   them. The discovery of a truncated line is also to be used to
172	   recognize the following line as the continuation message line.

174	   Thus, the discovery algorithm sets a flag, whenever a TCP segment,
175	   that containis a truncated line, is received.

177	2.3. Reassembly

179	   After recieving a TCP segment and discovering incomplete lines, the
180	   preprocessor checks a buffer (which holds data if the previous segment
181	   did contain a truncated line; described below) for the presence of any
182	   data. If there is data in the buffer, then the preprocessor does
183	   append the received data to the data in the buffer (this will
184	   effectively reassemble the truncated- and the continuation line),
185	   cycles the buffer (so it is empty after reassembly took place) and
186	   passes the resulting data to the next step.

188	   The next step parses the received data segment into IRC messages by
189	   splitting the data on each occurence of the delimiter sequence CRLF.

191	   If the recieved, tested, possibly reassembled and then splitted
192	   message did not contain an incomplete last line (the discovery flag
193	   was not set), then the preprocessor calls the message parser normally
194	   for each received line.

196	   If the received and splitted message has been marked as incomplete
197	   (discovery flag set), then the preprocessor calls the IRC message
198	   parser for each received message (line), but not for the last message
199	   (which will lack the message delimiter, because it is incomplete).
200	   The preprocessor does not call the message parser with the incomplete
201	   line, instead it resets the incomplete flag from the end-delimiter
202	   test and temporarily stores the incomplete line until arrival of the
203	   next TCP segment.

205	3. Expiration Notice

207	   This document expires in Jan 2002.

209	4. Credits and Authors' Address

211	   Christian Burri
212	   jun. System & Network Engineer
213	   Synecta Informatik AG
214	   Zwinglistrasse 3
215	   9000 St. Gallen
216	   SWITZERLAND

218	   Email: christian.burri@synecta.ch

220	   Special credits to Jarkko Oikarinen and all other contributors for
221	   creating such a cool thing as IRC :)