idnits 2.17.1 

draft-sparks-sip-nit-problems-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 253 has weird spacing: '...his can  ultim...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 6, 2004) is 7385 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? '1' on line 343 looks like a reference

  -- Missing reference section? '2' on line 347 looks like a reference

  -- Missing reference section? '3' on line 350 looks like a reference


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          R. Sparks
3	Internet-Draft                                               dynamicsoft
4	Expires: August 6, 2004                                 February 6, 2004

6	 Problems identified associated with the Session Initiation Protocol's
7	                         non-INVITE Transaction
8	                    draft-sparks-sip-nit-problems-00

10	Status of this Memo

12	   This document is an Internet-Draft and is in full conformance with
13	   all provisions of Section 10 of RFC2026.

15	   Internet-Drafts are working documents of the Internet Engineering
16	   Task Force (IETF), its areas, and its working groups. Note that other
17	   groups may also distribute working documents as Internet-Drafts.

19	   Internet-Drafts are draft documents valid for a maximum of six months
20	   and may be updated, replaced, or obsoleted by other documents at any
21	   time. It is inappropriate to use Internet-Drafts as reference
22	   material or to cite them other than as "work in progress."

24	   The list of current Internet-Drafts can be accessed at http://
25	   www.ietf.org/ietf/1id-abstracts.txt.

27	   The list of Internet-Draft Shadow Directories can be accessed at
28	   http://www.ietf.org/shadow.html.

30	   This Internet-Draft will expire on August 6, 2004.

32	Copyright Notice

34	   Copyright (C) The Internet Society (2004). All Rights Reserved.

36	Abstract

38	   This draft describes several problems that have been identified with
39	   the Session Initiation Protocol's non-INVITE transaction.

41	Table of Contents

43	   1.  Problems under the current specifications  . . . . . . . . . .  3
44	   1.1 NITs must complete immediately or risk losing a race . . . . .  3
45	   1.2 Provisional responses can delay recovery from lost final
46	       responses  . . . . . . . . . . . . . . . . . . . . . . . . . .  4
47	   1.3 Delayed responses will temporarily blacklist an element  . . .  5
48	   1.4 408 for non-INVITE is not useful . . . . . . . . . . . . . . .  6
49	   1.5 Non-INVITE timeouts doom forking proxies . . . . . . . . . . .  8
50	   1.6 Mismatched timer values make winning the race harder . . . . .  8
51	   2.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . .  9
52	       References . . . . . . . . . . . . . . . . . . . . . . . . . .  9
53	       Author's Address . . . . . . . . . . . . . . . . . . . . . . .  9
54	       Intellectual Property and Copyright Statements . . . . . . . . 10

56	1. Problems under the current specifications

58	   There are a number of unpleasant edge conditions created by the SIP
59	   non-INVITE transaction model's fixed duration. The negative aspects
60	   of some of these are exacerbated by the effect provisional responses
61	   have on the non-INVITE transaction state machines as currently
62	   defined.

64	1.1 NITs must complete immediately or risk losing a race

66	   The non-INVITE transaction defined in RFC 3261 [1] is designed to
67	   have a fixed and finite duration (dependent on T1). A consequence of
68	   this design is that participants must strive to complete the
69	   transaction as quickly as possible. Consider the race condition shown
70	   in Figure 1.

72	                      UAC           UAS
73	                       |   request   |
74	                  ---  |---.         |
75	                   ^   |    `---.    |
76	                   |   |         `-->|  ---
77	                   |   |             |   ^
78	                   |   |             |   |
79	                 64*T1 |             |   |
80	                   |   |             |   |
81	                   |   |             | 64*T1
82	                   |   |             |   |
83	                   |   |             |   |
84	                   v   |             |   |
85	     timeout <=== ---  |   200 OK    |   |
86	                       |         .---|   v
87	                       |    .---'    |  ---
88	                       |<--'         |

90	                      Figure 1: NI Race Condition

92	   The UAS in this figure believes it has responded to the request in
93	   time, and that the request succeeded. The UAC, on the other hand,
94	   believes the request has timed-out, hence failed. No longer having a
95	   matching client transaction, the UAC core will ignore what it
96	   believes to be a spurious response. As far as the UAC is concerned,
97	   it received no response at all to its request. The ultimate result is
98	   the UAS and UAC have conflicting views of the outcome of the
99	   transaction.

101	   Therefore, a UAS cannot wait until the last possible moment to send a
102	   final response within a NIT. It must, instead, send its response so
103	   that it will arrive at the UAC before that UAC times out.
104	   Unfortunately, the UAS has no way to accurately measure the
105	   propagation time of the request or predict the propagation time of
106	   the response. The uncertainty it faces is compounded by each proxy
107	   that participates in the transaction. Thus, the UAS's only choice is
108	   to send its final response as soon as it possibly can and hope for
109	   the best.

111	   This result constrains the set of problems that can be solved with a
112	   single NIT. Any delay introduced during processing of a request
113	   increases the probability of losing the race. If the timing
114	   characteristics of that processing are not predictable and
115	   controllable, a single NIT is an inappropriate model for handling the
116	   request. One viable alternative is to accept the request with a 202
117	   and send the ultimate results in a new request in the reciprocal
118	   direction.

120	   In specialized networks, a UAS might have some reliable knowledge of
121	   inter-hop latency and could use that knowledge to determine if it has
122	   time to delay its final response in order to perform some processing
123	   such as a database lookup while mitigating its risk of losing the
124	   race in Figure 1. Establishing this knowledge across arbitrary
125	   networks (perhaps using resource reservation techniques and
126	   deterministic transports) is not currently feasible.

128	1.2 Provisional responses can delay recovery from lost final responses

130	   The non-INVITE client transaction state machine provides reliability
131	   for NITs over unreliable transports (UDP) through retransmission of
132	   the request message. Timer E is set to T1 when a request is initially
133	   transmitted. As long as the machine remains in the Trying state, each
134	   time Timer E fires, it will be reset to twice its previous value
135	   (capping at T2) and the request is retransmitted.

137	   If the non-INVITE client transaction state machine sees a provisional
138	   response, it transitions to the Proceeding state, where
139	   retransmission continues, but the algorithm for resetting Timer E is
140	   simply to use T2 instead of doubling at each firing. (Note that Timer
141	   E is not altered during the transition to Proceeding).

143	   Making the transition to the Proceeding state before Timer E is reset
144	   to T2 can cause recovery from a lost final response to take extra
145	   time. Figure 2 shows recovery from a lost final response with and
146	   without a provisional message during this window. Recovery occurs
147	   within 2*T1 in the case without the provisional. With the
148	   provisional, recovery is delayed until T2, which by default is 8*T1.

150	   In practical terms, a provisional response to a NIT in currently
151	   deployed networks can delay transaction completion by up to 3.5
152	   seconds.

154	              UAC       UAS               UAC        UAS
155	               |         |                 |          |
156	         ---   |----.    |            ---  |----.     |
157	          ^    |     `-->|             ^   |     `--->|
158	      E = T1   |         |         E = T1  |    .-----|(provisional)
159	          v    |         |             v   |<--'      |
160	         ---   |----.    |            ---  |----.     |
161	          ^    |     `-->|             ^   |     `--->|
162	          |    |   X<----|(lost final) |   |   X<-----|(lost final)
163	          |    |         |             |   |          |
164	      E = 2*T1 |         |             |   |          |
165	          |    |         |             |   |          |
166	          |    |         |             |   |          |
167	          v    |         |             |   |          |
168	         ---   |----.    |             |   |          |
169	               |     `-->|             |   |          |
170	               |   .-----|(final)      |   |          |
171	               |<-'      |             |   |          |
172	               |         |             |   |          |
173	              \/\       /\/           /\/ /\/        /\/
174	                                   E = T2
175	              \/\       /\/           /\/ /\/        /\/
176	               |         |             |   |          |
177	               |         |             v   |          |
178	               |         |            ---  |----.     |
179	               |         |                 |     `--->|
180	               |         |                 |    .-----|(final)
181	               |         |                 |<--'      |
182	               |         |                 |          |

184	                Figure 2: Provisionals can harm recovery

186	   No additional delay is introduced if the first provisional response
187	   is received after Timer E has reached its maximum reset interval of
188	   T2.

190	1.3 Delayed responses will temporarily blacklist an element

192	   A SIP element's use of SRV is specified in RFC 3263 [2]. That
193	   specification discusses how SIP assures high availability by having
194	   upstream elements detect failure of downstream elements. It proceeds
195	   to define several types of failure detection and instructions for
196	   failover. Two of the behaviors it describes are important to this
197	   document:

199	   o  Within a transaction, transport failure is detected either through
200	      an explicit report from the transport layer or through timeout.
201	      Note specifically that timeout will indicates transport failure
202	      regardless of the transport in use. When transport failure is
203	      detected, the request is retried at the next element from the
204	      sorted results of the SRV query.

206	   o  Between transactions, locations reporting temporary failure
207	      (through 503/Retry-After for example) are not used until their
208	      requested black-out period expires.

210	   The specification notes the benefit of caching locations that are
211	   successfully contacted, but does not discuss how such a cache is
212	   maintained. It is unclear whether an element should stop using
213	   (temporarily blacklist) a location returned in the SRV query that
214	   results in a transport error. If it does, when should such a location
215	   be removed from the blacklist?

217	   Without such a blacklist (or equivalent mechanism), the intended
218	   availability mechanism fails miserably. Consider traffic between two
219	   domains. Proxy pA in domain A needs to forward a sequence of
220	   non-INVITE requests to domain B. Through DNS SRV, pA discovers pB1
221	   and pB2, and the ordering rules of [2] and [3] indicate it should use
222	   pB1 first. The first request to pB1 times out. Since pA is a proxy
223	   and a NIT has a fixed duration, pA has no opportunity to retry the
224	   request at pB2. If pA does not remember pB1's failure, the second
225	   request (and all subsequent non-INVITE requests until pB1 recovers)
226	   are doomed to the same failure. Caching would allow the subsequent
227	   requests to be tried at pB2.

229	   Since miserable failure is not acceptable in deployed networks, we
230	   should anticipate that elements will, in fact, cache timeout failures
231	   between transactions. Then the race in Figure 1 becomes important. If
232	   an element fails to respond "soon enough", it has effectively not
233	   responded at all, and will be blacklisted at its peer for some period
234	   of time.

236	   (Note that even with caching, the first request timeout results in a
237	   timeout failure all the way back to the original submitter. The
238	   failover mechanisms in [2] work well to increase the resiliency of a
239	   given INVITE transaction, but do nothing for a given non-INVITE
240	   transaction.)

242	1.4 408 for non-INVITE is not useful
243	   Consider the race condition in Figure 1 when the final response is
244	   408 instead of 200. Under the current specification, the race is
245	   guaranteed to be lost. Most existing endpoints will emit a 408 for a
246	   non-INVITE request 64*T1 after receiving the request if they haven't
247	   emitted an earlier final response. Such a 408 is guaranteed to arrive
248	   at the next upstream element too late to be useful. In fact, in the
249	   presence of proxies, these messages are even harmful. When the 408
250	   arrives, each proxy will have already terminated its associated
251	   client transaction due to timeout. So, each proxy must forward the
252	   408 upstream statelessly. This, in turn, is guaranteed to arrive too
253	   late. As Figure 3 shows, this can  ultimately result in bombarding
254	   the original requester with spurious 408s.  (Note that the proxy's
255	   client transaction state machine never enters the Completed state, so
256	   Timer K does not enter into play).

258	                  UAC        P1         P2         P3         UAS
259	                   |          |          |          |          |
260	             ---  ===---.     |          |          |          |
261	              ^    |     `-->===---.     |          |          |
262	              |    |          |     `-->===---.     |          |
263	              |    |          |          |     `-->===---.     |
264	            64*T1  |          |          |          |     `-->===
265	              |    |          |          |          |          |
266	              |    |          |          |          |          |
267	              v    |          |          |          |          |
268	   (timeout) ---  ===         |          |          |          |
269	                   |    .-408===         |          |          |
270	                   |<--'      |    .-408===         |          |
271	                   |    .-408-|<--'      |    .-408===         |
272	                   |<--'      |    .-408-|<--'      |    .-408===
273	                   |    .-408-|<--'      |    .-408-|<--'      |
274	                   |<--'      |    .-408-|<--'      |          |
275	                   |    .-408-|<--'      |          |          |
276	                   |<--'      |          |          |          |
277	                   |          |          |          |          |

279	                  Figure 3: late 408s to non-INVITEs

281	   This response bombardment is not limited to the 408 response, though
282	   it only exists when participating client transaction state machines
283	   are timing out. Figure 4 generalizes Figure 1 to include multiple
284	   hops. Note that even though the UAS responds "in time" to P3, the
285	   response is too late for P2, P1 and the UAC.

287	                  UAC        P1         P2         P3         UAS
288	                   |          |          |          |          |
289	             ---  ===---.     |          |          |          |
290	              ^    |     `-->===---.     |          |          |
291	              |    |          |     `-->===---.     |          |
292	              |    |          |          |     `-->===---.     |
293	            64*T1  |          |          |          |     `-->===
294	              |    |          |          |          |          |
295	              |    |          |          |          |          |
296	              v    |          |          |          |          |
297	   (timeout) ---  ===         |          |          |          |
298	                   |    .-408===         |          |    .-200-|
299	                   |<--'      |    .-408===   .-200-|<--'      |
300	                   |    .-408-|<--'.-200-|<--'     ===         |
301	                   |<--'.-200-|<--'      |          |         ===
302	                   |<--'      |          |          |          |
303	                   |          |          |          |          |

305	               Figure 4: Additional timeout related error

307	1.5 Non-INVITE timeouts doom forking proxies

309	   A single branch with a delayed or missing final response will
310	   dominate the processing at proxy that receives no 2xx responses to a
311	   forked non-INVITE request. Since this proxy is required to allow all
312	   of its client transactions to terminate before choosing a "best
313	   response". This forces the proxy's server transaction to lose the
314	   race in Figure 1. Any response it ultimately forwards (a 401 for
315	   example) will arrive at the upstream elements too late to be used.
316	   Thus, if no element among the branches would return a 2xx response,
317	   failure of a single element (or its transport) dooms the proxy to
318	   failure.

320	1.6 Mismatched timer values make winning the race harder

322	   There are many failure scenarios due to misconfiguration or
323	   misbehavior that the SIP specification does not discuss. One is
324	   placing two elements with different configured values for T1 and T2
325	   on the same network. Review of Figure 1 illustrates that the race
326	   failure is only made more likely in this misconfigured state (it may
327	   appear that shortening T1 at the element behaving as a UAS improves
328	   this particular situation, but remember that these elements may trade
329	   roles on the next request). Since the protocol provides no mechanism
330	   for discovering/negotiating a peer's timer values, exceptional care
331	   must be taken when deploying systems with non-defaults to ensure they
332	   will _never_ directly communicate with elements with default values.

334	2. Acknowledgments

336	   This document captures many conversations about non-INVITE issues.
337	   Significant contributers include Ben Campbell, Gonzalo Camarillo,
338	   Steve Donovan, Rohan Mahy, Dan Petrie, Adam Roach, Jonathan
339	   Rosenberg, and Dean Willis.

341	References

343	   [1]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
344	        Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP:
345	        Session Initiation Protocol", RFC 3261, June 2002.

347	   [2]  Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol
348	        (SIP): Locating SIP Servers", RFC 3263, June 2002.

350	   [3]  Gulbrandsen, A., Vixie, P. and L. Esibov, "A DNS RR for
351	        specifying the location of services (DNS SRV)", RFC 2782,
352	        February 2000.

354	Author's Address

356	   Robert J. Sparks
357	   dynamicsoft
358	   5100 Tennyson Parkway
359	   Suite 1200
360	   Plano, TX  75024

362	   EMail: rsparks@dynamicsoft.com

364	Intellectual Property Statement

366	   The IETF takes no position regarding the validity or scope of any
367	   intellectual property or other rights that might be claimed to
368	   pertain to the implementation or use of the technology described in
369	   this document or the extent to which any license under such rights
370	   might or might not be available; neither does it represent that it
371	   has made any effort to identify any such rights. Information on the
372	   IETF's procedures with respect to rights in standards-track and
373	   standards-related documentation can be found in BCP-11. Copies of
374	   claims of rights made available for publication and any assurances of
375	   licenses to be made available, or the result of an attempt made to
376	   obtain a general license or permission for the use of such
377	   proprietary rights by implementors or users of this specification can
378	   be obtained from the IETF Secretariat.

380	   The IETF invites any interested party to bring to its attention any
381	   copyrights, patents or patent applications, or other proprietary
382	   rights which may cover technology that may be required to practice
383	   this standard. Please address the information to the IETF Executive
384	   Director.

386	Full Copyright Statement

388	   Copyright (C) The Internet Society (2004). All Rights Reserved.

390	   This document and translations of it may be copied and furnished to
391	   others, and derivative works that comment on or otherwise explain it
392	   or assist in its implementation may be prepared, copied, published
393	   and distributed, in whole or in part, without restriction of any
394	   kind, provided that the above copyright notice and this paragraph are
395	   included on all such copies and derivative works. However, this
396	   document itself may not be modified in any way, such as by removing
397	   the copyright notice or references to the Internet Society or other
398	   Internet organizations, except as needed for the purpose of
399	   developing Internet standards in which case the procedures for
400	   copyrights defined in the Internet Standards process must be
401	   followed, or as required to translate it into languages other than
402	   English.

404	   The limited permissions granted above are perpetual and will not be
405	   revoked by the Internet Society or its successors or assignees.

407	   This document and the information contained herein is provided on an
408	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
409	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
410	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
411	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
412	   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

414	Acknowledgment

416	   Funding for the RFC Editor function is currently provided by the
417	   Internet Society.