idnits 2.17.1 draft-lyon-itp-nodes-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-18) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == Mismatching filename: the document gives the document name as 'draft-lyon-itp-nodes-03', but the file name used is 'draft-lyon-itp-nodes-04' == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 14 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 613 has weird spacing: '... K. Evans 4 Expires in 6 months J. Klein 5 Tandem Computers 6 November 21, 1997 8 Transaction Internet Protocol 9 Version 2.0 11 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working 16 documents of the Internet Engineering Task Force (IETF), its areas, 17 and its working groups. Note that other groups may also distribute 18 working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other documents 22 at any time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 To learn the current status of any Internet-Draft, please check the 26 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 28 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 29 ftp.isi.edu (US West Coast). 31 Abstract 33 In many applications where different nodes cooperate on some work, 34 there is a need to guarantee that the work happens atomically. That 35 is, each node must reach the same conclusion as to whether the work 36 is to be completed, even in the face of failures. This document 37 proposes a simple, easily-implemented protocol for achieving this 38 end. 40 Table of Contents 42 Status of this memo 1 43 Abstract 1 44 Table of Contents 2 45 1. Introduction 3 46 2. Example Usage 3 47 3. Transactions 4 48 4. Connections 4 49 5. Transaction Identifiers 5 50 6. Pushing vs. Pulling Transactions 5 51 7. Endpoint Identification 6 52 8. TIP Uniform Resource Locators 7 53 9. States of a Connection 8 54 10. Protocol Versioning 10 55 11. Commands and Responses 10 56 12. Command Pipelining 11 57 13. TIP Commands 11 58 14. Error Handling 16 59 15. Connection Failure and Recovery 17 60 16. Security Considerations 18 61 17. Significant changes from previous version 19 62 References 19 63 Authors' Addresses 20 64 Comments 20 65 App A. The TIP Multiplexing Protocol Version 2.0 21 67 1. Introduction 69 The standard method for achieving atomic commitment is the two-phase 70 commit protocol; see [1] for an introduction to atomic commitment and 71 two-phase commit protocols. 73 Numerous two-phase commit protocols have been implemented over the 74 years. However, none of them has become widely used in the Internet, 75 due mainly to their complexity. Most of that complexity comes from 76 the fact that the two-phase commit protocol is bundled together with 77 a specific program-to-program communication protocol, and that 78 protocol lives on top of a very large infrastructure. 80 This memo proposes a very simple two-phase commit protocol. It 81 achieves its simplicity by specifying only how different nodes agree 82 on the outcome of a transaction; it allows (even requires) that the 83 subject matter on which the nodes are agreeing be communicated via 84 other protocols. By doing so, we avoid all of the issues related to 85 application communication semantics and data representation 86 (to name just a few). Independent of the application communication 87 protocol a transaction manager may use the Transport Layer Security 88 protocol [3] to authenticate other transaction managers and encrypt 89 messages. 91 It is envisioned that this protocol will be used mainly for a 92 transaction manager on one Internet node to communicate with a 93 transaction manager on another node. While it is possible to use 94 this protocol for application programs and/or resource managers to 95 speak to transaction managers, this communication is usually 96 intra-node, and most transaction managers already have more-than- 97 adequate interfaces for the task. 99 While we do not expect this protocol to replace existing ones, we 100 do expect that it will be relatively easy for many existing 101 heterogeneous transaction managers to implement this protocol for 102 communication with each other. 104 Further supplemental information regarding the TIP protocol can be 105 found in [5]. 107 2. Example Usage 109 Today the electronic shopping basket is a common metaphor at many 110 electronic store-fronts. Customers browse through an electronic 111 catalog, select goods and place them into an electronic shopping 112 basket. HTTP servers [2] provide various means ranging from URL 113 encoding to context cookies to keep track of client context (e.g. 114 the shopping basket of a customer) and resume it on subsequent 115 customer requests. 117 Once a customer has finished shopping they may decide to commit 118 their selection and place the associated orders. Most orders may have 119 no relationship with each other except being executed as part of the 120 same shopping transaction; others may be dependent on each other 121 (for example, if made as part of a special offering). Irrespective of 122 these details a customer will expect that all orders have been 123 successfully placed upon receipt of a positive acknowledgment. 125 Today's electronic store-fronts must implement their own special 126 protocols to coordinate such placement of all orders. This 127 programming is especially complex when orders are placed through 128 multiple electronic store-fronts. This complexity limits the 129 potential utility of internet applications, and constrains growth. 131 The protocol described in this document intends to provide a standard 132 for internet servers to achieve agreement on a unit of shared work 133 (e.g. placement of orders in an electronic shopping basket). 134 The server (e.g. a CGI program) placing the orders may want to start 135 a transaction calling its local transaction manager, and ask 136 other servers participating in the work to join the transaction. 137 The server placing the orders passes a reference to the transaction 138 as user data on HTTP requests to the other servers. The other 139 servers call their transaction managers to start a local transaction 140 and ask them to join the remote transaction using the protocol 141 defined in this document. Once all orders have been placed, execution 142 of the two-phase-commit protocol is delegated 143 to the involved transaction managers. If the transaction commits, 144 all orders have been successfully placed and the customer gets a 145 positive acknowledgment. If the transaction aborts no orders will 146 be placed and the customer will be informed of the problem. 148 Transaction support greatly simplifies programming of these 149 applications as exception handling and failure recovery are delegated 150 to a special component. End users are also not left having to deal 151 with the consequences of only partial success. 153 While this example shows how the protocol can be used by HTTP 154 servers, applications may use the protocol when accessing a remote 155 database (e.g. via ODBC), or invoking remote services using other 156 already existing protocols (e.g. RPC). The protocol makes it easy for 157 applications in a heterogeneous network to participate in the same 158 transaction, even if using different communication protocols. 160 3. Transactions 162 "Transaction" is the term given to the programming model whereby 163 computational work performed has atomic semantics. That is, either 164 all work completes successfully and changes are made permanent (the 165 transaction commits), or if any work is unsuccessful, changes are 166 undone (the transaction aborts). The work comprising a transaction 167 (unit of work), is defined by the application. 169 4. Connections 171 The Transaction Internet Protocol (TIP) requires a reliable ordered 172 stream transport with low connection setup costs. In an Internet (IP) 173 environment, TIP operates over TCP or TLS, optionally using a 174 protocol to multiplex light-weight connections over the same TCP or 175 TLS connection. 177 Transaction managers which share transactions establish a TCP or TLS 178 connection. The protocol uses a different connection for each 179 simultaneous transaction shared between two transaction managers. 180 After a transaction has ended, the connection can be reused for 181 a different transaction. 183 Optionally, instead of associating a TCP or TLS connection with only 184 a single transaction, two transaction managers may agree on a 185 protocol to multiplex light-weight connections over the same TCP or 186 TLS connection, and associate each simultaneous transaction with a 187 separate light-weight connection. Using light-weight connections 188 reduces latency and resource consumption associated with executing 189 simultaneous transactions. Similar techniques as described here are 190 widely used by existing transaction processing systems. See Appendix 191 A for an example of one such protocol. 193 Note that although the TIP protocol itself is described only in terms 194 of TCP and TLS, there is nothing to preclude the use of TIP with 195 other transport protocols. However, it is up to the implementor to 196 ensure the chosen transport provides equivalent semantics to TCP, and 197 to map the TIP protocol appropriately. 199 The TIP protocol defines two URL schemes [4] which allow applications 200 and transaction managers to exchange references (i.e. TIP URLs) to 201 transaction managers and transactions. 203 5. Transaction Identifiers 205 Unfortunately, there is no single globally-accepted standard for the 206 format of a transaction identifier; there are various standard and 207 proprietary formats. Allowed formats for a TIP transaction 208 identifier are described below in the section "TIP Uniform Resource 209 Locators". A transaction manager may map its internal transaction 210 identifiers into this TIP format in any manner it sees fit. 211 Furthermore, each party in a superior/subordinate relationship gets 212 to assign its own identifier to the transaction; these identifiers 213 are exchanged when the relationship is first established. Thus, a 214 transaction manager gets to use its own format of transaction 215 identifier internally, but it must remember a foreign transaction 216 identifier for each superior/subordinate relationship in which it is 217 involved. 219 6. Pushing vs. Pulling Transactions 221 Suppose that some program on node "A" has created a transaction, and 222 wants some program on node "B" to do some work as part of the 223 transaction. There are two classical ways that he does this, 224 referred to as the "push" model and the "pull" model. 226 In the "push" model, the program on A first asks his transaction 227 manager to export the transaction to node B. A's transaction manager 228 sends a message to B's TM asking it to instantiate the transaction as 229 a subordinate of A, and return its name for the transaction. The 230 program on A then sends a message to its counterpart on B on the 231 order of "Do some work, and make it part of the transaction that your 232 transaction manager already knows of by the name ...". Because A's 233 TM knows that it sent the transaction to B's TM, A's TM knows to 234 involve B's TM in the two-phase commit process. 236 In the "pull" model, the program on A merely sends a message to B on 237 the order of "Do some work, and make it part of the transaction that 238 my TM knows by the name ...". The program on B asks its TM to enlist 239 in the transaction. At that time, B's TM will "pull" the transaction 240 over from A. As a result of this pull, A's TM knows to involve B's 241 TM in the two-phase commit process. 243 The protocol described here supports both the "push" and "pull" 244 models. 246 7. Endpoint Identification 248 In certain cases after connection failures, one of the parties of 249 a connection may have a responsibility to re-establish a new 250 connection to the other party in order to complete the 251 two-phase-commit protocol. If the party that initiated the original 252 connection needs to re-establish it, the job is easy: he merely 253 establishes a connection in the same way that he originally did it. 254 However, if the other party needs to re-establish the connection, 255 he needs to know how to contact the initiator of the original 256 connection. He gets this information in the following way: 258 After a TCP connection has been established the initiating party 259 issues an IDENTIFY command and supplies an endpoint identifier which 260 is used to re-establish the connection if needed. If the initiating 261 party does not supply an endpoint identifier on the IDENTIFY command, 262 he must not perform any action which would require a connection to be 263 re-established (e.g. perform recovery actions). 265 An as used in the IDENTIFY (and a few other) 266 commands has one of the following formats: 267 268 269 : 270 : 272 A is a standard name, acceptable to the domain name 273 service. It must be sufficiently qualified to be useful to the 274 receiver of the command. 276 An is an IP address, in the usual form: four decimal 277 numbers separated by period characters. 279 The is a decimal number specifying the port at which 280 the transaction manager is listening for requests to establish TCP 281 connections. Two standard transaction service port numbers are 282 defined: 3372 for TLS secured connections, and 3371 for unsecured 283 connections. If the port number is omitted from the endpoint 284 identifier, and if the current connection is TLS secured, then the 285 standard TLS secured transaction service port number is assumed; 286 otherwise the standard unsecured transaction service port number is 287 assumed. Likewise, if a port number is specified, then it must 288 represent a port with the same security capabilities as the current 289 connection (i.e. TLS or unsecured). 291 8. TIP Uniform Resource Locators 293 Transactions and transaction managers are resources associated 294 with the TIP protocol. Transaction managers and transactions are 295 located using TCP/IP endpoint identifiers. Once a TCP connection has 296 been established, TIP commands may be sent to operate on transactions 297 associated with the respective transaction managers. 299 Applications which want to pull a transaction from a remote node 300 must supply a reference to the remote transaction which allows 301 the local transaction manager (i.e. the transaction manager pulling 302 the transaction) to connect to the remote transaction 303 manager and identify the particular transaction. Applications 304 which want to push a transaction to a remote node must supply 305 a reference to the remote transaction manager (i.e. the transaction 306 manager to which the transaction is to be pushed), which allows the 307 local transaction manager to locate the remote transaction 308 manager. 310 The TIP protocol defines a URL scheme [4] which allows applications 311 and transaction managers to exchange references (i.e. TIP URLs) to 312 transaction managers and transactions. 314 A TIP URL takes the form: 316 TIP://[:]/ or 317 TIPS://[:]/ 319 where the TIP: form implies the underlying connection is based on 320 TCP; the TIPS: form implies the underlying connection is based on 321 TLS; is an IP address or a DNS name as defined above; and 322 is a valid TCP port number. may take one 323 of two forms (standard or non-standard): 325 i. "urn:" ":" 327 A standard transaction identifier, conforming to the proposed 328 Internet Standard for Uniform Resource Names (URNs), as 329 specified by RFC2141; where is the Namespace Identifier, 330 and is the Namespace Specific String. The Namespace ID 331 determines the syntactic interpretation of the Namespace 332 Specific String. The Namespace Specific String is a sequence of 333 characters representing a transaction identifier (as defined by 334 ). The rules for the contents of these fields are 335 specified by [7] (valid characters, encoding, etc.). 337 This format of may be used to express 338 global transaction identifiers in terms of standard 339 representations. Examples for might be or . 340 e.g. 342 TIP://123.123.123.123/urn:xopen:xid 344 Note that Namespace Ids require registration. See [8] for 345 details on how to do this. 347 ii. 349 A sequence of printable ASCII characters (octets with values in 350 the range 32 through 126 inclusive (excluding ":") 351 representing a transaction identifier. In this non-standard 352 case, it is the combination of and 353 which ensures global uniqueness. e.g. 355 TIP://123.123.123.123/transid1 357 Except as otherwise described above, the TIP URL scheme follows the 358 rules for reserved characters as defined in [4], and uses escape 359 sequences as defined in [4] Section 5. 361 Note that the TIP protocol itself does not use the TIP URL scheme. 362 This URL scheme is proposed as a standard way to pass transaction 363 identification information through other protocols. e.g. between 364 cooperating application processes. The URL may then be used to 365 communicate to the local transaction manager the information 366 necessary to associate the application with a particular TIP 367 transaction. e.g. to PULL the transaction from a remote transaction 368 manager. It is anticipated that each TIP implementation will provide 369 some set of APIs for this purpose. 371 To create a non-standard TIP URL from a transaction identifier, first 372 replace any reserved characters in the transaction identifier with 373 their equivalent escape sequences, then insert the appropriate host 374 endpoint identification. If the transaction identifier is one that 375 you created, insert your own endpoint identification. If the 376 transaction identifier is one that you received on a TIP connection 377 that you initiated, insert the identification of the party to which 378 you connected. If the transaction identifier is one that you received 379 on a TIP connection that you did not initiate, use the identification 380 that was received in the IDENTIFY command. 382 9. States of a Connection 384 At any instant, only one party on a connection is allowed to send 385 commands, while the other party is only allowed to respond to 386 commands that he receives. Throughout this document, the party that 387 is allowed to send commands is called "primary"; the other party is 388 called "secondary". Initially, the party that initiated the 389 connection is primary; however, a few commands cause the 390 roles to switch. A connection returns to it's original polarity 391 whenever the Idle state is reached. 393 When multiplexing is being used, these rules apply independently to 394 each "virtual" connection, regardless of the polarity of the 395 underlying connection (which will be in the Multiplexing state). 397 Note that commands may be sent "out of band" by the secondary via the 398 use of pipelining. This does not affect the polarity of the 399 connection (i.e. the roles of primary and secondary do not switch). 400 See section 12 for details. 402 At any instant, a connection is in one of the following states. 403 From the point of view of the secondary party, the state changes when 404 he sends a reply; from the point of view of the primary party, the 405 state changes when he receives a reply. 407 Initial: The initial connection starts out in the Initial state. 408 Upon entry into this state, the party that initiated the 409 connection becomes primary, and the other party becomes secondary. 410 There is no transaction associated with the connection in this 411 state. From this state, the primary can send the IDENTIFY command. 413 Idle: In this state, the primary and the secondary have 414 agreed on a protocol version, and the primary supplied an 415 endpoint identifier to the secondary party to reconnect after 416 a failure. There is no transaction associated with the 417 connection in this state. Upon entry to this state, the party 418 that initiated the connection becomes primary, and the other 419 party becomes secondary. From this state, the primary can send 420 any of the following commands: BEGIN, MULTIPLEX, PUSH, PULL, 421 QUERY and RECONNECT. 423 Begun: In this state, a connection is associated with an active 424 transaction, which can only be completed by a one-phase protocol. 425 A BEGUN response to a BEGIN command places a connection into 426 this state. Failure of a connection in Begun state implies 427 that the transaction will be aborted. From this state, the 428 primary can send an ABORT, or COMMIT command. 430 Enlisted: In this state, the connection is associated with an active 431 transaction, which can be completed by a one-phase or, two-phase 432 protocol. A PUSHED response to a PUSH command, or a PULLED 433 response to a PULL command, places the connection into this state. 434 Failure of the connection in Enlisted state implies that the 435 transaction will be aborted. From this state, the primary can 436 send an ABORT, COMMIT, or PREPARE command. 438 Prepared: In this state, a connection is associated with a 439 transaction that has been prepared. A PREPARED response to a 440 PREPARE command, or a RECONNECTED response to a RECONNECT 441 command places a connection into this state. Unlike other 442 states, failure of a connection in this state does not cause 443 the transaction to automatically abort. From this state, the 444 primary can send an ABORT, or COMMIT command. 446 Multiplexing: In this state, the connection is being used by a 447 multiplexing protocol, which provides its own set of connections. 448 In this state, no TIP commands are possible on the connection. 449 (Of course, TIP commands are possible on the connections 450 supplied by the multiplexing protocol.) The connection can 451 never leave this state. 453 Error: In this state, a protocol error has occurred, and the 454 connection is no longer useful. 456 10. Protocol Versioning 458 This document describes version 2 of the protocol. In order to 459 accommodate future versions, the primary party sends a message 460 indicating the lowest and the highest version number it understands. 461 The secondary responds with the highest version number it 462 understands. 464 After such an exchange, communication can occur using the smaller of 465 the highest version numbers (i.e., the highest version number that 466 both understand). This exchange is mandatory and occurs using the 467 IDENTIFY command (and IDENTIFIED response). 469 If the highest version supported by one party is considered obsolete 470 and no longer supported by the other party, no useful communication 471 can occur. In this case, the newer party should merely drop the 472 connection. 474 11. Commands and Responses 476 All commands and responses consist of one line of ASCII text, using 477 only octets with values in the range 32 through 126 inclusive, 478 followed by either a CR (an octet with value 13) or an LR (an octet 479 with value 10). Each line can be split up into one or more "words", 480 where successive words are separated by one or more space octets 481 (value 32). 483 Arbitrary numbers of spaces at the beginning and/or end of each line 484 are allowed, and ignored. 486 Lines that are empty, or consist entirely of spaces are ignored. 487 (One implication of this is that you can terminate lines with both a 488 CR and an LF if desired; the LF will be treated as terminating an 489 empty line, and ignored.) 491 In all cases, the first word of each line indicates the 492 type of command or response; all defined commands and responses 493 consist of upper-case letters only. 495 For some commands and responses, subsequent words convey parameters 496 for the command or response; each command and response takes a fixed 497 number of parameters. 499 All words on a command or response line after (and including) the 500 first undefined word are totally ignored. These can be used to pass 501 human-readable information for debugging or other purposes. 503 12. Command Pipelining 505 In order to reduce communication latency and improve efficiency, it 506 is possible for multiple TIP "lines" (commands or responses) to be 507 pipelined (concatenated) together and sent as a single message. Lines 508 may also be sent "ahead" (by the secondary, for later procesing by 509 the primary). Examples are an ABORT command immediately followed by a 510 BEGIN command, or a COMMITTED response immediately followed by a PULL 511 command. 513 The sending of pipelined lines is an implementation option. Likewise 514 which lines are pipelined together. Generally, it must be certain 515 that the pipelined line will be valid for the state of the connection 516 at the time it is processed by the receiver. It is the responsibility 517 of the sender to determine this. 519 All implementations must support the receipt of pipelined lines - the 520 rules for processing of which are described by the following 521 paragraph: 523 When the connection state is such that a line should be read (either 524 command or response), then that line (when received) is processed. No 525 more lines are read from the connection until processing again 526 reaches such a state. If a line is received on a connection when it's 527 not the turn of the other party to send, that line is _not_ rejected. 528 Instead, the line is held and processed when the connection state 529 again requires it. The receiving party must process lines and issue 530 responses in the order of lines received. If a line causes an error 531 the connection enters the Error state, and all subsequent lines on 532 the connection are discarded. 534 13. TIP Commands 536 Commands pertain either to connections or transactions. Commands 537 which pertain to connections are: IDENTIFY and MULTIPLEX. Commands 538 which pertain to transactions are: ABORT, BEGIN, COMMIT, PREPARE, 539 PULL, PUSH, QUERY, and RECONNECT. 541 Following is a list of all valid commands, and all possible responses 542 to each: 544 ABORT 546 This command is valid in the Begun, Enlisted, and Prepared states. 547 It informs the secondary that the current transaction of the 548 connection will abort. Possible responses are: 550 ABORTED 551 The transaction has aborted; the connection enters Idle 552 state. 554 ERROR 555 The command was issued in the wrong state, or was malformed. 556 The connection enters the Error state. 558 BEGIN 560 This command is valid only in the Idle state. It asks the 561 secondary to create a new transaction and associate it with the 562 connection. The newly created transaction will be completed with a 563 one-phase protocol. Possible responses are: 565 BEGUN 566 A new transaction has been successfully begun, and that 567 transaction is now the current transaction of the connection. 568 The connection enters Begun state. 570 NOTBEGUN 571 A new transaction could not be begun; the connection 572 remains in Idle state. 574 ERROR 575 The command was issued in the wrong state, or was malformed. 576 The connection enters the Error state. 578 COMMIT 580 This command is valid in the Begun, Enlisted or Prepared states. 581 In the Begun or Enlisted state, it asks the secondary to attempt 582 to commit the transaction; in the Prepared state, it informs the 583 secondary that the transaction has committed. Note that in the 584 Enlisted state this command represents a one-phase protocol, and 585 should only be done when the sender has 1) no local recoverable 586 resources involved in the transaction, and 2) only one subordinate 587 (the sender will not be involved in any transaction recovery 588 process). Possible responses are: 590 ABORTED 591 This response is possible only from the Begun and Enlisted 592 states. It indicates that some party has vetoed the commitment 593 of the transaction, so it has been aborted instead of 594 committing. The connection enters the Idle state. 596 COMMITTED 597 This response indicates that the transaction has been 598 committed, and that the primary no longer has any 599 responsibilities to the secondary with respect to the 600 transaction. The connection enters the Idle state. 602 ERROR 603 The command was issued in the wrong state, or was malformed. 604 The connection enters the Error state. 606 ERROR 608 This command is valid in any state; it informs the secondary that 609 a previous response was not recognized or was badly formed. A 610 secondary should not respond to this command. The connection 611 enters Error state. 613 IDENTIFY 614 615 | "-" 617 This command is valid only in the Initial state. The primary party 618 informs the secondary party of the lowest and highest protocol 619 version supported (all versions between the lowest and highest 620 must be supported), and optionally of an IP address and a port 621 number at which the other party can re-establish a connection 622 if ever needed. If the primary party does not supply an endpoint 623 identifier the secondary party will respond with ABORTED or 624 READONLY to any PREPARE commands. Possible responses are: 626 IDENTIFIED 627 The accepting party has saved the identification. The response 628 contains the highest protocol version supported by the 629 secondary party. All future communication is assumed to take 630 place using the smaller of the protocol versions in the 631 IDENTIFY command and the IDENTIFIED response. The connection 632 enters the Idle state. 634 ERROR 635 The command was issued in the wrong state, or was malformed. 636 This response also occurs if the accepting party does not 637 support any version of the protocol in the range supported 638 by the initiator. The connection enters the Error state. The 639 initiator should close the connection. 641 MULTIPLEX 643 This command is only valid in the Idle state. The command 644 seeks agreement to use the connection for a multiplexing 645 protocol that will supply a large number of connections on 646 the existing connection. The primary suggests a particular 647 multiplexing protocol. The secondary party can either accept 648 or reject use of this protocol. 650 At the present, the only defined protocol identifier is "TMP2.0", 651 which refers to the TIP Multiplexing Protocol, version 2.0. See 652 Appendix A for details of this protocol. Other protocol 653 identifiers may be defined in the future. 655 If the MULTIPLEX command is accepted, the specified multiplexing 656 protocol will totally control the underlying connection. This 657 protocol will begin with the first byte after the line terminator 658 of the MULTIPLEX command (for data sent by the initiator), 659 and the first byte after the line terminator of the MULTIPLEXING 660 response (for data received by the initiator). This implies that 661 an implementation must not send both a CR and a LF octet after 662 either the MULTIPLEX command or the MULTIPLEXING response, lest 663 the LF octet be mistaken for the first byte of the multiplexing 664 protocol. 666 Note that when using TMP V2.0, a single TIP command (TMP 667 application message) must be wholly contained within a single TMP 668 packet (the TMP PUSH flag is not used by TIP). 670 Possible responses to the MULTIPLEX command are: 672 MULTIPLEXING 673 The secondary party agrees to use the specified multiplexing 674 protocol. The connection enters the Multiplexing state, and 675 all subsequent communication is as defined by that protocol. 676 All connections created by the multiplexing protocol start 677 out in the Idle state. 679 CANTMULTIPLEX 680 The secondary party cannot support (or refuses to use) the 681 specified multiplexing protocol. The connection remains in the 682 Idle state. 684 ERROR 685 The command was issued in the wrong state, or was malformed. 686 The connection enters the Error state. 688 PREPARE 690 This command is valid only in the Enlisted state; it requests 691 the secondary to prepare the transaction for commitment (phase 692 one of two-phase commit). Possible responses are: 694 PREPARED 695 The subordinate has prepared the transaction; the connection 696 enters PREPARED state. 698 ABORTED 699 The subordinate has vetoed committing the transaction. The 700 connection enters the Idle state. After this response, the 701 superior has no responsibilities to the subordinate with 702 respect to the transaction. 704 READONLY 705 The subordinate no longer cares whether the transaction 706 commits or aborts. The connection enters the Idle state. After 707 this response, the superior has no responsibilities to the 708 subordinate with respect to the transaction. 710 ERROR 711 The command was issued in the wrong state, or was malformed. 712 The connection enters the Error state. 714 PULL 715 717 This command is only valid in Idle state. This command seeks to 718 establish a superior/subordinate relationship in a transaction, 719 with the primary party of the connection as the subordinate (i.e., 720 he is pulling a transaction from the secondary party). Note that 721 the entire value of (as defined in the 722 section "TIP Uniform Resource Locators") must be specified as the 723 transaction identifier. Possible responses are: 725 PULLED 726 The relationship has been established. Upon receipt of this 727 response, the specified transaction becomes the current 728 transaction of the connection, and the connection enters 729 Enlisted state. Additionally, the roles of primary and 730 secondary become reversed. (That is, the superior becomes 731 the primary for the connection.) 733 NOTPULLED 734 The relationship has not been established (possibly, because 735 the secondary party no longer has the requested transaction). 736 The connection remains in Idle state. 738 ERROR 739 The command was issued in the wrong state, or was malformed. 740 The connection enters the Error state. 742 PUSH 744 This command is valid only in the Idle state. It seeks to 745 establish a superior/subordinate relationship in a transaction 746 with the primary as the superior. Note that the entire value of 747 (as defined in the section "TIP Uniform 748 Resource Locators") must be specified as the transaction 749 identifier. Possible responses are: 751 PUSHED 752 The relationship has been established, and the identifier by 753 which the subordinate knows the transaction is returned. The 754 transaction becomes the current transaction for the connection, 755 and the connection enters Enlisted state. 757 ALREADYPUSHED 758 The relationship has been established, and the identifier by 759 which the subordinate knows the transaction is returned. 760 However, the subordinate already knows about the transaction, 761 and is expecting the two-phase commit protocol to arrive via a 762 different connection. In this case, the connection remains in 763 the Idle state. 765 NOTPUSHED 766 The relationship could not be established. The connection 767 remains in the Idle state. 769 ERROR 770 The command was issued in the wrong state, or was malformed. 771 The connection enters Error state. 773 QUERY 775 This command is valid only in the Idle state. A subordinate uses 776 this command to determine whether a specific transaction still 777 exists at the superior. Possible responses are: 779 QUERIEDEXISTS 780 The transaction still exists. The connection remains in the 781 Idle state. 783 QUERIEDNOTFOUND 784 The transaction no longer exists. The connection remains in 785 the Idle state. 787 ERROR 788 The command was issued in the wrong state, or was malformed. 789 The connection enters Error state. 791 RECONNECT 793 This command is valid only in the Idle state. A superior uses the 794 command to re-establish a connection for a transaction, when the 795 previous connection was lost during Prepared state. Possible 796 responses are: 798 RECONNECTED 799 The subordinate accepts the reconnection. The connection enters 800 Prepared state. 802 NOTRECONNECTED 803 The subordinate no longer knows about the transaction. The 804 connection remains in Idle state. 806 ERROR 807 The command was issued in the wrong state, or was malformed. 808 The connection enters Error state. 810 14. Error Handling 812 If either party receives a line that it cannot understand it closes 813 the connection. If either party (either a command or a response), 814 receives an ERROR indication or an ERROR response on a connection 815 the connection enters the Error state and no further communication 816 is possible on that connection. An implementation may decide to 817 close the connection. Closing of the connection is treated by the 818 other party as a communication failure. 820 Receipt of an ERROR indication or an ERROR response indicates that 821 the other party believes that you have not properly implemented the 822 protocol. 824 15. Connection Failure and Recovery 826 A connection failure may be caused by a communication failure, or by 827 any party closing the connection. It is assumed TIP implementations 828 will use some private mechanism to detect TIP connection failure 829 (e.g. socket keepalive, or a timeout scheme). 831 Depending on the state of a connection, transaction managers will 832 need to take various actions when a connection fails. 834 If the connection fails in Initial or Idle state, the connection does 835 not refer to a transaction. No action is necessary. 837 If the connection fails in the Multiplexing state, all connections 838 provided by the multiplexing protocol are assumed to have failed. 839 Each of them will be treated independently. 841 If the connection fails in Begun or Enlisted state and COMMIT has 842 been sent, then transaction completion has been delegated to the 843 subordinate (the superior is not involved); the outcome of the 844 transaction is unknown by the superior (it is known at the 845 subordinate). The superior uses application-specific means to 846 determine the outcome of the transaction (note that transaction 847 integrity is not compromised in this case since the superior has no 848 recoverable resources involved in the transaction). If the connection 849 fails in Begun or Enlisted state and COMMIT has not been sent, the 850 transaction will be aborted. 852 If the connection fails in Prepared state, then the appropriate 853 action is different for the superior and subordinate in the 854 transaction. 856 If the superior determines that the transaction commits, then it 857 must eventually establish a new connection to the subordinate, and 858 send a RECONNECT command for the transaction. If it receives a 859 NOTRECONNECTED response, it need do nothing else. However, if it 860 receives a RECONNECTED response, it must send a COMMIT request and 861 receive a COMMITTED response. 863 If the superior determines that the transaction aborts, it is allowed 864 to (but not required to) establish a new connection and send a 865 RECONNECT command for the transaction. If it receives a RECONNECTED 866 response, it should send an ABORT command. 868 The above definition allows the superior to reestablish the 869 connection before it knows the outcome of the transaction, if it 870 finds that convenient. Having succeeded in a RECONNECT command, 871 the connection is back in Prepared state, and the superior can send a 872 COMMIT or ABORT command as appropriate when it knows the transaction 873 outcome. 875 Note that it is possible for a RECONNECT command to be received by 876 the subordinate before it is aware that the previous connection has 877 failed. In this case the subordinate treats the RECONNECT command as 878 a failure indication and cleans-up any resources associated with the 879 connection, and associates the transaction state with the new 880 connection. 882 If a subordinate notices a connection failure in Prepared state, then 883 it should periodically attempt to create a new connection to the 884 superior and send a QUERY command for the transaction. It should 885 continue doing this until one of the following two events occurs: 887 1. It receives a QUERIEDNOTFOUND response from the superior. In this 888 case, the subordinate should abort the transaction. 890 2. The superior, on some connection that it initiated, sends a 891 RECONNECT command for the transaction to the subordinate. In this 892 case, the subordinate can expect to learn the outcome of the 893 transaction on this new connection. If this new connection should 894 fail before the subordinate learns the outcome of the transaction, 895 it should again start sending QUERY commands. 897 Note that if a TIP system receives either a QUERY or a RECONNECT 898 command, and for some reason is unable to satisfy the request (e.g. 899 the necessary recovery information is not currently available), then 900 the connection should be dropped. 902 16. Security Considerations 904 As with all two phase-commit protocols, any security mechanisms 905 applied to the application communication protocol are liable to be 906 subverted unless corresponding mechanisms are applied to the 907 commitment protocol. For example, any authentication between the 908 parties using the application protocol must be supported by security 909 of the TIP exchanges to at least the same level of certainty. 911 In order to support secure channels, TIP can optionally run over TLS. 912 Like TCP, TLS creates channels that consist of a bi-directional pair 913 of byte streams. Unlike TCP, TLS offers optional client 914 authentication, optional server authentication, and optional 915 encryption. A TIP implementation that requires maximum security can 916 reasonably require all three of these. A TIP system requests that 917 others connect to it with TLS by generating transaction URLs using 918 the TIPS: URL scheme. 920 If a system does not protect itself through usage of TLS, then 921 security implications fall into the following categories: 923 1. Someone PUSHED a new transaction to us that we don't want. 924 Depending on his correctness or intentions, he may or may not ever 925 complete it. Thus, an arbitrary computer may cause us to save a 926 little bit of state. An implementation concerned about this will 927 probably drop the TCP connection if the other system does not 928 complete transactions in a timely manner. 930 The Transport Layer Security protocol [3] may be used by a 931 transaction manager to restrict access to trusted clients only. 933 2. Someone PULLED a transaction from us when we didn't want him to. 934 In this case, he will become involved in the atomic commitment 935 protocol. At worst, he may cause a transaction to abort that 936 otherwise would have committed. Since transaction managers 937 traditionally reserve the right to abort any transaction for any 938 reason they see fit, this does not represent a disaster to the 939 applications. However, if done frequently, it may represent a 940 denial-of-service attack. 942 Implementations concerned about this kind of attack can use the 943 Transport Layer Security protocol [3] to restrict access to 944 trusted partners (i.e. to control from which remote endpoints 945 TIP transactions will be accepted, and to verify that an end-point 946 is genuine), and encrypt TIP commands thus preventing unauthorized 947 disclosure of transaction identifiers. 949 3. Someone violates the TIP commitment protocol. (e.g. a COMMIT 950 command is injected on a TIP connection in place of an ABORT 951 command). This yields the possibility of data inconsistency. 953 Implementations concerned about this kind of attack can also use 954 the Transport Layer Security protocol [3] to restrict access to 955 only trusted partners and to encrypt TIP commands. 957 It is assumed that implementation-specific configuration information 958 will define whether a partner should be connected to using either a 959 mandatory TLS secured connection, or an unsecured connection (in 960 which case any security risk is accepted). "Optionally TLS secured" 961 is in effect unsecured (since there is no guarantee of a TLS secured 962 connection). 964 17. Significant changes from previous version of this Internet-Draft 965 (): 967 Added TIPS: URL scheme (for TLS connections). 968 Otherwise minor clarifications. 970 References 972 [1] Gray, J. and A. Reuter (1993), Transaction Processing: Concepts 973 and Techniques. San Francisco, CA: Morgan Kaufmann Publishers. 974 (ISBN 1-55860-190-2). 976 [2] RFC2068 Standards Track "Hypertext Transfer Protocol -- 977 HTTP/1.1". 978 R. Fielding et al. 980 [3] Internet-Draft "The TLS Protocol Version 1.0". 981 T. Dierks et al. 983 [4] RFC1738 Standards Track "Uniform Resource Locators (URL)". 984 T. Berners-Lee et al. 986 [5] Internet-Draft "Transaction Internet Protocol - Requirements and 987 Supplemental Information". 988 K. Evans et al. 990 [6] Internet-Draft "Session Control Protocol V 2.0". 991 K. Evans et al. 993 [7] RFC2141 "URN Syntax". 994 R. Moats. 996 [8] Internet-Draft "Namespace Identifier Requirements for URN 997 Services". 998 P. Faltstrom et al. 1000 Authors' Addresses 1002 Jim Lyon Keith Evans 1003 Microsoft Corporation Tandem Computers, Inc. 1004 One Microsoft Way 5425 Stevens Creek Blvd 1005 Redmond, WA 98052-6399, USA Santa Clara, CA 95051-7200, USA 1007 Phone: +1 (206) 936 0867 Phone: +1 (408) 285 5314 1008 Fax: +1 (206) 936 7329 Fax: +1 (408) 285 5245 1009 Email: JimLyon@Microsoft.Com Email: Keith@Loc252.Tandem.Com 1011 Johannes Klein 1012 Tandem Computers Inc. 1013 10555 Ridgeview Court 1014 Cupertino, CA 95014-0789, USA 1016 Phone: +1 (408) 285 0453 1017 Fax: +1 (408) 285 9818 1018 Email: Klein_Johannes@Tandem.Com 1020 Comments 1022 Please send comments on this document to the authors at 1023 , , 1024 , or to the TIP mailing list at 1025 . You can subscribe to the TIP mailing list by 1026 sending mail to with the line "subscribe tip" 1027 somewhere in the body of the message. 1029 Appendix A. The TIP Multiplexing Protocol Version 2.0. 1031 This appendix describes version 2.0 of the TIP Multiplexing Protocol 1032 (TMP). TMP V2.0 is the same as the Session Control Protocol (SCP) 1033 version 2.0, as described by [6]. TMP is intended solely for use 1034 with the TIP protocol, and forms part of the TIP protocol 1035 specification (although its implementation is optional), hence its 1036 inclusion in this document. TMP V2.0 is the only multiplexing 1037 protocol supported by TIP V2.0. The following text is a copy of [6] 1038 with no substantive changes, it is edited only as necessary to 1039 reflect the name change and for inclusion in this document. 1041 Abstract 1043 TMP provides a simple mechanism for creating multiple lightweight 1044 connections over a single TCP connection. Several such lightweight 1045 connections can be active simultaneously. TMP provides a byte 1046 oriented service, but allows message boundaries to be marked. 1048 A.1. Introduction 1050 There are several protocols in widespread use on the Internet which 1051 create a single TCP connection for each transaction. Unfortunately, 1052 because these transactions are short lived, the cost of setting up 1053 and tearing down these TCP connections becomes significant, both in 1054 terms of resources used and in the delays associated with TCP's 1055 congestion control mechanisms. 1057 The TIP Multiplexing Protocol (TMP) is a simple protocol running on 1058 top of TCP that can be used to create multiple lightweight 1059 connections over a single transport connection. TMP therefore 1060 provides for more efficient use of TCP connections. Data from 1061 several different TMP connections can be interleaved, and both 1062 message boundaries and end of stream markers can be provided. 1064 Because TMP runs on top of a reliable byte ordered transport 1065 service it can avoid most of the extra work TCP must go through in 1066 order to ensure reliability. For example, TMP connections do not 1067 need to be confirmed, so there is no need to wait for handshaking 1068 to complete before data can be sent. 1070 TMP is a useful multiplexing protocol when all messages are short and 1071 buffering is not a problem (as is the case for TIP). If you are 1072 designing a different protocol that needs multiplexing, TMP may or 1073 may not be appropriate. (Protocols with large messages can exceed the 1074 buffering capabilities of the receiver, and under certain conditions 1075 this can cause deadlock.) 1077 A.2. Protocol Model 1079 The basic protocol model is that of multiple lightweight 1080 connections operating over a reliable stream of bytes. The party 1081 which initiated the connection is referred to as the primary, and 1082 the party which accepted the connection is referred to as the 1083 secondary. 1085 Connections may be unidirectional or bi-directional; each end of a 1086 bi-directional connection may be closed separately. Connections may 1087 be closed normally, or reset to indicate an abortive release. 1088 Aborting a connection closes both data streams. 1090 Once a connection has been opened, applications can send messages 1091 over it, and signal the end of application level messages. 1092 Application messages are encapsulated in TMP packets and 1093 transferred over the byte stream. A single TIP command (TMP 1094 application message) must be wholly contained within a single TMP 1095 packet. 1097 A.3. TMP Packet Format 1099 A TMP packet consists of a 64 bit header followed by zero or more 1100 octets of data. The header contains three fields; a flag byte, the 1101 connection identifier, and the packet length. Both integers, the 1102 connection identifier and the packet length must be sent in network 1103 byte order. 1105 FLAGS 1106 +--------+--------+--------+--------+ 1107 |SFPR0000| Connection ID | 1108 +--------+--------+--------+--------+ 1109 | | Length | 1110 +--------+--------+--------+--------+ 1112 A.3.1. Flag Details 1114 +-------+-----------+-----------------------------------------+ 1115 | Name | Mask | Description | 1116 +-------+-----------+ ----------------------------------------+ 1117 | SYN | 1xxx|0000 | Open a new connection | 1118 | FIN | x1xx|0000 | Close an existing connection | 1119 | PUSH | xx1x|0000 | Mark application level message boundary | 1120 | RESET | xxx1|0000 | Abort the connection | 1121 +-------+-----------+-----------------------------------------+ 1123 A.4. Connection Identifiers 1125 Each TMP connection is identified by a 24 bit integer. TMP 1126 connections created by the party which initiated the underlying TCP 1127 connection must have even identifiers; those created by the other 1128 party must have odd identifiers. 1130 A.5. TMP Connection States 1132 TMP connections can exist in several different states; Closed, 1133 OpenWrite, OpenSynRead, OpenSynReset, OpenReadWrite, CloseWrite, 1134 and CloseRead. A connection can change its state in response to 1135 receiving a packet with the SYN, FIN, or RESET bits set, or in 1136 response to an API call by the application. The available API calls 1137 are open, close, and abort. 1139 The meaning of most states is obvious (e.g. OpenWrite means that a 1140 connection has been opened for writing). The meaning of the states 1141 OpenSynRead and OpenResetRead need more explanation. 1143 In the OpenSynRead state a primary opened and immediately closed the 1144 output data stream of a connection, and is now waiting for a SYN 1145 response from the secondary to open the input data stream for 1146 reading. 1148 In the OpenResetRead state a primary opened and immediately aborted 1149 a connection, and is now waiting for a SYN response from the 1150 secondary to finally close the connection. 1152 A.6. Event Priorities and State Transitions 1154 The state table shown below describes the actions and state 1155 transitions that occur in response to a given event. The events 1156 accepted by each state are listed in priority order with highest 1157 priority first. If multiple events are present in a message, those 1158 events matching the list are processed. If multiple events match, 1159 the event with the highest priority is accepted and processed 1160 first. Any remaining events are processed in the resultant 1161 successor state. 1163 For example, if a TMP connection at the secondary is in the Closed 1164 state, and the secondary receives a packet containing a SYN event, a 1165 FIN event and an input data event (i.e. DATA-IN), the secondary first 1166 accepts the SYN event (because it is the only match in Closed 1167 state). The secondary accepts the connection, sends a SYN event and 1168 enters the ReadWrite state. The SYN event is removed from the list 1169 of pending events. The remaining events are FIN and DATA-IN. In the 1170 ReadWrite state the secondary reads the input data (i.e. the DATA-IN 1171 event is processed first because it has higher priority than the 1172 FIN event). Once the data has been read and the DATA-IN event has 1173 been removed from the list of pending events, the FIN event is 1174 processed and the secondary enters the CloseWrite state. 1176 If the secondary receives a packet containing a SYN event, and is for 1177 some reason unable to accept the connection (e.g. insufficient 1178 resources), it should reject the request by sending a SYN event 1179 followed by a RESET event. Note that both events can be sent as part 1180 of the same TMP packet. 1182 If either party receives a TMP packet that it does not understand, or 1183 an event in an incorrect state, it closes the TCP connection. 1185 +==============+=========+==========+==============+ 1186 | Entry State | Event | Action | Exit State | 1187 +==============+=========+==========+==============+ 1188 | Closed | SYN | SYN | ReadWrite | 1189 | | OPEN | SYN | OpenWrite | 1190 +--------------+---------+----------+--------------+ 1191 | OpenWrite | SYN | Accept | ReadWrite | 1192 | | WRITE | DATA-OUT | OpenWrite | 1193 | | CLOSE | FIN | OpenSynRead | 1194 | | ABORT | RESET | OpenSynReset | 1195 +--------------+---------+----------+--------------+ 1196 | OpenSynRead | SYN | Accept | CloseRead | 1197 +--------------+---------+----------+--------------+ 1198 | OpenSynReset | SYN | Accept | Closed | 1199 +--------------+---------+----------+--------------+ 1200 | ReadWrite | DATA-IN | Accept | ReadWrite | 1201 | | FIN | Accept | CloseWrite | 1202 | | RESET | Accept | Closed | 1203 | | WRITE | DATA-OUT | ReadWrite | 1204 | | CLOSE | FIN | CloseRead | 1205 | | ABORT | RESET | Closed | 1206 +--------------+---------+----------+--------------+ 1207 | CloseWrite | RESET | Accept | Closed | 1208 | | WRITE | DATA-OUT | CloseWrite | 1209 | | CLOSE | FIN | Closed | 1210 | | ABORT | RESET | Closed | 1211 +--------------+---------+----------+--------------+ 1212 | CloseRead | DATA-IN | Accept | CloseRead | 1213 | | FIN | Accept | Closed | 1214 | | RESET | Accept | Closed | 1215 | | ABORT | RESET | Closed | 1216 +--------------+---------+----------+--------------+ 1218 TMP Event Priorities and State Transitions