idnits 2.17.1 draft-lyon-itp-nodes-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == Mismatching filename: the document gives the document name as 'draft-lyon-itp-nodes-01', but the file name used is 'draft-lyon-itp-nodes-02' == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 17 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 572 has weird spacing: '... K. Evans 4 Expires in 6 months J. Klein 5 Tandem Computers 6 September 8th, 1997 8 Transaction Internet Protocol 9 Version 2.0 11 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working 16 documents of the Internet Engineering Task Force (IETF), its areas, 17 and its working groups. Note that other groups may also distribute 18 working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other documents 22 at any time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 To learn the current status of any Internet-Draft, please check the 26 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 28 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 29 ftp.isi.edu (US West Coast). 31 Abstract 33 In many applications where different nodes cooperate on some work, 34 there is a need to guarantee that the work happens atomically. That 35 is, each node must reach the same conclusion as to whether the work 36 is to be completed, even in the face of failures. This document 37 proposes a simple, easily-implemented protocol for achieving this 38 end. 40 Table of Contents 42 Status of this memo 1 43 Abstract 1 44 Table of Contents 2 45 1. Introduction 3 46 2. Example Usage 3 47 3. Transactions 4 48 4. Connections 4 49 5. Transaction Identifiers 5 50 6. Pushing vs. Pulling Transactions 5 51 7. Endpoint Identification 6 52 8. TIP Uniform Resource Locators 7 53 9. States of a Connection 8 54 10. Protocol Versioning 9 55 11. Commands and Responses 10 56 12. Command Pipelining 10 57 13. TIP Commands 10 58 14. Error Handling 16 59 15. Connection Failure and Recovery 16 60 16. Security Considerations 17 61 17. Significant changes from previous version 18 62 App A. The TIP Multiplexing Protocol Version 2.0 19 63 References 23 64 Authors' Addresses 23 65 Comments 23 67 1. Introduction 69 The standard method for achieving atomic commitment is the two-phase 70 commit protocol; see [1] for an introduction to atomic commitment and 71 two-phase commit protocols. 73 Numerous two-phase commit protocols have been implemented over the 74 years. However, none of them has become widely used in the Internet, 75 due mainly to their complexity. Most of that complexity comes from 76 the fact that the two-phase commit protocol is bundled together with 77 a specific program-to-program communication protocol, and that 78 protocol lives on top of a very large infrastructure. 80 This memo proposes a very simple two-phase commit protocol. It 81 achieves its simplicity by specifying only how different nodes agree 82 on the outcome of a transaction; it allows (even requires) that the 83 subject matter on which the nodes are agreeing be communicated via 84 other protocols. By doing so, we avoid all of the issues related to 85 application communication semantics and data representation 86 (to name just a few). Independent of the application communication 87 protocol a transaction manager may use the Transport Layer Security 88 protocol [3] to authenticate other transaction managers and encrypt 89 messages. 91 It is envisioned that this protocol will be used mainly for a 92 transaction manager on one Internet node to communicate with a 93 transaction manager on another node. While it is possible to use 94 this protocol for application programs and/or resource managers to 95 speak to transaction managers, this communication is usually 96 intra-node, and most transaction managers already have more-than- 97 adequate interfaces for the task. 99 While we do not expect this protocol to replace existing ones, we 100 do expect that it will be relatively easy for many existing 101 heterogeneous transaction managers to implement this protocol for 102 communication with each other. 104 Further supplemental information regarding the TIP protocol can be 105 found in [5]. 107 2. Example Usage 109 Today the electronic shopping basket is a common metaphor at many 110 electronic store-fronts. Customers browse through an electronic 111 catalog, select goods and place them into an electronic shopping 112 basket. HTTP servers [2] provide various means ranging from URL 113 encoding to context cookies to keep track of client context (e.g. 114 the shopping basket of a customer) and resume it on subsequent 115 customer requests. 117 Once a customer has finished shopping they may decide to commit 118 their selection and place the associated orders. Most orders may have 119 no relationship with each other except being executed as part of the 120 same shopping transaction; others may be dependent on each other 121 (for example, if made as part of a special offering). Irrespective of 122 these details a customer will expect that all orders have been 123 successfully placed upon receipt of a positive acknowledgment. 125 Today's electronic store-fronts must implement their own special 126 protocols to coordinate such placement of all orders. This 127 programming is especially complex when orders are placed through 128 multiple electronic store-fronts. This complexity limits the 129 potential utility of internet applications, and constrains growth. 131 The protocol described in this document intends to provide a standard 132 for internet servers to achieve agreement on a unit of shared work 133 (e.g. placement of orders in an electronic shopping basket). 134 The server (e.g. a CGI program) placing the orders may want to start 135 a transaction calling its local transaction manager, and ask 136 other servers participating in the work to join the transaction. 137 The server placing the orders passes a reference to the transaction 138 as user data on HTTP requests to the other servers. The other 139 servers call their transaction managers to start a local transaction 140 and ask them to join the remote transaction using the protocol 141 defined in this document. Once all orders have been placed, execution 142 of the two-phase-commit protocol is delegated 143 to the involved transaction managers. If the transaction commits, 144 all orders have been successfully placed and the customer gets a 145 positive acknowledgment. If the transaction aborts no orders will 146 be placed and the customer will be informed of the problem. 148 Transaction support greatly simplifies programming of these 149 applications as exception handling and failure recovery are delegated 150 to a special component. End users are also not left having to deal 151 with the consequences of only partial success. 153 While this example shows how the protocol can be used by HTTP 154 servers, applications may use the protocol when accessing a remote 155 database (e.g. via ODBC), or invoking remote services using other 156 already existing protocols (e.g. RPC). The protocol makes it easy for 157 applications in a heterogeneous network to participate in the same 158 transaction, even if using different communication protocols. 160 3. Transactions 162 "Transaction" is the term given to the programming model whereby 163 computational work performed has atomic semantics. That is, either 164 all work completes successfully and changes are made permanent (the 165 transaction commits), or if any work is unsuccessful, changes are 166 undone (the transaction aborts). The work comprising a transaction 167 (unit of work), is defined by the application. 169 4. Connections 171 The Transaction Internet Protocol (TIP) requires a reliable ordered 172 stream transport with low connection setup costs. In an Internet (IP) 173 environment, TIP operates over TCP, optionally using a protocol to 174 multiplex light-weight connections over the same TCP connection. 176 Transaction managers which share transactions establish a TCP 177 connection. The protocol uses a different connection for each 178 simultaneous transaction shared between two transaction managers. 179 After a transaction has ended, the connection can be reused for 180 a different transaction. 182 Optionally, instead of associating a TCP connection with only a 183 single transaction, two transaction managers may agree on a protocol 184 to multiplex light-weight connections over the same TCP connection, 185 and associate each simultaneous transaction with a separate light- 186 weight connection. Using light-weight connections reduces latency 187 and resource consumption associated with executing simultaneous 188 transactions. Similar techniques as described here are widely used 189 by existing transaction processing systems. See Appendix A for an 190 example of one such protocol. 192 Note that although the TIP protocol itself is described only in terms 193 of TCP, there is nothing to preclude the use of TIP with other 194 transport protocols. However, it is up to the implementor to ensure 195 the chosen transport provides equivalent semantics to TCP, and to map 196 the TIP protocol appropriately. 198 5. Transaction Identifiers 200 Unfortunately, there is no single globally-accepted standard for the 201 format of a transaction identifier; there are various standard and 202 proprietary formats. Allowed formats for a TIP transaction 203 identifier are described below in the section "TIP Uniform Resource 204 Locators". A transaction manager may map it's internal transaction 205 identifiers into this TIP format in any manner it sees fit. 206 Furthermore, each party in a superior/subordinate relationship gets 207 to assign its own identifier to the transaction; these identifiers 208 are exchanged when the relationship is first established. Thus, a 209 transaction manager gets to use its own format of transaction 210 identifier internally, but it must remember a foreign transaction 211 identifier for each superior/subordinate relationship in which it is 212 involved. 214 6. Pushing vs. Pulling Transactions 216 Suppose that some program on node "A" has created a transaction, and 217 wants some program on node "B" to do some work as part of the 218 transaction. There are two classical ways that he does this, 219 referred to as the "push" model and the "pull" model. 221 In the "push" model, the program on A first asks his transaction 222 manager to export the transaction to node B. A's transaction manager 223 sends a message to B's TM asking it to instantiate the transaction as 224 a subordinate of A, and return its name for the transaction. The 225 program on A then sends a message to its counterpart on B on the 226 order of "Do some work, and make it part of the transaction that your 227 transaction manager already knows of by the name ...". Because A's 228 TM knows that it sent the transaction to B's TM, A's TM knows to 229 involve B's TM in the two-phase commit process. 231 In the "pull" model, the program on A merely sends a message to B on 232 the order of "Do some work, and make it part of the transaction that 233 my TM knows by the name ...". The program on B asks its TM to enlist 234 in the transaction. At that time, B's TM will "pull" the transaction 235 over from A. As a result of this pull, A's TM knows to involve B's 236 TM in the two-phase commit process. 238 The protocol described here supports both the "push" and "pull" 239 models. 241 7. Endpoint Identification 243 In certain cases after connection failures, one of the parties of 244 a connection may have a responsibility to re-establish a new 245 connection to the other party in order to complete the 246 two-phase-commit protocol. If the party that initiated the original 247 connection needs to re-establish it, the job is easy: he merely 248 establishes a connection in the same way that he originally did it. 249 However, if the other party needs to re-establish the connection, 250 he needs to know how to contact the initiator of the original 251 connection. He gets this information in the following way: 253 After a TCP connection has been established the initiating party 254 issues an IDENTIFY command and supplies an endpoint identifier which 255 is used to re-establish the connection if needed. If the initiating 256 party does not supply an endpoint identifier on the IDENTIFY command, 257 he must not perform any action which would require a connection to be 258 re-established (e.g. perform recovery actions). 260 An as used in the IDENTIFY (and a few other) 261 commands has one of the following formats: 262 263 264 : 265 : 267 A is a standard name, acceptable to the domain name 268 service. It must be sufficiently qualified to be useful to the 269 receiver of the command. 271 An is an IP address, in the usual form: four decimal 272 numbers separated by period characters. 274 The is a decimal number specifying the port at which 275 the transaction manager is listening for requests to establish TCP 276 connections. Two standard transaction service port numbers are 277 defined: xxxx for TLS secured connections, and yyyy for unsecured 278 connections. If the port number is omitted from the endpoint 279 identifier, and if the current connection is TLS secured, then the 280 standard TLS secured transaction service port number is assumed; 281 otherwise the standard unsecured transaction service port number is 282 assumed. Likewise, if a port number is specified, then it must 283 represent a port with the same security capabilities as the current 284 connection (i.e. TLS or unsecured). 286 8. TIP Uniform Resource Locators 288 Transactions and transaction managers are resources associated 289 with the TIP protocol. Transaction managers and transactions are 290 located using TCP/IP endpoint identifiers. Once a TCP connection has 291 been established, TIP commands may be sent to operate on transactions 292 associated with the respective transaction managers. 294 Applications which want to pull a transaction from a remote node 295 must supply a reference to the remote transaction which allows 296 the local transaction manager (i.e. the transaction manager pulling 297 the transaction) to connect to the remote transaction 298 manager and identify the particular transaction. Applications 299 which want to push a transaction to a remote node must supply 300 a reference to the remote transaction manager (i.e. the transaction 301 manager to which the transaction is to be pushed), which allows the 302 local transaction manager to locate the remote transaction 303 manager. 305 The TIP protocol defines a URL scheme [4] which allows applications 306 and transaction managers to exchange references (i.e. TIP URLs) to 307 transaction managers and transactions. 309 A TIP URL takes the form: 311 TIP://[:]/ 313 where is an IP address or a DNS name as defined above; and 314 is a valid TCP port number. may take one 315 of two forms (standard or non-standard): 317 i. "urn:" ":" 319 A standard transaction identifier, conforming to the proposed 320 Internet Standard for Uniform Resource Names (URNs), as 321 specified by RFC2141; where is the Namespace Identifier, 322 and is the Namespace Specific String. The Namespace ID 323 determines the syntactic interpretation of the Namespace 324 Specific String. The Namespace Specific String is a sequence of 325 characters representing a transaction identifier (as defined by 326 ). The rules for the contents of these fields are 327 specified by [7] (valid characters, encoding, etc.). 329 This format of may be used to express 330 global transaction identifiers in terms of standard 331 representations. Examples for might be or . 332 e.g. 334 TIP://123.123.123.123/urn:xopen:xid 336 ii. 338 A sequence of printable ASCII characters (octets with values in 339 the range 33 through 126 inclusive (excluding ":")) 340 representing a transaction identifier. In this non-standard 341 case, it is the combination of and 342 which ensures global uniqueness. e.g. 344 TIP://123.123.123.123/transid1 346 Except as otherwise described above, the TIP URL scheme follows the 347 rules for reserved characters as defined in [4], and uses escape 348 sequences as defined in [4] Section 5. 350 Note that the TIP protocol itself does not use the TIP URL scheme. 351 This URL scheme is proposed as a standard way to pass transaction 352 identification information through other protocols. e.g. between 353 cooperating application processes. The URL may then be used to 354 communicate to the local transaction manager the information 355 necessary to associate the application with a particular TIP 356 transaction. e.g. to PULL the transaction from a remote transaction 357 manager. It is anticipated that each TIP implementation will provide 358 some set of APIs for this purpose. 360 To create a non-standard TIP URL from a transaction identifier, first 361 replace any reserved characters in the transaction identifier with 362 their equivalent escape sequences, then insert the appropriate host 363 endpoint identification. If the transaction identifier is one that 364 you created, insert your own endpoint identification. If the 365 transaction identifier is one that you received on a TIP connection 366 that you initiated, insert the identification of the party to which 367 you connected. If the transaction identifier is one that you received 368 on a TIP connection that you did not initiate, use the identification 369 that was received in the IDENTIFY command. 371 9. States of a Connection 373 At any instant, only one party on a connection is allowed to send 374 commands, while the other party is only allowed to respond to 375 commands that he receives. Throughout this document, the party that 376 is allowed to send commands is called "primary"; the other party is 377 called "secondary". Initially, the party that initiated the 378 connection is primary; however, a few commands cause the 379 roles to switch. A connection returns to it's original polarity 380 whenever Idle state is reached. These rules remain true for "virtual" 381 connections when multiplexing is being used. 383 At any instant, a connection is in one of the following states. 384 From the point of view of the secondary party, the state changes when 385 he sends a reply; from the point of view of the primary party, the 386 state changes when he receives a reply. 388 Initial: The initial connection starts out in the Initial state. 389 Upon entry into this state, the party that initiated the 390 connection becomes primary, and the other party becomes secondary. 391 There is no transaction associated with the connection in this 392 state. From this state, the primary can send the IDENTIFY command. 394 Idle: In this state, the primary and the secondary have 395 agreed on a protocol version, and the primary supplied an 396 endpoint identifier to the secondary party to reconnect after 397 a failure. There is no transaction associated with the 398 connection in this state. Upon entry to this state, the party 399 that initiated the connection becomes primary, and the other 400 party becomes secondary. From this state, the primary can send 401 any of the following commands: BEGIN, MULTIPLEX, PUSH, PULL, 402 QUERY and RECONNECT. 404 Begun: In this state, a connection is associated with an active 405 transaction, which can only be completed by a one-phase protocol. 406 A BEGUN response to a BEGIN command places a connection into 407 this state. Failure of a connection in Begun state implies 408 that the transaction will be aborted. From this state, the 409 primary can send an ABORT, or COMMIT command. 411 Enlisted: In this state, the connection is associated with an active 412 transaction, which can be completed by a one-phase or, two-phase 413 protocol. A PUSHED response to a PUSH command, or a PULLED 414 response to a PULL command, places the connection into this state. 415 Failure of the connection in Enlisted state implies that the 416 transaction will be aborted. From this state, the primary can 417 send an ABORT, COMMIT, or PREPARE command. 419 Prepared: In this state, a connection is associated with a 420 transaction that has been prepared. A PREPARED response to a 421 PREPARE command, or a RECONNECTED response to a RECONNECT 422 command places a connection into this state. Unlike other 423 states, failure of a connection in this state does not cause 424 the transaction to automatically abort. From this state, the 425 primary can send an ABORT, or COMMIT command. 427 Multiplexing: In this state, the connection is being used by a 428 multiplexing protocol, which provides its own set of connections. 429 In this state, no TIP commands are possible on the connection. 430 (Of course, TIP commands are possible on the connections 431 supplied by the multiplexing protocol.) The connection can 432 never leave this state. 434 Error: In this state, a protocol error has occurred, and the 435 connection is no longer useful. 437 10. Protocol Versioning 439 This document describes version 2 of the protocol. In order to 440 accommodate future versions, the primary party sends a message 441 indicating the lowest and the highest version number it understands. 442 The secondary responds with the highest version number it 443 understands. 445 After such an exchange, communication can occur using the smaller of 446 the highest version numbers (i.e., the highest version number that 447 both understand). This exchange is mandatory and occurs using the 448 IDENTIFY command (and IDENTIFIED response). 450 If the highest version supported by one party is considered obsolete 451 and no longer supported by the other party, no useful communication 452 can occur. In this case, the newer party should merely drop the 453 connection. 455 11. Commands and Responses 457 All commands and responses consist of one line of ASCII text, using 458 only octets with values in the range 32 through 127 inclusive, 459 followed by either a CR (an octet with value 13) or an LR (an octet 460 with value 10). Each line can be split up into one or more "words", 461 where successive words are separated by one or more space octets 462 (value 32). 464 Arbitrary numbers of spaces at the beginning and/or end of each line 465 are allowed, and ignored. 467 Lines that are empty, or consist entirely of spaces are ignored. 468 (One implication of this is that you can terminate lines with both a 469 CR and an LF if desired; the LF will be treated as terminating an 470 empty line, and ignored.) 472 In all cases, the first word of each line indicates the 473 type of command or response; all defined commands and responses 474 consist of upper-case letters only. 476 For some commands and responses, subsequent words convey parameters 477 for the command or response; each command and response takes a fixed 478 number of parameters. 480 All words on a command or response line after the last defined word 481 are totally ignored. These can be used to pass human-readable 482 information for debugging or other purposes. 484 12. Command Pipelining 486 The primary party of a connection is allowed to issue multiple 487 commands without having to wait for responses. This reduces 488 latency and allows the primary to react immediately to local state 489 changes. Examples are a PREPARE command immediately followed by 490 an ABORT command after the primary detected that a transaction must 491 be aborted, or a COMMIT response immediately followed by a PULL 492 command. The secondary must issue replies in the order of the 493 commands received. If a command causes an error the connection enters 494 the Error state and all subsequent commands on the connection are 495 discarded. 497 13. TIP Commands 499 Following is a list of all valid commands, and all possible 500 responses to each (for each command, whether it pertains to a 501 transaction, or a connection, is specified [thus]): 503 ABORT 505 This command is valid in the Begun, Enlisted, and Prepared states. 506 It informs the secondary that the current transaction of the 507 connection will abort. Possible responses are: 509 ABORTED 510 The transaction has aborted; the connection enters Idle 511 state, and the initiator of the connection becomes primary. 513 ERROR 514 The command was issued in the wrong state, or was malformed. 515 The connection enters the Error state. 517 BEGIN 519 This command is valid only in the Idle state. It asks the 520 secondary to create a new transaction and associate it with the 521 connection. The newly created transaction will be completed with a 522 one-phase protocol. Possible responses are: 524 BEGUN 525 A new transaction has been successfully begun, and that 526 transaction is now the current transaction of the connection. 527 The connection enters Begun state. 529 NOTBEGUN 530 A new transaction could not be begun; the connection 531 remains in Idle state. 533 ERROR 534 The command was issued in the wrong state, or was malformed. 535 The connection enters the Error state. 537 COMMIT 539 This command is valid in the Begun, Enlisted or Prepared states. 540 In the Begun or Enlisted state, it asks the secondary to attempt 541 to commit the transaction; in the Prepared state, it informs the 542 secondary that the transaction has committed. Note that in the 543 Enlisted state this command represents a one-phase protocol, and 544 should only be done when the sender has 1) no local recoverable 545 resources involved in the transaction, and 2) only one subordinate 546 (the sender will not be involved in any transaction recovery 547 process). Possible responses are: 549 ABORTED 550 This response is possible only from the Begun and Enlisted 551 states. It indicates that some party has vetoed the commitment 552 of the transaction, so it has been aborted instead of 553 committing. The connection enters the Idle state. 555 COMMITTED 556 This response indicates that the transaction has been 557 committed, and that the primary no longer has any 558 responsibilities to the secondary with respect to the 559 transaction. The connection enters the Idle state. 561 ERROR 562 The command was issued in the wrong state, or was malformed. 563 The connection enters the Error state. 565 ERROR 567 This command is valid in any state; it informs the secondary that 568 a previous response was not recognized or was badly formed. A 569 secondary should not respond to this command. The connection 570 enters Error state. 572 IDENTIFY 573 574 | "-" 576 This command is valid only in the Initial state. The primary party 577 informs the secondary party of the lowest and highest protocol 578 version supported (all versions between the lowest and highest 579 must be supported), and optionally of an IP address and a port 580 number at which the other party can re-establish a connection 581 if ever needed. If the primary party does not supply an endpoint 582 identifier the secondary party will respond with ABORTED or 583 READONLY to any PREPARE commands. Possible responses are: 585 IDENTIFIED 586 The accepting party has saved the identification. The response 587 contains the highest protocol version supported by the 588 secondary party. All future communication is assumed to take 589 place using the smaller of the protocol versions in the 590 IDENTIFY command and the IDENTIFIED response. The connection 591 enters the Idle state. 593 ERROR 594 The command was issued in the wrong state, or was malformed. 595 This response also occurs if the accepting party does not 596 support any version of the protocol in the range supported 597 by the initiator. 598 The connection enters the Error state. The initiator should 599 close the connection. 601 MULTIPLEX 603 This command is only valid in the Idle state. The command 604 seeks agreement to use the connection for a multiplexing 605 protocol that will supply a large number of connections on 606 the existing connection. The primary suggests a particular 607 multiplexing protocol. The secondary party can either accept 608 or reject use of this protocol. 610 At the present, the only defined protocol identifier is "TMP2.0", 611 which refers to the TIP Multiplexing Protocol, version 2.0. See 612 Appendix A for details of this protocol. Other protocol 613 identifiers may be defined in the future. Note that when using TMP 614 V2.0, a single TIP command (TMP application message) must be 615 wholly contained within a single TMP packet. 617 If the MULTIPLEX command is accepted, the specified multiplexing 618 protocol will totally control the underlying connection. This 619 protocol will begin with the first byte after the line terminator 620 of the MULTIPLEX command (for data sent by the initiator), 621 and the first byte after the line terminator of the MULTIPLEXING 622 response (for data received by the initiator). This implies that 623 an implementation must not send both a CR and a LF octet after 624 either the MULTIPLEX command or the MULTIPLEXING response, lest 625 the LF octet be mistaken for the first byte of the multiplexing 626 protocol. 628 Possible responses to the MULTIPLEX command are: 630 MULTIPLEXING 631 The secondary party agrees to use the specified multiplexing 632 protocol. The connection enters the Multiplexing state, and 633 all subsequent communication is as defined by that protocol. 634 All connections created by the multiplexing protocol start 635 out in the Idle state. 637 CANTMULTIPLEX 638 The secondary party cannot support (or refuses to use) the 639 specified multiplexing protocol. The connection remains in the 640 Idle state. 642 ERROR 643 The command was issued in the wrong state, or was malformed. 644 The connection enters the Error state. 646 PREPARE 648 This command is valid only in the Enlisted state; it requests 649 the secondary to prepare the transaction for commitment (phase 650 one of two-phase commit). Possible responses are: 652 PREPARED 653 The subordinate has prepared the transaction; the connection 654 enters PREPARED state. 656 ABORTED 657 The subordinate has vetoed committing the transaction. The 658 connection enters the Idle state, and the connection 659 initiator becomes primary. After this response, the 660 superior has no responsibilities to the subordinate with 661 respect to the transaction. 663 READONLY 664 The subordinate no longer cares whether the transaction 665 commits or aborts. The connection enters the Idle state, and 666 the connection initiator becomes primary. After this 667 response, the superior has no responsibilities to the 668 subordinate with respect to the transaction. 670 ERROR 671 The command was issued in the wrong state, or was malformed. 672 The connection enters the Error state. 674 PULL 675 677 This command is only valid in Idle state. This command seeks to 678 establish a superior/subordinate relationship in a transaction, 679 with the primary party of the connection as the subordinate (i.e., 680 he is pulling a transaction from the secondary party). Note that 681 the entire value of (as defined in the 682 section "TIP Uniform Resource Locators") must be specified as the 683 transaction identifier. Possible responses are: 685 PULLED 686 The relationship has been established. Upon receipt of this 687 response, the specified transaction becomes the current 688 transaction of the connection, and the connection enters 689 Enlisted state. Additionally, the roles of primary and 690 secondary become reversed. (That is, the superior becomes 691 the primary for the connection.) 693 NOTPULLED 694 The relationship has not been established (possibly, because 695 the secondary party no longer has the requested transaction). 696 The connection remains in Idle state. 698 ERROR 699 The command was issued in the wrong state, or was malformed. 700 The connection enters the Error state. 702 PUSH 704 This command is valid only in the Idle state. It seeks to 705 establish a superior/subordinate relationship in a transaction 706 with the primary as the superior. Note that the entire value of 707 (as defined in the section "TIP Uniform 708 Resource Locators") must be specified as the transaction 709 identifier. Possible responses are: 711 PUSHED 712 The relationship has been established, and the identifier by 713 which the subordinate knows the transaction is returned. The 714 transaction becomes the current transaction for the connection, 715 and the connection enters Enlisted state. 717 ALREADYPUSHED 718 The relationship has been established, and the identifier by 720 which the subordinate knows the transaction is returned. 721 However, the subordinate already knows about the transaction, 722 and is expecting the two-phase commit protocol to arrive via a 723 different connection. In this case, the connection remains in 724 the Idle state. 726 NOTPUSHED 727 The relationship could not be established. The connection 728 remains in the Idle state. 730 ERROR 731 The command was issued in the wrong state, or was malformed. 732 The connection enters Error state. 734 QUERY 736 This command is valid only in the Idle state. A subordinate uses 737 this command to determine whether a specific transaction still 738 exists at the superior. Possible responses are: 740 QUERIEDEXISTS 741 The transaction still exists. The connection remains in the 742 the Idle state. 744 QUERIEDNOTFOUND 745 The transaction no longer exists. The connection remains the 746 Idle state. 748 ERROR 749 The command was issued in the wrong state, or was malformed. 750 The connection enters Error state. 752 RECONNECT 754 This command is valid only in the Idle state. A superior uses the 755 command to re-establish a connection for a transaction, when the 756 previous connection was lost during Prepared state. Possible 757 responses are: 759 RECONNECTED 760 The subordinate accepts the reconnection. The connection enters 761 Prepared state. 763 NOTRECONNECTED 764 The subordinate no longer knows about the transaction. The 765 connection remains in Idle state. 767 ERROR 768 The command was issued in the wrong state, or was malformed. 769 The connection enters Error state. 771 [Note: Commands which pertain to connections are: IDENTIFY, 772 MULTIPLEX. Commands which pertain to transactions are: ABORT, 773 BEGIN, COMMIT, PREPARE, PULL, PUSH, QUERY, RECONNECT.] 775 14. Error Handling 777 If either party receives a line that it cannot understand it closes 778 the connection. If either party (either a command or a response), 779 receives an ERROR indication or an ERROR response on a connection 780 the connection enters the Error state and no further communication 781 is possible on that connection. An implementation may decide to 782 close the connection. Closing of the connection is treated by the 783 other party as a communication failure. 785 Receipt of an ERROR indication or an ERROR response indicates that 786 the other party believes that you have not properly implemented the 787 protocol. 789 15. Connection Failure and Recovery 791 A connection failure may be caused by a communication failure, or by 792 any party closing the connection. Depending on the state of a 793 connection, transaction managers will need to take various actions 794 when a connection fails. 796 If the connection fails in Initial or Idle state, the connection does 797 not refer to a transaction. No action is necessary. 799 If the connection fails in the Multiplexing state, all connections 800 provided by the multiplexing protocol are assumed to have failed. 801 Each of them will be treated independently. 803 If the connection fails in Begun or Enlisted state, each party will 804 abort the transaction. 806 If the connection fails in Prepared state, then the appropriate 807 action is different for the superior and subordinate in the 808 transaction. 810 If the superior determines that the transaction commits, then it 811 must eventually establish a new connection to the subordinate, and 812 send a RECONNECT command for the transaction. If it receives a 813 NOTRECONNECTED response, it need do nothing else. However, if it 814 receives a RECONNECTED response, it must send a COMMIT request and 815 receive a COMMITTED response. 817 If the superior determines that the transaction aborts, it is allowed 818 to (but not required to) establish a new connection and send a 819 RECONNECT command for the transaction. If it receives a RECONNECTED 820 response, it should send an ABORT command. 822 The above definition allows the superior to reestablish the 823 connection before it knows the outcome of the transaction, if it 824 finds that convenient. Having succeeded in a RECONNECT command, 825 the connection is back in Prepared state, and the superior can send a 826 COMMIT or ABORT command as appropriate when it knows the transaction 827 outcome. 829 If a subordinate notices a connection failure in Prepared state, then 830 it should periodically attempt to create a new connection to the 831 superior and send a QUERY command for the transaction. It should 832 continue doing this until one of the following two events occurs: 834 1. It receives a QUERIEDNOTFOUND response from the superior. In this 835 case, the subordinate should abort the transaction. 837 2. The superior, on some connection that it initiated, sends a 838 RECONNECT command for the transaction to the subordinate. In this 839 case, the subordinate can expect to learn the outcome of the 840 transaction on this new connection. If this new connection should 841 fail before the subordinate learns the outcome of the transaction, 842 it should again start sending QUERY commands. 844 Note that if a TIP system receives either a QUERY or a RECONNECT 845 command, and for some reason is unable to satisfy the request (e.g. 846 the necessary recovery information is not currently available), then 847 the connection should be dropped. 849 16. Security Considerations 851 If a system implements this protocol, it is in essence allowing any 852 other system to attempt to reach an atomic agreement about some piece 853 of work. However, since this protocol itself does not cause the work 854 to occur, the security implications are minimal. If a system does not 855 protect itself through usage of another protocol such as the 856 Transport Layer Security protocol, then security implications fall 857 into the following categories: 859 1. Someone PUSHED a new transaction to us that we don't want. 860 Depending on his correctness or intentions, he may or may not ever 861 complete it. Thus, an arbitrary computer may cause us to save a 862 little bit of state. An implementation concerned about this will 863 probably drop the TCP connection if the other system does not 864 complete transactions in a timely manner. 866 The Transport Layer Security protocol [3] may be used by a 867 transaction manager to restrict access to trusted clients only. 869 2. Someone PULLED a transaction from us when we didn't want him to. 870 In this case, he will become involved in the atomic commitment 871 protocol. At worst, he may cause a transaction to abort that 872 otherwise would have committed. Since transaction managers 873 traditionally reserve the right to abort any transaction for any 874 reason they see fit, this does not represent a disaster to the 875 applications. However, if done frequently, it may represent a 876 denial-of-service attack. 878 Implementations concerned about this kind of attack can use the 879 Transport Layer Security protocol [3] to restrict access to 880 trusted partners (i.e. to control from which remote endpoints 881 TIP transactions will be accepted, and to verify that an end-point 882 is genuine), and encrypt TIP commands thus preventing unauthorized 883 disclosure of transaction identifiers. 885 3. Someone violates the TIP commitment protocol. (e.g. a COMMIT 886 command is injected on a TIP connection in place of an ABORT 887 command). This yields the possibility of data inconsistency. 889 Implementations concerned about this kind of attack can also use 890 the Transport Layer Security protocol [3] to restrict access to 891 only trusted partners and to encrypt TIP commands. 893 It is assumed that implementation-specific configuration information 894 will define whether a partner should be connected to using either a 895 mandatory TLS secured connection, or an unsecured connection (in 896 which case any security risk is accepted). "Optionally TLS secured" 897 is in effect unsecured (since there is no guarantee of a TLS secured 898 connection). 900 17. Significant changes from previous version of this Internet-Draft 901 (): 903 The Session Control Protocol I-D has been incorporated as an addendum 904 to this I-D, and renamed the TIP Multiplexing Protocol. 905 Use of the URN scheme for standard transaction identifiers has been 906 added. 907 Numerous editorial changes to aid understanding and fix errors have 908 been made. 910 Appendix A. The TIP Multiplexing Protocol Version 2.0. 912 This appendix describes version 2.0 of the TIP Multiplexing Protocol 913 (TMP). TMP V2.0 is the same as the Session Control Protocol (SCP) 914 version 2.0, as described by [6]. TMP is intended solely for use 915 with the TIP protocol, and forms part of the TIP protocol 916 specification (although it's implementation is optional), hence it's 917 inclusion in this document. TMP V2.0 is the only multiplexing 918 protocol supported by TIP V2.0. The following text is a copy of [6] 919 with no substantive changes, it is edited only as necessary to 920 reflect the name change and for inclusion in this document. The only 921 protocol change is the removal of the PUSH flag (which is no longer 922 required given the rule that a single TIP command must be wholly 923 contained within a single TMP packet). 925 Abstract 927 TMP provides a simple mechanism for creating multiple lightweight 928 connections over a single TCP connection. Several such lightweight 929 connections can be active simultaneously. TMP provides a byte 930 oriented service, but allows message boundaries to be marked. 932 A.1. Introduction 934 There are several protocols in widespread use on the Internet which 935 create a single TCP connection for each transaction. Unfortunately, 936 because these transactions are short lived, the cost of setting up 937 and tearing down these TCP connections becomes significant, both in 938 terms of resources used and in the delays associated with TCP's 939 congestion control mechanisms. 941 The TIP Multiplexing Protocol (TMP) is a simple protocol running on 942 top of TCP that can be used to create multiple lightweight 943 connections over a single transport connection. TMP therefore 944 provides for more efficient use of TCP connections. Data from 945 several different TMP connections can be interleaved, and both 946 message boundaries and end of stream markers can be provided. 948 Because TMP runs on top of a reliable byte ordered transport 949 service it can avoid most of the extra work TCP must go through in 950 order to ensure reliability. For example, TMP connections do not 951 need to be confirmed, so there is no need to wait for handshaking 952 to complete before data can be sent. 954 A.2. Protocol Model 956 The basic protocol model is that of multiple lightweight 957 connections operating over a reliable stream of bytes. The party 958 which initiated the connection is referred to as the primary, and 959 the party which accepted the connection is referred to as the 960 secondary. 962 Connections may be unidirectional or bi-directional; each end of a 963 bi-directional connection may be closed separately. Connections may 964 be closed normally, or reset to indicate an abortive release. 965 Aborting a connection closes both data streams. 967 Once a connection has been opened, applications can send messages 968 over it, and signal the end of application level messages. 969 Application messages are encapsulated in TMP packets and 970 transferred over the byte stream. A single TIP command (TMP 971 application message) must be wholly contained within a single TMP 972 packet. 974 A.3. TMP Packet Format 976 A TMP packet consists of a 64 bit header followed by zero or more 977 octets of data. The header contains three fields; a flag byte, the 978 connection identifier, and the packet length. Both integers, the 979 connection identifier and the packet length must be sent in network 980 byte order. 982 FLAGS 983 +--------+--------+--------+--------+ 984 |SFPR0000| Connection ID | 985 +--------+--------+--------+--------+ 986 | | Length | 987 +--------+--------+--------+--------+ 989 A.3.1. Flag Details 991 +-------+-----------+-----------------------------------------+ 992 | Name | Mask | Description | 993 +-------+-----------+ ----------------------------------------+ 994 | SYN | 1xxx|0000 | Open a new connection | 995 | FIN | x1xx|0000 | Close an existing connection | 996 | RESET | xxx1|0000 | Abort the connection | 997 +-------+-----------+-----------------------------------------+ 999 A.4. Connection Identifiers 1001 Each TMP connection is identified by a 24 bit integer. TMP 1002 connections created by the party which initiated the underlying TCP 1003 connection must have even identifiers; those created by the other 1004 party must have odd identifiers. 1006 A.5. TMP Connection States 1008 TMP connections can exist in several different states; Closed, 1009 OpenWrite, OpenSynRead, OpenSynReset, OpenReadWrite, CloseWrite, 1010 and CloseRead. A connection can change its state in response to 1011 receiving a packet with the SYN, FIN, or RESET bits set, or in 1012 response to an API call by the application. The available API calls 1013 are open, close, and abort. 1015 The meaning of most states is obvious (e.g. OpenWrite means that a 1016 connection has been opened for writing). The meaning of the states 1017 OpenSynRead and OpenResetRead need more explanation. 1019 In the OpenSynRead state a primary opened and immediately closed the 1020 output data stream of a connection, and is now waiting for a SYN 1021 response from the secondary to open the input data stream for 1022 reading. 1024 In the OpenResetRead state a primary opened and immediately aborted 1025 a connection, and is now waiting for a SYN response from the 1026 secondary to finally close the connection. 1028 A.6. Event Priorities and State Transitions 1030 The state table shown below describes the actions and state 1031 transitions that occur in response to a given event. The events 1032 accepted by each state are listed in priority order with highest 1033 priority first. If multiple events are present in a message, those 1034 events matching the list are processed. If multiple events match, 1035 the event with the highest priority is accepted and processed 1036 first. Any remaining events are processed in the resultant 1037 successor state. 1039 For example, if a TMP connection at the secondary is in the Closed 1040 state, and the secondary receives a packet containing a SYN event, a 1041 FIN event and an input data event (i.e. DATA-IN), the secondary first 1042 accepts the SYN event (because it is the only match in Closed 1043 state). The secondary accepts the connection, sends a SYN event and 1044 enters the ReadWrite state. The SYN event is removed from the list 1045 of pending events. The remaining events are FIN and DATA-IN. In the 1046 ReadWrite state the secondary reads the input data (i.e. the DATA-IN 1047 event is processed first because it has higher priority than the 1048 FIN event). Once the data has been read and the DATA-IN event has 1049 been removed from the list of pending events, the FIN event is 1050 processed and the secondary enters the CloseWrite state. 1052 If either party receives a TMP packet that it does not understand, 1053 or an event in an incorrect state, it closes the TCP connection. 1055 +==============+=========+==========+==============+ 1056 | Entry State | Event | Action | Exit State | 1057 +==============+=========+==========+==============+ 1058 | Closed | SYN | SYN | ReadWrite | 1059 | | OPEN | SYN | OpenWrite | 1060 +--------------+---------+----------+--------------+ 1061 | OpenWrite | SYN | Accept | ReadWrite | 1062 | | WRITE | DATA-OUT | OpenWrite | 1063 | | CLOSE | FIN | OpenSynRead | 1064 | | ABORT | RESET | OpenSynReset | 1065 +--------------+---------+----------+--------------+ 1066 | OpenSynRead | SYN | Accept | CloseRead | 1067 +--------------+---------+----------+--------------+ 1068 | OpenSynReset | SYN | Accept | Closed | 1069 +--------------+---------+----------+--------------+ 1070 | ReadWrite | DATA-IN | Accept | ReadWrite | 1071 | | FIN | Accept | CloseWrite | 1072 | | RESET | Accept | Closed | 1073 | | WRITE | DATA-OUT | ReadWrite | 1074 | | CLOSE | FIN | CloseRead | 1075 | | ABORT | RESET | Closed | 1076 +--------------+---------+----------+--------------+ 1077 | CloseWrite | RESET | Accept | Closed | 1078 | | WRITE | DATA-OUT | CloseWrite | 1079 | | CLOSE | FIN | Closed | 1080 | | ABORT | RESET | Closed | 1081 +--------------+---------+----------+--------------+ 1082 | CloseRead | DATA-IN | Accept | CloseRead | 1083 | | FIN | Accept | Closed | 1084 | | RESET | Accept | Closed | 1085 | | ABORT | RESET | Closed | 1086 +--------------+---------+----------+--------------+ 1088 TMP Event Priorities and State Transitions 1090 References 1092 [1] Gray, J. and A. Reuter (1993), Transaction Processing: Concepts 1093 and Techniques. San Francisco, CA: Morgan Kaufmann Publishers. 1094 (ISBN 1-55860-190-2). 1096 [2] RFC2068 Standards Track "Hypertext Transfer Protocol -- 1097 HTTP/1.1". 1098 R. Fielding et al. 1100 [3] Internet-Draft "The TLS Protocol Version 1.0". 1101 T. Dierks et al. 1103 [4] RFC1738 Standards Track "Uniform Resource Locators (URL)". 1104 T. Berners-Lee et al. 1106 [5] Internet-Draft "Transaction Internet Protocol - Requirements and 1107 Supplemental Information". 1108 K. Evans et al. 1110 [6] Internet-Draft "Session Control Protocol V 2.0". 1111 K. Evans et al. 1113 [7] RFC2141, "URN Syntax". 1114 R. Moats. 1116 Authors' Addresses 1118 Jim Lyon Keith Evans 1119 Microsoft Corporation Tandem Computers, Inc. 1120 One Microsoft Way 5425 Stevens Creek Blvd 1121 Redmond, WA 98052-6399, USA Santa Clara, CA 95051-7200, USA 1123 Phone: +1 (206) 936 0867 Phone: +1 (408) 285 5314 1124 Fax: +1 (206) 936 7329 Fax: +1 (408) 285 5245 1125 Email: JimLyon@Microsoft.Com Email: Keith@Loc252.Tandem.Com 1127 Johannes Klein 1128 Tandem Computers Inc. 1129 10555 Ridgeview Court 1130 Cupertino, CA 95014-0789, USA 1132 Phone: +1 (408) 285 0453 1133 Fax: +1 (408) 285 9818 1134 Email: Klein_Johannes@Tandem.Com 1136 Comments 1138 Please send comments on this document to the authors at 1139 , , 1140 , or to the TIP mailing list at 1141 . You can subscribe to the TIP mailing list by 1142 sending mail to with the line "subscribe tip" 1143 somewhere in the body of the message.