idnits 2.17.1 draft-ietf-st2-state-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 2090 looks like a reference -- Missing reference section? '2' on line 2093 looks like a reference Summary: 12 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft M. Rajagopal 2 Expiration: August 22, 1996 S. Sergeant 3 File: draft-ietf-st2-state-01.txt 5 Internet Stream Protocol Version 2 (ST2) 6 Protocol State Machines - Version ST2+ 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its Areas 12 and Working Groups. Note that other groups may also distribute 13 working documents as Internet-Drafts. Internet-Drafts are draft 14 documents valid for a maximum of six months. Internet-Drafts may be 15 updated, replaced, or obsoleted by other documents at any time. It is 16 not appropriate to use Internet-Drafts as reference material or cite 17 them other than as "work in progress". To learn the current status 18 of any Internet-Draft, please check the "lid-abstracts.txt" listing 19 contained in the Internet-Drafts Shadow directories on 20 ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu 21 (US West Coast), or munnari.oz.au (Pacific Rim). 23 Abstract: 25 This memo contains a description of state machines for the revised 26 specification of the Internet STream Protocol Version 2 (ST2+) 27 described in RFC 1819. The state machines in this document are 28 descriptions of the ST2+ protocol states and message sequence 29 specifications for normal behavior. Exception processsing issues are 30 defined and discussed for protocol compliance and implementation 31 options. 33 Editor's Note: 35 This memo is available both in ASCII format (file: draft-ietf-ST- 36 state-01.txt) and in PostScript (file: draft-ietf-ST-state-01.ps). 37 The PostScript version contains the essential state diagrams and is 38 absolutely required. 40 TABLE OF CONTENTS 42 1 Introduction 4 44 2 ST Agent Architecture 6 45 2.1 ST Protocol Characteristics 6 46 2.2 The Stream FSM Model 7 47 2.3 ST Agent Roles in an Internetwork 7 48 2.4 The ST Agent Model 8 49 2.5 Origin, Next Hop, Previous Hop and 50 Target Finite State 10 51 2.6 Stream Finite State Machines 11 52 2.6.1 Externally Communicating FSMs 11 53 2.6.2 Internally Communicating FSMs 11 54 2.7 Queues between External Communicating FSMs 11 55 2.8 Queues Inside an Agent 12 57 3 Stream Finite State Machines 12 58 3.1 Assumptions 14 59 3.2 State Machine Model Conventions 15 60 3.2.1 Naming Conventions and Notations 15 61 3.2.2 Transmissions and Receptions 15 62 3.2.3 Predicates 15 63 3.3 Normal Behavior versus Exception Processing 16 64 3.3.1 Context not Represented in Stream FSMs 17 65 3.3.2 Special Message Types 17 66 3.3.3 Classes of Response Types 18 67 3.4 Stream State Machines. 19 68 3.4.1 Origin State Machine (OSM) 19 69 3.4.2 Next Hop State Machine (NHSM) 22 70 3.4.3 Previous Hop State Machine (PHSM) 26 71 3.5 The Target State Machine (TSM) 27 73 4 ST Agent FSMs 29 74 4.1 Agent Database Context 29 75 4.2 ST Dispatcher role for incoming Packet-switching, 30 76 4.3 ST Dispatcher functions for outgoing 77 Packet switching, timer 33 78 4.4 Retry FSM- RFSM for datalink reliability of 79 PDU transmissions 33 80 4.5 Agent , Neighbor and Stream Supervision 35 81 4.5.1 The MonitorFSM (MFSM) for Agent and Stream 82 Supervision 35 83 4.5.2 The Nieghbor Detection Failure FSM for Neighbor 84 Management 36 85 4.5.3 Service Model Interactions 37 87 5 Exception Processing 37 88 5.1 Additional Exception Processing 38 89 5.1.1 ST Dispatcher detected inconsistencies 90 Reason Codes: 38 91 5.1.2 MonitorFSM issues with neighbor failure and 92 stream recovery 38 93 5.1.3 Retry and Timeout Failures Reason Codes: 39 94 5.1.4 Routing issues Reason Codes: 39 95 5.1.5 LRM issue Reason Codes: 40 97 6 APPENDIX 41 98 6.1 Glossary 41 99 6.2 ST Control Message Flow 43 100 6.2.1 Message Type 43 101 6.2.2 Response 44 102 6.2.3 Possible causes for message 44 103 6.3 Internetwork Complexities 44 105 1 Introduction 107 This section gives a brief overview of the ST protocol terms and the 108 protocol FiniteState Machine (FSM) issues addressed in this document. 109 It is assumed that the reader is familiar with the ST2+ Protocol 110 Specification document listed in [1]. Unless otherwise stated, ST in 111 this document refers to the enhanced ST protocol (ST2+). 113 ST+ is a connection-oriented internetworking protocol that operates 114 at the same layer as connectionless IP. An ST stream is defined as a 115 connection established between an Origin sending data to one or more 116 Targets. An ST Agent is a network node that participates in resource 117 reservation negotiations during stream setup along the path between 118 the Origin and Targets. The resource reservation request is based on 119 a Flow Specification sent by the Origin. The FlowSpec provides the 120 basis for the negotiated Quality of Service (QOS). 122 This QOS is only established, monitored and maintained by nodes with 123 ST Agent capabilities. Each hop in the ST stream routing tree is an 124 ST Agent . ST Agents that are one hop away from a given node are 125 called Previous-Hops in the upstream direction and Next-Hops in the 126 downstream direction. ST Agent Previous-Hop and Next-Hop Agents are 127 called ST neighbors. 129 Data transfer in the ST stream is simplex in the downstream 130 direction. As such, a single Origin sending data to many Targets is 131 similar to a media broadcast model. However, each ST Agent may 132 simultaneously need to perform Origin, Previous-Hop, Next-Hop and 133 Target functions for a number of different streams. These streams may 134 be part of a conference ( as in the telephone model) or Group of 135 related streams, such that resource reservation and routing issues 136 may be interrelated. The streams may also be unrelated to each other, 137 but ranked by Precedence within an internetwork in the event that 138 limited or changing resources need to be reallocated. Origin 139 applications may request an automatic Recovery option in the event of 140 network failure or a Change to the QOS after the original setup. 141 Target applications may send a request to a stream ST Agent to allow 142 that Target to Join the stream, with or without Notifying the Origin. 144 Thus, an ST Agent may be required to support a complex web of 145 intersecting streams with competing QOS requirements and changing 146 resource allocations or members. The ST Service Model supports the ST 147 protocol and ST QOS features for routing, resource management and 148 packet-switching. This Protocol State Machine document addresses the 149 ST protocol in any ST Agent, regardless of the implementation 150 specifics of the ST Service Model. 152 Stream Control Message Protocol (SCMP) messages form a request- 153 response protocol where the particulars of the Flow Specification, as 154 well as other Protocol Data Unit (PDU) parameters, are interpreted by 155 the chosen QOS algorithms for routing, Local Resource Management 156 (LRM) and packet-switching. The ST2+ Specification explicitly defines 157 all required and allowable, functions and sequences of SCMP message 158 operations. The SCMP message types are: ACCEPT, ACK, CHANGE, CONNECT, 159 DISCONNECT, ERROR, HELLO, JOIN, JOIN-REJECT, NOTIFY, REFUSE, STATUS, 160 STATUS-RESPONSE. 162 An ST Agent will direct incoming SCMP messages to the appropriate 163 FSMs for each stream. An Origin State Machine OSM) is associated 164 with every Origin ST application. A Next-Hop State Machine (NHSM) is 165 associated with every downstream ST neighbor. A Previous-Hop State 166 Machine (PHSM) is associated with every upstream ST neighbor. A 167 Target State Machine (TSM) is associated with every Target ST 168 application. 170 The OSM, NHSM, PHSM and TSM have the same four fundemental states: 171 IDLE, ESTABLISHED, ADD and CHANGE. The basic transition from IDLE to 172 ESTABLISHED is through a CONNECT request with an ACCEPT response. 173 Additional CONNECT and JOIN requests may result in an ADD stream 174 state, while a CHANGE request would result in a CHANGE stream state. 175 DISCONNECT and REFUSE messages may remove one or more Targets while 176 the stream is in any state. 178 A Retry FSM is used to monitor the datalink reliability between ST 179 Agents.Each SCMP request-response sequence is defined with next hop 180 ACKnowledgement , ERROR, timeout and retry conditions. Such exception 181 processing is designed to resolve incomplete functions during times 182 of network or ST Agent failure. 184 A Monitor FSM is used to manage ST neighbor and stream Recovery 185 issues for all streams managed by the ST Agent. Each ST Agent 186 maintains state information describing the streams flowing through 187 it, and can actively gather and distribute such information. 189 If, for example, an Intermediate ST Agent fails, the neighboring 190 Agents can recognize this via HELLO messages that are periodically 191 exchanged between the ST Agents that share streams. STATUS packets 192 can be used to ask other ST Agents about a particular stream. These 193 agents then send back a STATUS-RESPONSE message. NOTIFY messages 194 serve to inform ST Agents of additional mangement information. 196 ST Reason Codes are used to inform other ST Agents of the source and 197 type of problem such that the correct response sequences will be 198 followed. These Reason Codes are inserted in the appropriate SCMP 199 PDUs and available to the ST Agent management functions. Thus an ST 200 Agent not only manages the normal request- response protocol between 201 the Origin and Targets of each stream, but also is actively involved 202 in the detection and distribution of error and QOS implications. 204 The ST Agent architecture and FSM models contained in this document 205 have been chosen to illustrate a method for an ST2+ protocol 206 implementation. There are many alternative techniques. Every effort 207 has been made to note the relevant tradeoffs between protocol 208 requirements and implementation choices.The atomic components of this 209 model may be rearranged to accomodate platform and implementation 210 issues. Every effort has been made to ensure that the described state 211 tables accurately reflect state transitions of the ST2+ version of 212 SCMP, and that the described state diagrams accurately reflect the 213 state transition tables. If there are discrepancies, the tables take 214 precedence over the diagrams and the protocol specification takes 215 precedence over the tables. 217 Section 2 : ST Agent Architecture describes the organization of the 218 ST Agent model. Section 3: Stream Finite State Machines describes the 219 OSM, NHSM, PHSM and TSM Section 4: Agent Finite State Machines 220 describes the Monitor and Retry FSMs. Section 5: Exception Processing 221 Issues details the Reason Codes by category. 223 2 ST Agent Architecture 225 This section describes the ST Agent Finite State Machines (FSMs). The 226 architectural descriptions are necessarily at a high level and are 227 meant to serve as a guide to the protocol implementer. The state 228 machine models are expected to provide the implementer with useful 229 information such as valid message sequences. The ST2+ Specification 230 provides the fully documented message detail. 232 2.1 ST Protocol Characteristics 234 The ST Agent FSM architecture is organized in a hierarchy of ST2 235 protocol characteristics. The characteristics are modeled as Agent 236 roles (i.e., Origin, Intermediate, Target, Previous Hop and Next 237 Hop), as well as protocol functions (e.g., individual SCMP 238 Request/Response patterns and reliability at both the PDU and Agent 239 level). 241 Figure 1. ST Agent Roles 243 Each ST Agent has an ST Dispatcher to filter incoming PDUs, intercept 244 semantic errors and direct valid PDUs to the appropriate queues and 245 FSMs. FIFO queues and a Message Separator/Combinator in each ST Agent 246 provide additional intra-Agent FSM communications. 248 Table 1 below shows the Request/Response patterns of each SCMP 249 message name by the Agent type generating the message. The message 250 type may be either a Request, a Response or a local control for the 251 Agent and PDU reliability functions. The message direction can be 252 downstream (towards a Target) or upstream (towards the Origin). The 253 Monitor FSM and the Retry FSM manage network, Agent and link 254 reliability status with the local control SCMP messages.. 256 Table 1: Request/Response Patterns 258 2.2 The Stream FSM Model 260 Figure 2 below shows the relationship between the ST Agents and FSMs 261 in each stream. 263 Figure 2. Stream FSM Model 265 The Origin State Machine (OSM) provides communications between an 266 Origin application and one or more Next Hop State Machines (NHSM). 268 An Intermediate Agent's Previous Hop State Machine (PHSM) has 269 communications with an Origin Agent's Next Hop State Machine (NHSM) 270 through their common network link and the respective Agent's ST 271 Dispatchers. In the same fashion, an Intermediate Agent's Next Hop 272 State Machine (NHSM) has communications with Intermediate and/or 273 Target Agent's Previous Hop State Machines(PHSM). 275 The Target State Machine (TSM) provides communications between a 276 Target application and a Previous Hop State Machine (PHSM) in the 277 Target Agent. 279 Figure 3. Internetwork Diagram of ST Agent Roles 281 2.3 ST Agent Roles in an Internetwork 283 The internetwork diagram of ST Agents (Figure 3) indicates the Origin 284 (O), Intermediate (I) and Target (T) roles of each ST Agent in a 285 conference with 4 Origins. The Intermediate Agents I1, I2, I3 and I4 286 are each neighbors of an ST Agent acting as both an Origin and a 287 Target for the other Origins. Each Origin is sending data in one 288 outgoing stream to three Targets, and simultaneously recieving data 289 on 3 incoming streams from the other Origins. 291 All ST Agents in this illustration have multiple ST neighbors, 292 streams and interfaces to manage. Each ST Agent may be required to 293 manage multiple FSMs for any one stream, as well as all competitive, 294 intersecting streams in the internetwork topology. 296 ST neighbor communications and SCMP exception processing can be used 297 to create a first line of defense for the ST stream FSMs. Normal 298 operations for stream FSMs may be protected and simplified. Network 299 errors and conflicting message sequences can be filtered out of the 300 fundemental stream state transitions. 302 2.4 The ST Agent Model 304 In Figure 4 , an ST Agent is depicted with an ST Dispatcher sending 305 and receiving ST PDUs from interface queues (and PDU surrogates from 306 application interfaces), representing a high order ST message 307 management scheme. This dispatcher unpacks or forwards incoming PDUs, 308 and creates and forwards outgoing PDUs. 310 Figure 4. ST Agent Model 312 The forwarding of data or the forwarding of certain command sequences 313 that are not following a negotiatied QOS path (i.e., JOIN and/or JOIN 314 flooding messages) requires a packet-forwarding mechanism, separate 315 from the stream operations that unpack, interpret or create PDUs.The 316 efficient packet switching of ST PDUs through Intermediate hops is 317 the main reason for this filtering priority. 319 A first level of filtering is designed to determine whether the 320 incoming PDU (or PDU surrogate) is data or one of the JOIN sequences 321 where the destination is, in fact, another ST Agent or an 322 application. Such PDUs are then forwarded to the destination., 323 whether a resident Target or for replication to multiple Next-hop 324 Targets. 326 The ST PDU validation and delivery functions manage information about 327 the the messaging success or failure, i.e. the Retry, timeout, ERROR 328 , or ACK status of messages. This information concerns datalink, 329 Agent and network reliability. Stream state transitions may occur as 330 a result. 332 Many Retry and Monitor FSM transitions may occur while any particular 333 stream state exists. The Monitor FSM interprets and sends messages 334 with information relevant to multiple streams , i.e., HELLO, STATUS, 335 STATUS-RESPONSE , NOTIFY. 337 Second-level filtering occurs when an ST Agent validates incoming 338 SCMP PDUs and sends the required ACKnowledge (or ERROR PDU, if there 339 are semantic errors) to the originator of the incoming PDU. 340 Conversely, incoming ACKnowledgement and ERROR PDUs trigger the Retry 341 FSM, where the timeout and retry values are updated or a signal is 342 generated to the appropriate stream FSM for specialized exception 343 processing. 345 All SCMP PDUs, except ACK, ERROR, HELLO, STATUS and STATUS-RESPONSE, 346 require an ACK from the next hop Agent upon receipt. An Agent or 347 datalink failure may be detected by either the Retry or the Monitor 348 FSM. A signal is then sent to the appropriate stream FSMs. The 349 Monitor FSM then manages a reCONNECT sequence for all streams that 350 have specified this Recovery option. 352 Once an ST Dispatcher has validated and filtered the PDUs, the stream 353 SCMP messages are separately queued into Requests (CONNECT, CHANGE, 354 JOIN) and Responses (ACCEPT, DISCONNECT, JOIN-REJECT, REFUSE). 355 Requests must wait for the completion of any preceding Requests for 356 the same stream, while Responses must be handled immediately without 357 regard to Request state transitions or queues. 359 The Request and Response queues are directed to the Origin (OSM), 360 Next Hop (NHSM), Previous Hop (PHSM) and Target (TSM) state machines. 361 These state machines are referred to as the stream state machines, 362 rather than the Agent state machines. 364 The stream state machines are designed to focus on normal (typical) 365 behavior rather than all pathological cases. Error control and 366 recovery in the architecture (i.e., ST Dispatcher filters, Monitor 367 and Retry FSMs) provide a firewall against many problems in the 368 stream FSMs. 370 However, when the architecture of a particular Agent platform has ST 371 intra-Agent communications that are actually between multiple 372 processors, the Next-hop and Previous-hop FSM communications may 373 require the concept of multiple ST Agents within what might otherwise 374 appear to be one ST Agent. Thus the rationale for the filtering and 375 queueing of the SCMP messages , not just the particular method of 376 illustration, is very important to the ST Agent Model. 378 The convention of discussing issues in terms of individual stream 379 state transistions will be used throughout Section 3 (Stream FSMs) as 380 a way of simplifying the discussion. Section 4 (ST Agent FSMs) and 381 Section 5 (Exception Processing) will provide more detail about the 382 architecture, filtering and queues used with the stream FSMs, such 383 that network failures can be managed with competing and intersecting 384 streams. 386 2.5 Origin, Next Hop, Previous Hop and Target Finite State 387 Machines 389 Communicating Finite State Machine (CFSM) models have been 390 extensively used in the trade to formally describe protocol behavior 391 [2]. Many variations of the basic CFSM model exist and our model is 392 also a variation of the basic model. Our model uses the basic CFSM 393 model with FIFO queues combined with predicates. The model describes 394 the ST protocol behavior and consists of ST SCMP messages along with 395 a number of predicates. These predicates are not part of the formal 396 ST Protocol specifications but are useful mechanisms that simplify 397 the state machine specifications 399 Origin, Intermediate and Target ST Agents in Figure 2 are all modeled 400 separately. Because a stream diverges in a tree-like graph, every 401 Intermediate ST Agent has to communicate with one upstream ST Agent 402 and one or more downstream Agents. An Intermediate Agent will 403 therefore have exactly one PHSM and one or more NHSMs for each 404 stream. Note that, it is possible to have more than one NHSM per 405 physical interface, when that interface has more than one Agent on 406 the associated communications link. 408 The state machine model architecture at an Origin is similar to the 409 state machine architecture at an Intermediate (Fig. 2). The Origin 410 may have one or more NHSMs. There is no PHSM in this case. However, 411 in the place of the PHSM there is an Origin State Machine (OSM) which 412 interfaces with the application. An OSM is a special case of the 413 PHSM. 415 The Target is modeled with one PHSM (Fig. 2). There are no NHSMs in 416 this example. However, in the place of a NHSM there are one or more 417 Target State Machine (TSM) that interface with the application. The 418 TSM is a special case of the NHSM. 420 Because the role of each ST Agent (Origin, Intermediate, or Target) 421 is different, the finite state machine models are not identical. 422 However, the model for communication between FSMs inside or outside 423 an Agent is uniform. 425 Consider a stream topology shown in Figure 2. The figure shows an ST 426 Origin (O) connected to 2 Intermediate Agents (I1 and I3). I1 is also 427 connected to I2 and a target T2. I2 is connected to Target T1 and I3 428 is connected to Targets T3 and T4. 430 The Origin is modeled with one OSM and 2 NHSMs (one per next hop). 431 Each Target is modeled with one PHSM and one or more TSMs. I1 and I3 432 are both modeled with one PHSM and 2 NHSMs; I2 is modeled with one 433 PHSM and 2 NHSMs. 435 2.6 Stream Finite State Machines 437 2.6.1 Externally Communicating FSMs 439 Communication between two ST agents is External Communication and 440 always happens between a NHSM and a PHSM pair (see Figure 2). Note 441 that, in the case of Origin and Target Agent as direct neighbors, it 442 is possible for a Target to be directly connected to an Origin . It 443 is also possible that one Target Agent is an Intermediate Agent for 444 another Target. in which case an Agent will have a PHSM communicating 445 with a TSM and one or more NHSMs. 447 2.6.2 Internally Communicating FSMs 449 Communicating entities inside an ST Agent is different for each Agent 450 type, i.e., Origin, Intermediate or Target. However, all FSMs inside 451 an Agent communicate via a Message Separator/Combiner box (MS/C). The 452 function of the MS/C box is described later in this section. 454 Internal Communication within the Origin occurs: 456 o between the OSM and an Upper Layer module and 458 o between the OSM and one or more NHSMs via a MS/C box (Note 459 that the NHSMs themselves do not communicate with each other) 461 Internal Communication within a Target occurs: 463 o between one or more TSMs and an Upper Layer module and 465 o between the TSM and a PHSM via a MS/C box 467 Internal Communication within an Intermediate Agent occurs: 469 o between a PHSM and one or more NHSMs via a MS/C box (Note 470 that the NHSMs themselves do not communicate with each other) 472 2.7 Queues between External Communicating FSMs 474 For the purposes of modelling, assume that messages are filtered and 475 queued in FIFO queues for the case of external Communicating FSM 476 pairs, i.e. between any two ST Agents. However, as indicated in the 477 previous discussion and diagrams of the ST dispatcher and filtering 478 hierarchy, it is somewhat more complex in reality. The concept shown 479 below in Figure 5 allows the discussion of the inter-Agent and 480 intra-Agent state machines to focus on the stream FSM issues without 481 regard to message and neighbor management issues. 483 Figure 5. Implicit Queues between an External Communicating FSM pair 485 2.8 Queues Inside an Agent 487 Each Agent is modeled with at least 2 state machines for each stream. 488 These state machines also need to communicate just like the external 489 communicating FSM pairs described above. The queue model in this case 490 also requires filtering mechanisms. This model requires a message 491 Separating and Combining function shown as Message Separator/Combiner 492 (MS/C) box in Fig. 6. 494 Figure 6 pictorially describes the multi-stage FIFO queue model for 495 an Agent. Implicit FIFO queues are assumed between the PHSM and the 496 MS/C, and also between the MS/C and one or more NHSMs. Use of such 497 FIFO queues eliminates the need for a separate synchronizing state 498 machine that would normally be required to synchronize the flows . 500 Figure 6. Queues between Internal Communicating FSMs inside an Agent 502 The function of this Message Separator/Combiner box is many: 504 o Performing a multicasting function by replicating an OSM or 505 PHSM message and sending them to different NHSMs or TSMs 507 o Combining messages coming from different TSMs or NHSMs and 508 sending them to the appropriate OSM or PHSM 510 Designing the Agent to contain separate upstream and downstream state 511 machines (PHSM and NHSMs respectively) with FIFO queues as shown in 512 Fig.6, offers several benefits: 514 o It simplifies the Agent design considerably by separating the 515 neighbor upstream and downstream communications 517 o Use of FIFO queues simplifies the Agent management since no 518 other synchronization mechanisms need to be used to streamline 519 messages flowing through the Agent. 521 3 Stream Finite State Machines 523 Each ST Agent must maintain state for each stream supported by that 524 Agent. There are many ways to represent the state that must be 525 maintained by Agents. This section presents the OSM, NHSM, PHSM and 526 TSM as a reference set of state machines. 528 Implementations may support machines based on this section or may 529 even support a completely different set of state machines. These 530 stream FSMs represent normal operations for the stream request- 531 response scenarios without regard to the functions performed by the 532 Retry and Monitor FSMs. The model assumes that a data engine 533 separate from the control engine exists. 535 This section represents stream state through four state machines. The 536 defined machines are: 538 o The Origin State Machine, or OSM. It represents the state of 539 a stream at the Origin Agent. 541 o The Next-Hop State Machine, or NHSM. It represents the state 542 of the stream for Targets reached via a particular next-hop. 544 o The Previous-Hop State Machine, or PHSM. It represents the 545 state of a stream at an Intermediate Agent or a Target Agent. The OSM 546 is essentially a special case of the PHSM, where the delivery of SCMP 547 to the Origin is via an API. 549 o The Target State Machine, or TSM. It represents the state of 550 a stream for a particular target application at the Target Agent. 551 This state machine is essentially a special case of the NHSM, where 552 there is only a ever a single Target per TSM and delivery of SCMP to 553 the Target is via an API. 555 A number of NHSMs related to the same stream, could conceivably all 556 be running in parallel -one for each next hop. In some cases, where 557 there is a network-layer multipoint link (e.g., ethernet), it is even 558 possible to have more than one NHSM associated with the same physical 559 interface. 561 A Message Separator/Combiner (MS/C) box separates all downstream 562 messages modifying the Targetlist and placing them in the respective 563 NHSM FIFO queues. The MS/C box also functions as a combiner of 564 messages flowing up stream. In this role it multiplexes all local 565 messages and places them in the PHSM FIFO queues. Note that the MS/C 566 relies on separate routing and LRM functions to determine the 567 appropriate separation since route and resource computation is not 568 part of ST protocol. Full-duplex FIFO queues are assumed between the 569 MS/C box and PHSM, and also between the MS/C box and the NHSMs. 571 The multi-machine Agent model breaks the complexity that results with 572 only one large model with the aid of the FIFO queue buffers and a 573 MS/C box. The FIFO queues eliminate the need for a separate 574 synchronizing state machine while reducing the complexity. The MS/C 575 reduces the explicit next-hop identification modelling that would 576 otherwise be required.. 578 The Intermediate Agent PHSM always communicates with a NHSM on the 579 upstream side and the NHSM always communicates with a PHSM on the 580 downstream side. 582 3.1 Assumptions 584 Some basic assumptions were made as part of the development of the 585 enclosed state machines. These included: 587 o All state machines exist as part of an ST Agent and that the 588 Agent will instantiate state machines as needed to represent state on 589 a per stream basis. 591 o The ST Agent implements logic that unpacks incoming SCMP 592 packets, validates the contents, updates the Agent databases and 593 routes the message signal to the appropriate stream and it's 594 associated state machine. 596 o Detection and handling of messages that are broken, 597 duplicates, or not valid for a particular stream state does not 598 affect stream state and is not represented in the state machines. The 599 mechanisms to prevent such misleading signals to individual state 600 machines are described in the Architecture, Agent FSM and Exception 601 Processing Sections. 603 o All reliable delivery of intra- and inter-Agent SCMP messages 604 is handled by the ST Agent independent of the described state 605 machines except in the case where stream state is dependent on the 606 outcome of the message delivery. 608 o All communication within the same Agent should follow the 609 same Request Response paradigm as inter-Agent messages in order to be 610 as reliable as SCMP communications. This assumes that all API 611 communications and intra-agent communications recreate the 612 reliability available with the ACK, timeout and retry paradigms. It 613 is an implementation specific choice. 615 o The described state tables accurately reflect state 616 transitions of the ST2+ version of SCMP. The described state 617 diagrams accurately reflect the state transition tables for all 618 states, input trigger events and state transitions. Output events are 619 not shown in the diagrams, but are detailed in the tables. If there 620 are discrepancies, the tables take precedence over the diagrams and 621 the protocol specification takes precedence over the tables. 623 o API notations for the Origin and Target ST applications are 624 shown to illustrate the OSM and TSM interactions. The actual 625 defintion of an ST application API is outside the scope of this 626 document. 628 3.2 State Machine Model Conventions 630 3.2.1 Naming Conventions and Notations 632 All state names are in bold and start out with a capital letter 633 followed by the lower case letters. All message names are in capitals 634 usually prefixed with a + or - sign .All messages with special 635 response conditions have suffixes indicating the condition, i.e. 636 _last, _all, _change. Predicates are in bold and lower case string. 638 Tables show states, events, output, and transitions. Diagrams show 639 states, events and transitions. Initial states are indicated by an 640 asterisk "*". 642 Messages that trigger events are proceeded by a plus sign "+". 644 Outputs are proceeded by a minus sign "-". 646 Transitions are represented by arrows in the diagrams and by ">>" in 647 the tables. 649 3.2.2 Transmissions and Receptions 651 In all the state machine models, the standard convention of prefixing 652 message transition labels, with a + or - symbol, is used to 653 explicitly indicate a transmission and reception respectively. The 654 prefixes are not part of the message syntax. In addition, the tables 655 will show both transmitted and received messages, but the diagrams 656 show only recieived messages. This simplifies the diagrams, but the 657 tables must be referenced for the message outputs. 659 3.2.3 Predicates 661 State transitions are sometimes dictated by conditions outside the 662 scope of the protocol specification. Predicates are mechanisms that 663 allow such transitions to occur. For example, terminating a protocol 664 session (a result of many conditions) should allow the Agent to 665 transition to either the initial state or some idle state. This 666 decision is of course Application-initiated but a means should 667 nevertheless exist to allow transitioning to the correct state. In 668 the ST protocol there is no message which accomplishes this. 670 Predicates allow a state machine to express conditions and control 671 not explicitly possible with the protocol messages. Generally 672 speaking, they add clarity to the state diagram while reducing the 673 complexity in terms of states. The addition of control predicates 674 allows user defined change of states. Predicates are meant to give 675 hints to the protocol implementer and are not part of the ST 676 protocol. A Glossary in the Appendix can be used to check the 677 explicit meaning of each message or predicate 679 API predicates are used to illustrate the OSM and TSM interactions 680 with ST applications. A predicate with an api_ prefix shows an API 681 message coming into the FSM. A predicate with an _api suffix shows a 682 message being sent to the API from an FSM. 684 For example, an api_open and an api_close predicate are defined for 685 the OSM as a means to transfer control to and from the Init state. 686 The Origin application may open or maintain a stream in the Establd 687 state without any Targets being active or in the TargetList. 689 NHSM, PHSM and TSM state machines have corresponding nh_open, ph_open 690 and tsm_open predicate definitions to allow the Agent to bring the 691 state machine into the Establd state when the Agent is ready to 692 process the initial CONNECT. Unlike the OSM, these state machines 693 return to the Init state when all Targets have been deleted, so no 694 predicate is required to close the NHSM, PHSM or TSM. 696 Some triggers and events are combinations of implicit and explicit 697 message conditions. This is particularly true for the RetryTimeout 698 mechanisms, as well as the requirement that responses from all 699 Targets in a Request be complete before the Request state can 700 complete. See Section 3.3 below. 702 No attempt has been made to illustrate the API interactions with 703 Routing and LRM functions. The results of these interactions affect 704 both how the TargetList is partitioned and what Reason Code has been 705 included in a DISCONNECT or REFUSE to indicate the source of a 706 Request failure. ST Agent management of such failures is discussed in 707 Section 4 ST Agent State Machines and Secton 5 Exception Processing. 709 3.3 Normal Behavior versus Exception Processing 711 The stream FSMs describe the protocol under normal conditions. In 712 general, the architecture is designed to protect these stream FSMs 713 from error conditions handled in the Monitor and Retry FSMs. The SCMP 714 messages STATUS, STATUS-RESPONSE, NOTIFY and ERROR, as well as 715 detailed error handling will be discussed in both Section 4 and 716 Section 5. Otherwise, if a core message transition is not specified 717 from a state, it implicitly means that this message is not allowed 718 from that state. 720 3.3.1 Context not Represented in Stream FSMs 722 The OSM, NHSM, PHSM and TSM diagrams and tables in this section 723 cannot represent state as the complete context of the stream. There 724 are context issues that are handled by the ST Agent Dispatcher, Retry 725 and Monitor FSMs and ST Agent database implementation. These stream 726 FSMs define the atomic elements of stream setup, maintenance and 727 teardown. 729 The G-bit (all Targets), the S-bit (stream Recovery), the I-bit and 730 E-bit ( CHANGE stream teardown risk) involve combinations of FSM and 731 stream database interactions. The implementor must consider the best 732 way to manage these conditions with the other elements of the ST 733 Service Model. 735 Stream Recovery by the Monitor FSM is modeled such that the 736 reconnection heuristics are outside of the basic CONNECT 737 functionality in the stream FSMs. The Monitor FSM initiates stream 738 teardown, and then initiates reCONNECT sequences.The individual 739 stream FSMs are not directly concerned with the Recovery option. 741 MTU size limitations may cause multiple SCMP PDUs for the same 742 transaction or an SCMP propagation failure. This type of problem is 743 managed by the Dispatcher and Retry FSM filtering. 745 Another issue not specifically addressed in this section is the 746 partitioning and management of the TargetList according to the NHSM 747 and ST Agent neighbor associated with each Target or set of Targets. 749 3.3.2 Special Message Types 751 In addition to the described predicates for API transactions and 752 state transitions, there are signals from the Retry FSM for ACK 753 failures, a signal for the timer expiration for the End-to-End 754 Response to a Request and special conditions for a REFUSE Response to 755 a CHANGE. 757 The Retry FSM issues a RetryTimeout signal when no ACK has been 758 received for a Request after the configured number of retries have 759 been attempted. This signal is an implicit REFUSE to appropriate PHSM 760 or NHSM. 762 In the following FSM explanations, you will note that a RetryTimeout 763 is an indicated signal only to the NHSM and the PHSM. The NHSM and 764 PHSM provide the inter-Agent communications for the stream FSMs. A 765 RetryTimeout is generated by the Retry FSM and forwarded to the 766 appropriate PHSM or NHSM, and that FSM then generates the appropriate 767 DISCONNECT and REFUSE messages as intra-Agent communications. For 768 example, an OSM would receive a REFUSE with Reason Code 41 769 RetransTimeout as the result of an NHSM receiving a RetryTimeout. 771 Once an Origin (or Agent acting as an Origin) receives an ACK to a 772 Request in the Retry FSM, the End-to-End Response timer is set for 773 the maximum time to wait for Responses to this Request. If this End- 774 to-End timer expires before a Response has been recieved, the 775 E2ETimeout becomes an implicit REFUSE for all Targets that have not 776 yet Responded. The Retry FSM communicates this failure to OSM (or 777 PHSM, in the case of an Agent acting as Origin) as an E2ETimeout. The 778 OSM issues the appropriate messages to the API and NHSM. 780 If a CHANGE request is made with the I-bit set, the LRM may risk 781 losing the existing resources to allocate the requested resources. If 782 the I-bit is not set, application does not want to risk losing the 783 current resources for the sake of a CHANGE. Thus when a REFUSE to a 784 CHANGE is recieved, and the E-bit is zero, it means the REFUSE will 785 result in stream teardown. This is the normal result of a REFUSE. 786 However, if the the E-bit is set, it is a REFUSE_CHANGE, indicating 787 only that the CHANGE could not be completed, but the that the stream 788 still has the original QOS resources. 790 3.3.3 Classes of Response Types 792 The ST2+ protocol requires that all Responses be received from all 793 Targets in a TargetList before the Request state transition may be 794 completed and any other Request may be processed. The protocol, 795 however, allows immediate processing of all DISCONNECT and REFUSE 796 messages whether or not they are not initiated by the current 797 REQUEST. These requirements result in the need to differentiate three 798 classes of Responses. 800 The first class is a Response that does not have any signifigance for 801 state change, where such Responses are not specifically either the 802 last one required to complete the TargetList of the current Request, 803 nor a deletion of the last of all Targets associated with that 804 stream's FSM. Completion of the Responses for the current TargetList 805 is the second class. Removal of the last of all Targets for that 806 stream's FSM is the third class. 808 All Responses in the second and third class are defined by predicates 809 that identify the message type with a suffix for either last (class 810 2) or all (class 3). All API references are illustrative and are not 811 intended to fully define the application interface. 813 Class 1 Responses: 815 api_accept, api_disconnect, api_refuse, ACCEPT, DISCONNECT, REFUSE, 816 REFUSE_CHANGE and RetryTimeout indicate that an individual TargetList 817 member has signaled a Response. The api_disconnect, api_refuse, 818 DISCONNECT and REFUSE messages may also be a Request to delete a 819 Target( whether or not it is in the current TargetList of an Add or 820 Change state transaction). Individual and/or Global Target deletion 821 may occur at any time, but any Global (G-bit set) Target Response or 822 deletion Request falls into one of the second two classes. 824 Class 2 Responses: 826 api_accept_last, api_disconnect_last, api_refuse_last, ACCEPT_LAST, 827 DISCONNECT_LAST, REFUSE_LAST, REFUSE_CHANGE_LAST, RetryTimeout_last, 828 E2E_Timeout_last are only relevant to the current stream Request and 829 refer to the completion of the Request state by occurring as the 830 Response that incidently completes the TargetList . 832 Class 3 responses: 834 api_disconnect_all, api_refuse_all, DISCONNECT_ALL and REFUSE_ALL 835 refer to the Requests or Responses that remove the last active Target 836 from that FSM for that stream. 838 These classes delineate the asynchronous Request/Response activity 839 that may occur. Network conditions may result in interruptions of any 840 stream FSM operation. 842 The OSM, NHSM, PHSM and TSM diagrams and tables in this section 843 define stream state as it relates to atomic setup and teardown 844 functions. Every attempt has been made to delineate the atomic SCMP 845 request-response specifications such that implementors may reorganize 846 the Agent architecture to address implementation-specific issues. 848 3.4 Stream State Machines. 850 3.4.1 Origin State Machine (OSM) 852 The Origin State Machine (OSM) communicates with one or more NHSMs. 853 The OSM also talks to the Upper Layer module via primitives. This OSM 854 to Upper Layer Interface is outside the scope of this document, but 855 examples of API predicates are illustrated in the diagrams and 856 tables. All ST Dispatcher and MS/C Box diagrams have indicated that 857 API messages could be included. The actual mechanism used for API 858 communications should be decided by implementation factors. 860 The OSM consists of a small number of states: Init, Establd, Add and 861 Change. 863 Init: The initial state is called Init. An api_open predicate moves 864 the control to the Establd state. An api_close is required to return 865 the stream to the Init state. 867 Establd: The Establd state is the stable state from which all 868 api_connect, JOIN and api_change requests may cause a transition to 869 the Add or Change states. All Requests that occur while a stream is 870 in either an Add or Change state will be queued up until the stream 871 returns to the Establd state. Data transfer may occur to established 872 Targets. The removal of Targets from previous operations or current 873 operations may occur in the Establd, Add or Change states with an 874 api_disconnect or a REFUSE. 876 It is possible for an Application at the Origin to add new Targets to 877 an existing stream any time after the stream has been established. A 878 JOIN message received by an OSM indicates that the Origin Agent 879 happens to be the first Agent for that stream in the path between the 880 JOIN originator and the Origin. 882 JOIN messages from potential Targets require the authorization 883 process to determine if the JOIN will be allowed. The OSM then issues 884 either a JOIN-REJECT message or a CONNECT message. If this validation 885 is complete and the stream JOIN option allows authorization to be 886 completed,the ST Agent at the Origin transitions to the Add state and 887 then issues a CONNECT message that contains the SID, the FlowSpec, 888 and the TargetList specifying the new Target, waiting an ACCEPT or 889 REFUSE response. 891 If this is not the case, a JOIN-REJECT message is sent to the Target 892 with the appropriate ReasonCode (e.g., JoinAuthFailure, 893 DuplicateTarget or RouteLoop). Issuing a JOIN-REJECT brings the OSM 894 back to the Establd state. 896 Add:Once in the Establd state the API may issue an api_connect. A 897 transition to Add will create a CONNECT message that is placed in the 898 FIFO queue between the OSM and the MS/C box. The CONNECT message 899 contains the SID, an updated FlowSpec, and a TargetList. The MS/C box 900 will then make a copy of the CONNECT message, partition the 901 Targetlist parameter and place it the NHSMs queues.The spliting (or 902 separating) information is derived from the implementation's routing 903 and LRM functions. 905 Once in the Add state the OSM waits to get ACCEPT or REFUSE 906 responses. The stream will not transition back to the Establd state 907 until all Targets have responded. A REFUSE may be generated by the 908 local Routing or LRM functions, NHSM or Monitor FSMs as well as any 909 Agent in the path between the Origin and the Target. Normal 910 operations in the OSM treat all types of REFUSE responses to a 911 CONNECT in the same manner. The Monitor FSM will manage the Recovery 912 reCONNECT analysis and may also be expanded to include other ST 913 Service Model functions. 915 The OSM will record the status of each response from each Target. As 916 each ACCEPT is received, the OSM updates its database and records the 917 status of each Target and the resources that were successfully 918 allocated along the path to it, as specified in the FlowSpec 919 contained in the ACCEPT message. The Application may then use the 920 information to either adopt or terminate the portion of the stream to 921 each Target. When either an ACCEPT or REFUSE from all Targets has 922 been received at the Origin, the stream state returns to Establd and 923 any additional queued up requests may then be processed. 925 Figure 7. Origin State Machine (OSM) 927 Table 2: OSM 929 Once an ACCEPT is received by the OSM, the path to the Target is 930 considered to be established and the ST Agent is allowed to forward 931 the data along this path. When a REFUSE reaches the OSM, the OSM 932 notifies the Application that the Target is no longer part of the 933 stream. If there are no remaining Targets, the Application may wish 934 to terminate the stream or keep the stream active to allow stream 935 joining. 937 To ensure that all Targets receive the data with the desired quality 938 of service, an Application should send the data only after the whole 939 stream has been established. Depending on the local API, an 940 Application may not be prevented from sending data before the 941 completion of all stream Targets. 943 For each new Target in the TargetList, processing is much the same as 944 for the original CONNECT. The CONNECT is acknowledged, propagated, 945 and network resources are reserved. However, it may be possible to 946 route to the new Targets using previously allocated paths or an 947 existing multicast group. In that case, additional resources do not 948 need to be reserved but more next-hops might have to be added to an 949 existing multicast group. These issues are managed by the 950 implementation of the ST Service Model and stream state transitions 951 remain the same. Intermediate or Target ST Agents that are not 952 already nodes in the stream behave as in the case of stream setup. 954 The OSM may issue a DISCONNECT when an api_disconnect is received. 955 This message may be processed in any state. The OSM then records this 956 fact and appropriately updates its database. 958 A REFUSE message may arrive at the OSM asynchronously at any 959 time.This message is sent as a result of an Intermediate Agent 960 failure or a Target leaving a stream. 962 Change:The Application at the Origin may wish to change the FlowSpec 963 of an established stream. To do so, it informs the ST Agent at the 964 Origin of the new FlowSpec and of the list of Targets associated with 965 the change with an api_change. The Origin then issues one CHANGE 966 message with the new FlowSpec per next-hop and sends it to the 967 relevant next-hop Agents. The control flow to the Change state is 968 very similar to the control to the Add state from the Establd state. 969 Depending on the CHANGE options selected and the resources 970 avalailable in each of the stream paths, the CHANGE may result in 971 either a simple refusal of any change or the disconnect of the entire 972 stream. A REFUSE response to a CHANGE request with the E-bit set to 973 zero means that the stream has been torn down.for that Target. A 974 REFUSE_CHANGE is a REFUSE with the E-bit set to 1 indicating that the 975 CHANGE has been refused but the prior stream resources are unchanged 977 3.4.2 Next Hop State Machine (NHSM) 979 The NHSM is pictorially shown in Figure 8. This model is common to 980 the Origin as well as an Intermediate Agent .The NHSM consists of the 981 same fundamental states as the OSM: Init, Establd, Add and Change. 983 Init:The state machine for each next hop enters its Init state at 984 Agent start-up time. An asterisk indicates that this is the initial 985 state. A nexthop_open predicate moves control to the Establd state 986 when the next hop associated with an NHSM is required by Targets in a 987 stream. 989 Establd:Once in the Establd state a number of things can happen. 990 Targets may be added by the Origin or Targets may request to join the 991 stream. However, the processing of a JOIN request is always handled 992 by either an OSM or a PHSM. Within each ST Agent, the ST Dispatcher 993 examines incoming JOIN requests and determines whether the stream 994 referenced is a stream that that Agent supports. If not, the JOIN is 995 forwarded on towards the Origin. Once a JOIN request reaches an Agent 996 that can process the JOIN, the ST Dispatcher ACKs the JOIN and queues 997 it up to the resident OSM or PHSM. The NHSM only sees the resultant 998 CONNECT when stream authorization has completed successfully and the 999 OSM or PHSM has issued a CONNECT through the MS/C Box. 1001 As previously described in the OSM, an ST Agent can handle only one 1002 stream Add or Change at a time. If such a stream operation is already 1003 underway, further requests are queued and handled when the previous 1004 operation has been completed. Either a DISCONNECT or REFUSE for all 1005 Targets transfers control from the Establd state to the Init state. 1007 Add: A CONNECT that has been propagated from the NHSM Add state to 1008 the next hop Agent PHSM and will require a Response in the form an 1009 ACK . If an ACK is not received, the timeout and retry mechanisms of 1010 the Retry FSM will invoke a RetryTimeout signal. Every PDU has a 1011 unique reference number, so that all ACKs may be matched to the 1012 appropriate Request or Response. 1014 The CONNECT message contains the SID, an updated FlowSpec, and a 1015 TargetList. In general, the FlowSpec and TargetList depend on both 1016 the next-hop and the intervening network. Each TargetList is a subset 1017 of the original TargetList, identifying the targets that are to be 1018 reached through the next-hop to which the CONNECT message is being 1019 sent. If the TargetList causes a PDU that is larger than the MTU 1020 size, CONNECT message to be generated, the CONNECT message is 1021 partitioned. 1023 The ACK, if it is received, does not need to be reported to the NHSM. 1024 However, if the ACK is not received and the retries are exhausted, a 1025 RetryTimeout signal will be reported to the NHSM and interpreted as a 1026 REFUSE. The NHSM will record all Target Responses until the last 1027 Target in the TargetList has sent an ACCEPT or REFUSE (or an implicit 1028 REFUSE due to Retry exhaustion ).An Origin DISCONNECT may terminate 1029 this process when the End-to-End Response timer is exceeded. A 1030 DISCONNECT or REFUSE signal may be due to the failure of a next hop 1031 or previous hop. 1033 If an Application at a Target does not wish to participate in the 1034 stream, it sends a REFUSE message back to the Origin with a 1035 ReasonCode (ApplDisconnect). When an NHSM receives a REFUSE message 1036 with ReasonCode (ApplDisconnect), the acknowledgement has already 1037 been sent by the ST Dispatcher as an ACK to the next-hop. The Agent 1038 considers which resources are to be released, deletes the Target 1039 entry from the internal database, and propagates the REFUSE message 1040 back to the OSM or PHSM. 1042 If, after deleting the specified Target, the next-hop has no 1043 remaining Targets, then those resources associated with that next-hop 1044 agent may be released. Note that network resources may not actually 1045 be released if network multicasting is being used since they may 1046 still be required for traffic to other next-hops in the multicast 1047 group. 1049 Change: The Application at the Origin may wish to change the FlowSpec 1050 of an established stream. To do so, it informs the OSM of the new 1051 FlowSpec and of the list of Targets relative to the change. The OSM 1052 then issues one CHANGE message with the new FlowSpec per next-hop and 1053 sends it with the correct Targetlist. The MS/C box then places copies 1054 (as required) of this in the NHSM queues.This takes the control to 1055 the Change state from the Establd state. CHANGE messages are 1056 structured and processed similar to CONNECT messages. 1058 A next-hop agent that is an Intermediate Agent that receives a CHANGE 1059 message similarly determines if it can implement the new FlowSpec 1060 along the path to each of its next-hop agents, and if so, it 1061 propagates the CHANGE messages along the established paths. If this 1062 process succeeds, the CHANGE messages will eventually reach the 1063 Targets, which will each respond with an ACCEPT (or REFUSE) message 1064 that is propagated back to the OSM. 1066 Figure 8. Next Hop State Machine (NHSM) 1068 At this point the Application decides whether all replies have been 1069 received. If the change to the FlowSpec is in a direction that makes 1070 fewer demands of the involved networks, then the change has a high 1071 probability of success along the path of the established stream. Each 1072 ST agent receiving the CHANGE message makes the necessary request 1073 changes to the network resource allocations, and if successful, 1074 propagates the CHANGE message along the established paths. If the 1075 change cannot be made, but the E-bit indicates that stream should be 1076 torn down, then the ST Agent must recover using DISCONNECT and REFUSE 1077 messages as in the case of a network failure. Note that a failure to 1078 change the resources requested for specific Targets should not cause 1079 other targets in the stream to be deleted. A REFUSE response to a 1080 CHANGE request with the E-bit set to zero means that the stream has 1081 been torn down.for that Target. A REFUSE_CHANGE is a REFUSE with the 1082 E-bit set to 1 and the stream is unchanged 1084 Table 3: NHSM 1086 The Application at the Origin may specify a set of Targets that are 1087 to be removed from the stream with an appropriate ReasonCode 1088 (ApplDisconnect). The Targets are partitioned into multiple 1089 DISCONNECT messages based on the next-hop route towards the 1090 individual Targets. If the TargetList is too long to fit into one 1091 DISCONNECT message, it is partitioned. 1093 If, after deleting the specified Targets, any next-hop has no 1094 remaining Targets, then those resources associated with that next-hop 1095 agent may be released. Note that the network resources may not 1096 actually be released if network multicasting is being used since they 1097 may still be required for traffic to other next-hops in the multicast 1098 group. 1100 When the DISCONNECT reaches a Target, the Target Agent sends an ACK 1101 to the upstream NHSM and notifies the Application (at target) that it 1102 is no longer part of the stream and for which reason. The ST Agent at 1103 the Target deletes the stream from its database after performing any 1104 necessary management and accounting functions. Note that the stream 1105 is not deleted if the ST Target Agent is also an Intermediate Agent 1106 for the stream and there are remaining downstream Targets. 1108 Data Forwarding: Once the Application or OSM determines that the 1109 stream is established Data may be transferred to the targets. An 1110 Application is not guaranteed that the data reaches its destinations: 1111 ST is unreliable and it does not make any attempt to recover from 1112 packet loss, e.g. due to the underlying network. In case the data 1113 reaches its destination, it does it accordingly to the negotiated 1114 quality of service. An ST Agent forwards the data only along already 1115 established paths to Targets. 1117 Since a path is considered to be established when the ST next-hop 1118 agent on the path sends an ACCEPT message, it implies that the target 1119 and all other intermediate ST Agents on the path to the Target are 1120 ready to handle the incoming data packets. In no case will an ST 1121 Agent forward data to a next-hop Agent that has not explicitly 1122 accepted the stream. 1124 At the end of the connection setup phase, the Origin, each Target, 1125 and each Intermediate ST Agent has a database entry that allows it to 1126 forward the data packets from the Origin to the Targets and to 1127 recover from failures of the Intermediate Agents or networks. The 1128 database should be optimized to make the packet forwarding task most 1129 efficient. The time critical operation is an Intermediate Agent 1130 receiving a packet from the previous-hop Agent and forwarding it to 1131 the next- hop Agents. The database entry must also contain the 1132 FlowSpec, utilization information, the address of the Origin and 1133 previous-hop, and the addresses of the Targets and next-hops, so it 1134 can perform enforcement and recover from failures. An ST Agent 1135 receives data packets encapsulated by an ST header. A data packet 1136 received by an ST Agent contains the SID. This SID was selected at 1137 the Origin so that it is globally unique and thus can be used as an 1138 index into the database, to obtain quickly the necessary replication 1139 and forwarding information. 1141 The forwarding information will be network and implementation 1142 specific, but must identify the next-hop Agents. It is suggested that 1143 the cached information for a next-hop Agent include the local network 1144 address of the next- hop. If the data packet must be forwarded to 1145 multiple next- hops across a single network that supports multicast, 1146 the database may specify the next-hops by a (local network) multicast 1147 address. If the network does not support multicast, or the next-hops 1148 are on different networks, multiple copies of the data packet must be 1149 sent. 1151 No data fragmentation is supported during the data transfer phase. 1152 The Application is expected to segment its PDUs according to the 1153 minimum MTU over all paths in the stream. The Application receives 1154 information on the MTUs relative to the paths to the Targets as part 1155 of the FlowSpec contained in the ACCEPT message. The minimum MTU over 1156 all paths has to be calculated from the MTUs relative to the single 1157 paths. If the Application at the Origin sends a too large data 1158 packet, the ST Agent at the Origin generates an error and it does not 1159 forward the data. 1161 3.4.3 Previous Hop State Machine (PHSM) 1163 The Previous Hop State Machine Model is common to a Target or 1164 Intermediate Agent. A PHSM communicates with an upstream NHSM and 1165 downstream with one or more NHSMs and/or a TSM via a MS/C box. When a 1166 CONNECT message is received, the Intermediate ST Agent invokes the 1167 routing function, reserves resources via the Local Resource Manager, 1168 and then propagates the CONNECT messages to its next-hops. For the 1169 most part the Intermediate Agent behaves like a relay. In the cases 1170 when the Intermediate Agent is not able to successfully send out a 1171 CONNECT message to a downstream PHSM, a REFUSE message from the PHSM 1172 is sent to the upstream NHSM.. 1174 The PHSM consists of a small number of states: Init, Establd, Add and 1175 Change. 1177 Init: The ST Agent initially takes control from the Init state to the 1178 Establd state via the phsm_open predicate. A DISCONNECT or REFUSE of 1179 all Targets in a stream will take the stream from the Establd to a 1180 terminating state which is also the Init state. 1182 Establd:Once in the Establd state, Targets may be added or changed by 1183 the Origin or Targets may request to join the stream. The processing 1184 of a JOIN request is always handled by either an OSM or a PHSM. 1185 Within each ST Agent, the ST Dispatcher examines incoming JOIN 1186 requests and determines whether the stream referenced is a stream 1187 that that Agent supports. If not, the JOIN is forwarded on towards 1188 the Origin. Once a JOIN request reaches an Agent that can process the 1189 JOIN, the ST Dispatcher ACKs the JOIN and queues it up to the 1190 resident OSM or PHSM. When stream authorization has completed 1191 successfully, the PHSM issues a CONNECT through the MS/C Box to 1192 either a NHSM or a TSM. 1194 As previously described in the OSM, an ST Agent can handle only one 1195 stream Add or Change at a time. If such a stream operation is already 1196 underway, further requests are queued and handled when the previous 1197 operation has been completed. Either a DISCONNECT or REFUSE for all 1198 Targets transfers control from the Establd state to the Init state. 1200 Add: Once in the Establd state the previous hop may relay a CONNECT 1201 message. A transition to Add will create a CONNECT message that is 1202 placed in the FIFO queue between the PHSM and the MS/C box. The 1203 CONNECT message contains the SID, an updated FlowSpec, and a 1204 TargetList. The MS/C box will then make a copy of the CONNECT 1205 message, partition the Targetlist parameter and place it the NHSM 1206 and/or TSM queues.The spliting (or separating) information is derived 1207 from the implementation's routing and LRM functions. 1209 Once in the Add state the OSM waits to get ACCEPT or REFUSE 1210 responses. The stream will not transition back to the Establd state 1211 until all Targets have responded. The expiration of the retry timer 1212 and count (if the next hop is not ACKing the request) or the 1213 expiration of the end-to-end timer will be interpreted as an implicit 1214 refuse. 1216 Change:The Application at the Origin may wish to change the FlowSpec 1217 of an established stream. To do so, it informs the ST Agent at the 1218 Origin of the new FlowSpec and of the list of Targets relative to the 1219 change and this message will be propagated to through the NHSMs to 1220 the PHSMs and TSMs. The control flow to the Change state is very 1221 similar to the previous FSM discussions. 1223 Figure 9. Previous Hop State Machine (PHSM) 1225 Table 4: PHSM 1227 3.5 The Target State Machine (TSM) 1229 The Target State Machine (TSM) is a high level state machine which 1230 communicates with a PHSM, or OSM if residing the same Agent as the 1231 Origin. The TSM also talks to the Upper Layer module via primitives. 1232 The TSM consists of a small number of states: Init, Establd, Add and 1233 Change. 1235 Init: The ST Agent initially takes control from the Init state to the 1236 Establd state via the tsm_open predicate. A Target Application may 1237 request to join an existing stream. It has to collect information on 1238 the stream including the stream ID (SID) and the IP address of the 1239 stream's Origin. This can be done out-of- band, e.g. via regular IP. 1240 The information is then passed to the local ST Agent together with 1241 the FlowSpec. The Application directs the TSM to generate a JOIN 1242 message containing the Application's request to join the stream and 1243 sends it to the PHSM which in turn sends it upstream toward the 1244 stream Origin. 1246 An ST Agent receiving a JOIN message for which that Agent has a 1247 matching stream , responds with an ACK. The ACK message must identify 1248 the JOIN message to which it corresponds by including the Reference 1249 number indicated by the Reference field of the Join message. If the 1250 ST Agent is not traversed by the stream that has to be joined, it 1251 propagates the JOIN message toward the stream's Origin. Eventually, 1252 an ST Agent traversed by the stream or the stream's Origin itself is 1253 reached. In any case, the TSM will eventually receive a JOIN-REJECT 1254 or CONNECT response. This is shown as transitions to the Establd 1255 state and the Add state respectively. 1257 Add: The TSM may receive a CONNECT message any time . The ST Agent 1258 reserves local resources and inquires from the specified Application 1259 process whether or not it is willing to accept the connection. In 1260 particular, the Application must be presented with parameters from 1261 the CONNECT, such as the SID, FlowSpec, Options, and Group, to be 1262 used as a basis for its decision. The Application is identified by a 1263 combination of the NextPcol field and the SAP field included in the 1264 correspondent (usually single remaining) target of the TargetList. 1265 The contents of the SAP field may specify the port or other local 1266 identifier for use by the protocol layer above the host ST layer. 1267 Subsequently received data packets will carry the SID, that can be 1268 mapped into this information and be used for their delivery. 1270 The TSM responds with an ACCEPT or REFUSE - a result of the Upper 1271 Layer module decision. 1273 Change: The TSM may receive a CHANGE message any time it is in a 1274 Establd state. This happens always after a CONNECT. The TSM again 1275 responds with an ACCEPT or REFUSE after informing the Upper Layer 1276 Protocol. 1278 The TSM may at any time want to terminate its membership in the 1279 stream. This is handled by the TSM sending out a REFUSE message. On 1280 the other hand it is possible for an Origin or IntermediateAgent to 1281 disconnect the Target from the stream. This is accomplished by the 1282 Agent or Origin sending a DISCONNECT message. 1284 Figure 10. Target State Machine (TSM) 1286 Table 5: TSM 1288 4 ST Agent FSMs 1290 This section describes the Retry FSM and the Monitor FSM for the 1291 datalink and ST Agent neighbor reliability functions. The OSM, NHSM, 1292 PHSM and TSM have been shown to model the stream specific Request- 1293 Response pattern. The Retry FSM models the datalink reliability 1294 provided by the ACK mechanisms with the associated timer and retry 1295 count. The Monitor FSM models the ST Agent reliabilty provided by the 1296 neighbor HELLO mechanisms, the STATUS and STATUS_RESPONSE messages, 1297 as well as the stream Recovery timer and retry count. 1299 The ST Agent FSMs are the unifying aspect of the total FSM model 1300 architecture.These models are dependent on how the SCMP messages 1301 traverse the ST Agents, and impact Agent databases and FSMs. 1303 4.1 Agent Database Context 1305 ST Agent stream database entries are intitated by the first CONNECT 1306 for that Stream Id. The information initially correlated to each 1307 StreamId entry includes: 1309 ST Neighbor Previous Hop and Next Hops 1311 FlowSpec, Group, MulticastAddress, Origin, TargetList, 1312 ACK and Response timers 1314 Stream Options for NoRecovery(S-bit) and Join 1315 Authorization Level (J-bit, N-bit) 1317 Routing results for each Target's Next Hop 1319 LRM results for each Next Hop's resource allocation 1321 Each Agent database is modified when the CONNECT Responses indicate 1322 some variation specified by a downstream Agent or Target Response. 1323 Subsequent Requests can also modify the database and include 1324 additional CONNECT, JOIN and CHANGE requests. Origin, Network or 1325 Agent Recovery and LRM initiated stream teardown can occur in the 1326 form of explicit DISCONNECT, REFUSE and Recovery initiated CONNECT 1327 messages or implicit conditions detected through the HELLO, STATUS, 1328 STATUS-RESPONSE and NOTIFY messages. 1330 The database context is then augmented with the history of Reason 1331 Codes and prior stream characteristics. Transient state 1332 characteristics can also include the G-bit (Global stream 1333 TargetList), I-bit (CHANGE risking teardown of old resources)and E- 1334 bit (CHANGE REFUSE without teardown), R-bit (Restarted Agent) or ACK 1335 and Response retry values. 1337 While all control messages may have an indirect effect on stream 1338 state and databases , only ACCEPT, CHANGE, CONNECT, DISCONNECT, 1339 JOIN-REJECT, JOIN and REFUSE directly affect each Agent's defintion 1340 of each stream. ACK, ERROR, HELLO, NOTIFY, STATUS and STATUS- 1341 RESPONSE are the control messages that are primarily used to maintain 1342 ST Agent databases for datalink, neighbor and network management 1343 functions. 1345 As the ST PDUs traverse the network, each Agent presumably has a 1346 platform specific interface-to-packet-switching function that must 1347 intercept the ST packets for ST functions. The ST Dispatcher 1348 represents ST PDU validation, filtering and packetswitching. The ST 1349 Dispatcher in this model is organized as the Agent packet-switcher, 1350 rather than as a per-interface or per-next-hop packet-switcher. This 1351 function may be reorganized as a distributed function if the Agent 1352 platform architecture requires such distribution. 1354 4.2 ST Dispatcher role for incoming Packet-switching, 1355 ACKnowledgement and PDU validation 1357 An ST Dispatcher can validate an ST PDU for ST header and PDU syntax 1358 and semantic validity, and then rapidly switch Data packets , .i.e to 1359 a local Target application SAP or to the appropriate next hop 1360 interface for remote Targets. 1362 When the PDU syntax are in error, an ERROR PDU with the corresponding 1363 Reason Code and the offending PDU contents are returned to the 1364 SenderIpAddress (instead of an ACK for those SCMP messages that 1365 require an ACK). The incoming PDU in ERROR is then discarded and does 1366 not directly impact any FSM state. The ERROR response is designed for 1367 The following Reason Codes detail the inconsistencies that could be 1368 reported in an ERROR Response: 1370 2 ErrorUnknown An error not contained in this list 1371 has been 1373 detected. 1375 8 AuthentFailed The authentication function failed. 1377 13 CksumBadCtl Control PDU has a bad message 1378 checksum. 1380 14 CksumBadST PDU has a bad ST Header checksum. 1382 23 InvalidSender Control PDU has an invalid 1383 SenderIPAddress field. 1385 24 InvalidTotByt Control PDU has an invalid 1386 TotalBytes field. 1388 * 26 LnkRefUnknown Control PDU contains an unknown 1389 LnkReference. 1391 31 OpCodeUnknown Control PDU has an invalid OpCode 1392 field. 1394 32 PCodeUnknown Control PDU has a parameter with an 1395 invalid PCode. 1397 33 ParmValueBad Control PDU contains an invalid 1398 parameter value. 1400 35 ProtocolUnknown Control PDU contains an unknown 1401 next-higher 1403 layer protocol identifier. 1405 37 RefUnknown Control PDU contains an unknown 1406 Reference. 1408 45 SAPUnknown Control PDU contains an unknown next- 1409 higher 1411 layer SAP (port). 1413 * 46 SIDUnknown Control PDU contains an unknown SID. 1415 48 STVer3Bad A received PDU is not ST Version 3. 1417 54 TruncatedCtl Control PDU is shorter than expected. 1419 55 TruncatedPDU A received ST PDU is shorter than the 1420 ST Header 1422 indicates. 1424 In some cases, RFC1819 specifically requires that an error in a PDU 1425 result in an ACK and then a response with the error code. An example 1426 of this is when a CONNECT or CHANGE request with an unknown SID 1427 results in an ACK followed by a REFUSE with Reason Code 46. In any 1428 event, the ST Dispatcher function is to direct only valid PDUs to the 1429 inidividual FSM logic. 1431 Figure 11. ST Dispatcher InputInput 1433 The next level of PDU analysis involves Agent and stream consistency. 1434 The PDU is examined for content consistency with both Agent and 1435 stream database information.The following detected inconsistencies 1436 may result: 1438 3 AccessDenied Access denied. 1440 4 AckUnexpected An unexpected ACK was received. 1442 15 DuplicateIgn Control PDU is a duplicate and is 1443 being acknowledged. 1445 16 DuplicateTarget Control PDU contains a duplicate 1446 target, or an attempt to add 1447 an existing target. 1449 49 StreamExists A stream with the given SID already 1450 exists. 1452 51 TargetExists A CONNECT was received that specified 1453 an 1455 existing target. 1457 52 TargetUnknown A target is not a member of the 1458 specified stream. 1460 53 TargetMissing A target parameter was expected and 1461 is not 1463 included, or is empty. 1465 Most SCMP PDUs (except ACK, ERROR, HELLO, STATUS, STATUS-RESPONSE,) 1466 will trigger an ACK to the ST neighbor that sent the PDU. 1468 CONNECT, CHANGE and JOIN Requests will be directed to the appropriate 1469 stream PHSM. 1471 Incoming Responses are first correlated with any corresponding 1472 Request Reference so that the appropriate next hop or Response timer 1473 may be terminated. Then ACCEPT and REFUSE messages are queued up to 1474 the appropriate stream NHSM, while DISCONNECT and JOIN-REJECT 1475 messages are queued up to the appropriate stream PHSM. 1477 ACK and ERROR messages are correlated with a PDU Reference, 1478 terminating the appropriate timers, and then queued up to the stream 1479 Retry FSM. 1481 HELLO, STATUS and STATUS-RESPONSE messages are correlated with a PDU 1482 Reference so that the appropriate timers may be terminated and then 1483 queued up to the Monitor FSM. 1485 4.3 ST Dispatcher functions for outgoing Packet switching, timer 1486 and retry settings 1488 Figure 12. 1490 The ST Dispatcher also has the role of packaging and forwarding 1491 outgoing PDUs to the apprropriate interfaces. The outgoing PDU must 1492 be given it's own PDU Reference number and any correlated PDU 1493 Referencer number, as well as the semantics and context of the PDU 1494 database entries. This Agent architecture model assumes that the 1495 Agent and stream databases are the intra-Agent repository of all 1496 activities, such that the ST Dispatcher can efficiently create and 1497 distribute the PDUs. However, it is entirely possible that the 1498 accumulated contents of a PDU has exceeded an outgoing MTU 1499 restriction and the PDU would be trunkated with the following Reason 1500 Codes: 1502 6 UserDataSize UserData parameter too large to 1503 permit a 1505 message to fit into a network's MTU. 1507 36 RecordRouteSize RecordRoute parameter is too long 1508 to permit 1510 message to fit a network's MTU. 1512 4.4 Retry FSM- RFSM for datalink reliability of PDU transmissions 1514 The following table provides a quick reference for ST Recovery and 1515 Retry implications across ST Agent FSMs. Each SCMP message type that 1516 requires an ACK has configured values for the ACK timer and Retry 1517 count. Each Request that requires an End-to-End Response has a 1518 configured value for the End-to-End Response timer. End-to-End 1519 Response timers are set by the Retry FSM when an ACK signals that the 1520 Request has successfully gone out to the network. The Recovery 1521 Option, STATUS and HELLO messages are managed by the Monitor FSM and 1522 follow a different paradigm. 1524 Table 6: Table of local control and End-to-End retry parameterss 1526 The Retry FSM conditions are explicit when an ST Agent neighbor ACK 1527 terminates the neighbor ACK timer and retry count for that 1528 transaction. Any End-to-End Response will terminate the End-to-End 1529 Response timer. Implicit conditions occur when any of the timer or 1530 the retry count values have been exhausted. The general paradigm is 1531 that an implicit REFUSE is generated for unsatisfied downstream 1532 Requests and an implicit DISCONNECT is generated for unsatisfied 1533 upstream Requests. The secondary consequence of a timeout is that 1534 explicit REFUSE and DISCONNECT messages may also be issued. 1536 Each table entry has its own variation of this basic paradigm. In 1537 addition, the ST specification indicates many secondary and tertiary 1538 implications for SCMP message failures. As a particular example, once 1539 any ST Agent has completed a downstream Request-Response scenario, an 1540 upstream propagation problem may or may not cause the stream to be 1541 torn down. The I-bit (risk teardown)in CHANGE processing and the S- 1542 bit (Recovery) are examples of causes for the secondary and tertiary 1543 implications. 1545 Figure 13. Retry State Machine (RFSM) 1547 Figure 14. 1549 The Retry FSM has three states - Init, Ack-wait and Resp-wait. The 1550 general paradigm for the Retry FSM is to move from the Init state to 1551 the Ack-wait state whenever a PDU requiring an ACK is sent. The 1552 STATUS message does not require an ACK, but the required STATUS- 1553 RESPONSE performs the same function as an ACK. 1555 The Retry FSM waits for the resultant ACKS, Responses and/or 1556 timeouts. PDUs requiring ACKS cycle through resends for the 1557 appropriate NAccept, NChange, NConnect, NDisconnect, NJoin, 1558 NJoinReject, NNotify and NRefuse configured counts. 1560 Since all ACKs are correlated by PDU Reference numbers, packets maybe 1561 correlated to the outstanding Retry FSM by the same mechanism. Either 1562 an ACK or a RetryTimeout that is correlated to an ACCEPT, DISCONNECT, 1563 JOIN-REJECT, NOTIFY or REFUSE results in the Retry FSM transitionsto 1564 Init. Such PDUs have no End-to-End response requirements and 1565 generally have no secondary error processing when it can be assumed 1566 that the neighbor Agent and/or link layer reliability is gone. An 1567 ACCEPT is an exception. The failure of an ACCEPT is an implicit 1568 REFUSE upstream and DISCONNECT downstream.since this ACCEPT was an 1569 End-to-End Response that has now failed to completely traverse the 1570 stream Agents. 1572 An ACK on a CHANGE, CONNECT or JOIN causes the respective 1573 ToChange,ToConnect or ToJoin End-to-End timers to be set, and a state 1574 transition to Resp-wait. 1576 A legitimate Response, or an E2ETimeout on CHANGE, CONNECT or JOIN 1577 causes the transition to the Init state with the signal to be 1578 replicated to the appropriate stream FSM. 1580 4.5 Agent , Neighbor and Stream Supervision 1582 4.5.1 The MonitorFSM (MFSM) for Agent and Stream Supervision 1584 Each ST Agent must monitor its own status, network conditions, 1585 neighbor Agent status and supervise the Recovery of streams whenever 1586 required and possible during a network failure.This MFSM is intended 1587 to be a general approach to these issues, rather than a fully 1588 specified FSM since the particular network, platform and 1589 implementation architecture will determine detail FSM considerations. 1591 What this MFSM model does suggest is that the MFSM provides a 1592 superstructure for the management of the Neighbor Detection Failure 1593 FSM(NDFSM), as well as any Agent NOTIFY, STATUS or STATUS-RESPONSE 1594 implications. The Service Model management (including application 1595 issues, routing and LRM or other Agent implementation specific 1596 issues), as well as datalink statistics analysis (e.g., broken or 1597 dropped PDUs or accumulated routing errors) may also be incorporated 1598 into this FSM. 1600 At the very least, stream Recovery requires careful analysis of the 1601 possible recursions in Agent, neighbor failure detection, routing and 1602 LRM conditions. The ST2+ specification defines parameters for a 1603 configured number of times that Recovery should be attempted 1604 (NRetryRoute), the configured time to wait for each Response 1605 (ToRetryRoute) and variations in the exception processing. 1607 Figure 15. .MFSM 1609 During the course of a stream setup, the CONNECT contains a Recovery 1610 Timeout, as specified by the Origin. The resultant ACCEPTs contain 1611 the Agent's "supportable" Recovery Timeout such that the stream 1612 Recovery Timeout becomes the smallest Recovery Timeout for all 1613 Targets. The HELLO timer must be smaller than the smallest Recovery 1614 Timeout for all streams between these Agents, but an Agent may have 1615 various HELLO timers between different Agents, such that the 1616 management function of such timers should fall into the MFSM also. A 1617 Round Trip Time (RTT) estimation function is available with STATUS 1618 and STATUS-RESPONSE messages to aid in this area. 1620 The MFSM relies on the Nieghbor Detection Failure FSM (NDFSM) as the 1621 primary notification vehicle for stream and neighbor management. 1622 During the initial stream setup of any stream NHSM and PHSM, the MFSM 1623 is signalled to begin monitoring of the FSM neighbor Agents involved 1624 in the stream. The sending of HELLOs is begun once an ACCEPT is 1625 forwarded upstream. The receiving of HELLOs is acceptable as soon as 1626 an ACCEPT is received. HELLOs are terminated once an ACK is sent or 1627 received for the DISCONNECT or REFUSE associated with the last of all 1628 streams and Targets for that neighbor.This requires signalling and 1629 coordination with the ST Dispatcher, Retry FSM and database context, 1630 especially when the Restarted bit is active for either the local 1631 Agent of a neighbor. 1633 Agent network "inspection and repair" functions might also exist in 1634 the MFSM to extend the mechanisms of the NDFSM before attempting 1635 Recovery and/or stream teardown. 1637 Group management for Bandwidth-sharing, Fate-sharing, Path-sharing 1638 and Subnet resource- sharing can be intiated by any ST Agent and it 1639 may be adviseable to incorporate optimization algorithms in the MFSM 1640 to interact with Routing and LRM functions, thus allowing the MFSM to 1641 monitor and gauge the impact on the stream Recovery analysis. 1643 4.5.2 The Nieghbor Detection Failure FSM for Neighbor Management 1645 This FSM has a more atomic focus in that ST neighbor HELLOs are 1646 maintained and monitored only while there are one or more shared 1647 streams active. When the neighbor HELLOs and subsequent STATUS 1648 inquiry fails or the neighbor R-bit has been set, the neighbor is 1649 considered down and the streams involved in that neighbor 1650 relationship must be examined for Recovery conditions. 1652 Figure 16. NFDSM 1653 Table 7: NFDSM 1655 4.5.3 Service Model Interactions 1657 Figure 17. MS/C Box Communications inside an Agent 1659 The optimization of route and LRM functions can affect the selection 1660 from multiple path routes to a Target on initial CONNECTs, as well as 1661 CHANGE and Recovery procedures. This document's model follows a 1662 sequential process of integrating the route and LRM services with the 1663 MS/C Box. for the atomic stream FSMs. 1665 Additional algorithms may be used in the MFSM, such that algorithms 1666 for Options and Group factors may be optimized in relation to the 1667 stream Recovery decisions. 1669 5 Exception Processing 1671 Various types of exception processing conditions have been referenced 1672 in the preceding sections. Not all have been spelled out in detail. 1673 The general paradigms fall into several categories and all of this 1674 document's models are based on a suggested approach. The secondary 1675 and tertiary conditions of some apects of exception processsing are 1676 especially subject to implementation preferences. 1678 The first topic of discussion might be the category SCMP datalink 1679 reliability as generally characterized in the Retry FSM. This 1680 document favors maintaining a coordinating Retry FSM versus 1681 incorporating the Retry states in each of the OSM, NHSM, PHSM and 1682 TSM, which is naturally an alternative. 1684 ERROR message generation for PDU semantics problems is discussed in 1685 Section 5 as an ST Dispatcher function. A special case occurs in PDU 1686 construction when the MTU size is exceeded, i.e.: 1688 6 UserDataSize UserData parameter too large to 1689 permit a 1691 message to fit into a network's MTU. 1693 36 RecordRouteSize RecordRoute parameter is too long 1694 to permit 1696 message to fit a network's MTU. 1698 However, all of the analysis and potential REFUSE message or signal 1699 generation, still seems best suited to the ST Dispatcher. 1701 5.1 Additional Exception Processing 1703 5.1.1 ST Dispatcher detected inconsistencies Reason Codes: 1705 The following errors can also be detected by the ST Dispatcher with 1706 the careful analysis of all Agent and stream database values: 1708 3 AccessDenied Access 1709 denied. 1711 4 AckUnexpected An unexpected ACK was received. 1713 15 DuplicateIgn Control PDU is a duplicate and is 1714 being acknowledged. 1716 16 DuplicateTarget Control PDU contains a duplicate 1717 target, or an attempt to add 1718 an existing target. 1720 49 StreamExists A stream with the given SID already 1721 exists. 1723 51 TargetExists A CONNECT was received that specified 1724 an 1726 existing target. 1728 52 TargetUnknown A target is not a member of the 1729 specified stream. 1731 53 TargetMissing A target parameter was expected and 1732 is not 1734 included, or is empty. 1736 This means that the atomic FSMs do not have to incorporate this logic 1737 and this approach simplifies the atomic FSM paradigms. 1739 5.1.2 MonitorFSM issues with neighbor failure and stream recovery 1740 Reason Codes: 1742 The details of these specific instances can also be intertwined with 1743 Retry, Routing and LRM failures. 1745 12 CantRecover Unable to recover failed stream. 1747 22 IntfcFailure A network interface failure has been 1748 detected. 1750 27 NetworkFailure A network failure has been 1751 detected. 1753 39 RestartLocal The local ST agent has recently 1754 restarted. 1756 40 RestartRemote The remote ST agent has recently 1757 restarted. 1759 47 STAgentFailure An ST agent failure has been 1760 detected. 1762 5.1.3 Retry and Timeout Failures Reason Codes: 1764 38 ResponseTimeout Control message has been 1765 acknowledged but not 1767 answered 1768 by an appropriate control message. 1770 41 RetransTimeout An acknowledgment has not been 1771 received after 1773 several retransmissions. 1775 5.1.4 Routing issues Reason Codes: 1777 Routing issues initiate special exception processing requirements. 1778 Some of these have been addressed in the ST2+ specification, but each 1779 implementation should consider the network and platform architecture, 1780 also. 1782 9 BadMcastAddress IP Multicast address is 1783 unacceptable in CONNECT 1785 28 NoRouteToAgent Cannot find a route to an ST agent. 1787 29 NoRouteToHost Cannot find a route to a host. 1789 30 NoRouteToNet Cannot find a route to a network. 1791 34 PathConvergence Two branches of the stream join 1793 during the 1795 CONNECT setup. 1797 42 RouteBack Route to next-hop through same interface 1798 as 1800 previous-hop and is not previous-hop. 1802 43 RouteInconsist A routing inconsistency has been 1803 detected. 1805 44 RouteLoop A routing loop has been detected. 1807 5.1.5 LRM issue Reason Codes: 1809 Optimization of routing and LRM issues can also initiate special 1810 exception processing requirements. Some of these have been addressed 1811 in the ST2+ specification, but each implementation should also 1812 consider the network and platform architecture. 1814 10 CantGetResrc Unable to acquire (additional) 1815 resources. 1817 11 CantRelResrc Unable to release excess resources. 1819 17 FlowSpecMismatch FlowSpec in request does not 1820 match 1822 existing FlowSpec. 1824 18 FlowSpecError An error occurred while processing 1825 the FlowSpec. 1827 19 FlowVerUnknown Control PDU has a FlowSpec Version 1828 Number that 1830 is not supported. 1832 20 GroupUnknown Control PDU contains an unknown Group 1833 Name. 1835 21 InconsistGroup An inconsistency has been detected 1836 with the 1837 streams forming a group. 1839 50 StreamPreempted The stream has been preempted by 1840 one with a 1842 higher precedence. 1844 6 APPENDIX 1846 6.1 Glossary 1848 All stream FSMs have the following 4 states in common: 1850 Init: The stream has no active Targets. 1852 Establd: The stream is established and may or may not have Target 1853 members. 1855 Add: The stream is currently adding Targets as the result of a 1856 Connect or Join initiated Connect. 1858 Change: The stream is currently attempting to Change according to a 1859 new FlowSpec. 1861 A list of predicates, API interactions and combination conditions 1862 include the following: 1864 api_close - the Origin API explicitly terminates a stream, since a 1865 stream with no Targets at the Origin may remain Established 1867 api_open - the Origin API explicitly establishes a stream to initiate 1868 all database setup functions whether or not any Targets are initially 1869 specified. 1871 api_connect- the Origin API adds Targets. 1873 api_change - the Origin API initiates a CHANGE to the FlowSpec. 1875 api_disconnect - the Origin API initiates a DISCONNECT to Targets. 1877 accept_api - the OSM propagates an ACCEPT received from either a TSM 1878 or a NHSM to the Origin API. 1880 notify_api - the OSM propagates a NOTIFY received from either a TSM 1881 or a NHSM to 1883 the Origin API. 1885 refuse_api - the OSM propagates a REFUSE received from either a TSM 1886 or a NHSM to 1888 the Origin API. 1890 nexthop_open - the first time each unique NHSM is invoked for each 1891 unique stream in an Agent, the Agent explicitly establishes a NHSM 1892 database and Establd state. 1894 prevhop_open - the first time the PHSM is invoked for each unique 1895 stream in an Agent, the Agent explicitly establishes a PHSM database 1896 and Establd state. 1898 api_join - the Target API initiates a JOIN request. 1900 api_refuse - the Target API initiates a REFUSE to the TSM. 1902 api_refuse_change - the Target API initiates a REFUSE of a CHANGE 1903 request to the TSM. 1905 connect_api - the TSM propagates a CONNECT to the Target API. 1907 change_api - the TSM propagates a CHANGE to the Target API. 1909 join_reject_api - the TSM propagates a JOIN_REJECT to the Target API. 1911 disconnect_api - the TSM propagates a DISCONNECT to the Target API. 1913 JOIN_AUTH - a PHSM or OSM JOIN is authorized. 1915 JOIN_NOT_AUTH - a PHSM or OSM JOIN is not authorized. 1917 RetryTimeout - an FSM recieves an implicit REFUSE response to a 1918 CONNECT or CHANGE request to one Target in the TargetList by 1919 exceeding the ACK retry and timeout values (i.e., ToChange/NChange, 1920 ToConnect/NConnect timers and retry counts) for that particular 1921 transaction. 1923 The Add and Change states cannot transition back to the Establd state 1924 until all Targets have given implicit or explicit responses. 1926 ACCEPT_LAST - the last Target in the TargetList for a CONNECT or 1927 CHANGE has responded with an ACCEPT. 1929 DISC_LAST - the last Target in the TargetList for a CONNECT or CHANGE 1930 has responded with a DISCONNECT. 1932 REFUSE_LAST - the last Target in the TargetList for a CONNECT or 1933 CHANGE has responded with a REFUSE. 1935 RetryTimeout_last - the last Target in the TargetList for a CONNECT 1936 or CHANGE has responded with an implicit REFUSE by exceeding the ACK 1937 retry and timeout values (i.e., ToChange/NChange, ToConnect/NConnect 1938 timers and retry counts). 1940 E2E_Timeout_last - the last Target in the TargetList for a CONNECT or 1941 CHANGE has responded with an implicit REFUSE by exceeding the End- 1942 to-End timeout value (i.e., ToChangeResp, ToConnectResp timers). 1944 Except in the case of the OSM (which must be explicitly closed by the 1945 Origin API, the Establd, Add and Change states transition back to the 1946 Init state when all Targets in the unique FSM TargetList have given 1947 implicit or explicit stream teardown instructions. 1949 DISC_ALL -a DISCONNECT has been received for the last Target in the 1950 entire TargetList for a stream FSM (as opposed to the TargetList for 1951 a particular CONNECT or CHANGE request). 1953 REFUSE_ALL - a REFUSE has been received for the last Target in the 1954 entire TargetList for a stream FSM (as opposed to the TargetList for 1955 a particular CONNECT or CHANGE request 1957 6.2 ST Control Message Flow 1959 Control Message Types 1961 ST control messages are generally of the Request -Response type. 1962 Table 1 summarizes these control messages alphabetically. The table 1963 has three major columns. 1965 o Message type 1967 o Response 1969 o Possible causes for message 1971 6.2.1 Message Type 1973 Under the Message Type each control message is categorized either as 1974 a: 1976 - Request message 1978 - Response message 1980 It is possible for a message to be more than one type depending on 1981 the usage, although this is not apparent from this table. 1983 6.2.2 Response 1985 The Response to each control message is given in the next major 1986 column under Response. Note that the Response to a message can be 1987 interpreted to mean either: 1989 1. a Response to another control message 1991 2. a Response to indicate the condition of receipt of the 1992 message, driven primarily by the error control function 1994 The second interpretation of Response includes positive 1995 acknowledgments and negative acknowledgments (error response). Thus, 1996 this major column has the following categories: 1998 - Error Response 2000 - Mandatory Response 2002 - Other response following mandatory response. 2004 An X or an entry in the table indicates classification of a message 2005 under a particular category shown under each major column. 2007 6.2.3 Possible causes for message 2009 Finally, a control message might have been sent in response to 2010 another control message. This is shown in the last column. Note that 2011 it is possible that independently a number of control messages may be 2012 the cause for this control message in question Note that an entry 2013 does not necessarily mean that is the only cause. A blank entry in 2014 this column for instance means that the message was not invoked by 2015 another message. 2017 For example, an ACCEPT message is a Response Type message to either a 2018 CONNECT or a CHANGE message. It will be acknowledged (Mandatory 2019 response) with an ACK. It may be responded with an ERROR in case of 2020 error conditions. The state diagrams illustrate this sequencing more 2021 completely. It may be noted that the sequencing of messages gives the 2022 protocol semantics. 2024 Table 8: Message Types: Requests, Responses and Others 2026 6.3 Internetwork Complexities 2027 The following internetwork diagram of ST Agents indicates the Origin 2028 (O), Intermediate (I) and Target (T) roles of each ST Agent in 2029 relation to two ST conferences. Conference 1 (C1) has 4 participants, 2030 all of which are sending and receiving data from each other. 2031 Intermediate Agents 1, 2, 3 and 4 each have an ST application with an 2032 Origin sending data to all other members, and simultaneously 2033 receiving data as a Target for each of the other members. 2035 Figure 18. 2037 Figure 19. 2039 Each ST Agent participating in C1, as illustrated, has all four 2040 streams to manage, representing a fully meshed stream conference with 2041 Targets and Origins communicating along the same paths. 2043 There are a number of other possible routes for each stream in C1. 2044 The above paths through Agents I6, I9 and I10 were chosen to 2045 illustrate a simple routing scheme for such a conference. Agents I8 2046 and I12 could have just as easily been involved, if there were no 2047 other routing metrics to consider other than number of hops. However, 2048 it is also possible that the resources available at any Agent or 2049 interface may not actually be equal , such that Agents I7 and I12 2050 became involved in branches of some of the streams. In such 2051 alternative routing and resource circumstances, some of the 2052 Intermediate Agents might only maintain one stream in the conference. 2053 However, in this illustration, Agent I9 happens to have an ST 2054 neighbor for every stream and the need to manage multiple targets for 2055 each stream. 2057 Conference 2 (C2) has 3 participants and is also a fully meshed set 2058 of streams for each member of the conference. All ST Agents in the C2 2059 illustration also have multiple ST neighbors, streams and interfaces 2060 to manage. 2062 Figure 20. , 2064 In addition, since both conferences are being conducted 2065 simultaneously, several Agents are managing streams from both 2066 conferences, which may be Grouped for Resource or Fatesharing 2067 characteristics. The dynamics of such internetwork topology and 2068 resource issues can become complex stream management issues. 2070 ACKNOWLEDGEMENTS and AUTHORS: 2072 Many individuals have contributed to the work described in this memo. 2074 We thank the participants in the ST Working Group for their input, 2075 review, and constructive comments. 2077 We would also like to thank Luca Delgrossi and Louis Berger for 2078 allowing us to adopt the text from their [1] document. 2080 We would like to acknowledge inputs from Mark Pullen and his graduate 2081 students, Tim O'Malley, Eric Crowley, Muneyoshi Suzuki and many 2082 others. 2084 Murali Rajagopal EMail: murali@fbcs.com, Phone: 714-764-2952 2086 Sharon Sergeant EMail:sergeant@xylogics.com, Phone: 617-893-6142 2088 LIST OF REFERENCES: 2090 [1] L. Delgrossi and L. Berger: Internet STream Protocol Version 2 2091 (ST) - Protocol Specification- Version ST2+, RFC 1819 , August 1995. 2093 [2] D. Brand and P. Zafiropulo: On Communicating Finite-State 2094 Machines, J.ACM, 30, No.2, April 1983