idnits 2.17.1 draft-ietf-ion-scsp-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1802 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 32 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The exact meaning of the all-uppercase expression 'MAY NOT' is not defined in RFC 2119. If it is intended as a requirements expression, it should be rewritten using one of the combinations defined in RFC 2119; otherwise it should not be all-uppercase. == The expression 'MAY NOT', while looking like RFC 2119 requirements text, is not defined in RFC 2119, and should not be used. Consider using 'MUST NOT' instead (if that is what you mean). Found 'MAY NOT' in this paragraph: The SCSP Vendor-Private Extension is carried in SCSP packets to convey vendor-private information between an LS and a DCS in the same SG and is thus of limited use. If a finer granularity (e.g., CSA record level) is desired then then given client/server protocol specific SCSP document MUST define such a mechanism. Obviously, however, such a protocol specific mechanism might look exactly like this extension. The Vendor Private Extension MAY NOT appear more than once in an SCSP packet for a given Vendor ID value. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1998) is 9440 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '5' is defined on line 1694, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 1697, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1577 (ref. '1') (Obsoleted by RFC 2225) == Outdated reference: A later version (-14) exists of draft-ietf-rolc-nhrp-12 ** Obsolete normative reference: RFC 1583 (ref. '3') (Obsoleted by RFC 2178) -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '6' ** Obsolete normative reference: RFC 1700 (ref. '7') (Obsoleted by RFC 3232) == Outdated reference: A later version (-04) exists of draft-ietf-ion-scsp-nhrp-02 -- Possible downref: Non-RFC (?) normative reference: ref. '9' ** Downref: Normative reference to an Informational RFC: RFC 2104 (ref. '11') Summary: 14 errors (**), 0 flaws (~~), 8 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internetworking Over NBMA James V. Luciani 2 INTERNET-DRAFT (Bay Networks) 3 Grenville Armitage 4 (Bellcore) 5 Joel Halpern 6 (Newbridge) 7 Naganand Doraswamy 8 (Bay Networks) 9 Expires June 1998 11 Server Cache Synchronization Protocol (SCSP) 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working 16 documents of the Internet Engineering Task Force (IETF), its areas, 17 and its working groups. Note that other groups may also distribute 18 working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as ``work in progress.'' 25 To learn the current status of any Internet-Draft, please check the 26 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 27 Directories on ds.internic.net (US East Coast), nic.nordu.net 28 (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific 29 Rim). 31 Abstract 33 This document describes the Server Cache Synchronization Protocol 34 (SCSP) and is written in terms of SCSP's use within Non Broadcast 35 Multiple Access (NBMA) networks; although, a somewhat straight 36 forward usage is applicable to BMA networks. SCSP attempts to solve 37 the generalized cache synchronization/cache-replication problem for 38 distributed protocol entities. However, in this document, SCSP is 39 couched in terms of the client/server paradigm in which distributed 40 server entities, which are bound to a Server Group (SG) through some 41 means, wish to synchronize the contents (or a portion thereof) of 42 their caches which contain information about the state of clients 43 being served. 45 1. Introduction 47 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 48 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 49 document, are to be interpreted as described in [10]. 51 It is perhaps an obvious goal for any protocol to not limit itself to 52 a single point of failure such as having a single server in a 53 client/server paradigm. Even when there are redundant servers, there 54 still remains the problem of cache synchronization; i.e., when one 55 server becomes aware of a change in state of cache information then 56 that server must propagate the knowledge of the change in state to 57 all servers which are actively mirroring that state information. 58 Further, this must be done in a timely fashion without putting undue 59 resource strains on the servers. Assuming that the state information 60 kept in the server cache is the state of clients of the server, then 61 in order to minimize the burden placed upon the client it is also 62 highly desirable that clients need not have complete knowledge of all 63 servers which they may use. However, any mechanism for 64 synchronization should not preclude a client from having access to 65 several (or all) servers. Of course, any solution must be reasonably 66 scalable, capable of using some auto-configuration service, and lend 67 itself to a wide range of authentication methodologies. 69 This document describes the Server Cache Synchronization Protocol 70 (SCSP). SCSP solves the generalized server synchronization/cache- 71 replication problem while addressing the issues described above. 72 SCSP synchronizes caches (or a portion of the caches) of a set of 73 server entities of a particular protocol which are bound to a Server 74 Group (SG) through some means (e.g., all NHRP servers belonging to a 75 Logical IP Subnet (LIS)[1]). The client/server protocol which a 76 particular server uses is identified by a Protocol ID (PID). SGs are 77 identified by an ID which, not surprisingly, is called a SGID. Note, 78 therefore, that the combination PID/SGID identifies both the 79 client/server protocol for which the servers of the SG are being 80 synchronized as well as the instance of that protocol. This implies 81 that multiple instances of the same protocol may be in operation at 82 the same time and have their servers synchronized independently of 83 each other. An example of types of information that must be 84 synchronized can be seen in NHRP[2] using IP where the information 85 includes the registered clients' IP to NBMA mappings in the SG LIS. 87 The simplest way to understand SCSP is to understand that the 88 algorithm used here is quite similar to that used in OSPF[3]. In 89 fact, if the reader wishes to understand more details of the 90 tradeoffs and reliability aspects of SCSP, they should refer to the 91 Hello, Database Synchronization, and Flooding Procedures in OSPF [3]. 93 As described later, the protocol goes through three phases. The 94 first, very brief phase is the hello phase where two devices 95 determine that they can talk to each other. Following that is 96 database synchronization. The operation of SCSP assumes that up to 97 the point when new information is received, two entities have the 98 same data available. The database synchronization phase ensures 99 this. 101 In database synchronization, the two neighbors exchange summary 102 information about each entry in their database. Summaries are used 103 since the database itself is potentially quite large. Based on these 104 summaries, the neighbors can determine if there is information that 105 each needs from the other. If so, that is requested and provided. 106 Therefore, at the end of this phase of operation, the two neighbors 107 have the same data in their databases. 109 After that, the entities enter and remain in flooding state. In 110 flooding state, any new information that is learned is sent to all 111 neighbors, except the one (if any) that the information was learned 112 from. This causes all new information in the system to propagate to 113 all nodes, thus restoring the state that everyone knows the same 114 thing. Flooding is done reliably on each link, so no pattern of low 115 rate packet loss will cause a disruption. (Obviously, a sufficiently 116 high rate of packet loss will cause the entire neighbor relationship 117 to come down, but if the link does not work, then that is what one 118 wants.) 120 Because the database synchronization procedure is run whenever a link 121 comes up, the system robustly ensures that all participating nodes 122 have all available information. It properly recovers from 123 partitions, and copes with other failures. 125 The SCSP specification is not useful as a stand alone protocol. It 126 must be coupled with the use of an SCSP Protocol Specific 127 specification which defines how a given protocol would make use of 128 the synchronization primitives supplied by SCSP. Such specification 129 will be done in separate documents; e.g., [8][9]. 131 2. Overview 133 SCSP places no topological requirements upon the SG. Obviously, 134 however, the resultant graph must span the set of servers to be 135 synchronized. SCSP borrows its cache distribution mechanism from the 136 link state protocols [3,4]. However, unlike those technologies, 137 there is no mandatory Shortest Path First (SPF) calculation, and SCSP 138 imposes no additional memory requirements above and beyond that which 139 is required to save the cached information which would exist 140 regardless of the synchronization technology. 142 In order to give a frame of reference for the following discussion, 143 the terms Local Server (LS), Directly Connected Server (DCS), and 144 Remote Server (RS) are introduced. The LS is the server under 145 scrutiny; i.e., all statements are made from the perspective of the 146 LS when discussing the SCSP protocol. The DCS is a server which is 147 directly connected to the LS; e.g., there exists a VC between the LS 148 and DCS. Thus, every server is a DCS from the point of view of every 149 other server which connects to it directly, and every server is an LS 150 which has zero or more DCSs directly connected to it. From the 151 perspective of an LS, an RS is a server, separate from the LS, which 152 is not directly connected to the LS (i.e., an RS is always two or 153 more hops away from an LS whereas a DCS is always one hop away from 154 an LS). 156 SCSP contains three sub protocols: the "Hello" protocol, the "Cache 157 Alignment" protocol, and the "Cache State Update" protocol. The 158 "Hello" protocol is used to ascertain whether a DCS is operational 159 and whether the connection between the LS and DCS is bidirectional, 160 unidirectional, or non-functional. The "Cache Alignment" (CA) 161 protocol allows an LS to synchronize its entire cache with that of 162 the cache of its DCSs. The "Cache State Update" (CSU) protocol is 163 used to update the state of cache entries in servers for a given SG. 164 Sections 2.1, 2.2, and 2.3 contain a more in-depth explanation of the 165 Hello, CA, and CSU protocols and the messages they use. 167 SCSP based synchronization is performed on a per protocol instance 168 basis. That is, a separate instance of SCSP is run for each instance 169 of the given protocol running in a given box. The protocol is 170 identified in SCSP via a Protocol ID and the instance of the protocol 171 is identified by a Server Group ID (SGID). Thus the PID/SGID pair 172 uniquely identify an instance of SCSP. In general, this is not an 173 issue since it is seldom the case that many instances of a given 174 protocol (which is distributed and needs cache synchronization) are 175 running within the same physical box. However, when this is the 176 case, there is a mechanism called the Family ID (described briefly in 177 the Hello Protocol) which enables a substantial reduction in 178 maintenance traffic at little real cost in terms of control. The use 179 of the Family ID mechanism, when appropriate for a given protocol 180 which is using SCSP, will be fully defined in the given SCSP protocol 181 specific specification. 183 +---------------+ 184 | | 185 +------->| DOWN |<-------+ 186 | | | | 187 | +---------------+ | 188 | | ^ | 189 | | | | 190 | | | | 191 | | | | 192 | @ | | 193 | +---------------+ | 194 | | | | 195 | | WAITING | | 196 | +--| |--+ | 197 | | +---------------+ | | 198 | | ^ ^ | | 199 | | | | | | 200 | @ | | @ | 201 +---------------+ +---------------+ 202 | BIDIRECTIONAL |---->| UNIDIRECTIONAL| 203 | | | | 204 | CONNECTION |<----| CONNECTION | 205 +---------------+ +---------------+ 207 Figure 1: Hello Finite State Machine (HFSM) 209 2.1 Hello Protocol 211 "Hello" messages are used to ascertain whether a DCS is operational 212 and whether the connections between the LS and DCS are bidirectional, 213 unidirectional, or non-functional. In order to do this, every LS MUST 214 periodically send a Hello message to its DCSs. 216 An LS must be configured with a list of NBMA addresses which 217 represent the addresses of peer servers in a SG to which the LS 218 wishes to have a direct connection for the purpose of running SCSP; 219 that is, these addresses are the addresses of would-be DCSs. The 220 mechanism for the configuration of an LS with these NBMA address is 221 beyond the scope of this document; although one possible mechanism 222 would be an autoconfiguration server. 224 An LS has a Hello Finite State Machine (HFSM) associated with each of 225 its DCSs (see Figure 1) for a given SG, and the HFSM monitors the 226 state of the connectivity between the servers. 228 The HFSM starts in the "Down" State and transitions to the "Waiting" 229 State after NBMA level connectivity has been established. Once in 230 the Waiting State, the LS starts sending Hello messages to the DCS. 231 The Hello message includes: a Sender ID which is set to the LS's ID 232 (LSID), zero or more Receiver IDs which identify the DCSs from which 233 the LS has recently heard a Hello message (as described below), and a 234 HelloInterval and DeadFactor which will be described below. At this 235 point, the DCS may or may not already be sending its own Hello 236 messages to the LS. 238 When the LS receives a Hello message from one of its DCSs, the LS 239 checks to see if its LSID is in one of the Receiver ID fields of that 240 message which it just received, and the LS saves the Sender ID from 241 that Hello message. If the LSID is in one of the Receiver ID fields 242 then the LS transitions the HFSM to the Bidirectional Connection 243 state otherwise it transitions the HFSM into the Unidirectional 244 Connection state. The Sender ID which was saved is the DCS's ID 245 (DCSID). At some point before the next time that the LS sends its 246 own Hello message to the DCS, the LS will check the saved DCSID 247 against a list of Receiver IDs which the LS uses when sending the 248 LS's own Hello messages. If the DCSID is not found in the list of 249 Receiver IDs then it is added to that list before the LS sends its 250 Hello message. 252 Hello messages also contain a HelloInterval and a DeadFactor. The 253 Hello interval advertises the time (in seconds) between sending of 254 consecutive Hello messages by the server which is sending the 255 "current" Hello message. That is, if the time between reception of 256 Hello messages from a DCS exceeds the HelloInterval advertised by 257 that DCS then the next Hello message is to be considered late by the 258 LS. If the LS does not receive a Hello message, which contains the 259 LS's LSID in one of the Receiver ID fields, within the interval 260 HelloInterval*DeadFactor seconds (where DeadFactor was advertised by 261 the DCS in a previous Hello message) then the LS MUST consider the 262 DCS to be stalled. At which point one of two things will happen: 1) 263 if any Hello messages have been received during the last 264 HelloInterval*DeadFactor seconds then the LS should transition the 265 HFSM for that DCS to the Unidirectional Connection State; otherwise, 266 the LS should transition the HFSM for that DCS to the Waiting State 267 and remove the DCSID from the Receiver ID list. 269 Note that the Hello Protocol is on a per PID/SGID basis. Thus, for 270 example, if there are two servers (one in SG A and the other in SG B) 271 associated with an NBMA address X and another two servers (also one 272 in SG A and the other in SG B) associated with NBMA address Y and 273 there is a suitable point-to-point VC between the NBMA addresses then 274 there are two HFSMs running on each side of the VC (one per 275 PID/SGID). 277 Hello messages contain a list of Receiver IDs instead of a single 278 Receiver ID in order to make use of point to multipoint connections. 279 While there is an HFSM per DCS, an LS MUST send only a single Hello 280 message to its DCSs attached as leaves of a point to multipoint 281 connection. The LS does this by including DCSIDs in the list of 282 Receiver IDs when the LS's sends its next Hello message. Only the 283 DCSIDs from non-stalled DCSs from which the LS has heard a Hello 284 message are included. 286 Any abnormal event, such as receiving a malformed SCSP message, 287 causes the HFSM to transition to the Waiting State; however, a loss 288 of NBMA connectivity causes the HFSM to transition to the Down State. 289 Until the HFSM is in the Bidirectional Connection State, if any 290 properly formed SCSP messages other than Hello messages are received 291 then those messages MUST be ignored (this is for the case where, for 292 example, there is a point to multipoint connection involved). 294 +------------+ 295 | | 296 +--->| DOWN | 297 | | | 298 | +------------+ 299 | | 300 ^ | 301 | @ 302 | +------------+ 303 | |Master/Slave| 304 |-<--| |<---+ 305 | |Negotiation | | 306 | +------------+ | 307 | | | 308 ^ | ^ 309 | @ | 310 | +------------+ | 311 | | Cache | | 312 |-<--| |-->-| 313 | | Summarize | | 314 | +------------+ | 315 | | | 316 ^ | ^ 317 | @ | 318 | +------------+ | 319 | | Update | | 320 |-<--| |-->-| 321 | | Cache | | 322 | +------------+ | 323 | | | 324 ^ | ^ 325 | @ | 326 | +------------+ | 327 | | | | 328 +-<--| Aligned |-->-+ 329 | | 330 +------------+ 332 Figure 2: Cache Alignment Finite State Machine 334 2.2 Cache Alignment Protocol 336 "Cache Alignment" (CA) messages are used by an LS to synchronize its 337 cache with that of the cache of each of its DCSs. That is, CA 338 messages allow a booting LS to synchronize with each of its DCSs. A 339 CA message contains a CA header followed by zero or more Cache State 340 Advertisement Summary records (CSAS records). 342 An LS has a Cache Alignment Finite State Machine (CAFSM) associated 343 (see Figure 2) with each of its DCSs on a per PID/SGID basis, and the 344 CAFSM monitors the state of the cache alignment between the servers. 345 The CAFSM starts in the Down State. The CAFSM is associated with an 346 HFSM, and when that HFSM reaches the Bidirectional State, the CAFSM 347 transitions to the Master/Slave Negotiation State. The Master/Slave 348 Negotiation State causes either the LS or DCS to take on the role of 349 master over the cache alignment process. In a sense, the master 350 server sets the tempo for the cache alignment. 352 When the LS's CAFSM reaches the Master/Slave Negotiation State, the 353 LS will send a CA message to the DCS associated with the CAFSM. The 354 format of CA messages are described in Section B.2.1. The first CA 355 message which the LS sends includes no CSAS records and a CA header 356 which contains the LSID in the Sender ID field, the DCSID in the 357 Receiver ID field, a CA sequence number, and three bits. These three 358 bits are the M (Master/Slave) bit, the I (Initialization of master) 359 bit, and the O (More) bit. In the first CA message sent by the LS to 360 a particular DCS, the M, O, and I bits are set to one. If the LS 361 does not receive a CA message from the DCS in CAReXmtInterval seconds 362 then it resends the CA message it just sent. The LS continues to do 363 this until the CAFSM transitions to the Cache Summarize State or 364 until the HFSM transitions out of the Bidirectional State. Any time 365 the HFSM transitions out of the Bidirectional State, the CAFSM 366 transitions to the Down State. 368 2.2.1 Master Slave Negotiation State 370 When the LS receives a CA message from the DCS while in the 371 Master/Slave Negotiation State, the role the LS plays in the exchange 372 depends on packet processing as follows: 374 1) If the CA from the DCS has the M, I, and O bits set to one and there 375 are no CSAS records in the CA message and the Sender ID as specified 376 in the DCS's CA message is larger than the LSID then 377 a) The timer counting down the CAReXmtInterval is stopped. 378 b) The CAFSM corresponding to that DCS transitions to the 379 Cache Summarize State and the LS takes on the role of slave. 380 c) The LS adopts the CA sequence number it received in the CA message 381 as its own CA sequence number. 382 d) The LS sends a CA message to the DCS which is formated as follows: 383 the M and I bits are set to zero, the Sender ID field is set to the 384 LSID, the Receiver ID field is set to the DCSID, and the 385 CA sequence number is set to the CA sequence number that appeared 386 in the DCS's CA message. If there are CSAS records to be sent 387 (i.e., if the LS's cache is not empty), and if all of them will not 388 fit into this CA message then the O bit is set to one and the initial 389 set of CSAS records are included in the CA message; otherwise the O bit 390 is set to zero and if any CSAS Records need to be sent then those 391 records are included in the CA message. 393 2) If the CA message from the DCS has the M and I bits off and the 394 Sender ID as specified in the DCS's CA message is smaller than 395 the LSID then 396 a) The timer counting down the CAReXmtInterval is stopped. 397 b) The CAFSM corresponding to that DCS transitions to the 398 Cache Summarize State and the LS takes on the role of master. 399 c) The LS must process the received CA message. 400 An explanation of CA message processing is given below. 401 d) The LS sends a CA message to the DCS which is formated as follows: 402 the M bit is set to one, I bit is set to zero, the Sender ID 403 field is set to the LSID, the Receiver ID field is set to the DCSID, 404 and the LS's current CA sequence number is incremented by one and 405 placed in the CA message. If there are any CSAS records to be sent 406 from the LS to the DCS (i.e., if the LS's cache is not empty) then 407 the O bit is set to one and the initial set of CSAS records are 408 included in the CA message that the LS is sending to the DCS. 410 3) Otherwise, the packet must be ignored. 412 2.2.2 The Cache Summarize State 414 At any given time, the master or slave have at most one outstanding 415 CA message. Once the LS's CAFSM has transitioned to the Cache 416 Summarize State the sequence of exchanges of CA messages occurs as 417 follows. 419 1) If the LS receives a CA message with the M bit set incorrectly 420 (e.g., the M bit is set in the CA of the DCS and the LS is master) 421 or if the I bit is set then the CAFSM transitions back to the 422 Master/Slave Negotiation State. 424 2) If the LS is master and the LS receives a CA message with a 425 CA sequence number which is one less than the LS's current 426 CA sequence number then the message is a duplicate and the message 427 MUST be discarded. 429 3) If the LS is master and the LS receives a CA message with a 430 CA sequence number which is equal to the LS's current CA sequence 431 number then the CA message MUST be processed. An explanation of 432 "CA message processing" is given below. As a result of having 433 received the CA message from the DCS the following will occur: 434 a) The timer counting down the CAReXmtInterval is stopped. 435 b) The LS must process any CSAS records in the received CA message. 436 c) Increment the LS's CA sequence number by one. 437 d) The cache exchange continues as follows: 438 1) If the LS has no more CSAS records to send and the received CA 439 message has the O bit off then the CAFSM transitions to the 440 Update Cache State. 441 2) If the LS has no more CSAS records to send and the received CA 442 message has the O bit on then the LS sends back a CA message 443 (with new CA sequence number) which contains no CSAS records and 444 with the O bit off. Reset the timer counting down the 445 CAReXmtInterval. 446 3) If the LS has more CSAS records to send then the LS sends the 447 next CA message with the LS's next set of CSAS records. If LS 448 is sending its last set of CSAS records then the O bit is set 449 off otherwise the O bit is set on. Reset the timer counting 450 down the CAReXmtInterval. 452 4) If the LS is slave and the LS receives a CA message with a 453 CA sequence number which is equal to the LS's current 454 CA sequence number then the CA message is a duplicate and the 455 LS MUST resend the CA message which it had just sent to the DCS. 457 5) If the LS is slave and the LS receives a CA message with a 458 CA sequence number which is one more than the LS's current 459 CA sequence number then the message is valid and MUST be 460 processed. An explanation of "CA message processing" is given 461 below. As a result of having received the CA message from the 462 DCS the following will occur: 464 a) The LS must process any CSAS records in the received CA message. 465 b) Set the LS's CA sequence number to the CA sequence number in the CA 466 message. 467 c) The cache exchange continues as follows: 468 1) If the LS had just sent a CA message with the O bit off and the 469 received CA message has the O bit off then the CAFSM transitions 470 to the Update Cache State and the LS sends a CA message with no 471 CSAS records and with the O bit off. 472 2) If the LS still has CSAS records to send then the LS MUST send 473 a CA message with CSAS records in it. 474 a) If the message being sent from the LS to the DCS does not contain 475 the last CSAS records that the LS needs to send then the CA 476 message is sent with the O bit on. 477 b) If the message being sent from the LS to the DCS does contain 478 the last CSAS records that the LS needs to send and 479 the CA message just received from the DCS had the O bit off then 480 the CA message is sent with the O bit off, and the LS transitions 481 the CAFSM to the Update Cache State. 482 c) If the message being sent from the LS to the DCS does contain 483 the last CSAS records that the LS needs to send and 484 the CA message just received from the DCS had the O bit on 485 then the CA message is sent with the O bit off and the alignment 486 process continues. 488 6) If the LS is slave and the LS receives a CA message with a 489 CA sequence number that is neither equal to nor one more than 490 the current LS's CA sequence number then an error has occurred 491 and the CAFSM transitions to the Master/Slave Negotiation State. 493 Note that if the LS was slave during the CA process then the LS upon 494 transitioning the CAFSM to the Update Cache state MUST keep a copy of 495 the last CA message it sent and the LS SHOULD set a timer equal to 496 CAReXmtInterval. If either the timer expires or the LS receives a CSU 497 Solicit (CSUS) message (CSUS messages are described in Section 2.2.3) 498 from the DCS then the LS releases the copy of the CA message. The 499 reason for this is that if the DCS (which is master) loses the last 500 CA message sent by the LS then the DCS will resend its previous CA 501 message with the last CA Sequence number used. If that were to occur 502 the LS would need to resend its last sent CA message as well. 504 2.2.2.1 "CA message processing": 506 The LS makes a list of those cache entries which are more "up to 507 date" in the DCS than the LS's own cache. This list is called the 508 CSA Request List (CRL). See Section 2.4 for a description of what it 509 means for a CSA (Client State Advertisement) record or CSAS record to 510 be more "up to date" than an LS's cache entry. 512 2.2.3 The Update Cache State 514 If the CRL of the associated CAFSM of the LS is empty upon transition 515 into the Update Cache State then the CAFSM immediately transitions 516 into the Aligned State. 518 If the CRL is not empty upon transition into the Update Cache State 519 then the LS solicits the DCS to send the CSA records corresponding to 520 the summaries (i.e., CSAS records) which the LS holds in its CRL. The 521 solicited CSA records will contain the entirety of the cached 522 information held in the DCS's cache for the given cache entry. The 523 LS solicits the relevant CSA records by forming CSU Solicit (CSUS) 524 messages from the CRL. See Section B.2.4 for the description of the 525 CSUS message format. The LS then sends the CSUS messages to the DCS. 526 The DCS responds to the CSUS message by sending to the LS one or more 527 CSU Request messages containing the entirety of newer cached 528 information identified in the CSUS message. Upon receiving the CSU 529 Request the LS will send one or more CSU Replies as described in 530 Section 2.3. Note that the LS may have at most one CSUS message 531 outstanding at any given time. 533 Just before the first CSUS message is sent from an LS to the DCS 534 associated with the CAFSM, a timer is set to CSUSReXmtInterval 535 seconds. If all the CSA records corresponding to the CSAS records in 536 the CSUS message have not been received by the time that the timer 537 expires then a new CSUS message will be created which contains all 538 the CSAS records for which no appropriate CSA record has been 539 received plus additional CSAS records not covered in the previous 540 CSUS message. The new CSUS message is then sent to the DCS. If, at 541 some point before the timer expires, all CSA record updates have been 542 received for all the CSAS records included in the previously sent 543 CSUS message then the timer is stopped. Once the timer is stopped, 544 if there are additional CSAS records that were not covered in the 545 previous CSUS message but were in the CRL then the timer is reset and 546 a new CSUS message is created which contains only those CSAS records 547 from the CRL which have not yet been sent to the DCS. This process 548 continues until all the CSA records corresponding CSAS records that 549 were in the CRL have been received by the LS. When the LS has a 550 completely updated cache then the LS transitions CAFSM associated 551 with the DCS to the Aligned State. 553 If an LS receives a CSUS message or a CA message with a Receiver ID 554 which is not the LS's LSID then the message must be discarded and 555 ignored. This is necessary since an LS may be a leaf of a point to 556 multipoint connection with other servers in the SG. 558 2.2.4 The Aligned State 560 While in the Aligned state, an LS will perform the Cache State Update 561 Protocol as described in Section 2.3. 563 Note that an LS may receive a CSUS message while in the Aligned State 564 and, the LS MUST respond to the CSUS message with the appropriate CSU 565 Request message in a similar fashion to the method previously 566 described in Section 2.2.3. 568 2.3 Cache State Update Protocol 570 "Cache State Update" (CSU) messages are used to dynamically update 571 the state of cache entries in servers on a given PID/SGID basis. CSU 572 messages contain zero or more "Cache State Advertisement" (CSA) 573 records each of which contains its own snapshot of the state of a 574 particular cache entry. An LS may send/receive a CSU to/from a DCS 575 only when the corresponding CAFSM is in either the Aligned State or 576 the Update Cache State. 578 There are two types of CSU messages: CSU Requests and CSU Replies. 579 See Sections B.2.2 and B.2.3 respectively for message formats. A CSU 580 Request message is sent from an LS to one or more DCSs for one of two 581 reasons: either the LS has received a CSUS message and MUST respond 582 only to the DCS which originated the CSUS message, or the LS has 583 become aware of a change of state of a cache entry. An LS becomes 584 aware of a change of state of a cache entry either through receiving 585 a CSU Request from one of its DCSs or as a result of a change of 586 state being observed in a cached entry originated by the LS. In the 587 former case, the LS will send a CSU Request to each of its DCSs 588 except the DCS from which the LS became aware of the change in state. 589 In the latter case, the LS will send a CSU Request to each of its 590 DCSs. The change in state of a particular cache entry is noted in a 591 CSA record which is then appended to the end of the CSU Request 592 message mandatory part. In this way, state changes are propagated 593 throughout the SG. 595 Examples of such changes in state are as follows: 597 1) a server receives a request from a client to add an entry to its 598 cache, 599 2) a server receives a request from a client to remove an entry from 600 its cache, 601 3) a cache entry has timed out in the server's cache, has been 602 refreshed in the server's cache, or has been administratively 603 modified. 605 When an LS receives a CSU Request from one of its DCSs, the LS 606 acknowledges one or more CSA Records which were contained in the CSU 607 Request by sending a CSU Reply. The CSU Reply contains one or more 608 CSAS records which correspond to those CSA records which are being 609 acknowledged. Thus, for example, if a CSA record is dropped (or 610 delayed in processing) by the LS because there are insufficient 611 resources to process it then a corresponding CSAS record is not 612 included in the CSU Reply to the DCS. 614 Note that an LS may send multiple CSU Request messages before 615 receiving a CSU Reply acknowledging any of the CSA Records contained 616 in the CSU Requests. Note also that a CSU Reply may contain 617 acknowledgments for CSA Records from multiple CSU Requests. Thus, 618 the terms "request" and "reply" may be a bit confusing. 620 Note that a CSA Record contains a CSAS Record followed by 621 client/server protocol specific information contained in a cache 622 entry (see Section B.2.0.2 for CSAS record format information and 623 Section B.2.2.1 for CSA record format information). When a CSA 624 record is considered by the LS to represent cached information which 625 is more "up to date" (see Section 2.4) than the cached information 626 contained within the cache of the LS then two things happen: 1) the 627 LS's cache is updated with the more up to date information, and 2) 628 the LS sends a CSU Request containing the CSA Record to each of its 629 DCSs except the one from which the CSA Record arrived. In this way, 630 state changes are propagated within the PID/SGID. Of course, at some 631 point, the LS will also acknowledge the reception of the CSA Record 632 by sending the appropriate DCS a CSU Reply message containing the 633 corresponding CSAS Record. 635 When an LS sends a new CSU Request, the LS keeps track of the 636 outstanding CSA records in that CSU Request and to which DCSs the LS 637 sent the CSU Request. For each DCS to which the CSU Request was 638 sent, a timer set to CSUReXmtInterval seconds is started just prior 639 to sending the CSU Request. This timer is associated with the CSA 640 Records contained in that CSU Request such that if that timer expires 641 prior to having all CSA Records acknowledged from that DCS then (and 642 only then) a CSU Request is re-sent by the LS to that DCS. However, 643 the re-sent CSU Request only contains those CSA Records which have 644 not yet been acknowledged. If all CSA Records associated with a 645 timer becomes acknowledged then the timer is stopped. Note that the 646 re-sent CSA Records follow the same time-out and retransmit rules as 647 if they were new. Retransmission will occur a configured number of 648 times for a given CSA Record and if acknowledgment fails to occur 649 then an "abnormal event" has occurred at which point the then the 650 HFSM associated with the DCS is transitioned to the Waiting State. 652 A CSA Record instance is said to be on a "DCS retransmit queue" when 653 it is associated with the previously mentioned timer. Only the most 654 up-to-date CSA Record is permitted to be queued to a given DCS 655 retransmit queue. Thus, if a less up-to-date CSA Record is queued to 656 the DCS retransmit queue when a newer CSA Record instance is about to 657 be queued to that DCS retransmit queue then the older CSA Record 658 instance is dequeued and disassociated with its timer immediately 659 prior to enqueuing the newer instance of the CSA Record. 661 When an LS receives a CSU Reply from one of its DCSs then the LS 662 checks each CSAS record in the CSU Reply against the CSAS Record 663 portion of the CSA Records which are queued to the DCS retransmit 664 queue. 666 1) If there exists an exact match between the CSAS record portion 667 of the CSA record and a CSAS Record in the CSU Reply then 668 that CSA Record is considered to be acknowledged and is thus 669 dequeued from the DCS retransmit queue and is 670 disassociated with its timer. 672 2) If there exists a match between the CSAS record portion 673 of the CSA record and a CSAS Record in the CSU Reply except 674 for the CSA Sequence number then 675 a) If the CSA Record queued to the DCS retransmit queue has a 676 CSA Sequence Number which is greater than the 677 CSA Sequence Number in the CSAS Record of the the CSU Reply then 678 the CSAS Record in the CSU Reply is ignored. 679 b) If the CSA Record queued to the DCS retransmit queue has a 680 CSA Sequence Number which is less than the 681 CSA Sequence Number in the CSAS Record of the the CSU Reply then 682 CSA Record which is queued to the DCS retransmit queue is 683 dequeued and the CSA Record is disassociated with its timer. 684 Further, a CSUS Message is sent to that DCS which sent the 685 more up-to-date CSAS Record. All normal CSUS processing 686 occurs as if the CSUS were sent as part of the CA protocol. 688 When an LS receives a CSU Request message which contains a CSA Record 689 which contains a CSA Sequence Number which is smaller than the CSA 690 Sequence number of the cached CSA then the LS MUST acknowledge the 691 CSA record in the CSU Request but it MUST do so by sending a CSU 692 Reply message containing the CSAS Record portion of the CSA Record 693 stored in the cache and not the CSAS Record portion of the CSA Record 694 contained in the CSU Request. 696 An LS responds to CSUS messages from its DCSs by sending CSU Request 697 messages containing the appropriate CSA records to the DCS. If an LS 698 receives a CSUS message containing a CSAS record for an entry which 699 is no longer in its database (e.g., the entry timed out and was 700 discarded after the Cache Alignment exchange completed but before the 701 entry was requested through a CSUS message), then the LS will respond 702 by copying the CSAS Record from the CSUS message into a CSU Request 703 message and the LS will set the N bit signifying that this record is 704 a NULL record since the cache entry no longer exists in the LS's 705 cache. Note that in this case, the "CSA Record" included in the CSU 706 Request to signify the NULL cache entry is literally only a CSAS 707 Record since no client/server protocol specific information exists 708 for the cache entry. 710 If an LS receives a CSA Record in a CSU Request from a DCS for which 711 the LS has an identical CSA record posted to the corresponding DCS's 712 DCS retransmit queue then the CSA Record on the DCS retransmit queue 713 is considered to be implicitly acknowledged. Thus, the CSA Record is 714 dequeued from the DCS retransmit queue and is disassociated with its 715 timer. The CSA Record sent by the DCS MUST still be acknowledged by 716 the LS in a CSU Reply, however. This is useful in the case of point 717 to multipoint connections where the rule that "when an LS receives a 718 CSA record from a DCS, that LS floods the CSA Record to every DCS 719 except the DCS from which it was received" might be broken. 721 If an LS receives a CSU with a Receiver ID which is not equal to the 722 LSID and is not set to all 0xFFs then the CSU must be discarded and 723 ignored. This is necessary since the LS may be a leaf of a point to 724 multipoint connection with other servers in the LS's SG. 726 An LS MAY send a CSU Request to the all 0xFFs Receiver ID when the LS 727 is a root of a point to multipoint connection with a set of its DCSs. 728 If an LS receives a CSU Request with the all 0xFFs Receiver ID then 729 it MUST use the Sender ID in the CSU Request as the Receiver ID of 730 the CSU Reply (i.e., it MUST unicast its response to the sender of 731 the request) when responding. If the LS wishes to send a CSU Request 732 to the all 0xFFs Receiver ID then it MUST create a time-out and 733 retransmit timer for each of the DCSs which are leaves of the point 734 to multipoint connection prior to sending the CSU Request. If in 735 this case, the time-out and retransmit timer expires for a given DCS 736 prior to acknowledgment of a given CSA Record then the LS MUST use 737 the specific DCSID as the Receiver ID rather than the all 0xFFs 738 Receiver ID. Similarly, if it is necessary to re-send a CSA Record 739 then the LS MUST specify the specific DCSID as the Receiver ID rather 740 than the all 0xFFs Receiver ID. 742 Note that if a set of servers are in a full mesh of point to 743 multipoint connections, and one server of that mesh sends a CSU 744 Request into that full mesh, and the sending server sends the CSA 745 Records in the CSU Request to the all 0xFFs Receiver ID then it would 746 not be necessary for every other server in the mesh to source their 747 own CSU Request containing those CSA Records into the mesh in order 748 to properly flood the CSA Records. This is because every server in 749 the mesh would have heard the CSU Request and would have processed 750 the included CSA Records as appropriate. Thus, a server in a full 751 mesh could consider the mesh to be a single logical port and so the 752 rule that "when an LS receives a CSA record from a DCS, that LS 753 floods the CSA Record to every DCS except the DCS from which it was 754 received" is not broken. A receiving server in the full mesh would 755 still need to acknowledge the CSA records with CSU Reply messages 756 which contain the LSID of the replying server as the Sender ID and 757 the ID of the server which sent the CSU Request as the Receiver ID 758 field. In the time out and retransmit case, the Receiver ID of the 759 CSU Request would be set to the specific DCSID which did not 760 acknowledge the CSA Record (as opposed to the all 0xFFs Receiver ID). 761 Since a full mesh emulates a broadcast media for the servers attached 762 to the full mesh, use of SCSP on a broadcast medium might use this 763 technique as well. Further discussion of this use of a full mesh or 764 use of a broadcast media is left to the client/server protocol 765 specific documents. 767 2.4 The meaning of "More Up To Date"/"Newness" 769 During the cache alignment process and during normal CSU processing, 770 a CSAS Record is compared against the contents of an LS's cache entry 771 to decide whether the information contained in the record is more "up 772 to date" than the corresponding cache entry of the LS. 774 There are three pieces of information which are used in determining 775 whether a record contains information which is more "up to date" than 776 the information contained in the cache entry of an LS which is 777 processing the record: 1) the Cache Key, 2) the Originator which is 778 described by an Originator ID (OID), and 3) the CSA Sequence number. 779 See Section B.2.0.2 for more information on these fields. 781 Given these three pieces of information, a CSAS record (be it part of 782 a CSA Record or be it stand-alone) is considered to be more "up to 783 date" than the information contained in the cache of an LS if all of 784 the following are true: 785 1) The Cache Key in the CSAS Record matches the stored Cache Key 786 in the LS's cache entry, 787 2) The OID in the CSAS Record matches the stored OID 788 in the LS's cache entry, 789 3) The CSA Sequence Number in the CSAS Record is greater than 790 CSA Sequence Number in the LS's cache entry. 792 Discussion and conclusions 794 While the above text is couched in terms of synchronizing the 795 knowledge of the state of a client within the cache of servers 796 contained in a SG, this solution generalizes easily to any number of 797 database synchronization problems (e.g., LECS synchronization). 799 SCSP defines a generic flooding protocol. There are a number of 800 related issues relative to cache maintenance and topology maintenance 801 which are more appropriately defined in the client/server protocol 802 specific documents; for example, it might be desirable to define a 803 generic cache entry time-out mechanism for a given protocol or to 804 advertise adjacency information between servers so that one could 805 obtain a topo-map of the servers in a SG. When mechanisms like these 806 are desirable, they will be defined in the client/server protocol 807 specific documents. 809 Appendix A: Terminology and Definitions 811 CA Message - Cache Alignment Message 812 These messages allow an LS to synchronize its entire cache with 813 that of the cache of one of its DCSs. 815 CAFSM - Cache Alignment Finite State Machine 816 The CAFSM monitors the state of the cache alignment between an LS 817 and a particular DCS. There exists one CAFSM per DCS as seen from 818 an LS. 820 CSA Record - Cache State Advertisement Record 821 A CSA is a record within a CSU message which identifies an update 822 to the status of a "particular" cache entry. 824 CSAS Record - Cache State Advertisement Summary Record 825 A CSAS contains a summary of the information in a CSA. A server 826 will send CSAS records describing its cache entries to another 827 server during the cache alignment process. CSAS records are also 828 included in a CSUS messages when an LS wants to request the entire 829 CSA from the DCS. The LS is requesting the CSA from the DCS 830 because the LS believes that the DCS has a more recent view of the 831 state of the cache entry in question. 833 CSU Message - Cache State Update message 834 This is a message sent from an LS to its DCSs when the LS becomes 835 aware of a change in state of a cache entry. 837 CSUS Message - Cache State Update Solicit Message 838 This message is sent by an LS to its DCS after the LS and DCS have 839 exchanged CA messages. The CSUS message contains one or more CSAS 840 records which represent solicitations for entire CSA records (as 841 opposed to just the summary information held in the CSAS). 843 DCS - Directly Connected Server 844 The DCS is a server which is directly connected to the LS; e.g., 845 there exists a VC between the LS and DCS. This term, along with the 846 terms LS and RS, is used to give a frame of reference when talking 847 about servers and their synchronization. Unless explicitly stated 848 to the contrary, there is no implied difference in functionality 849 between a DCS, LS, and RS. 851 HFSM - Hello Finite State Machine 852 An LS has a HFSM associated with each of its DCSs. The HFSM 853 monitors the state of the connectivity between the LS and a 854 particular DCS. 856 LS - Local Server 857 The LS is the server under scrutiny; i.e., all statements are made 858 from the perspective of the LS. This term, along with the terms 859 DCS and RS, is used to give a frame of reference when talking about 860 servers and their synchronization. Unless explicitly stated to the 861 contrary, there is no implied difference in functionality between a 862 DCS, LS, and RS. 864 LSID - Local Server ID 865 The LSID is a unique token that identifies an LS. This value might 866 be taken from the protocol address of the LS. 868 PID - Protocol ID 869 This field contains an identifier which identifies the 870 client/server protocol which is making use of SCSP for the given 871 message. The assignment of Protocol IDs for this field is given 872 over to IANA as described in Section C. 874 RS - Remote Server (RS) 875 From the perspective of an LS, an RS is a server, separate from the 876 LS, which is not directly connected to the LS (i.e., an RS is 877 always two or more hops away from an LS whereas a DCS is always one 878 hop away from an LS). Unless otherwise stated an RS refers to a 879 server in the SG. This term, along with the terms LS and DCS, is 880 used to give a frame of reference when talking about servers and 881 their synchronization. Unless explicitly stated to the contrary, 882 there is no implied difference in functionality between a DCS, LS, 883 and RS. 885 SG - Server Group 886 The SCSP synchronizes caches (or a portion of the caches) of a set 887 of server entities which are bound to a SG through some means 888 (e.g., all servers belonging to a Logical IP Subnet (LIS)[1]). 889 Thus an SG is just a grouping of servers around some commonality. 891 SGID - Server Group ID 892 This ID is a 16 bit identification field that uniquely identifies 893 the instance client/server protocol for which the servers of the SG 894 are being synchronized. This implies that multiple instances of 895 the same protocol may be in operation at the same time and have 896 their servers synchronized independently of each other. 898 Appendix B: SCSP Message Formats 900 This section of the appendix includes the message formats for SCSP. 901 SCSP protocols are LLC/SNAP encapsulated with an LLC=0xAA-AA-03 and 902 OUI=0x00-00-5e and PID=0x00-05. 904 SCSP has 3 parts to every packet: the fixed part, the mandatory part, 905 and the extensions part. The fixed part of the message exists in 906 every packet and is shown below. The mandatory part is specific to 907 the particular message type (i.e., CA, CSU Request/Reply, Hello, 908 CSUS) and, it includes (among other packet elements) a Mandatory 909 Common Part and zero or more records each of which contains 910 information pertinent to the state of a particular cache entry 911 (except in the case of a Hello message) whose information is being 912 synchronized within a SG. The extensions part contains the set of 913 extensions for the SCSP message. 915 In the following message formats, the fields marked as "unused" MUST 916 be set to zero upon transmission of such a message and ignored upon 917 receipt of such a message. 919 B.1 Fixed Part 921 0 1 2 3 922 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 924 | Version | Type Code | Packet Size | 925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 | Checksum | Start Of Extensions | 927 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 929 Version 930 This is the version of the SCSP protocol being used. The current 931 version is 1. 933 Type Code 934 This is the code for the message type (e.g., Hello (5), CSU 935 Request(2), CSU Reply(3), CSUS (4), CA (1)). 937 Packet Size 938 The total length of the SCSP packet, in octets (excluding link 939 layer and/or other protocol encapsulation). 941 Checksum 942 The standard IP checksum over the entire SCSP packet starting at 943 the fixed header. If the packet is an odd number of bytes in 944 length then this calculation is performed as if a byte set to 0x00 945 is appended to the end of the packet. 947 Start Of Extensions 948 This field is coded as zero when no extensions are present in the 949 message. If extensions are present then this field will be coded 950 with the offset from the top of the fixed header to the beginning 951 of the first extension. 953 B.2.0 Mandatory Part 955 The mandatory part of the SCSP packet contains the operation specific 956 information for a given message type (e.g., SCSP Cache State Update 957 Request/Reply, etc.), and it includes (among other packet elements) a 958 Mandatory Common Part (described in Section B.2.0.1) and zero or more 959 records each of which contains information pertinent to the state of 960 a particular cache entry (except in the case of a Hello message) 961 whose information is being synchronized within a SG. These records 962 may, depending on the message type, be either Cache State 963 Advertisement Summary (CSAS) Records (described in Section B.2.0.2) 964 or Cache State Advertisement (CSA) Records (described in Section 965 B.2.2.1). CSA Records contain a summary of a cache entry's 966 information (i.e., a CSAS Record) plus some additional client/server 967 protocol specific information. The mandatory common part format and 968 CSAS Record format is shown immediately below, prior to showing their 969 use in SCSP messages, in order to prevent replication within the 970 message descriptions. 972 B.2.0.1 Mandatory Common Part 974 Sections B.2.1 through B.2.5 have a substantial overlap in format. 975 This overlapping format is called the mandatory common part and its 976 format is shown below: 978 0 1 2 3 979 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 981 | Protocol ID | Server Group ID | 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 983 | unused | Flags | 984 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 985 | Sender ID Len | Recvr ID Len | Number of Records | 986 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 987 | Sender ID (variable length) | 988 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 989 | Receiver ID (variable length) | 990 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 992 Protocol ID 993 This field contains an identifier which identifies the 994 client/server protocol which is making use of SCSP for the given 995 message. The assignment of Protocol IDs for this field is given 996 over to IANA as described in Section C. Protocols with current 997 documents have the the following defined values: 998 1 - ATMARP 999 2 - NHRP 1000 3 - MARS 1001 4 - DHCP 1002 5 - LNNI 1004 Server Group ID 1005 This ID is uniquely identifies the instance of a given 1006 client/server protocol for which servers are being synchronized. 1008 Flags 1009 The Flags field is message specific, and its use will be described 1010 in the specific message format sections below. 1012 Sender ID Len 1013 This field holds the length in octets of the Sender ID. 1015 Recvr ID Len 1016 This field holds the length in octets of the Receiver ID. 1018 Number of Records 1019 This field contains the number of additional records associated 1020 with the given message. The exact format of these records is 1021 specific to the message and will be described for each message type 1022 in the sections below. 1024 Sender ID 1025 This is an identifier assigned to the server which is sending the 1026 given message. One possible assignment might be the protocol 1027 address of the sending server. 1029 Receiver ID 1030 This is an identifier assigned to the server which is to receive 1031 the given message. One possible assignment might be the protocol 1032 address of the server which is to receive the given message. 1034 B.2.0.2 Cache State Advertisement Summary Record (CSAS record) 1036 CSAS records contain a summary of information contained in a cache 1037 entry of a given client/server database which is being synchronized 1038 through the use of SCSP. The summary includes enough information for 1039 SCSP to look into the client/server database for the appropriate 1040 database cache entry and then compare the "newness" of the summary 1041 against the "newness" of the cached entry. 1043 Note that CSAS records do not contain a Server Group ID (SGID) nor do 1044 they contain a Protocol ID. These IDs are necessary to identify 1045 which protocol and which instance of that protocol for which the 1046 summary is applicable. These IDs are present in the mandatory common 1047 part of each message. 1049 Note also that the values of the Hop Count and Record Length fields 1050 of a CSAS Record are dependent on whether the CSAS record exists as a 1051 "stand-alone" record or whether the CSAS record is "embedded" in CSA 1052 Record. This is further described below. 1054 0 1 2 3 1055 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1056 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1057 | Hop Count | Record Length | 1058 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1059 | Cache Key Len | Orig ID Len |N| unused | 1060 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1061 | CSA Sequence Number | 1062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 | Cache Key ... | 1064 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1065 | Originator ID ... | 1066 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1068 Hop Count 1069 This field represents the number of hops that the record may take 1070 before being dropped. Thus, at each server that the record 1071 traverses, the Hop Count is decremented. This field is set to 1 1072 when the CSAS record is a "stand-alone" record (i.e., it is not 1073 embedded within a CSA record) since summaries do not go beyond one 1074 hop during the cache alignment process. If a CSAS record is 1075 "embedded" within a CSA record then the Hop Count is set to an 1076 administratively defined value which is almost certainly greater 1077 than or equal to the the cardinality of the SG minus one. Note 1078 that an exception to the previous rule occurs when the CSA Record 1079 is carried within a CSU Request which was sent in response to a 1080 solicitation (i.e., in response to a CSAS Record which was sent in 1081 a CSUS message); in which case, the Hop Count SHOULD be set to 1. 1083 Record Length 1084 If the CSAS record is a "stand-alone" record then this value is 1085 12+"Cache Key Leng"+"Orig ID Len" in bytes; otherwise, this value 1086 is set to 12+"Cache Key Leng"+"Orig ID Len"+ sizeof("Client/Server 1087 Protocol Specific Part for cache entry"). The size of the 1088 Client/Server Protocol Specific Part may be obtained from the 1089 client/server protocol specific document for the given Protocol ID. 1091 Cache Key Len 1092 Length of the Cache Key field in bytes. 1094 Orig ID Len. 1095 Length of the Originator ID field in bytes. 1097 N 1098 The "N" bit signifies that this CSAS Record is actually a Null 1099 record. This bit is only used in a CSAS Record contained in a CSU 1100 Request/Reply which is sent in response to a CSUS message. It is 1101 possible that an LS may receive a solicitation for a CSA record 1102 when the cache entry represented by the solicited CSA Record no 1103 longer exists in the LS's cache (see Section 2.3 for details). In 1104 this case, the LS copies the CSAS Record directly from the CSUS 1105 message into the CSU Request, and the LS sets the N bit signifying 1106 that the cache entry does not exist any longer. The DCS which 1107 solicited the CSA record which no longer exists will still respond 1108 with a CSU Reply. This bit is usually set to zero. 1110 CSA Sequence Number 1111 This field contains a sequence number that identifies the "newness" 1112 of a CSA record instance being summarized. A "larger" sequence 1113 number means a more recent advertisement. Thus, if the state of 1114 part (or all) of a cache entry needs to be updated then the CSA 1115 record advertising the new state MUST contain a CSA Sequence Number 1116 which is larger than the one corresponding to the previous 1117 advertisement. This number is assigned by the originator of the 1118 CSA record. The CSA Sequence Number may be assigned by the 1119 originating server or by the client which caused its server to 1120 advertise its existence. 1122 The CSA Sequence Number is a signed 32 bit number. Within the CSA 1123 Sequence Number space, the number -2^31 (0x80000000) is reserved. 1124 Thus, the usable portion of the CSA Sequence Number space for a 1125 given Cache Key is between the numbers -2^31+1 (0x80000001) and 1126 2^31-1 (0x7fffffff). An LS uses -2^31+1 the first time it 1127 originates a CSA Record for a cache entry that it created. Each 1128 time the cache entry is modified in some manner and when that 1129 modification needs to be synchronized with the other servers in the 1130 SG, the LS increments the CSA Sequence number associated with the 1131 given Cache Key and uses that new CSA Sequence Number when 1132 advertising the update. If it is ever the case that a given CSA 1133 Sequence Number has reached 2^31-2 and the associated cache entry 1134 has been modified such that an update must be sent to the rest of 1135 the servers in the SG then the given cache entry MUST first be 1136 purged from the SG by the LS by sending a CSA Record which causes 1137 the cache entry to be removed from other servers and this CSA 1138 Record carries a CSA Sequence Number of 2^31-1. The exact packet 1139 format and mechanism by which a cache entry is purged is defined in 1140 the appropriate protocol specific document. After the purging CSA 1141 Record has been acknowledged by each DCS, an LS will then send a 1142 new CSA Record carrying the updated information, and this new CSA 1143 Record will carry a CSA Sequence Number of -2^31+1. 1145 After a restart occurs and after the restarting LS's CAFSM has 1146 achieved the Aligned state, if an update to an existing cache entry 1147 needs to be synchronized or a new cache entry needs to be 1148 synchronized then the ensuing CSA Record MUST contain a CSA 1149 Sequence Number which is unique within the SG for the given OID and 1150 Cache Key. The RECOMMENDED method of obtaining this number (unless 1151 explicitly stated to the contrary in the protocol specific 1152 document) is to set the CSA Sequence Number in the CSA Record to 1153 the CSA Sequence Number associated with the existing cache entry 1154 (if an out of date cache entry already exists and zero if not) plus 1155 a configured constant. Note that the protocol specific document 1156 may require that all cache entries containing the OID of the 1157 restarting LS be purged prior to updating the cache entries; in 1158 this case, the updating CSA Record will still contain a CSA 1159 Sequence Number set to the CSA Sequence Number associated with the 1160 previously existing cache entry plus a configured constant. 1162 Cache Key 1163 This is a database lookup key that uniquely identifies a piece of 1164 data which the originator of a CSA Record wishes to synchronize 1165 with its peers for a given "Protocol ID/Server Group ID" pair. 1166 This key will generally be a small opaque byte string which SCSP 1167 will associate with a given piece of data in a cache. Thus, for 1168 example, an originator might assign a particular 4 byte string to 1169 the binding of an IP address with that of an ATM address. 1170 Generally speaking, the originating server of a CSA record is 1171 responsible for generating a Cache Key for every element of data 1172 that the the given server originates and which the server wishes to 1173 synchronize with its peers in the SG. 1175 Originator ID 1176 This field contains an ID administratively assigned to the server 1177 which is the originator of CSA Records. 1179 B.2.1 Cache Alignment (CA) 1181 The Cache Alignment (CA) message allows an LS to synchronize its 1182 entire cache with that of the cache of its DCSs within a server 1183 group. The CA message type code is 1. The CA message mandatory part 1184 format is as follows: 1186 0 1 2 3 1187 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1189 | CA Sequence Number | 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1191 | Mandatory Common Part | 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 | CSAS Record | 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1195 ....... 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1197 | CSAS Record | 1198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1200 CA Sequence Number 1201 A value which provides a unique identifier to aid in the sequencing 1202 of the cache alignment process. A "larger" sequence number means a 1203 more recent CA message. The slave server always copies the 1204 sequence number from the master server's previous CA message into 1205 its current CA message which it is sending and the the slave 1206 acknowledges the master's CA message. Since the initial CA process 1207 is lock-step, if the slave does not receive the same sequence 1208 number which it previously received then the information in the 1209 slave's previous CA message is implicitly acknowledged. Note that 1210 there is a separate CA Sequence Number space associated with each 1211 CAFSM. 1213 Whenever it is necessary to (re)start cache alignment and the CAFSM 1214 enters the Master/Slave Negotiation state, the CA Sequence Number 1215 should be set to a value not previously seen by the DCS. One 1216 possible scheme is to use the machine's time of day counter. 1218 Mandatory Common Part 1219 The mandatory common part is described in detail in Section 1220 B.2.0.1. There are two fields in the mandatory common part whose 1221 codings are specific to a given message type. These fields are the 1222 "Number of Records" field and the "Flags" field. 1224 Number of Records 1225 The Number of Records field of the mandatory common part for the 1226 CA message gives the number of CSAS Records appended to the CA 1227 message mandatory part. 1229 Flags 1230 The Flags field of the mandatory common part for the CA message 1231 has the following format: 1233 0 1 1234 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 |M|I|O| unused | 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1239 M 1240 This bit is part of the negotiation process for the cache 1241 alignment. When this bit is set then the sender of the CA 1242 message is indicating that it wishes to lead the alignment 1243 process. This bit is the "Master/Slave bit". 1245 I 1246 When set, this bit indicates that the sender of the CA message 1247 believes that it is in a state where it is negotiating for the 1248 status of master or slave. This bit is the "Initialization 1249 bit". 1251 O 1252 This bit indicates that the sender of the CA message has more 1253 CSAS records to send. This implies that the cache alignment 1254 process must continue. This bit is the "mOre bit" despite its 1255 dubious name. 1257 All other fields of the mandatory common part are coded as 1258 described in Section B.2.0.1. 1260 CSAS record 1261 The CA message appends CSAS records to the end of its mandatory 1262 part. These CSAS records are NOT embedded in CSA records. See 1263 Section B.2.0.2 for details on CSAS records. 1265 B.2.2 Cache State Update Request (CSU Request) 1267 The Cache State Update Request (CSU Request) message is used to 1268 update the state of cache entries in servers which are directly 1269 connected to the server sending the message. A CSU Request message 1270 is sent from one server (the LS) to directly connected server (the 1271 DCS) when the LS observes changes in the state of one or more cache 1272 entries. An LS observes such a change in state by either receiving a 1273 CSU request which causes an update to the LS's database or by 1274 observing a change of state of a cached entry originated by the LS. 1275 The change in state of a cache entry is noted in a CSU message by 1276 appending a "Cache State Advertisement" (CSA) record to the end of 1277 the mandatory part of the CSU Request as shown below. 1279 The CSU Request message type code is 2. The CSU Request message 1280 mandatory part format is as follows: 1282 0 1 2 3 1283 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1285 | Mandatory Common Part | 1286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1287 | CSA Record | 1288 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1289 ....... 1290 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1291 | CSA Record | 1292 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1294 Mandatory Common Part 1295 The mandatory common part is described in detail in Section 1296 B.2.0.1. There are two fields in the mandatory common part whose 1297 codings are specific to a given message type. These fields are the 1298 "Number of Records" field and the "Flags" field. 1300 Number of Records 1301 The Number of Records field of the mandatory common part for the 1302 CSU Request message gives the number of CSA Records appended to 1303 the CSU Request message mandatory part. 1305 Flags 1306 Currently, there are no flags defined for the Flags field of the 1307 mandatory common part for the CSU Request message. 1309 All other fields of the mandatory common part are coded as 1310 described in Section B.2.0.1. 1312 CSA Record 1313 See Section B.2.2.1. 1315 B.2.2.1 Cache State Advertisement Record (CSA record) 1317 CSA records contain the information necessary to relate the current 1318 state of a cache entry in an SG to the servers being synchronized. 1319 CSA records contain a CSAS Record header and a client/server protocol 1320 specific part. The CSAS Record includes enough information for SCSP 1321 to look into the client/server database for the appropriate database 1322 cache entry and then compare the "newness" of the summary against the 1323 "newness" of the cached entry. If the information contained in the 1324 CSA is more new than the cached entry of the receiving server then 1325 the cached entry is updated accordingly with the contents of the CSA 1326 Record. The client/server protocol specific part of the CSA Record 1327 is documented separately for each such protocol. Examples of the 1328 protocol specific parts for NHRP and ATMARP are shown in [8] and [9] 1329 respectively. 1331 The amount of information carried by a specific CSA record may exceed 1332 the size of a link layer PDU. Hence, such CSA records MUST be 1333 fragmented across a number of CSU Request messages. The method by 1334 which this is done, is client/server protocol specific and is 1335 documented in the appropriate protocol specific document. 1337 The content of a CSA record is as follows: 1339 0 1 2 3 1340 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1342 | CSAS Record | 1343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1344 | Client/Server Protocol Specific Part for cache entry ... | 1345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1347 CSAS Record 1348 See Section B.2.0.2 for rules and format for filling out a CSAS 1349 Record when it is "embedded" in a CSA Record. 1351 Client/Server Protocol Specific Part for cache entry 1352 This field contains the fields which are specific to the protocol 1353 specific portion of SCSP processing. The particular set of fields 1354 are defined in separate documents for each protocol user of SCSP. 1355 The Protocol ID, which identifies which protocol is using SCSP in 1356 the given packet, is located in the mandatory part of the message. 1358 B.2.3 Cache State Update Reply (CSU Reply) 1360 The Cache State Update Reply (CSU Reply) message is sent from a DCS 1361 to an LS to acknowledge one or more CSA records which were received 1362 in a CSU Request. Reception of a CSA record in a CSU Request is 1363 acknowledged by including a CSAS record in the CSU Reply which 1364 corresponds to the CSA record being acknowledged. The CSU Reply 1365 message is the same in format as the CSU Request message except for 1366 the following: the type code is 3, only CSAS Records (rather than CSA 1367 records) are returned, and only those CSAS Records for which CSA 1368 Records are being acknowledged are returned. This implies that a 1369 given LS sending a CSU Request may not receive an acknowledgment in a 1370 single CSU Reply for all the CSA Records included in the CSU Request. 1372 B.2.4 Cache State Update Solicit Message (CSUS message) 1373 This message allows one server (LS) to solicit the entirety of CSA 1374 record data stored in the cache of a directly connected server (DCS). 1375 The DCS responds with CSU Request messages containing the appropriate 1376 CSA records. The CSUS message type code is 4. The CSUS message 1377 format is the same as that of the CSU Reply message. CSUS messages 1378 solicit CSU Requests from only one server (the one identified by the 1379 Receiver ID in the Mandatory Part of the message). 1381 B.2.5 Hello: 1383 The Hello message is used to check connectivity between the sending 1384 server (the LS) and one of its directly connected neighbor servers 1385 (the DCSs). The Hello message type code is 5. The Hello message 1386 mandatory part format is as follows: 1388 0 1 2 3 1389 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1390 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1391 | HelloInterval | DeadFactor | 1392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1393 | unused | Family ID | 1394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1395 | Mandatory Common Part | 1396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1397 | Additional Receiver ID Record | 1398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1399 ......... 1400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 | Additional Receiver ID Record | 1402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1404 HelloInterval 1405 The hello interval advertises the time between sending of 1406 consecutive Hello Messages. If the LS does not receive a Hello 1407 message from the DCS (which contains the LSID as a Receiver ID) 1408 within the HelloInterval advertised by the DCS then the DCS's Hello 1409 is considered to be late. Also, the LS MUST send its own Hello 1410 message to a DCS within the HelloInterval which it advertised to 1411 the DCS in the LS's previous Hello message to that DCS (otherwise 1412 the DCS would consider the LS's Hello to be late). 1414 DeadFactor 1415 This is a multiplier to the HelloInterval. If an LS does not 1416 receive a Hello message which contains the LS's LSID as a Receiver 1417 ID within the interval HelloInterval*DeadFactor from a given DCS, 1418 which advertised the HelloInterval and DeadFactor in a previous 1419 Hello message, then the LS MUST consider the DCS to be stalled; at 1420 this point, one of two things MUST happen: 1) if the LS has 1421 received any Hello messages from the DCS during this time then the 1422 LS transitions the corresponding HFSM to the Unidirectional State; 1423 otherwise, 2) the LS transitions the corresponding HFSM to the 1424 Waiting State. 1426 Family ID 1427 This is an opaque bit string which is used to refer to an aggregate 1428 of Protocol ID/SGID pairs. Only a single HFSM is run for all 1429 Protocol ID/SGID pairs assigned to a Family ID. Thus, there is a 1430 one to many mapping between the single HFSM and the CAFSMs 1431 corresponding to each of the Protocol ID/SGID pairs. This might 1432 have the net effect of substantially reducing HFSM maintenance 1433 traffic. See the protocol specific SCSP documents for further 1434 details. 1436 Mandatory Common Part 1437 The mandatory common part is described in detail in Section 1438 B.2.0.1. There are two fields in the mandatory common part whose 1439 codings are specific to a given message type. These fields are the 1440 "Number of Records" field and the "Flags" field. 1442 Number of Records 1443 The Number of Records field of the mandatory common part for the 1444 Hello message contains the number of "Additional Receiver ID" 1445 records which are included in the Hello. Additional Receiver ID 1446 records contain a length field and a Receiver ID field. Note 1447 that the count in "Number of Records" does NOT include the 1448 Receiver ID which is included in the Mandatory Common Part. 1450 Flags 1451 Currently, there are no flags defined for the Flags field of the 1452 mandatory common part for the Hello message. 1454 All other fields of the mandatory common part are coded as 1455 described in Section B.2.0.1. 1457 Additional Receiver ID Record 1458 This record contains a length field followed by a Receiver ID. 1459 Since it is conceivable that the length of a given Receiver ID may 1460 vary even within an SG, each additional Receiver ID heard (beyond 1461 the first one) will have both its length in bytes and value encoded 1462 in an "Additional Receiver ID Record". Receiver IDs are IDs of a 1463 DCS from which the LS has heard a recent Hello (i.e., within 1464 DeadFactor*HelloInterval as advertised by the DCS in a previous 1465 Hello message). 1467 The format for this record is as follows: 1469 0 1 2 3 1470 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1472 | Rec ID Len | Receiver ID | 1473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1475 If the LS has not heard from any DCS then the LS sets the Hello 1476 message fields as follows: Recvr ID Len is set to zero and no storage 1477 is allocated for the Receiver ID in the Common Mandatory Part, 1478 "Number of Records" is set to zero, and no storage is allocated for 1479 "Additional Receiver ID Records". 1481 If the LS has heard from exactly one DCS then the LS sets the Hello 1482 message fields as follows: the Receiver ID of the DCS which was heard 1483 and the length of that Receiver ID are encoded in the Common 1484 Mandatory Part, "Number of Records" is set to zero, and no storage is 1485 allocated for "Additional Receiver ID Records". 1487 If the LS has heard from two or more DCSs then the LS sets the Hello 1488 message fields as follows: the Receiver ID of the first DCS which was 1489 heard and the length of that Receiver ID are encoded in the Common 1490 Mandatory Part, "Number of Records" is set to the number of 1491 "Additional" DCSs heard, and for each additional DCS an "Additional 1492 Receiver ID Record" is formed and appended to the end of the Hello 1493 message. 1495 B.3 Extensions Part 1497 The Extensions Part, if present, carries one or more extensions in 1498 {Type, Length, Value} triplets. 1500 Extensions have the following format: 1502 0 1 2 3 1503 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1504 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1505 | Type | Length | 1506 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1507 | Value... | 1508 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1510 Type 1511 The extension type code (see below). 1513 Length 1514 The length in octets of the value (not including the Type and 1515 Length fields; a null extension will have only an extension header 1516 and a length of zero). 1518 When extensions exist, the extensions part is terminated by the End 1519 of Extensions extension, having Type = 0 and Length = 0. 1521 Extensions may occur in any order but any particular extension type 1522 may occur only once in an SCSP packet. An LS MUST NOT change the 1523 order of extensions. 1525 B.3.0 The End Of Extensions 1527 Type = 0 1528 Length = 0 1530 When extensions exist, the extensions part is terminated by the End 1531 Of Extensions extension. 1533 B.3.1 SCSP Authentication Extension 1535 Type = 1 Length = variable 1537 The SCSP Authentication Extension is carried in SCSP packets to 1538 convey the authentication information between an LS and a DCS in the 1539 same SG. 1541 Authentication is done pairwise on an LS to DCS basis; i.e., the 1542 authentication extension is generated at each LS. If a received 1543 packet fails the authentication test then an "abnormal event" has 1544 occurred. The packet is discarded and this event is logged. 1546 The presence or absence of authentication is a local matter. 1548 B.3.1.1 Header Format 1550 The authentication header has the following format: 1552 0 1 2 3 1553 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1555 | Security Parameter Index (SPI) | 1556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1557 | | 1558 +-+-+-+-+-+-+-+-+-+-+ Authentication Data... -+-+-+-+-+-+-+-+-+-+ 1559 | | 1560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1562 Security Parameter Index (SPI) can be thought of as an index into a 1563 table that maintains the keys and other information such as hash 1564 algorithm. LS and DCS communicate either off-line using manual keying 1565 or online using a key management protocol to populate this table. The 1566 receiving SCSP entity always allocates the SPI and the parameters 1567 associated with it. 1569 The authentication data field contains the MAC (Message 1570 Authentication Code) calculated over the entire SCSP payload. The 1571 length of this field is dependent on the hash algorithm used to 1572 calculate the MAC. 1574 B.3.1.2 Supported Hash Algorithms 1576 The default hash algorithm to be supported is HMAC-MD5-128 [11]. HMAC 1577 is safer than normal keyed hashes. Other hash algorithms MAY be 1578 supported by def. 1580 IANA will assign the numbers to identify the algorithm being used as 1581 described in Section C. 1583 B.3.1.3 SPI and Security Parameters Negotiation 1585 SPI's can be negotiated either manually or using an Internet Key 1586 Management protocol. Manual keying MUST be supported. The following 1587 parameters are associated with the tuple - lifetime, 1588 Algorithm, Key. Lifetime indicates the duration in seconds for which 1589 the key is valid. In case of manual keying, this duration can be 1590 infinite. Also, in order to better support manual keying, there may 1591 be multiple tuples active at the same time (DCS ID being the same). 1593 Any Internet standard key management protocol MAY be used to 1594 negotiate the SPI and parameters. 1596 B.3.1.4 Message Processing 1598 At the time of adding the authentication extension header, LS looks 1599 up in a table to fetch the SPI and the security parameters based on 1600 the DCS ID. If there are no entries in the table and if there is 1601 support for key management, the LS initiates the key management 1602 protocol to fetch the necessary parameters. The LS then calculates 1603 the hash by zeroing authentication data field before calculating the 1604 MAC on the sending end. The result replaces in the zeroed 1605 authentication data field. If key management is not supported and 1606 authentication is mandatory, the packet is dropped and this 1607 information is logged. 1609 When receiving traffic, an LS fetches the parameters based on the SPI 1610 and its ID. The authentication data field is extracted before zeroing 1611 out to calculate the hash. It computes the hash on the entire payload 1612 and if the hash does not match, then an "abnormal event" has 1613 occurred. 1615 B.3.1.5 Security Considerations 1617 It is important that the keys chosen are strong as the security of 1618 the entire system depends on the keys being chosen properly and the 1619 correct implementation of the algorithms. 1621 SCSP has a peer to peer trust model. It is recommended to use an 1622 Internet standard key management protocol to negotiate the keys 1623 between the neighbors. Transmitting the keys in clear text, if other 1624 methods of negotiation is used, compromises the security completely. 1626 Data integrity covers the entire SCSP payload. This guarantees that 1627 the message was not modified and the source is authenticated as well. 1628 If authentication extension is not used or if the security is 1629 compromised, then SCSP servers are liable to both spoofing attacks, 1630 active attacks and passive attacks. 1632 There is no mechanism to encrypt the messages. It is assumed that a 1633 standard layer 3 confidentiality mechanism will be used to encrypt 1634 and decrypt messages. As integrity is calculated on an SCSP message 1635 and not on each record, there is an implied trust between all the 1636 servers in a domain. It is recommend to use the security extension 1637 between all the servers in a domain and not just a subset servers. 1639 B.3.2 SCSP Vendor-Private Extension 1641 Type = 2 1642 Length = variable 1644 The SCSP Vendor-Private Extension is carried in SCSP packets to 1645 convey vendor-private information between an LS and a DCS in the same 1646 SG and is thus of limited use. If a finer granularity (e.g., CSA 1647 record level) is desired then then given client/server protocol 1648 specific SCSP document MUST define such a mechanism. Obviously, 1649 however, such a protocol specific mechanism might look exactly like 1650 this extension. The Vendor Private Extension MAY NOT appear more 1651 than once in an SCSP packet for a given Vendor ID value. 1653 0 1 2 3 1654 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1656 | Vendor ID | Data.... | 1657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1659 Vendor ID 1660 802 Vendor ID as assigned by the IEEE [7]. 1662 Data 1663 The remaining octets after the Vendor ID in the payload are 1664 vendor-dependent data. 1666 If the receiver does not handle this extension, or does not match the 1667 Vendor ID in the extension then the extension may be completely 1668 ignored by the receiver. 1670 C. IANA Considerations 1671 IANA will take advice from James Luciani (see author information 1672 below for contact information), who is the Area Director appointed 1673 designated subject matter expert, in order to assign numbers from the 1674 various number spaces described herein. In the event that the Area 1675 Director appointed designated subject matter expert is unavailable, 1676 the relevant IESG Area Director will appoint another expert. Any and 1677 all requests for value assignment within a given number space will be 1678 accepted when the usage of the value assignment documented. Possible 1679 forms of documentantion include, but is not limited to, RFCs or the 1680 product of another cooperative standards body (e.g., the MPOA and 1681 LANE subworking group of the ATM Forum). 1683 References 1685 [1] "Classical IP and ARP over ATM", Laubach, RFC 1577. 1687 [2] "NBMA Next Hop Resolution Protocol (NHRP)", Luciani, Katz, Piscitello, 1688 Cole, draft-ietf-rolc-nhrp-12.txt. 1690 [3] "OSPF Version 2", Moy, RFC1583. 1692 [4] "P-NNI V1", Dykeman, Goguen, 1996. 1694 [5] "Support for Multicast over UNI 3.0/3.1 based ATM Networks.", 1695 Armitage, RFC2022. 1697 [6] "LAN Emulation over ATM Version 2 - LNNI specification", Keene, 1698 btd-lane-lnni-02.08. 1700 [7] "Assigned Numbers", Reynolds, Postel, RFC 1700. 1702 [8] "A Distributed NHRP Service Using SCSP", Luciani, 1703 draft-ietf-ion-scsp-nhrp-02.txt. 1705 [9] "A Distributed ATMARP Service Using SCSP", Luciani, 1706 Work In Progress. 1708 [10] "Key words for use in RFCs to Indicate Requirement Levels", 1709 S. Bradner, RFC 2119. 1711 [11] "HMAC: Keyed Hashing for Message Authentication", Krawczyk, Bellare, 1712 Canetti, RFC 2104. 1714 Acknowledgments 1716 This I-D is a distillation of issues raised during private 1717 discussions, on the IP-ATM mailing list, and during the Dallas IETF 1718 (12/95). Thanks to all who have contributed but particular thanks to 1719 following people (in no particular order): Barbara Fox of Harris and 1720 Jeffries; Colin Verrilli of IBM; Raj Nair, and Matthew Doar of Ascom 1721 Nexion; Andy Malis of Cascade; Andre Fredette of Bay Networks; James 1722 Watt of Newbridge; and Carl Marcinik of Fore. 1724 Author's Address 1726 James V. Luciani 1727 Bay Networks, Inc. 1728 3 Federal Street, BL3-03 1729 Billerica, MA 01821 1730 phone: +1-978-916-4734 1731 email: luciani@baynetworks.com 1733 Grenville Armitage 1734 Bell Labs Lucent Technologies 1735 101 Crawfords Corner Road 1736 Holmdel, NJ 07733 1737 Email: gja@lucent.com 1738 Ph. +1 201 829 2635 1740 Joel M. Halpern 1741 Newbridge Networks Corp. 1742 593 Herndon Parkway 1743 Herndon, VA 22070-5241 1744 Phone: +1-703-708-5954 1745 Email: jhalpern@Newbridge.COM 1747 Naganand Doraswamy 1748 Bay Networks, Inc. 1749 3 Federal St, BL3-03 1750 Billerice, MA 01821 1751 phone: +1-978-916-1323 1752 Email: naganand@baynetworks.com