idnits 2.17.1 draft-ietf-pktway-protocol-rrp-spec-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 149 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** There are 2 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 83 has weird spacing: '... xxxx indic...' == Line 145 has weird spacing: '...dresses a se...' == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (June 1998) is 9445 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'Level-3' on line 65 looks like a reference -- Missing reference section? 'GVL2' on line 606 looks like a reference -- Missing reference section? 'L2SR' on line 622 looks like a reference -- Missing reference section? 'RDRC' on line 539 looks like a reference -- Missing reference section? 'HRTO' on line 522 looks like a reference -- Missing reference section? 'TELL' on line 511 looks like a reference -- Missing reference section? 'INFO' on line 574 looks like a reference -- Missing reference section? 'PBD' on line 472 looks like a reference -- Missing reference section? 'PAD' on line 430 looks like a reference Summary: 12 errors (**), 0 flaws (~~), 4 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Danny Cohen 2 Myricom 3 Expires in six months Craig Lund 4 Mercury Computers 5 Tony Skjellum, Thom McMahon, Robert George 6 Mississippi State University 7 June 1998 9 The Router-to-Router (RRP) PacketWay Protocol for 10 High-Performance Interconnection of Computer Clusters 12 Status of this Memo 14 This document is an Internet-Draft. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its areas, 16 and its working groups. Note that other groups may also distribute 17 working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference 22 material or to cite them other than as "work in progress." 24 To view the entire list of current Internet-Drafts, please check the 25 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 26 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 27 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 28 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 30 Table of Contents 32 Introduction ..................................................... 2 33 Notations ........................................................ 2 34 PacketWay and IP ................................................. 3 35 Node Attributes .................................................. 3 36 RRP Messages ..................................................... 4 37 Structure of RRP Messages ........................................ 5 38 RRP Records ...................................................... 8 39 Example .......................................................... 12 40 Glossary ......................................................... 16 41 Acronyms and Abbreviations ....................................... 18 42 Editor's Address ................................................. 19 44 Introduction 46 The PacketWay family of protocols is introduced in the "The End-to- 47 End (EEP) PacketWay Protocol for High-Performance Interconnection of 48 Computer Clusters". This document defines the Router-to-Router 49 Protocol (RRP), the basic messages used by routers to exchange 50 routing information with endpoints and each other. 52 In the PacketWay model a router is a set of cooperating hosts on two 53 (or more) networks. These hosts, each a full-fledged host on its 54 SAN, are called "half-routers" (HRs). RRP defines, via message 55 structure and behavior, the interactions between HRs as well as the 56 interactions between HRs and nodes. 58 RRP does not define the lower level protocols that deliver its 59 messages. RRP also does not define the connection between the HRs 60 within the router-- these are left for mutual agreements among the 61 implementors of each HR. 63 However, the intra-router communication among these hosts is a 64 "public" issue, handled according to the RRP which defines only the 65 Network-level [Level-3], and not the lower levels of this 66 communication. All RRP messages are carried via EEP packets with the 67 "Packet-Type" field of the EEP header set to "RRP". 69 This document does not define how source routes are initially 70 constructed. It is expected that static tables may be manually 71 maintained for simple or very stable systems. Dynamic table- 72 maintenance protocols will likely be outlined in a future document. 74 Notations 76 8B means "8-byte" (64 bits). 78 0x indicates hexadecimal values, e.g., 0x0100 is 79 2^8=256(decimal). 81 0b indicates binary values, e.g., 0b0100 is 4(decimal). 83 xxxx indicate a field that is discarded without any checking (e.g., 84 padding). 86 [exp] in equations, is the integral part, rounded down, of `exp` 87 (e.g., [23/8]=2). 89 All length fields do not include themselves, and therefore may be 0. 90 Lengths are specified either (a) by byte count, implying that some 91 padding bytes may follow to fill 8B-words, or (b) by 8B-word count 92 and PL, the number of trailing padding bytes (with PL between 0 and 93 7). 95 PacketWay and IP 97 The architecture of PacketWay is very similar to the IP family (in 98 fact it heavily borrows from IP), with emphasis on performance not 99 generality and scaleability as was selected for IP. 101 Like IP, PacketWay is based on an End-to-End protocol (EEP) that 102 assumes that if an address (or equivalent specification of the desti- 103 nation) is placed in the appropriate field in the packet header, then 104 the packet will arrive to that destination. Neither IP nor EEP 105 specify how this happens. 107 Routers are responsible for transferring packets from their source 108 networks to their destination networks (possibly via other networks). 109 The communication among the routers (such the entire family of the 110 GGPs [Gateway/Gateway Protocols] as they were originally called) is 111 NOT a part of IP (as defined originally in RFC-791 and MIL-STD-1777). 112 Similarly, it is not a part of EEP. 114 Like the IP family, PacketWay defines separately its Router-to-Router 115 Protocol (RRP), in a device- and network-independent manner. 117 However, the model of routers in PacketWay is slightly different from 118 the original model in the IP family. IP routers (or gateways as they 119 were called then) are monolithic devices, provided by their vendors. 120 Each IP-router is a bona-fide host on two (or more) networks. The 121 communication among these intra-router hosts is an internal "private" 122 issue, handled by each vendor as it sees fit, not subject to pub- 123 lished standards. 125 Node Attributes 127 Each node must have a Physical Address. Optionally it may also have 128 Name, Capabilities, and Logical-Addresses: 130 Physical Address 23 bits, flat, unique in this PacketWay. 132 Name flat, globally unique (e.g., IP address), arbi- 133 trary length 135 Capabilities regular GP node, router, PacketWay-server, NFS, 136 paging server, M/C server, SRVLOC-server, DSP, 137 printer, etc. 139 Some capabilities may need additional parameters 140 (e.g., SAN-ID for routers, and resolution+colors 141 for printers). Their parameters are capability- 142 specific. The capabilities are defined in the 143 PacketWay Enumeration document. 145 Logical-Addresses a set of (logical) addresses to which this node 146 requests to listen. Logical addresses designate 147 multicast and broadcast groups. 149 The control of the Logical-Addresses (a la IGMP) 150 is not defined in this document. This will be 151 designed by the applications that use it (e.g., 152 PacketWay-Multicast). 154 The management of logical addresses (e.g., JOIN 155 and LEAVE) is not defined here. 157 RRP Messages 159 RRP messages are PacketWay messages with PT="RRP" and TE="RRP-Type" 160 in their EEP-header, followed by zero or more RRP-records according 161 to their RRP-type and completed by the TAIL which is the EI field of 162 the EEP packet. The RRP-records are defined in the next sections. 164 The RRP-records constitute the Data Block (DB) of the PacketWay- 165 message. They must be in Big-Endian order, with e=0 in the EEP- 166 header. 168 We use "[XXX]" to indicate the RRP-message XXX, and to indicate 169 the RRP-record YYY. XXX is the RRP-Type, carried in the Type 170 Extension (TE) field of the EEP header (with Packet-Type of "RRP"), 171 and YYY is the RTyp field, carried in the first byte of that 172 RRP-record. 174 Following are the 7 RRP messages, with their RRP-type, and the 175 related error messages. The column S->D (Source to Destination) 176 shows who sends such messages to whom, where N is for Node, H is for 177 HR, and A is for Any. 179 RRP- 180 Type S->D Description 181 -------- ------ ----------------------------------------------- 182 [GVL2] N->H Please give me L2-routes to node (address) 183 Replies to [GVL2]: [L2SR], [RDRC], or [ERR/UNK]. 185 [L2SR] H->N Here are L2-routes to node (address) 187 [HRTO] N->H Which HR should I use for node (address)? 188 Replies to [HRTO]: [RDRC] or [ERR/UNK]. 190 [RDRC] H->N Re-direct to node (address) via an HR on same SAN 192 [TELL] N->H Please tell me about node (address, name, capa's) 193 The reply to [TELL] is [INFO], or [ERR/UNK]. 195 [INFO] A->A Info about node (address, name, capabilities, LAs) 197 [WRU?] A->A Who/what-Are-You? (Tell me all about yourself) 198 The reply to [WRU?] is [INFO] about the replier. 200 RRP also uses the following error messages: 202 [ERR/UNK] Destination Unknown (address) 203 [ERR/HRDOWN] HR Down 204 [ERR/LKDOWN] Link Down 205 [ERR/GENERAL] General error message 207 [GVL2] Please give me L2-routes from you to node (address) 209 PH (with [PT/TE]=[RRP/GVL2]) 210 (address of the node for which [L2SR] is 211 requested) 213 Structure of RRP Messages 215 [L2SR] Here are L2-routes from me to node (address) 217 PH (with [PT/TE]=[RRP/L2SR]) 218 (address of the node for which the following 219 is provided) 220 (Source Route/Quality record) 221 (optional) MTU records for the above 223 This message may have several (, ) pairs, 224 one such pair for each source route. 226 [HRTO] Which HR should I use for node (address) 228 PH (with [PT/TE]=[RRP/HRTO]) 229 (address of the node for which initial HR 230 is requested) 232 [RDRC] Re-direct to destination node (address) via a HR 233 (address), on the same SAN. 235 PH (with [PT/TE]=[RRP/RDRC]) 236 (address of the destination node) 237 (address of the HR to be used for that 238 destination) 240 The above addresses are expected to be physical 241 (but they be otherwise). 243 [TELL] Please tell me about node 244 (address | name | capabilities) 246 PH (with [PT/TE]=[RRP/TELL]) 247 (address of that node) 249 or 251 PH (with [PT/TE]=[RRP/TELL]) 252 (name of that node) 254 or 256 PH (with [PT/TE]=[RRP/TELL]) 257 (capabilities for which nodes are requested) 259 This message may have several 's, one for each 260 capability. 262 [TELL] identifies a node by an address and/or a name 263 and/or capabilities. If more than one attribute is 264 specified (e.g., an address and name(s)) any nodes 265 that meets any of them should be considered (like an 266 implied OR). 268 [INFO] Info about node(s) (address, name, capabilities) 270 PH (with [PT/TE]=[RRP/INFO]) 271 (address of that node) 272 (name of that node) 273 (capabilities for which nodes are requested) 274 (Logical-Addresses for the requested node) 276 This message may have several 's, one for each 277 capability. For nodes without , , or any 278 , these records are omitted. 280 [INFO] provides all the known information about all 281 the nodes that match the [TELL]. The -records 282 are the separators between the nodes. 284 [WRU?] Who/what-Are-You? 286 PH (with [PT/TE]=[RRP/WRU?] and [DD]=0x7FFFFE) 288 [ERR/UNK] Destination Unknown (address) 290 PH (with [PT/TE]=ERROR/UNK) 292 (XXXX of the Destination node for which the 293 requested information is not available), 294 where >XXXX> is the and/or 295 of the node(s) about which 296 this message is sent 298 [ERR/HRDOWN] HR Down (or Router-Down) 300 PH (with [PT/TE]=[ERROR/HRDOWN]) 301 (address of the HR that is down) 302 (the other address of the router that is down) 304 [ERR/LINKDOWN] Link Down 306 PH (with [PT/TE]=[ERROR/LINKDOWN]) 307 (address of one end of the link that is down) 308 (address of the other end of the link that is 309 down) 311 [ERR/GENERAL] General Error (i.e., none of the above) 313 PH (with [PT/TE]=[ERROR/GENERAL]) 314 XX (The entire message that caused the error : 315 PH+OH+DB+TAIL) 317 RRP Records 319 Each RRP-record starts with an 8B-word header as shown below. Its 320 first byte identifies the record type (RTyp). The second byte is the 321 Pad-Count byte (PL) indicating the number of padding bytes. The 322 third and the fourth bytes (RL) are the length (in 8B-words) of the 323 record, excluding the record header, hence it may be zero. The rest 324 of the header bytes depend on the record type (RTyp). 326 +--------+--------+--------+--------+--------+--------+--------+--------+ 327 | RTyp | PL | RL |........|........|........|........| 328 +--------+--------+--------+--------+--------+--------+--------+--------+ 330 Some records that have an arbitrary length are "right justified" by 331 having PL padding bytes before the data (Padding Before Data [PBD]). 332 Some records that have an arbitrary length are "left justified" by 333 having PL bytes after the data (Padding After Data [PAD]). In either 334 case the total number of data bytes is: (8*RL+4-PL). 336 Following are the RRP-records. These records are the building blocks 337 used to construct RRP-messages. In the following, "xxxx" indicate 338 bytes that are discarded, such as for padding. It is recommended to 339 set them to all-0. 341 ===> Node-Address Record [PAD] 343 This record specifies either a single address (with AT=1) or a range 344 of addresses (with AT=2 followed by AT=3, or by AT=4 followed by 345 AT=5). AT is the "Address-Type". 347 0 1 2 3 4 5 6 7 348 +--------+--------+--------+--------+--------+--------+--------+--------+ 349 | | PL=0 | RL=0 | AT=1 | PacketWay-Address | 350 +--------+--------+--------+--------+--------+--------+--------+--------+ 352 or: 353 0 1 2 3 4 5 6 7 354 +--------+--------+--------+--------+--------+--------+--------+--------+ 355 | | PL=4 | RL=1 | AT=2 | Min-PacketWay-Address | 356 +--------+--------+--------+--------+--------+--------+--------+--------+ 357 | AT=3 | Max-PacketWay-Address | xxxx | xxxx | xxxx | xxxx | 358 +--------+--------+--------+--------+--------+--------+--------+--------+ 360 or: 361 0 1 2 3 4 5 6 7 362 +--------+--------+--------+--------+--------+--------+--------+--------+ 363 | | PL=4 | RL=1 | AT=4 | PacketWay-Address-Value | 364 +--------+--------+--------+--------+--------+--------+--------+--------+ 365 | AT=5 | PacketWay-Address-Mask | xxxx | xxxx | xxxx | xxxx | 366 +--------+--------+--------+--------+--------+--------+--------+--------+ 367 The address-mask follows the address-value. 369 The above addresses may be physical or logical. 371 The address X is specified by an -record if: 373 if AT=1: X == PacketWay-Address 375 if AT=2,3: Min-PacketWay-Address <= X <= Max-PacketWay-Address 377 if AT=4,5: (PacketWay-Address-Mask & X) == PacketWay-Address-Value 379 An -record defines only one PacketWay-address (or one range), 380 unlike an -record (see below) that may specify multiple addresses 381 and multiple address-ranges. 383 If the -record is followed by other records that describe the 384 same node (such as , , , , and ) then 385 the RL of the -records also covers all these records. All 386 these records apply to all the addresses specified in this 387 -record. Needless to say that is not expected to appear 388 within a record that specifies more than one address. 390 Hence, if an -record with AT=1 has RL>1, or if an -record 391 with AT>1 has RL>2, then this -record includes additional records 392 (such as , , , and/or ) about the specified 393 address(es). 395 The enumeration is guaranteed not to have overlap between the AT and 396 the RTyp codes. 398 ===> Node-Name Record [PAD] 400 (e.g., a name with 7 bytes B1..B7) 402 0 1 2 3 4 5 6 7 403 +--------+--------+--------+--------+--------+--------+--------+--------+ 404 | | PL=3 | RL=1 | B1 | B2 | B3 | B4 | 405 +--------+--------+--------+--------+--------+--------+--------+--------+ 406 | B5 | B6 | B7 | xxxx | xxxx | xxxx | xxxx | xxxx | 407 +--------+--------+--------+--------+--------+--------+--------+--------+ 409 The number of bytes in the name is 8*RL+4-PL. 411 ===> Node-Capability Record [PAD] 413 (e.g., 9 parameter bytes) 415 0 1 2 3 4 5 6 7 416 +--------+--------+--------+--------+--------+--------+--------+--------+ 417 | | PL=2 | RL=1 | CC=Cx | P1 | P2 | P3 | 418 +--------+--------+--------+--------+--------+--------+--------+--------+ 419 | P4 | P5 | P6 | P7 | P8 | P9 | xxxx | xxxx | 420 +--------+--------+--------+--------+--------+--------+--------+--------+ 422 Byte#4 is the Capability Code, CC, followed by as many parameter 423 bytes as needed (9 in the above example). 425 The capability codes are listed in the PacketWay Enumeration docu- 426 ment. 428 The number of bytes used by the parameters is 8*RL+3-PL. 430 ===> Logical-Addresses Record [PAD] 432 (e.g., 2 logical addresses and a range of logical addresses) 434 0 1 2 3 4 5 6 7 435 +--------+--------+--------+--------+--------+--------+--------+--------+ 436 | | PL=4 | RL=2 | AT=1 |1110 Logical-Address-#1 | 437 +--------+--------+--------+--------+--------+--------+--------+--------+ 438 | AT=2 |1110 Min-Logical-Address | AT=3 |1110 Max-Logical-Address | 439 +--------+--------+--------+--------+--------+--------+--------+--------+ 440 | AT=1 |1110 Logical-Address-#2 | xxxx | xxxx | xxxx | xxxx | 441 +--------+--------+--------+--------+--------+--------+--------+--------+ 443 Whereas an -record defines only one PacketWay-address (or one 444 range), an -record may specify multiple addresses (each with 445 AT=1) and multiple ranges (each with a pair of AT=2,3 or AT=4,5). 447 ===> Source-Route Record [PBD], with Q for that route. 449 (e.g., an SR combined of 2 L2RHs, one with 13 bytes and one with 4 450 bytes) 452 This record carries one, or more, L2RHs (2 in the following example, 453 one with SR of 13B, followed by an SR of 5B). 455 0 1 2 3 4 5 6 7 456 +--------+--------+--------+--------+--------+--------+--------+--------+ 457 | | PL=2 | RL=3 | xxxx | xxxx | Q | 458 +--------+--------+--------+--------+--------+--------+--------+--------+ 459 |vv000000|10 L=13B| SR01 | SR02 | SR03 | SR04 | SR05 | SR06 | 460 +--------+--------+--------+--------+--------+--------+--------+--------+ 461 | SR07 | SR08 | SR09 | SR10 | SR11 | SR12 | SR13 | xxxx | 462 +--------+--------+--------+--------+--------+--------+--------+--------+ 463 |vv000000|10 L=4B | SR01 | SR02 | SR03 | SR04 | xxxx | xxxx | 464 +--------+--------+--------+--------+--------+--------+--------+--------+ 466 Q (the Route Quality) is an unsigned 16-bit integer. The units are 467 not defined here. It is assumed that it is monotonic with all-0 468 being the best and all-1 the worst. If there is an 469 (MTU-record) for that SR it should follow this -record. 470 However, the RL of the does not include the RL of the . 472 ===> MTU record [PBD] 474 0 1 2 3 4 5 6 7 475 +--------+--------+--------+--------+--------+--------+--------+--------+ 476 | | PL=0 | RL=0 | MTU (in 8B-words) | 477 +--------+--------+--------+--------+--------+--------+--------+--------+ 479 The MTU record provides the MTU for the SR defined before (by an 480 ). 482 The value of 0 means indefinite MTU (i.e., any length is OK). 484 Example 486 In the following PacketWay network used for this example, 3 SANs are 487 interconnected via 2 routers, Router-A (RTRA) between SAN1 and SAN3, 488 and RTRB between SAN1 and SAN2. 490 +-------+ +--0--+ SAN1 +--0--+ +--0--+ 491 | Node1 +----------3 SW0 1----------3 SW1 1----------3 SW2 1 MTU=16KB 492 +-------+ +--2--+ +--2--+ +--2--+ 493 | | 494 RTRA1 *********** +---+---+ *********** RTRB1 495 * RouterA * | Node2 | * RouterB * 496 RTRA3 *********** +---+---+ *********** RTRB2 497 | | | 498 +-------+ SAN3 +--0--+ +--0--+ SAN2 +--0--+ 499 | Node3 +----------3 SW3 1 3 SW4 1----------3 SW5 1 MTU=8KB 500 +-------+ +--2--+ +--2--+ +--2--+ 502 In this example Node1 on SAN1 (with MTU=16KB) is looking for Node2 503 which is on SAN2 (with MTU=8KB). It first asks its default router 504 (RTRA1) for an L2RH to Node2. RTRA1 redirects Node1 to RTRB1 505 regarding Node2. 507 Node1 asks RTRA1 (by [HRTO], in message M1) which router to use for 508 Node2. RTRA1 suggests (using [RDRC], M2) to use RouterB. Node1 uses 509 L3-forwarding ([WRU?], M3), via Router-B, to verify that RTRB can 510 indeed get to Node2, by asking Node2 for information about itself. 511 Node2 provides this information ([TELL], M4) which Node1 likes. 512 Node1 asks RouterB ([GVL2], M5) for L2RH(s) to Node2. RouterB pro- 513 vides ([L2SR], M6) the requested L2RH with its MTU of 1,024 8B-words 514 (8KB). 516 Finally, Node1 sends data (by M7) to Node2 using L2-forwarding. 517 Similarly, Node2 may ask its default router which HR to use for Node1 518 and for L2RH(s) to Node1. 520 The sequence of messages (M1 thru M7) is shown below. 522 (M1) Node1 sends [HRTO] to its default router RTRA1 asking which HR 523 to use for node2. 525 0 1 2 3 4 5 6 7 526 +-----------------------------------------------------------------------+ 527 | <---- The L2-header needed to get from Node1 to RouterA1 ----> | 528 | It may be any number of bytes. In this example it's 9 bytes:230000000| 529 +--------+--------+--------+--------+--------+--------+--------+--------+ 530 |00 P |0 RTRA1 | "HRTO" | "R R P" | 531 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 532 |E=0|PL=0| Data-Length=1 (8B-words) |0| RZ |0 Node1 | 533 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 534 | | PL=0 | RL=0 | AT=1 |0 Node2 | 535 +--------+--------+--------+--------+--------+--------+--------+--------+ 536 | 64 zero bits, unless any error was indicated along the path | 537 +--------+--------+--------+--------+--------+--------+--------+--------+ 539 (M2) RTRA1 uses [RDRC] to re-direct to Node2 via RouterB. 541 0 1 2 3 4 5 6 7 542 +-----------------------------------------------------------------------+ 543 | <---- The L2-header needed to get from RouterA1 to Node1 ----> | 544 | It may be any number of bytes. In this example it's 9 bytes:330000000| 545 +--------+--------+--------+--------+--------+--------+--------+--------+ 546 |00 P |0 Node1 | "RDRC" | "R R P" | 547 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 548 |E=0|PL=0| Data-Length=2 (8B-words) |0| RZ |0 RTRA1 | 549 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 550 | | PL=0 | RL=0 | AT=1 |0 Node2 | 551 +--------+--------+--------+--------+--------+--------+--------+--------+ 552 | | PL=0 | RL=0 | AT=1 |0 RTRB1 | 553 +--------+--------+--------+--------+--------+--------+--------+--------+ 554 | 64 zero bits, unless any error was indicated along the path | 555 +--------+--------+--------+--------+--------+--------+--------+--------+ 557 Node1 knows how to get to RouterB over its SAN. 559 (M3) Node1 uses [WRU?] (still using L3-forwarding via RouterB) to 560 verify the capabilities of Node-2, and that RTRB can indeed get to 561 it. This is done by asking Node2 for information about itself. 563 0 1 2 3 4 5 6 7 564 +-----------------------------------------------------------------------+ 565 | <---- The L2-header needed to get from Node1 to RouterB1 ----> | 566 | It may be any number of bytes. Here it is 11 bytes: 11230000000 | 567 +--------+--------+--------+--------+--------+--------+--------+--------+ 568 |00 P |0 Node2 | "WRU?" | "R R P" | 569 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 570 |E=0|PL=0| Data-Length=0 (8B-words) |0| RZ |0 Node1 | 571 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 572 | 64 zero bits, unless any error was indicated along the path | 573 +--------+--------+--------+--------+--------+--------+--------+--------+ 574 (M4) Node2 uses [INFO] (via RouterB2, also using L3-forwarding) to 575 provide information about itself to Node1. This info includes its 576 PacketWay-address and its name ("Super"). If Node2 had implemented 577 also Level-C of the RRP it would also provide a record about its 578 capabilities (as shown in this example with 2 capabilities (with 579 codes of 5 and 7). 581 0 1 2 3 4 5 6 7 582 +-----------------------------------------------------------------------+ 583 | <---- The L2-header needed to get from Node2 to RouterB2 ----> | 584 | It may be any number of bytes. Here it is 10 bytes: 1030000000 | 585 +--------+--------+--------+--------+--------+--------+--------+--------+ 586 |00 P |0 Node1 | "INFO" | "R R P" | 587 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 588 |E=0|PL=0| Data-Length=5 (8B-words) |0| RZ |0 Node2 | 589 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 590 | | PL=0 | RL=4 | AT=1 |0 Node2 | 591 +--------+--------+--------+--------+--------+--------+--------+--------+ 592 | | PL=7 | RL=1 | "S" | "u" | "p" | "e" | 593 +--------+--------+--------+--------+--------+--------+--------+--------+ 594 | "r" | xxxx | xxxx | xxxx | xxxx | xxxx | xxxx | xxxx | 595 +--------+--------+--------+--------+--------+--------+--------+--------+ 596 | | PL=1 | RL=0 | CC=7 | 4 | 8 | xxxx | 597 +--------+--------+--------+--------+--------+--------+--------+--------+ 598 | | PL=3 | RL=0 | CC=5 | xxxx | xxxx | xxxx | 599 +--------+--------+--------+--------+--------+--------+--------+--------+ 600 | 64 zero bits, unless any error was indicated along the path | 601 +--------+--------+--------+--------+--------+--------+--------+--------+ 603 By receiving this message Node1 knows that RTRB could indeed be used 604 for communication with Node2. 606 (M5) Node1 uses [GVL2] to ask RouterB for L2RH(s) from RouterB to 607 Node2. 609 0 1 2 3 4 5 6 7 610 +-----------------------------------------------------------------------+ 611 | <---- The L2-header needed to get from Node1 to RouterB1 ----> | 612 | It may be any number of bytes. Here it is 11 bytes: 11230000000 | 613 +--------+--------+--------+--------+--------+--------+--------+--------+ 614 |00 P |0 RTRB1 | "GVL2" | "R R P" | 615 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 616 |E=0|PL=0| Data-Length=1 (8B-words) |0| RZ |0 Node1 | 617 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 618 | | PL=0 | RL=0 | AT=1 |0 Node2 | 619 +--------+--------+--------+--------+--------+--------+--------+--------+ 620 | 64 zero bits, unless any error was indicated along the path | 621 +--------+--------+--------+--------+--------+--------+--------+--------+ 622 (M6) RouterB uses [L2SR] to provide Node1 with an L2RH from RTRB2 to 623 Node2, with its Q and MTU. This L2RH is {3,0,3,0,0,0,0,0,0,0} from 624 RouterB to Node2, and the MTU is 1,024 (meaning 8KB). 626 0 1 2 3 4 5 6 7 627 +-----------------------------------------------------------------------+ 628 | <---- The L2-header needed to get from RouterB1 to Node1 ----> | 629 | It may be any number of bytes. Here it is 11 bytes: 33330000000 | 630 +--------+--------+--------+--------+--------+--------+--------+--------+ 631 |00 P |0 Node1 | "L2SR" | "R R P" | 632 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 633 |E=0|PL=0| Data-Length=4 (8B-words) |0| RZ |0 RTRA1 | 634 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 635 | | PL=0 | RL=3 | AT=1 |0 Node2 | 636 +--------+--------+--------+--------+--------+--------+--------+--------+ 637 | | PL=2 | RL=1 | xxxx | xxxx | Q | 638 +--------+--------+--------+--------+--------+--------+--------+--------+ 639 |vv000000|10 L=4B | 3 | 0 | 3 | 0 | xxxx | xxxx | 640 +--------+--------+--------+--------+--------+--------+--------+--------+ 641 | | PL=1 | RL=0 | MTU=1,024 (in 8B-words) | 642 +--------+--------+--------+--------+--------+--------+--------+--------+ 643 | 64 zero bits, unless any error was indicated along the path | 644 +--------+--------+--------+--------+--------+--------+--------+--------+ 646 The MTU in the above is the lessor of the MTUs of both 647 networks. 649 The RL (record-length) of the last -record is NOT included in 650 the RL of the preceding -record, but is included in the RL of 651 the preceding -record (since the RL of the is included 652 in the RL of the ). The RL=3 of the includes 2 words of 653 and 1 word of . 655 (M7) Finally, Node1 sends data to Node2 using L2-forwarding. 657 0 1 2 3 4 5 6 7 658 +-----------------------------------------------------------------------+ 659 | <---- The L2-header needed to get from Node1 to RouterB1 ----> | 660 | It may be any number of bytes. Here it is 11 bytes: 11230000000 | 661 +--------+--------+--------+--------+--------+--------+--------+--------+ 662 |vv000000|10 L=4B | 3 | 0 | 3 | 0 | xxxx | xxxx | 663 +--------+--------+--------+--------+--------+--------+--------+--------+ 664 |00 P |0 Node2 |Sensor.SubType=? | "Sensor" | 665 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 666 |E=3|PL=0| Data-Length=? (8B-words) |0| RZ |0 Node1 | 667 +---+----+--------+--------+--------+-+------+--------+--------+--------+ 668 | | 669 | <------------------- The sensor data goes here ---------------------> | 670 | | 671 +--------+--------+--------+--------+--------+--------+--------+--------+ 672 | 64 zero bits, unless any error was indicated along the path | 673 +--------+--------+--------+--------+--------+--------+--------+--------+ 674 E=3 (0b0011) indicates that all the data is 64-bit, Big Endian order. 676 All the messages shown in this appendix start with local L2 routing 677 bytes needed to get across either SAN1 or SAN2 (indicated with "The 678 L2-header needed to get from ... to ...") which are not L2RHs. The 679 difference is that these bytes are in front of the packet, exposed to 680 the local switches, whereas the L2RHs are only exposed to PacketWay- 681 entities. 683 These local L2 routing bytes are the actual bytes required by the 684 SANs and likely to be consumed as the messages traverses the SAN, 685 unlike the L2RHs that are intact until converted to actual routing 686 bytes. 688 The L2RHs start with 0bvv00000010 followed by the number of routing 689 bytes in that L2RH, and possibly also by several bytes of padding. 691 Glossary 693 Address A unique designation of a node (actually an 694 interface to that node) or a SAN. 696 Buddy-HR HRs are "buddies" if they are on the same SAN. 698 Cut-Through See Wormhole. 700 Destination The node to which a packet is intended. 702 Dynamic-Routing Routing according to dynamic information (i.e., 703 acquired at run time, rather than pre-set). 705 Endianness The property of being Big-Endian or Little-Endian 706 (transmission order, etc.) 708 Ethertype A 16-bit value designating the type of Level-3 709 packets carried by a Level-2 communication sys- 710 tem. 712 HR Half-Router, the part of a router that handles 713 one network only. 715 L2-Forwarding Forwarding based on Level-2 (i.e., data-link 716 layer of the ISORM) information, e.g., the native 717 technique of each SAN or LAN. Also called 718 "source routing." 720 L3-Forwarding Forwarding based on end-to-end (Level-3 i.e., 721 network layer of the ISORM) addresses. Also 722 called "destination routing." 724 Map The topology of a network. 726 Mapper A node on a SAN/LAN that has the map and an RT 727 for that network. It is expected that the mapper 728 dynamically updates the map and the RT. 730 Multi-homed A node with more than one network interface, 731 where each interface has another address. 733 Node Whatever can send and receive packets (e.g., a 734 computer, an MPP, a software process, etc.) 736 Node A C-struct (or equivalent) containing values for 737 some attributes of a node. 739 Planned Transfer of information, occurs after an initial 740 phase in which the sender decides which Level-2 741 route to use for that transfer. 743 RCVF The "Received From" set includes all the physical 744 addresses through which an RT was disseminated, 745 starting with the address of the mapper that 746 created that RT. 748 Redirect A message that tells nodes which HR should be 749 used in order to get to a certain remote address. 751 Router The inter-SAN communication device. 753 Security A relationship between 2 (or more) nodes that 754 defines how the nodes utilize security services 755 to communicate securely. 757 Source The node that created a packet. 759 Source-Route A Level-2 route that is chosen for a packet by 760 its source. 762 Symbol Data preceding the EEP header of a PacketWay mes- 763 sage, interleaving with the L2RHs. 765 Twin-HR Two HRs are twins if they both are parts of the 766 same inter-SAN router. 768 Wormhole-routing (aka cut-thru routing) forwarding packets out of 769 switches as soon as possible, without storing 770 that entire packet in the switch (unlike Stop- 771 and-forward) 773 Zero-copy A TCP system that copies data directly between 774 the user area and the network device, bypassing 775 OS copies 777 Acronyms and Abbreviations 779 0bNNNN The binary number NNNN (e.g., 0b0100 is 4-decimal) 781 0xNNNN The hexadecimal number NNNN (e.g., 0x0100 is 256-decimal) 783 8B 8 byte (64 bits) entity 785 ADDR The Address-record of RRP 787 AT Address Type 789 BER Bit Error Rate 791 CAPA The CAPAbility-record of RRP 793 CSR Common Source-Route 795 DA Destination Address 797 DB Data Block 799 DL Data Length (in 8B words) 801 DT Destination-Type 803 EEP End-to-End Protocol 805 EI Error Indication 807 GVL2 An RRP message, requesting L2 route to a given destination 809 GVRT An RRP message asking an HR to give its routing tables 811 HR Half Router 813 HRTO An RRP message asking which HR to use for a given destination 815 INFO An RRP message providing information about nodes 817 L2 Level-2 of the ISO Reference Model (Link) 819 L2RH Level-2 Routing Header 821 L2SR Source Route 823 L3 Level-3 of the ISO Reference Model (Network) 825 LADR The Logical-addresses-record of RRP 826 LSbit Least Significant bit 828 LSbyte Least Significant byte 830 MSbit Most Significant bit 832 MSbyte Most Significant byte 834 MTU Maximum Transmission Unit 836 MTUR The MTU-record of RRP 838 NAME The name-record of RRP 840 OH Optional Header field 842 OH-TYPE The Type of an Optional Header field 844 Q Quality (of a path) 846 RCVF Received-From list, or the Received-From record of RRP 848 RDRC A re-direct message of RRP 850 RRP Router-to-Router Protocol 852 RTBL An RRP message proving a Routing Table 854 SRQR The Source-Route-and-Q-record of RRP 856 TELL RRP message requesting INFO about a partially specified node 858 UNK Unknown 860 WRU? An RRP message asking its recipient to identify itself 862 Editor's Address 864 Anthony Skjellum 865 Computer Science Department 866 Box 9637 867 Mississippi State University 868 Mississippi State, MS 39762-9637 870 Phone: 601-325-8435 871 Fax: 601-325-8997 872 Email: tony@cs.msstate.edu 874 --------