idnits 2.17.1 draft-rfced-info-mitsuru-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 75 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 17 instances of too long lines in the document, the longest one being 7 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 502: '... from its downstream switch, it SHOULD...' RFC 2119 keyword, line 587: '...t of the message SHOULD ignore all ent...' RFC 2119 keyword, line 673: '...rst comes up, it SHOULD send to all ad...' RFC 2119 keyword, line 677: '...e routing table, MUST have at least an...' RFC 2119 keyword, line 905: '... SHOULD check the status of the corr...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1997) is 9841 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: 'NOTE 1' on line 778 ** Obsolete normative reference: RFC 1723 (ref. '7') (Obsoleted by RFC 2453) == Outdated reference: A later version (-11) exists of draft-ietf-idmr-dvmrp-v3-03 Summary: 11 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT EXPIRES OCTOBER 1997 INTERNET-DRAFT 3 Network Working Group K. Murakami 4 INTERNET-DRAFT M. Maruyama 5 Category: Informational NTT Laboratories 6 May 1997 8 A MAPOS version 1 Extension - Switch-Switch Protocol 9 11 Status of this Memo 13 This document is an Internet-Draft. Internet-Drafts are working 14 documents of the Internet Engineering task Force (IETF), its areas, 15 and its working groups. Note that other groups may also distribute 16 working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six 19 months and may be updated, replaced, or obsoleted by other 20 documents at any time. It is inappropriate to use Internet-Drafts 21 as reference material or to cite them other than as "work in 22 progress". 24 To learn the current status of any Internet-Draft, please check the 25 "1id-abstract.txt" listing contained in the Internet-Drafts Shadow 26 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 27 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 28 ftp.isi.edu (US West Coast). 30 Authors' Note 32 This memo documents a MAPOS (Multiple Access Protocol over SONET/SDH) 33 version 1 extension, Switch Switch Protocol which provides dynamic 34 routing for unicast, broadcast, and multicast. This document is NOT 35 the product of an IETF working group nor is it a standards track 36 document. It has not necessarily benefited from the widespread and 37 in depth community review that standards track documents receive. 39 Abstract 41 This document describes a MAPOS version 1 extension, SSP (Switch 42 Switch Protocol). MAPOS is a multiple access protocol for 43 transmission of network-protocol packets, encapsulated in High-Level 44 Data Link Control (HDLC) frames, over SONET/SDH. In MAPOS network, A 45 SONET switch provides the multiple access capability to end nodes. 46 SSP is a protocol of Distance Vector family and provides unicast and 47 broadcast/multicast routing for multiple SONET switch environment. 49 1. Introduction 51 This document describes an extension to MAPOS version 1, Switch 52 Switch Protocol, for routing both unicast and broadcast/multicast 53 frames. MAPOS[1], Multiple Access Protocol over SONET (Synchronous 54 Optical Network) / SDH (Synchronous Digital Hierarchy) [2][3][4][5], 55 is a link layer protocol for transmission of HDLC frames over 56 SONET/SDH. A SONET switch provides the multiple access capability to 57 each node. SSP is a dynamic routing protocol designed for an 58 environment where a MAPOS network segment spans over multiple 59 switches. It is a protocol of Distance Vector family. It provides 60 both unicast and broadcast/multicast routing. First, this document 61 describes the outline of SSP. Next, it explains unicast and 62 broadcast/multicast routing algorithms. Then, it describes the SSP 63 protocol in detail. 65 2. Constraints in Designing SSP 67 SSP is a unified routing protocol supporting both unicast and 68 broadcast/multicast. The former and the latter are based on the 69 Distance Vector [6][7] and the spanning tree[8] algorithm, 70 respectively. In MAPOS version 1, a small number of switches is 71 assumed in a segment. Thus, unlike DVMRP(Distance Vector Multicast 72 Routing Protocol)[8], TRPB(Truncated Reverse Path Broadcasting) is 73 not supported for simplicity. This means that multicast frames are 74 treated just the same as broadcast frames and are delivered to every 75 node. 77 In MAPOS version 1, there are two constraints regarding design of the 78 broadcast/multicast routing algorithm; 80 (1) there is no source address field in MAPOS HDLC frames 82 (2) there is no TTL(Time To Live) field in MAPOS HDLC frames to 83 prevent forwarding loop. 85 To cope with the first issue, VRPB(Virtual Reverse Path Broadcast) 86 algorithm is introduced. In VRPB, all broadcast and multicast frames 87 are assumed to be generated by a node under a specific switch called 88 VSS(Virtual Source Switch). VSS is the switch which has the smallest 89 switch number in a MAPOS network. Each switch determine its place in 90 the spanning tree rooted from VSS independently. Whenever a switch 91 receives a broadcast/multicast frame, it forwards the frame to all 92 upstream and downstream switches except for the one which has sent 93 the frame to the local switch. 95 To cope with the second issue, the forward delay timer is introduced. 96 Even if a switch finds a new VSS, it suspends forwarding for a time 97 period. This timer ensures that all the switches have a consistent 98 routing information and that they are synchronized after a topology 99 change. 101 3. Unicast Routing in SSP 103 This section describes the address structure of MAPOS version 1 and 104 the SSP unicast routing based on it. 106 3.1 Address Structure of MAPOS version 1 108 In a multiple switch environment, a node address consists of the 109 switch number and the port number to which the node is connected. As 110 shown in Figure 1, the address length is 8 bits and the LSB is always 111 1, which indicates the end of the address field. An MSB of 0 112 indicates a unicast address. The switch and the port number fields 113 are variable-length. In this document, an unicast the address is 114 represented as "0 ". Note that a port 115 number includes EA bit. 117 MSB of 1 indicates multicast or broadcast. In the case of broadcast, 118 the address field contains all 1s (0xff in hex). In the case of 119 multicast, the remaining bits indicate a group address. The switch 120 number field is variable-length. An multicast address is represented 121 as "1 ". 123 Switch Number(variable length) 124 | 125 | +--- Port Number 126 | | 127 V V 128 |<->|<------->| 129 +-------------+-+ 130 | | | | | | | | | 131 | | |1| 132 +-+-----------+-+ 133 ^ ^ 134 | | 135 | +------- EA bit (always 1) 136 | 137 1 : broadcast, multicast 138 0 : unicast 140 Figure 1 Address Format 142 Figure 2 shows an example of a SONET LAN that consists of three 143 switches. In this configuration, two bits of a node address are used 144 to indicate the switch number. Node N1 is connected to the port 145 0x03(000011 in binary) of the switch S2 numbered 0x2. Thus, the node 146 address is 01000011 in binary. Node N4 has an address 01101001 in 147 binary since the connected switch number is 0x3 and the port number 148 is 0x09. 150 01000011 151 +------+ 152 | node | 153 | N1 | 154 +------+ 155 01000101 |0x03 |0x03 00101001 156 +------+ +---+----+ +---+----+ +------+ 157 | node +-----+ SONET +---------+ SONET +------+ node | 158 | N2 | 0x05| Switch |0x09 0x05| Switch |0x09 | N3 | 159 +------+ | S2 | | S1 | +------+ 160 | (0x2) | | (0x1) | 161 +---+----+ +---+----+ 162 |0x07 |0x07 163 | | 164 | |0x03 01101001 165 | +---+----+ +------+ 166 +--------------+ SONET +-----+ node | 167 0x05| Switch |0x09 | N4 | 168 | S3 | +------+ 169 | (0x3) | 170 +---+----+ 171 |0x07 173 Figure 2 Multiple SONET Switch Environment 175 3.2 Forwarding Unicast Frames 177 Unicast frames are forwarded along the shortest path. For example, a 178 frame from node N4 destined to N1 is forwarded by switch S3 and S2. 179 These SONET switches forwards an HDLC frame based on the destination 180 switch number contained in the destination address. 182 Each switch keeps a routing table with entries for possible 183 destination switches. An entry contains the subnet mask, the next hop 184 to the adjacent switch along the shortest path to the destination, 185 the metric measuring the total distance to the destination, and other 186 parameters associated with the entry such as timers. For example, the 187 routing table in switch S1 will be as shown in Table 1. The metric 188 value 1 means that the destination switch is an adjacent switch. The 189 value 16 means that it is unreachable. Although the values between 17 190 and 31 also mean unreachable, they are special values utilized for 191 split horizon with poisoned reverse [8]. 193 +-------------------------+----------+--------+------------+ 194 | destination | subnet | next hop | metric | other | 195 | switch | mask | port | | parameters | 196 +-------------+-----------+----------+--------+------------+ 197 | 01000000 | 11100000 | 00000101 | 1 | | 198 +-------------+-----------+----------+--------+------------+ 199 | 01100000 | 11100000 | 00000111 | 1 | | 200 +-------------+-----------+----------+--------+------------+ 202 Table 1 An Example of a Routing Table 204 When a switch receives a unicast frame, it extracts the switch number 205 from the destination address. If it equals to the local switch 206 number, the frame is sent to the local node through the port 207 specified in the destination address. Otherwise, the switch looks up 208 its routing table for a matching destination switch number by masking 209 the destination address with the corresponding subnet mask. If a 210 matching entry is found, the frame is sent to an adjacent switch 211 through the next hop port in the entry. Otherwise, it is silently 212 discarded or sent to the control processor for its error processing. 214 3.4 Protocol Overview 216 This subsection describes an overview of the unicast routing protocol 217 and its algorithm. 219 3.4.1 Route Exchange 221 SSP is a distance vector protocol to establish and maintain the 222 routing table. In SSP, each switch sends a routing update message to 223 every adjacent switches every FULL_UPDATE_TIME (10 seconds by 224 default). The update message is a copy of the routing table, that is, 225 routes. 227 When a switch receives an update message from an adjacent switch 228 through a port, it adds the cost associated with the port, usually 1, 229 to every metric value in the message. The result is a set of new 230 metrics from the receiving switch to the destination switches. Next, 231 it compares the new metrics with those of the corresponding entries 232 in the existing routing table. A smaller metric means a better route. 233 Thus, if the new metric is smaller than the existing one, the entry 234 is updated with the new metric and next hop. The next hop is the port 235 from which the update message was received. Otherwise, the entry is 236 left unchanged. If the existing next hop is the same as the new one, 237 the metric is updated regardless of the metric value. If no 238 corresponding route is found, a new route entry is created. 240 3.4.2 Route Expiration 241 Assume a route to R is advertised by a neighboring switch S. If no 242 update message has been received from switch S for the period 243 FULL_UPDATE_TIME * 3 (30 seconds by default) or the route is 244 advertised with metric 16 by switch S, the route to R is marked as 245 unreachable by setting its metric to 16. In other words, the route to 246 R is kept advertised even if the route is not refreshed up-to 30 247 seconds. 249 To process this, each routing table entry has an EXPIRATION_TIMER (30 250 seconds by default, that is, FULL_UPDATE_TIME *3). If another switch 251 advertises a route to R, it replaces the unreachable route. Even if a 252 route is marked unreachable, the entry is kept in the routing table 253 for the period of FULL_UPDATE_TIME * 3. This enables the switch to 254 notify its neighbors of the unreachable route by sending update 255 messages with metric 16. To process this, each routing table entry 256 has a garbage collection timer GC_TIMER (30 seconds by default). The 257 entry is deleted on expiration of the timer. Figure 3 shows this 258 transition. 260 The Last Update Expiration Garbage Collection 261 | | | 262 Routing V T T T V T T T V 263 Table +-------+-------+-------+-------+-------+-------X 264 Entry metric < 16 | metric = 16 | 266 ----------------------->|---------------------->| 267 EXPIRATION_TIMER GC_TIMER 268 Stop Advertising 269 | 270 Advertised V 271 Metric -- metric <16 ------+-- metric = 16 -------X 273 T: FULL_UPDATE_TIME 275 Figure 3. Route Expiration 277 3.4.3 Slow Convergence Prevention 279 To prevent slow convergence of routing information, two techniques, 280 split horizon with poisoned reverse, and triggered update are 281 employed. 283 Sn <------------- S3 <- S2 <- S1 285 (i) Before Outage 287 -> 288 Sn <-- X -- S3 <- S2 <- S1 290 (ii) After Outage 292 Figure 4 An Example of Slow Convergence 294 Figure 4 shows an example of slow convergence[6]. In (i), three 295 switches, S1, S2, and S3, are assumed to have a route to Sn. In (ii), 296 the connection to Sn has disappeared because of an outage, but S2 297 continue to advertise the route since there is no means for S2 to 298 detect the outage immediately and it has the route to Sn in its 299 routing table. Thus, S3 misunderstand that S2 has the best route to 300 Sn and S2 is the next hop. This results in a transitive loop between 301 S2 and S3. S2 and S3 increments the metric of the route to Sn every 302 time they advertise the route and the loop continues until the metric 303 reaches 16. To suppress the slow convergence problem, split horizon 304 with poisoned reverse is used. 306 In split horizon with poisoned reverse, a route is advertised as 307 unreachable to the next hop. The metric is the received metric value 308 plus 16. For example, in Figure 4, S2 advertises the route to Sn with 309 the metric unreachable only to S3. Thus, S3 never considers that S2 310 is the next hop to Sn. This ensures fast convergence on disappearance 311 of a route. 313 Another technique, triggered update, forces a switch to send an 314 immediate update instead of waiting for the next periodic update when 315 a switch detects a local port failure, or when it receives a message 316 that a route has become unreachable, or that its metric has 317 increased. This makes the convergence faster. 319 4. Broadcast/multicast Routing in SSP 321 This section explains VRPB algorithm and the outline of 322 broadcast/multicast routing protocol. 324 4.1 Virtual Reverse Path Broadcast/Multicast Algorithm 326 SSP provides broadcast/multicast routing based on a spanning tree 327 algorithm. As described in Section 2, the routing is based on the 328 VRPB(Virtual Reverse Path Broadcast) algorithm. In VRPB, each switch 329 assumes that all broadcast and multicast frames are generated by a 330 specific switch, VSS(Virtual Source Switch). Thus, unlike DVMRP, a 331 MAPOS network has only one spanning tree at any given time. 333 The frames are forwarded along the reverse path by computing the 334 shortest path from the VSS to all possible recipients. VSS is the 335 switch which has the lowest switch number in the network. Because 336 the routing table contains all the unicast destination addresses 337 including the switch numbers, each switch can identify the VSS 338 independently by searching for the smallest switch number in its 339 unicast routing table. 341 In Figure 2, switch S1 is the VSS. Each switch determines its place 342 in the spanning tree, relative to the VSS, and which of its ports are 343 on the shortest path tree. Thus, the spanning tree is as shown in 344 Figure 5. Except for the VSS, each switch has one upstream port and 345 zero or more downstream ports. VSS have no upstream port, since it is 346 the root of the spanning tree. In Figure 2. switch S2's upstream 347 port is port 0x09 and it has no downstream port. 349 S1 (VSS) 350 / \ 351 / \ 352 / \ 353 S2 S3 355 Figure 5 VRPB Spanning Tree 357 When a switch receives a broadcast/multicast frame, it forwards the 358 frame to all of the upstream switch, the downstream switches, and the 359 directly connected nodes. However, it does not forward to the switch 360 which sent the frame to it. For that purpose, a bit mapped 361 broadcast/multicast routing table may be employed. The 362 broadcast/multicast routing process marks all the bits corresponding 363 to the ports to which frames should be forwarded. The forwarding 364 process refers to it and broadcasts a frame to all the ports with its 365 corresponding bit marked. 367 4.2 Forwarding Broadcast/multicast Frames 369 When a switch forwards a broadcast/multicast frame, (1) it first 370 decides the VSS by referring to its unicast routing table. Then, (2) 371 it refers to its broadcast/multicast routing table corresponding to 372 the VSS. A cache may be used to reduce the search overhead. (3) Based 373 on the routing table, the switch forwards the frame. 375 Figure 6 shows an example of S2's broadcast/multicast routing table 376 for the VSS S1. It is a bit map table and each bit corresponds to a 377 port. The value 1 indicates that frames should be forwarded to a node 378 or a switch through the port. If no bit is marked, the frame is 379 silently discarded. In the example of Figure 6, port 0x09 is the 380 upstream port to its VSS, that is, S1. Other ports, ports 0x05 and 381 0x03 are path to N2 and N1 nodes, respectively. 383 0F 0D 0B 09 07 05 03 01 --- port number 384 +---+---+---+---+---+---+---+---+ 385 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | --- 1: forward 386 +---+---+---+---+---+---+---+---+ 0: inhibit 388 Figure 6 Broadcast/Multicast Routing Table of S2 390 4.3 Forwarding Path Examples 392 Assume that a broadcast frame is generated by N2 in Figure 2. The 393 frame is received by S2. 395 Then, S2 passes it to all the connected nodes except for the source 396 N2. That is, only to N1. At the same time, it also forwards the frame 397 to all its upstream and downstream switches. Since S2 has no 398 downstream switch, S2 forwards the frame to S1 though its upstream 399 port 0x09. 401 S1 is the VSS and it passes the frame to all the local nodes, that 402 is, only to N3. Since it has no upstream switch and S2 is the switch 403 which sent the frame to S1, the frame is eventually forwarded only to 404 a downstream switch S3. 406 S3 passes the frame to its local node, N4. Since S3 has only an 407 upstream and the frame was received through that port, S3 does not 408 forward the frame to any switch. 410 The resulting path is shown in Figure 7. Although this is not the 411 optimal path, VRPB ,at least, ensures that broadcast/multicast frames 412 are delivered all the nodes without a loop. Figures 8 and 9 show the 413 forwarding path for frames generated by a node under S3 and S4, 414 respectively. 416 +-> N3 417 | 418 N2 -> S2 +-> S1 +-> S3 -> N4 419 | 420 +-> N1 422 Figure 7 Forwarding Path from N2 423 +-> N1 424 | 425 N3 -> S1 +-> S2 +-> N2 426 | 427 +-> S3 --> N4 429 Figure 8 Forwarding Path from N3 431 +-> N3 432 | 433 N4 -> S3 +-> S1 +-> S2 +-> N1 434 | 435 +-> N2 436 Figure 9 Forwarding Path from N4 438 4.4 Suppressing Routing Loop 440 To suppress transitive routing loop, forward delay is employed. A 441 switch suspends broadcast/multicast forwarding for a period after a 442 new VSS is found in the routing table. This prevents transitive 443 routing loop by waiting for all the switches to have the same routing 444 information and become synchronized. In addition to controlling 445 sending of frames by forward delay, another mechanism is employed to 446 prevent transitive routing loop by controlling reception of frames. 447 That is, broadcast/multicast frames received through ports other than 448 the upstream and downstream ports are discarded. 450 4.5 Upstream Switch Discovery 452 The upstream port is determined by the shortest reverse path to the 453 VSS. It is identified by referring to the next hop port of the route 454 to VSS in the local unicast routing table. When a new next hop to the 455 VSS is discovered, the bit corresponding to the old next hop port is 456 cleared, and the bit corresponding to the new one is marked as the 457 upstream port in the broadcast/multicast routing table. 459 4.6 Downstream Switch Discovery 461 To determine the downstream ports, split horizon with poisoned 462 reverse is employed. When a switch receives a route with a metric 463 poisoned by split horizon processing through a port as described in 464 Section 3.4.3, the port is considered to be a downstream port. In 465 Figure 2, S1 is the VSS and the route information is sent back from 466 S2 to S1 with metric unreachable based on the split horizon with 467 poisoned reverse. Thus, S1 knows that S2 is one of its downstreams. 469 4.7 Downstream Port Expiration 470 When a poison reversed packet is newly received from a port, the 471 local switch knows that a new downstream switch has appeared. Then, 472 it marks the bit corresponding to the port and starts 473 FORWARD_DELAY_TIMER (30second by default, that is, FULL_UPDATE_TIME * 474 3) for the port. The forwarding of broadcast/multicast frames to the 475 port is prohibited until the timer expires. Every time the local 476 switch receives a poison reversed packet through a port, it 477 initializes PORT_EXPIRATION_TIMER(30 seconds by default, that is, 478 FULL_UPDATE_TIME *3) corresponding to the port. A continuous loss of 479 poison reversed packets or a failure of downstream port results in 480 expiration of PORT_EXPIRATION_TIMER, and the corresponding bit is 481 cleared. 483 First Update Last Update 484 | | 485 V T T T T T T T V 486 +---+---+---+---+---+---+---+---+---+---+---+---+--- 487 A bit in 488 the routing 0 0 0 1 1 1 1 1 1 1 0 0 0 489 table ^ ^ 490 <--------->| <--------->| 491 ^ route up ^ route down 492 | | 493 FORWARD_DELAY PORT_EXPIRATION 495 T: FULL_UPDATE_TIME 497 Figure 10. Port Expiration 499 When a downstream switch discovers another best path to the VSS or a 500 new VSS, it stops split horizon with poison reverse and sends 501 ordinary update messages. Whenever the local switch receives an 502 ordinary update message from its downstream switch, it SHOULD 503 immediately clear the corresponding bit in the routing table and stop 504 forwarding of broadcast/multicast frames. 506 4.8 Node Discovery 508 When a NSP[9] packet, requesting a node address from a port, is 509 received, the local switch considers that a new node is connected, 510 and marks the corresponding bit in the broadcast/multicast routing 511 table. When the local switch detects that the port went down as 512 described in [9], it clear the corresponding bit. 514 4.9 Invalidating The Broadcast/multicast Routing Table 516 When a new VSS is discovered or when the VSS becomes unreachable, the 517 entire broadcast/multicast routing table is invalidated. That is, a 518 change of upstream port affects the entire broadcast/multicast 519 routing. However, a change of a downstream port does not affect 520 forwarding to other downstream ports, its upstream port, and nodes. 522 5. Detailed Protocol Operation 524 This section explains SSP packet format and protocol processing in 525 detail. 527 5.1 Packet Format 529 This subsection describes the packet encapsulation in HDLC frame and 530 the packet format. 532 5.1.1 Packet Format and Its Encapsulation 534 SSP packet format is designed based on RIP[6] and its successor, RIP2 535 [7]. Figure 11 shows the packet format. A SSP packet is encapsulated 536 in the information field of a MAPOS HDLC frame. The HDLC protocol 537 field of SSP is 0xFE05 in hex as defined by the ``MAPOS Version 1 538 Assigned Numbers'' [10]. The packet is sent encapsulated in a unicast 539 packet with the destination address 0000 0001, which indicates the 540 control processor of an adjacent switch. 542 (MSB) (LSB) 543 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 544 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ------- 545 | Command | Version | unused | SSP header 546 +---------------+---------------+-------------------------------+ ------- 547 | Address Family Identifier | All 0 | 548 +-------------------------------+-------------------------------+ 549 | HDLC Address | an SSP 550 +---------------------------------------------------------------+ route 551 | Subnet Mask | entry 552 +---------------------------------------------------------------+ 553 | All 0 | 554 +---------------------------------------------------------------+ 555 | Metric | 556 +---------------+---------------+-------------------------------+ ------ 557 | Address Family Identifier | All 0 | 559 Figure 11 SSP packet format 561 The maximum packet size is 512 octet. The first four octets is the 562 SSP header. The remainder of the message is composed of 1 - 25 route 563 entries. Each entry is 20 octets long. 565 5.1.2 SSP Header 567 SSP header consists of a command field and a version field. The 568 command field is one octet long and holds one of the following 569 values; 571 1 - request A request to send all or part of SSP routing table. 573 2 - response A message containing all, or a part of the sender's 574 SSP routing table. This message may be sent in 575 response to a request, or it may be an update 576 message generated by the sender. 578 The Version field indicates the version of SSP being used. The 579 current version number is 1. 581 5.1.3 SSP Route Entries 583 Each entry has an address family identifier. It indicates an 584 attribute of the entry. SSP routing protocol uses 2 as its identifier 585 by default. The identifier 0 indicates unspecified. This value is 586 used when a switch requests other switches to send the entire SSP 587 routing table. A recipient of the message SHOULD ignore all entries 588 with unknown value. 590 The HDLC address is a destination address. It may be a switch address 591 or a node address. The subsequent subnet mask is applied to the HDLC 592 address to yield the switch number portion. The field is 4 octet long 593 and the address is placed in the least significant position. 595 Metric indicates the distance to the destination node. That is, how 596 many switches a message must go through en route to the destination 597 node. The metric field must contain a value between 1 and 31. The 598 metric of 16 indicates that the destination is not reachable and is 599 ignored by recipients. The values between 17 and 31 are utilized for 600 poisoned reverse with split horizon and also means unreachable. The 601 metric 0 indicates the local switch itself. 603 5.2 Routing Table 605 Every switch has an SSP routing table. The table is a collection of 606 route entries - one for every destination. An entry consists of the 607 following information; 609 (1) destination : A unicast destination address. 611 (2) subnet mask : A mask to extract the switch address by applying 612 bitwise AND with the destination address 614 (3) next hop port : The local port number connected to the adjacent 615 switch along the path to the destination. 617 (4) metric : Distance to the destination node. The metric of an 618 adjacent switch is 1 and that of local switch is 0. 620 (5) timers for unicast routing : Timers associated with unicast 621 routing such as EXPIRATION_TIMER and GC_TIMER. 623 (6) flags : Various flags associated with the route such as route 624 change flag to indicate that the route has changed recently or it 625 has timed out. 627 (7) bit map routing table for broadcast/multicast : Each bit 628 corresponding to the port to an upstream or a downstream switch of 629 the spanning tree is marked in addition to the ports to end nodes. 630 Broadcast/multicast frames are forwarded only through those ports 631 with their corresponding bit set. Since only one spanning tree 632 exists at a time in a network, each route entry does not necessarily 633 have to have this field. 635 (8) timers for broadcast/multicast routing : Timers associated with 636 broadcast/multicast routing such as FORWARD_DELAY_TIMER and 637 PORT_EXPIRATION_TIMER. These timers are prepared for each bit of 638 broadcast/multicast routing table. 640 5.3 Sending Routing Messages 642 5.3.1 Packet Construction 644 Because of the split horizon with poisoned reverse, a routing message 645 differs depending on the adjacent switch to which the message is 646 being sent. The upstream switch of a route, that is next hop, 647 receives a message which contains the corresponding route with a 648 metric between 17 and 31. Switches that are not the upstream switch 649 of any route receive the same message. Here, we assume that a packet 650 for a routing message is constructed for an adjacent switch which is 651 connected through the local port N. 653 First, set the version field to 1, the current SSP version. Then, set 654 the command to "response". Set other fields which are supposed to be 655 zero to zero. Next, start filling in entries. 657 To fill in the entries, perform the following for each route. The 658 destination HDLC address, netmask, and its metric are put into the 659 entry in the packet. Routes must be included in the packet even if 660 their metrics are unreachable(16). If the next hop port is N, 16 is 661 added to the metric for split horizon with poisoned reverse. 663 Recall that the maximum packet size is 512 bytes. When there is no 664 more space in a packet, send the current message and start a new one. 665 If a triggered update is being generated, only entries whose route 666 change flags are set need be included. 668 5.3.2 Sending update 669 Sending update may be triggered in any of the following ways; 671 (1) Initial Update 673 When a switch first comes up, it SHOULD send to all adjacent 674 switches a request asking for their entire routing tables. The 675 destination address is 00000001. When a port comes on-line, the 676 request packet is sent to the port. The packet, requesting the 677 entire routing table, MUST have at least an entry with the address 678 family identifier 0 meaning unspecified. 680 When a switch receives a request packet, it first checks the version 681 number of the SSP header. If it is not 1, the packet is silently 682 discarded. Otherwise, the address family identifier is examined. If 683 the value is 0, the entire SSP routing table is returned in one or 684 more response packets destined to 00000001. Otherwise, the request 685 is silently discarded. Although the original RIP specification 686 defines the partial routing table request, SSP routing protocol 687 omits it for the sake of simplicity. 689 (2) Periodic Update 691 Every switch participating in the routing process sends an update 692 message (response message) to all its neighbor switches once every 693 FULL_UPDATE_TIME (10 seconds). For the periodic update, a response 694 packet(s) is used. The destination address is always 00000001. An 695 update message contains the entire SSP routing table. The maximum 696 packet size is 512byte. Thus, an update message may require several 697 packets to be packed. 699 (3) Triggered Update 701 When a route in the unicast routing table is changed or a local port 702 goes down, the switch advertises a triggered update packet without 703 waiting for the full update time. The difference between triggered 704 update and the other update is that triggered updates do not have to 705 include the entire routing table. Only changed entries should be 706 included. Triggered update may be suppressed if a regular periodic 707 update is due. 709 Note that when a route is advertised as unreachable (metric 16) by 710 an adjacent switch, update process is triggered as well as 711 expiration of the route in the local switch. 713 (4) On Termination 715 When a switch goes down, it is desirable to advertise all the routes 716 with metric 16, that is, unreachable. 718 5.4 Receiving Routing Messages 719 When a switch receives an update, it first checks the version number. 720 If it is not 1, the update packet is silently discarded. Otherwise, 721 it processes the entries in it one by one. 723 For each entry, the address family identifier is checked. If it is 724 not 2, the entry is ignored. Otherwise, the metric is checked. The 725 value should be between 0 and 31. An entry with illegal metric is 726 ignored. Next, the HDLC address and the subnet mask is checked. An 727 entry with an invalid address such as broadcast is ignored. If the 728 entry passed all these validation checks, it is processed according 729 to the following steps; 731 Step 1 - Process Poisoned Reverse 733 If the metric value is between 0 and 16, it is an unicast 734 information. Go ahead to Step 2. 736 If the metric value is between 17 and 31, it indicates poisoned 737 reverse, that the local switch has been chosen as the next hop for 738 the route. However, if the corresponding entry is not included in 739 the current routing table or the message is from a port connected to 740 its upstream switch, the message is illegal -- ignore it and return 741 to Step 1 to process the next entry. Otherwise, 743 (1) Initialize the PORT_EXPIRATION_TIMER corresponding to the 744 downstream port. 745 (2) Operate the FORWARD_DELAY_TIMER as follows; 746 (2-1) If the broadcast/multicast forwarding was already enabled, 747 go to (3). 748 (2-2) If the FORWARD_DELAY_TIMER corresponding to the 749 downstream port was already started, increment the 750 timer. If the timer expires, mark the bit in the 751 broadcast/multicast routing table corresponding to the 752 port and stop the timer. 753 (2-2) Otherwise, start the FORWARD_DELAY_TIMER. 754 (3) Return to Step 1 to process the next entry. 756 Step 2 - Process Unicast Routing Information 758 First, add the cost associated with the link, usually 1, to the 759 metric. If the result is greater than 16, 16 is used. Then, look up 760 the unicast routing table for the corresponding entry. There are two 761 cases. 763 Case 1 no corresponding entry is found 765 If the new metric is 16, return to step 1 to process the next entry. 766 Otherwise, 767 (1) Create a new route entry in the routing table 768 (2) Initialize EXPIRATION_TIMER and GC_TIMER 769 (3) The port corresponding to the new route is the next_hop port 770 for the route. Thus, mark the bit in the broadcast/multicast 771 routing table corresponding to the new next_hop port and start 772 FORWARD_DELAY_TIMER. If this new route is for the switch with 773 the minimum switch number, select it as the VSS and use its 774 broadcast/multicast routing table. (See NOTE 1.) 775 (4) Set the route change flag and invoke triggered update process 776 (5) Return to step 1 to process the next entry. 778 [NOTE 1] 779 There are two implementations; 780 (1) Prepare a spanning tree for each route and use 781 only one corresponding to the current VSS. In 782 this case, each unicast route entry has a 783 broadcast/unicast routing table. 784 (2) Prepare only one spanning tree corresponding to the 785 current VSS. In this case, a switch has only one 786 broadcast/multicast routing table. 787 In this document, the former is assumed. 789 Case 2. A corresponding entry is found 791 In this case, the update message is processed differently according 792 to the new metric value. 794 (a) new_metric < 16 & new_metric > current_metric 796 (1)If and only if the update is from the same port(next_hop 797 port) as the existing one, 798 (1-1) Update the entry 799 (1-2) Initialize EXPIRATION_TIMER and GC_TIMER 801 (2) If the corresponding bit to the port, which the update 802 message is received, is marked in the broadcast/multicast 803 routing table, clear the bit. 804 (3) Return to Step 1 and process the next entry. 806 (b) new_metric < 16 & new_metric < current_metric 808 (1) Update the entry and clear the bit in the 809 broadcast/multicast routing table corresponding to the old 810 next_hop port. 811 (2) Initialize EXPIRATION_TIMER, GC_TIMER, and PORT_EXPIRATION_TIMER 812 for the new next_hop port. 814 (3) Mark a bit in the broadcast/multicast routing table 815 corresponding to the new next_hop port and start 816 FORWARD_DELAY_TIMER. 817 (4) Set the route change flag and invoke triggered update with 818 poisoned reverse for the new next_hop. 819 (5) Return to Step 1 to process the next entry. 821 (c) new_metric < 16 & new_metric = current_metric 823 If a new route with the same metric value as the existing 824 routing table entry is received, use the old one as follows; 826 (1) If the new next hop is equal to the current one, initialize 827 EXPIRATION_TIMER and GC_TIMER. Otherwise, ignore this update. 828 (2) If the bit corresponding to the port, from which the update 829 message was received, is marked in the broadcast/multicast 830 routing table, clear the bit. 831 (3) Return to Step 1 to process the next entry. 833 (d) the new metric = 16 & the new next hop = the current one 835 If the current metric is not equal to 16, this is a new 836 unreachable information. Then, 837 (1) Update the entry and clear the bit in the 838 broadcast/multicast routing table corresponding to the old 839 next_hop port. 840 (2) If this route is for the current VSS, select a new VSS in 841 the valid routing table entries. Valid means that the 842 destination is reachable. 843 (3) Set the route change flag and invoke triggered update 844 process to notify the unreachable route. 845 Otherwise, 846 do nothing and return to Step 1 to process the next entry. 848 (e) the new metric = 16 & the new next hop /= the current one 850 (1) If the bit corresponding to the port, from which the 851 update message was received, is marked in the 852 broadcast/multicast routing table, clear the bit. 853 (2) Return to Step 1 to process the next entry. 855 5.5 Timers 857 The timer routine increments the following timers and executes its 858 associated process on their expiration. 860 (1) EXPIRATION_TIMER and GC_TIMER 862 The EXPIRATION_TIMERs and GC_TIMERs of each entry in the unicast 863 routing table are incremented every FULL_UPDATE_TIME (10 seconds by 864 default). When a EXPIRATION_TIMER expires, the metric is changed to 865 unreachable(16), update process is triggered, and GC_TIMER is 866 started. When a GC_TIMER expires, the entry is deleted from the 867 local routing table. EXPIRATION_TIMER and GC_TIMER are cleared every 868 time a switch receives a routing update. 870 (2) FORWARD_DELAY_TIMER 872 FORWARD_DELAY_TIMER is completely handled in the receive process and 873 has no relation to the timer routine. 875 (3) PORT_EXPIRATION_TIMER 877 PORT_EXPIRATION_TIMERs associated with each bit in the 878 broadcast/multicast routing table are incremented every 879 FULL_UPDATE_TIME (10 seconds by default). When the timer expires, 880 the corresponding downstream switch is considered to be down and the 881 corresponding bit in the broadcast/multicast routing table is 882 cleared. This timer is cleared by the receive process every time a 883 poisoned reverse packet is received from the corresponding switch. 885 6. Further considerations on implementation 887 6.1 Port State 889 A switch assumes that every port is connected to a switch initially. 890 Thus, it sends update packets to every port. When a node is connected 891 to a port, the switch recognizes it by receiving an NSP request 892 packet, and stops sending SSP packets to the port. Whenever a switch 893 detects a connection failure such as loss of signal and out-of- 894 synchronization, it should clear the internal state table 895 corresponding of the port. 897 6.2 Half way connection problem 899 A port consists of two channels, transmit and receive. Although it is 900 easy for a node or a switch to detect a receive channel failure, 901 transmit channel failure may not be detected, causing half way 902 connection. This results in a black hole. 904 Thus, whenever a switch receives a SSP update packet from a port, it 905 SHOULD check the status of the corresponding transmit channel. 906 SONET/SDH has a feedback mechanism for that purpose. The status of 907 the local transmit channel received at the remote end can be sent 908 back utilizing the overhead part, FEBE(Far End Block Error) and 909 FERF(Far End Receive Failure), of the corresponding receive channel. 910 If the signals indicates that the transmit channel has a problem, the 911 SSP packet received from the remote end should be silently discarded. 912 However, some SONET/SDH services do not provide path overhead 913 transparency. 915 Although, SONET/SDH APS(Automatic Protection Switching) can be 916 utilized to switch service from a failed line to a spare line, the 917 function is out of scope of this protocol. 919 7. Security Considerations 921 Security issues are not discussed in this memo. 923 References 925 [1] K. Murakami and M. Maruyama, "MAPOS - Multiple Access Protocol 926 over SONET/SDH Version 1," May 1997. 928 [2] CCITT Recommendation G.707: Synchronous Digital Hierarchy Bit 929 Rates, 1990. 931 [3] CCITT Recommendation G.708: Network Node Interface for Synchronous 932 Digital Hierarchy, 1990. 934 [4] CCITT Recommendation G.709: Synchronous Multiplexing Structure, 935 1990. 937 [5] American National Standard for Telecommunications - Digital 938 Hierarchy - Optical Interface Rates and Formats Specification, 939 ANSI T1.105-1991. 941 [6] Hedrick, C., "Routing Information Protocol", STD 34, RFC 1058, 942 Rutgers University, June 1988. 944 [7] G. Malkin., "RIP Version 2 - Carrying Additional Information ", 945 RFC1723, Xylogics, Inc., November 1994. 947 [8] T. Pusateri, "Distance Vector Multicast Routing Protocol", 948 draft-ietf-idmr-dvmrp-v3-03, September 1996 950 [9] K. Murakami and M. Maruyama, "A MAPOS version 1 Extension - 951 Node Switch Protocol," May 1997. 953 [10] M. Maruyama and K. Murakami, "MAPOS Version 1 Assigned Numbers," 954 May, 1997. 956 Acknowledgements 958 The authors would like to acknowledge the contributions and 959 thoughtful suggestions of John P. Mullaney, Clark Bremer, Masayuki 960 Kobayashi, Paul Francis, Toshiaki Yoshida, Takahiro Sajima, and 961 Satoru Yagi. 963 Authors' Address 964 Ken Murakami 965 NTT Software Laboratories 966 3-9-11, Midori-cho 967 Musashino-shi 968 Tokyo 180, Japan 969 E-mail: murakami@ntt-20.ecl.net 971 Mitsuru Maruyama 972 NTT Software Laboratories 973 3-9-11, Midori-cho 974 Musashino-shi 975 Tokyo 180, Japan 976 E-mail: mitsuru@ntt-20.ecl.net