idnits 2.17.1 draft-ietf-forces-sctptml-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 23, 2009) is 5240 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306) ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: A later version (-32) exists of draft-ietf-tsvwg-sctpsocket-19 -- Obsolete informational reference (is this intentional?): RFC 3768 (Obsoleted by RFC 5798) Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Hadi Salim 3 Internet-Draft Mojatatu Networks 4 Intended status: Standards Track K. Ogawa 5 Expires: May 27, 2010 NTT Corporation 6 November 23, 2009 8 SCTP based TML (Transport Mapping Layer) for ForCES protocol 9 draft-ietf-forces-sctptml-07 11 Abstract 13 This document defines the SCTP based TML (Transport Mapping Layer) 14 for the ForCES protocol. It explains the rationale for choosing the 15 SCTP (Stream Control Transmission Protocol) and also describes how 16 this TML addresses all the requirements required by and the ForCES 17 protocol. 19 Status of this Memo 21 This Internet-Draft is submitted to IETF in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt. 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 This Internet-Draft will expire on May 27, 2010. 42 Copyright Notice 44 Copyright (c) 2009 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the BSD License. 57 This document may contain material from IETF Documents or IETF 58 Contributions published or made publicly available before November 59 10, 2008. The person(s) controlling the copyright in some of this 60 material may not have granted the IETF Trust the right to allow 61 modifications of such material outside the IETF Standards Process. 62 Without obtaining an adequate license from the person(s) controlling 63 the copyright in such materials, this document may not be modified 64 outside the IETF Standards Process, and derivative works of it may 65 not be created outside the IETF Standards Process, except to format 66 it for publication as an RFC or to translate it into languages other 67 than English. 69 Table of Contents 71 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 73 3. Protocol Framework Overview . . . . . . . . . . . . . . . . . 3 74 3.1. The PL . . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 3.2. The TML . . . . . . . . . . . . . . . . . . . . . . . . . 5 76 3.2.1. TML and PL Interfaces . . . . . . . . . . . . . . . . 5 77 3.2.2. TML Parameterization . . . . . . . . . . . . . . . . . 6 78 4. SCTP TML overview . . . . . . . . . . . . . . . . . . . . . . 7 79 4.1. Rationale for using SCTP for TML . . . . . . . . . . . . . 7 80 4.2. Meeting TML requirements . . . . . . . . . . . . . . . . . 8 81 4.2.1. SCTP TML Channels . . . . . . . . . . . . . . . . . . 9 82 4.2.2. Satisfying TML Requirements . . . . . . . . . . . . . 14 83 5. SCTP TML Channel Work . . . . . . . . . . . . . . . . . . . . 16 84 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 7.1. IPsec Usage . . . . . . . . . . . . . . . . . . . . . . . 17 87 7.1.1. SAD and SPD setup . . . . . . . . . . . . . . . . . . 18 88 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 89 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 90 9.1. Normative References . . . . . . . . . . . . . . . . . . . 18 91 9.2. Informative References . . . . . . . . . . . . . . . . . . 19 92 Appendix A. Suggested SCTP TML Channel Work Implementation . . . 20 93 A.1. SCTP TML Channel Initialization . . . . . . . . . . . . . 20 94 A.2. Channel work scheduling . . . . . . . . . . . . . . . . . 20 95 A.2.1. FE Channel work scheduling . . . . . . . . . . . . . . 21 96 A.2.2. CE Channel work scheduling . . . . . . . . . . . . . . 21 97 A.3. SCTP TML Channel Termination . . . . . . . . . . . . . . . 22 98 A.4. SCTP TML NE level channel scheduling . . . . . . . . . . . 22 99 Appendix B. Suggested Service Interface . . . . . . . . . . . . . 23 100 B.1. TML Boot-strapping . . . . . . . . . . . . . . . . . . . . 23 101 B.2. TML Shutdown . . . . . . . . . . . . . . . . . . . . . . . 25 102 B.3. TML Sending and Receiving . . . . . . . . . . . . . . . . 26 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 105 1. Definitions 107 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 108 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 109 document are to be interpreted as described in RFC 2119. 111 The following definitions are taken from [RFC3654]and [RFC3746]: 113 Logical Functional Block (LFB) -- A template that represents a fine- 114 grained, logically separate aspects of FE processing. 116 ForCES Protocol -- The protocol used at the Fp reference point in the 117 ForCES Framework in [RFC3746]. 119 ForCES Protocol Layer (ForCES PL) -- A layer in the ForCES 120 architecture that embodies the ForCES protocol and the state transfer 121 mechanisms as defined in [I-D.ietf-forces-protocol]. 123 ForCES Protocol Transport Mapping Layer (ForCES TML) -- A layer in 124 ForCES protocol architecture that specifically addresses the protocol 125 message transportation issues, such as how the protocol messages are 126 mapped to different transport media (like SCTP, IP, TCP, UDP, ATM, 127 Ethernet, etc), and how to achieve and implement reliability, 128 security, etc. 130 2. Introduction 132 The ForCES (Forwarding and Control Element Separation) working group 133 in the IETF defines the architecture and protocol for separation of 134 Control Elements(CE) and Forwarding Elements(FE) in Network 135 Elements(NE) such as routers. [RFC3654] and [RFC3746] respectively 136 define architectural and protocol requirements for the communication 137 between CE and FE. The ForCES protocol layer specification 138 [I-D.ietf-forces-protocol] describes the protocol semantics and 139 workings. The ForCES protocol layer operates on top of an inter- 140 connect hiding layer known as the TML. The relationship is 141 illustrated in Figure 1. 143 This document defines the SCTP based TML for the ForCES protocol 144 layer. It also addresses all the requirements for the TML including 145 security, reliability, etc as defined in [I-D.ietf-forces-protocol]. 147 3. Protocol Framework Overview 149 The reader is referred to the Framework document [RFC3746], and in 150 particular sections 3 and 4, for an architectural overview and 151 explanation of where and how the ForCES protocol fits in. 153 There is some content overlap between the ForCES protocol 154 specification [I-D.ietf-forces-protocol] and this section (Section 3) 155 in order to provide basic context to the reader of this document. 157 The ForCES protocol layering constitutes two pieces: the PL and TML. 158 This is depicted in Figure 1. 160 +----------------------------------------------+ 161 | CE PL | 162 +----------------------------------------------+ 163 | CE TML | 164 +----------------------------------------------+ 165 ^ 166 | 167 ForCES PL |messages 168 | 169 v 170 +-----------------------------------------------+ 171 | FE TML | 172 +-----------------------------------------------+ 173 | FE PL | 174 +-----------------------------------------------+ 176 Figure 1: Message exchange between CE and FE to establish an NE 177 association 179 The PL is in charge of the ForCES protocol. Its semantics and 180 message layout are defined in [I-D.ietf-forces-protocol]. The TML is 181 necessary to connect two ForCES end-points as shown in Figure 1. 183 Both the PL and TML are standardized by the IETF. While only one PL 184 is defined, different TMLs are expected to be standardized. The TML 185 at each of the nodes (CE and FE) is expected to be of the same 186 definition in order to inter-operate. 188 When transmitting from a ForCES end-point, the PL delivers its 189 messages to the TML. The TML then delivers the PL message to the 190 destination TML(s). 192 On reception of a message, the TML delivers the message to its 193 destination PL (as described in the ForCES header). 195 3.1. The PL 197 The PL is common to all implementations of ForCES and is standardized 198 by the IETF [I-D.ietf-forces-protocol]. The PL is responsible for 199 associating an FE or CE to an NE. It is also responsible for tearing 200 down such associations. 202 An FE may use the PL to asynchronously send packets to the CE. The 203 FE may redirect via the PL (from outside the NE) various control 204 protocol packets (e.g. OSPF, etc) to the CE. Additionally, the FE 205 delivers various events that CE has subscribed-to via PL 206 [I-D.ietf-forces-model]. 208 The CE and FE may interact synchronously via the PL. The CE issues 209 status requests to the FE and receives responses via the PL. The CE 210 also configures the associated FE's LFBs' components using the PL 211 [I-D.ietf-forces-model]. 213 3.2. The TML 215 The TML is responsible for transport of the PL messages. 216 [I-D.ietf-forces-protocol] section 5 defines the requirements that 217 need to be met by a TML specification. The SCTP TML specified in 218 this document meets all the requirements specified in 219 [I-D.ietf-forces-protocol] section 5. Section 4.2.2 describes how 220 the TML requirements are met. 222 3.2.1. TML and PL Interfaces 224 There are two interfaces to the PL and TML. The specification of 225 these interfaces is out of scope for this document, but the 226 interfaces are introduced to show how they fit into the architecture 227 and summarize the function provided at the interfaces. The first 228 interface is between the PL and TML and the other is the CE Manager 229 (CEM)/FE Manager (FEM)[RFC3746] interface to both the PL and TML. 230 Both interfaces are shown in Figure 2. 232 +----------------------------+ 233 | +----------------------+ | 234 | | | | 235 +---------+ | | PL | | 236 | | | +----------------------+ | 237 |FEM/CEM |<---->| ^ | 238 | | | | | 239 +---------+ | |TML API | 240 | | | 241 | V | 242 | +----------------------+ | 243 | | | | 244 | | TML | | 245 | | | | 246 | +----------------------+ | 247 +----------------------------+ 249 Figure 2: The TML-PL interface 251 The CEM/FEM[RFC3746] interface is responsible for bootstrapping and 252 parameterization of the TML. In its most basic form the CEM/FEM 253 interface takes the form of a simple static config file which is read 254 on startup in the pre-association phase. 256 Appendix B discusses in more details the service interfaces. 258 3.2.2. TML Parameterization 260 It is expected that it should be possible to use a configuration 261 reference point, such as the FEM or the CEM, to configure the TML. 263 Some of the configured parameters may include: 265 o PL ID 267 o Connection Type and associated data. For example if a TML uses 268 IP/SCTP then parameters such as SCTP ports and IP addresses need 269 to be configured. 271 o Number of transport connections 273 o Connection Capability, such as bandwidth, etc. 275 o Allowed/Supported Connection QoS policy (or Congestion Control 276 Policy) 278 4. SCTP TML overview 280 SCTP [RFC4960] is an end-to-end transport protocol that is equivalent 281 to TCP, UDP, or DCCP in many aspects. With a few exceptions, SCTP 282 can do most of what UDP, TCP, or DCCP can achieve. SCTP as well can 283 do most of what a combination of the other transport protocols can 284 achieve (e.g. TCP and DCCP or TCP and UDP). 286 Like TCP, it provides ordered, reliable, connection-oriented, flow- 287 controlled, congestion controlled data exchange. Unlike TCP, it does 288 not provide byte streaming and instead provides message boundaries. 290 Like UDP, it can provide unreliable, unordered data exchange. Unlike 291 UDP, it does not provide multicast support 293 Like DCCP, it can provide unreliable, ordered, congestion controlled, 294 connection-oriented data exchange. 296 SCTP also provides other services that none of the 3 transport 297 protocols mentioned above provide that we found attractive. These 298 include: 300 o Multi-homing 302 o Runtime IP address binding 304 o A range of reliability shades with congestion control 306 o Built-in heartbeats 308 o Multi-streaming 310 o Message boundaries with reliability 312 o Improved SYN DOS protection 314 o Simpler transport events 316 o Simplified replicasting 318 4.1. Rationale for using SCTP for TML 320 SCTP has all the features required to provide a robust TML. As a 321 transport that is all-encompassing, it negates the need for having 322 multiple transport protocols in order to satisfy the TML requirements 323 ([I-D.ietf-forces-protocol] section 5). As a result it allows for 324 simpler coding and therefore reduces a lot of the interoperability 325 concerns. 327 SCTP is also very mature and widely used, making it a good choice for 328 ubiquitous deployment. 330 4.2. Meeting TML requirements 332 PL 333 +----------------------+ 334 | | 335 +-----------+----------+ 336 | TML API 337 TML | 338 +-----------+----------+ 339 | | | 340 | +------+------+ | 341 | | TML core | | 342 | +-+----+----+-+ | 343 | | | | | 344 | SCTP socket API | 345 | | | | | 346 | | | | | 347 | +-+----+----+-+ | 348 | | SCTP | | 349 | +------+------+ | 350 | | | 351 | | | 352 | +------+------+ | 353 | | IP | | 354 | +-------------+ | 355 +----------------------+ 357 Figure 3: The TML-SCTP interface 359 Figure 3 details the interfacing between the PL and SCTP TML and the 360 internals of the SCTP TML. The core of the TML interacts on its 361 north-bound interface to the PL (utilizing the TML API). On the 362 south-bound interface, the TML core interfaces to the SCTP layer 363 utilizing the standard socket interface[I-D.ietf-tsvwg-sctpsocket]. 364 There are three SCTP socket connections opened between any two PL 365 endpoints (whether FE or CE). 367 4.2.1. SCTP TML Channels 369 +--------------------+ 370 | | 371 | TML core | 372 | | 373 +-+-------+--------+-+ 374 | | | 375 | Med prio, | 376 | Semi-reliable | 377 | channel | 378 | | Low prio, 379 | | Unreliable 380 | | channel 381 | | | 382 ^ ^ ^ 383 | | | 384 Y Y Y 385 High prio,| | | 386 reliable | | | 387 channel | | | 388 Y Y Y 389 +-+--------+--------+-+ 390 | | 391 | SCTP | 392 | | 393 +---------------------+ 395 Figure 4: The TML-SCTP channels 397 Figure 4 details further the interfacing between the TML core and 398 SCTP layers. There are 3 channels used to separate and prioritize 399 the different types of ForCES traffic. Each channel constitutes a 400 socket interface. It should be noted that all SCTP channels are 401 congestion aware (and for that reason that detail is left out of the 402 description of the 3 channels). SCTP port 6704, 6705, 6706 are used 403 for the higher, medium and lower priority channels respectively. 404 SCTP Payload Protocol ID (PPID) values of 21, 22, and 23 are used for 405 the higher, medium and lower priority channels respectively. 407 4.2.1.1. Justifying Choice of 3 Sockets 409 SCTP allows up to 64K streams to be sent over a single socket 410 interface. The authors initially envisioned using a single socket 411 for all three channels (mapping a channel to an SCTP stream). This 412 simplifies programming of the TML as well as conserves use of SCTP 413 ports. 415 Further analysis revealed head of line blocking issues with this 416 initial approach. Lower priority packets not needing reliable 417 delivery could block higher priority packets (needing reliable 418 delivery) under congestion situation for an indeterminate period of 419 time (depending on how many outstanding lower priority packets are 420 pending). For this reason, we elected to go with mapping each of the 421 three channels to a different SCTP socket (instead of a different 422 stream within a single socket). 424 4.2.1.2. Higher Priority, Reliable channel 426 The higher priority (HP) channel uses a standard SCTP reliable socket 427 on port 6704. SCTP PPID 21 is used for all messages on the HP 428 channel. The HP channel is used for CE solicited messages and their 429 responses: 431 1. ForCES configuration messages flowing from CE to FE and responses 432 from the FE to CE. 434 2. ForCES query messages flowing from CE to FE and responses from 435 the FE to the CE. 437 PL priorities 4-7 MUST be used for all PL messages using this 438 channel. The following PL messages MUST use the HP channel for 439 transport: 441 o Association Setup (default priority: 7) 443 o Association Setup Response (default priority: 7) 445 o Association Teardown (default priority: 7) 447 o Config (default priority: 4) 449 o Config Response (default priority: 4) 451 o Query (default priority: 4) 453 o Query Response (default priority: 4) 455 If PL priorities outside of the specified range (4-7) priority, PPID 456 or PL message types other than the above are received on the HP 457 channel, then the PL message MUST be dropped. 459 Although an implementation may choose different values from the 460 defined range (4-7), it is RECOMMENDED that default priorities be 461 used. A response to a ForCES message MUST contain the same priority 462 as the request. Example, a config sent by the CE with priority 5 463 MUST have a config-response from the FE with priority 5. 465 4.2.1.3. Medium Priority, Semi-Reliable channel 467 The medium priority (MP) channel uses SCTP-PR on port 6705. SCTP 468 PPID 22 MUST be used for all messages on the MP channel. Time limits 469 on how long a message is valid are set on each outgoing message. 470 This channel is used for events from the FE to the CE that are 471 obsoleted over time. Events that are accumulative in nature and are 472 recoverable by the CE (by issuing a query to the FE) can tolerate 473 lost events and therefore should use this channel. For example, a 474 generated event which carries the value of a counter that is 475 monotonically incrementing fits to use this channel. 477 PL priority 3 MUST be used for PL messages on this channel. The 478 following PL messages MUST use the MP channel for transport: 480 o Event Notification (default priority: 3) 482 If PL priority outside of the specified priority, PPID or PL message 483 type other than the above are received on the MP channel, then the PL 484 message MUST be dropped. 486 4.2.1.4. Lower Priority, Unreliable channel 488 The lower priority (LP) channel uses SCTP port 6706. SCTP PPID 23 is 489 used for all messages on the LP channel. The LP channel also MUST 490 use SCTP-PR with lower timeout values than the MP channel. The 491 reason an unreliable channel is used for redirect messages is to 492 allow the control protocol at both the CE and its peer-endpoint to 493 take charge of how the end-to-end semantics of the said control 494 protocol's operations. For example: 496 1. Some control protocols are reliable in nature, therefore making 497 this channel reliable introduces an extra layer of reliability 498 which could be harmful. So any end-to-end retransmits will 499 happen from remote. 501 2. Some control protocols may desire to have obsolescence of 502 messages over retransmissions; making this channel reliable 503 contradicts that desire. 505 Given ForCES PL heartbeats are traffic sensitive, sending them over 506 the LP channel also makes sense. If the other end is not processing 507 other channels it will eventually get heartbeats; and if it is busy 508 processing other channels heartbeats will be obsoleted locally over 509 time (and it does not matter if they did not make it). 511 PL priorities 1-2 MUST be used for PL messages on this channel. PL 512 messages that MUST use the MP channel for transport are: 514 o Packet Redirect (default priority: 2) 516 o Heartbeats (default priority: 1) 518 If PL priorities outside of the specified priority range, PPID or PL 519 message types other than the above are received on the LP channel, 520 then the PL message MUST be dropped. 522 4.2.1.5. Scheduling of The 3 Channels 524 Strict priority work-conserving scheduling is used to process both on 525 sending and receiving (of the PL messages) by the TML Core as shown 526 in Figure 5. 528 This means that the HP messages are always processed first until 529 there are no more left. The LP channel is processed only if channels 530 that are a higher priority than itself has no more messages left to 531 process. This means that under congestion situation, a higher 532 priority channel with sufficient messages that occupy the available 533 bandwidth would starve lower priority channel(s). 535 The design intent of the SCTP TML is to tie processing prioritization 536 as described in Section 4.2.1.1 and transport congestion control to 537 provide implicit node congestion control. This is further detailed 538 in Appendix A.2. 540 SCTP channel +----------+ 541 Work available | DONE +---<--<--+ 542 | +---+------+ | 543 Y ^ 544 | +-->--+ +-->---+ | 545 +-->-->-+ | | | | | 546 | | | | | | ^ 547 | ^ ^ Y ^ Y | 548 ^ / \ | | | | | 549 | / \ | ^ | ^ ^ 550 | / Is \ | / \ | / \ | 551 | / there \ | /Is \ | /Is \ | 552 ^ / HP work \ ^ /there\ ^ /there\ ^ 553 | \ ? / | /MP work\ | /LP work\ | 554 | \ / | \ ? / | \ ? / | 555 | \ / | \ / | \ / ^ 556 | \ / ^ \ / ^ \ / | 557 | \ / | \ / | \ / | 558 ^ Y-->-->-->+ Y-->-->-->+ Y->->->-+ 559 | | NO | NO | NO 560 | | | | 561 | Y Y Y 562 | | YES | YES | YES 563 ^ | | | 564 | Y Y Y 565 | +----+------+ +---|-------+ +----|------+ 566 | |- process | |- process | |- process | 567 | | HP work | | MP work | | LP work | 568 | +------+----+ +-----+-----+ +-----+-----+ 569 | | | | 570 ^ Y Y Y 571 | | | | 572 | Y Y Y 573 +--<--<---+--<--<----<----+-----<---<-----+ 575 Figure 5: SCTP TML Strict Priority Scheduling 577 4.2.1.6. SCTP TML Parameterization 579 The following is a list of parameters needed for booting the TML. It 580 is expected these parameters will be extracted via the FEM/CEM 581 interface for each PL ID. 583 1. The IP address(es) or a resolvable DNS/hostname(s) of the CE/FE. 585 2. Whether to use IPsec or not. If IPsec is used, how to 586 parameterize the different required ciphers, keys etc as 587 described in Section 7.1 589 3. The HP SCTP port, as discussed in Section 4.2.1.2. The default 590 HP port value is 6704 (Section 6). 592 4. The MP SCTP port, as discussed in Section 4.2.1.3. The default 593 MP port value is 6705 (Section 6). 595 5. The LP SCTP port, as discussed in Section 4.2.1.4. The default 596 LP port value is 6706 (Section 6). 598 4.2.2. Satisfying TML Requirements 600 [I-D.ietf-forces-protocol] section 5 lists requirements that a TML 601 needs to meet. This section describes how the SCTP TML satisfies 602 those requirements. 604 4.2.2.1. Satisfying Reliability Requirement 606 As mentioned earlier, a shade of reliability ranges is possible in 607 SCTP. Therefore this requirement is met. 609 4.2.2.2. Satisfying Congestion Control Requirement 611 Congestion control is built into SCTP. Therefore, this requirement 612 is met. 614 4.2.2.3. Satisfying Timeliness and Prioritization Requirement 616 By using 3 sockets in conjunction with the partial-reliability 617 feature, both timeliness and prioritization requirements are 618 addressed. 620 4.2.2.4. Satisfying Addressing Requirement 622 There are no extra headers required for SCTP to fulfil this 623 requirement. SCTP can be told to replicast packets to multiple 624 destinations. The TML implementation will need to translate PL 625 addresses, to a variety of unicast IP addresses in order to emulate 626 multicast and broadcast PL addresses. 628 4.2.2.5. Satisfying High Availability Requirement 630 Transport link resiliency is one of SCTP's strongest point. Failure 631 detection and recovery is built in, as mentioned earlier. 633 o The SCTP multi-homing feature is used to provide path diversity. 634 Should one of the peer IP addresses become unreachable, the 635 other(s) are used without needing lower layer convergence 636 (routing, for example) or even the TML becoming aware. 638 o SCTP heartbeats and data transmission thresholds are used on a per 639 peer IP address to detect reachability faults. The faults could 640 be a result of an unreachable address or peer, which may be caused 641 by a variety of reasons, like interface, network, or endpoint 642 failures. The cause of the fault is noted. 644 o With the ADDIP feature, one can migrate IP addresses to other 645 nodes at runtime. This is not unlike the VRRP[RFC3768] protocol 646 use. This feature is used in addition to multi-homing in a 647 planned migration of activity from one FE/CE to another. In such 648 a case, part of the provisioning recipe at the CE for replacing an 649 FE involves migrating activity of one FE to another. 651 4.2.2.6. Satisfying Node Overload Prevention Requirement 653 The architecture of this TML defines three separate channels, one per 654 socket, to be used within any FE-CE setup. The scheduling design for 655 processing the TML channels (Section 4.2.1.5) is strict priority. A 656 fundamental desire of the strict prioritization is to ensure that 657 more important processing work always gets node resources over lesser 658 important work. 660 When a ForCES node CPU is overwhelmed because the incoming packet 661 rate is higher than it can keep up with, the channel queues grow and 662 transport congestion subsequently follows. By virtue of using SCTP, 663 the congestion is propagated back to the source of the incoming 664 packets and eventually alleviated. 666 The HP channel work gets prioritized at the expense of the MP which 667 gets prioritized over LP channels. The preferential scheduling only 668 kicks in when there is node overload regardless of whether there is 669 transport congestion. As a result of the preferential work 670 treatment, the ForCES node achieves a robust steady processing 671 capacity. Refer to Appendix A.2 for details on scheduling. 673 For an example of how the overload prevention works: consider a 674 scenario where an overwhelming amount redirected packets (from 675 outside the NE) coming into the NE may overload the FE while it has 676 outstanding config work from the CE. In such a case, the FE, while 677 it is busy processing config requests from the CE essentially ignores 678 processing the redirect packets on the LP channel. If enough 679 redirect packets accumulate, they are dropped either because the LP 680 channel threshold is exceeded or because they are obsoleted. If on 681 the other hand, the FE has successfully processed the higher priority 682 channels and their related work, then it can proceed and process the 683 LP channel. So as demonstrated in this case, the TML ties transport 684 congestion and node overload implicitly together. 686 4.2.2.7. Satisfying Encapsulation Requirement 688 The SCTP TML sets SCTP PPIDs to identify channels used as described 689 in Section 4.2.1.1. 691 5. SCTP TML Channel Work 693 There are two levels of TML channel work within an NE when a ForCES 694 node (CE or FE) is connected to multiple other ForCES nodes: 696 1. NE-level I/O work where a ForCES node (CE or FE) needs to choose 697 which of the peer nodes to process. 699 2. Node-level I/O work where a ForCES node, handles the three SCTP 700 TML channels separately for each single ForCES endpoint. 702 NE-level scheduling definition is left up to the implementation and 703 is considered out of scope for this document. Appendix A.4 discuss 704 briefly some constraints that an implementer needs to worry about. 706 This document provides suggestions on SCTP channel work 707 implementation in Appendix A. 709 The FE SHOULD do channel connections to the CE in the order of 710 incrementing priorities i.e. LP socket first, followed by MP and 711 ending with HP socket connection. The CE, however, MUST NOT assume 712 that there is ordering of socket connections from any FE. 714 6. IANA Considerations 716 Following the policies outlined in "Guidelines for Writing an IANA 717 Considerations Section in RFCs" [RFC5226], the following name spaces 718 are defined in ForCES SCTP TML. 720 o SCTP port 6704 for the HP channel, 6705 for the MP channel, and 721 6706 for the LP channel. 723 o SCTP Payload Protocol ID (PPID) 21 for the HP channel, 22 for the 724 MP channel, and 23 for the LP channel. 726 XXX [Note to IANA]: Port allocations(SCTP 6700-6702) were made in 727 August 2009. We have been asked by IESG to change these as 728 prescribed above. 730 7. Security Considerations 732 The SCTP TML provides the following security services to the PL: 734 o A mechanism to authenticate ForCES CEs and FEs at transport level 735 in order to prevent the participation of unauthorized CEs and 736 unauthorized FEs in the control and data path processing of a 737 ForCES NE. 739 o A mechanism to ensure message authentication of PL data and 740 headers transferred from the CE to FE (and vice-versa) in order to 741 prevent the injection of incorrect data into PL messages. 743 o A mechanism to ensure the confidentiality of PL data and headers 744 transferred from the CE to FE (and vice-versa), in order to 745 prevent disclosure of PL information transported via the TML. 747 Security choices provided by the TML are made by the operator and 748 take effect during the pre-association phase of the ForCES protocol. 749 An operator may choose to use all, some or none of the security 750 services provided by the TML in a CE-FE connection. 752 When operating under a secured environment, or for other operational 753 concerns (in some cases performance issues) the operator may turn off 754 all the security functions between CE and FE. 756 IP Security Protocol (IPsec) [RFC4301] is used to provide needed 757 security mechanisms. 759 IPsec is an IP level security scheme transparent to the higher-layer 760 applications and therefore can provide security for any transport 761 layer protocol. This gives IPsec the advantage that it can be used 762 to secure everything between the CE and FE without expecting the TML 763 implementation to be aware of the details. 765 The IPsec architecture is designed to provide message integrity and 766 message confidentiality outlined in the TML security requirements 767 [I-D.ietf-forces-protocol]. Mutual authentication and key exchange 768 protocol are provided by Internet Key Exchange (IKE)[RFC2409]. 770 7.1. IPsec Usage 772 A ForCES FE or CE MUST support the following: 774 o Internet Key Exchange (IKE)[RFC2409] with certificates for 775 endpoint authentication. 777 o Transport Mode Encapsulating Security Payload (ESP)[RFC4303]. 779 o HMAC-SHA1-96 [RFC2404] for message integrity protection 781 o AES-CBC with 128-bit keys [RFC3602] for message confidentiality. 783 o Replay protection[RFC4301]. 785 It is expected to be possible for the CE or FE to be operationally 786 configured to negotiate other cipher suites and even use manual 787 keying. 789 7.1.1. SAD and SPD setup 791 To minimize the operational configuration it is recommended that only 792 the IANA issued SCTP protocol number(132) be used as a selector in 793 the Security Policy Database (SPD) for ForCES. In such a case only a 794 single SPD and SAD entry is needed. 796 It should be straightforward to extend such a policy to alternatively 797 use the 3 SCTP TML port numbers as SPD selectors. But as noted above 798 this choice will require increased number of SPD entries. 800 In scenarios where multiple IP addresses are used within a single 801 association, and there is desire to configure different policies on a 802 per IP address, then it is recommended to follow [RFC3554] 804 8. Acknowledgements 806 The authors would like to thank Joel Halpern, Michael Tuxen, Randy 807 Stewart, Evangelos Haleplidis, Chuanhuang Li, Lars Eggert, Avshalom 808 Houri, Adrian Farrel, Juergen Quittek, Magnus Westerlund, and Pasi 809 Eronen for engaging us in discussions that have made this document 810 better. 812 Ross Callon was an excellent manager who persevered in providing us 813 guidance and Joel Halpern was an excellent document shepherd without 814 whom this document would have taken longer to publish. 816 9. References 818 9.1. Normative References 820 [I-D.ietf-forces-protocol] 821 Dong, L., Doria, A., Gopal, R., HAAS, R., Salim, J., 822 Khosravi, H., and W. Wang, "ForCES Protocol 823 Specification", draft-ietf-forces-protocol-22 (work in 824 progress), March 2009. 826 [RFC2404] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within 827 ESP and AH", RFC 2404, November 1998. 829 [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange 830 (IKE)", RFC 2409, November 1998. 832 [RFC3554] Bellovin, S., Ioannidis, J., Keromytis, A., and R. 833 Stewart, "On the Use of Stream Control Transmission 834 Protocol (SCTP) with IPsec", RFC 3554, July 2003. 836 [RFC3602] Frankel, S., Glenn, R., and S. Kelly, "The AES-CBC Cipher 837 Algorithm and Its Use with IPsec", RFC 3602, 838 September 2003. 840 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 841 Internet Protocol", RFC 4301, December 2005. 843 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 844 RFC 4303, December 2005. 846 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 847 RFC 4960, September 2007. 849 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 850 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 851 May 2008. 853 9.2. Informative References 855 [I-D.ietf-forces-model] 856 Halpern, J. and J. Salim, "ForCES Forwarding Element 857 Model", draft-ietf-forces-model-16 (work in progress), 858 October 2008. 860 [I-D.ietf-tsvwg-sctpsocket] 861 Stewart, R., Poon, K., Tuexen, M., Yasevich, V., and P. 862 Lei, "Sockets API Extensions for Stream Control 863 Transmission Protocol (SCTP)", 864 draft-ietf-tsvwg-sctpsocket-19 (work in progress), 865 February 2009. 867 [RFC3654] Khosravi, H. and T. Anderson, "Requirements for Separation 868 of IP Control and Forwarding", RFC 3654, November 2003. 870 [RFC3746] Yang, L., Dantu, R., Anderson, T., and R. Gopal, 871 "Forwarding and Control Element Separation (ForCES) 872 Framework", RFC 3746, April 2004. 874 [RFC3768] Hinden, R., "Virtual Router Redundancy Protocol (VRRP)", 875 RFC 3768, April 2004. 877 Appendix A. Suggested SCTP TML Channel Work Implementation 879 As mentioned in Section 5, there are two levels of TML channel work 880 within an NE when a ForCES node (CE or FE) is connected to multiple 881 other ForCES nodes: 883 1. NE-level I/O work where a ForCES node (CE or FE) needs to choose 884 which of the peer nodes to process. 886 2. Node-level I/O work where a ForCES node, handles the three SCTP 887 TML channels separately for each single ForCES endpoint. 889 NE-level scheduling definition is left up to the implementation and 890 is considered out of scope for this document. Appendix A.4 discusses 891 briefly some constraints that an implementer needs to worry about. 893 This document and in particular Appendix A.1, Appendix A.2 and 894 Appendix A.3 discuss details of node-level I/O work. 896 A.1. SCTP TML Channel Initialization 898 As discussed in Section 5, it is recommended that the FE SHOULD do 899 socket connections to the CE in the order of incrementing priorities 900 i.e. LP socket first, followed by MP and ending with HP socket 901 connection. The CE, however, MUST NOT assume that there is ordering 902 of socket connections from any FE. Appendix B.1 has more details on 903 the expected initialization of SCTP channel work. 905 A.2. Channel work scheduling 907 This section provides high level details of the scheduling view of 908 the SCTP TML core (Section 4.2.1). A practical scheduler 909 implementation takes care of many little details (such as timers, 910 work quanta, etc) not described in this document. It is left to the 911 implementer to take care of those details. 913 The CE(s) and FE(s) are coupled together in the principles of the 914 scheduling scheme described here to tie together node overload with 915 transport congestion. The design intent is to provide the highest 916 possible robust work throughput for the NE under any network or 917 processing congestion. 919 A.2.1. FE Channel work scheduling 921 The FE scheduling, in priority order, needs to I/O process: 923 1. The HP channel I/O in the following priority order: 925 1. Transmitting back to the CE any outstanding result of 926 executed work via the HP channel transmit path. 928 2. Taking new incoming work from the CE which creates ForCES 929 work to be executed by the FE. 931 2. ForCES events which result in transmission of unsolicited ForCES 932 packets to the CE via the MP channel. 934 3. Incoming Redirect work in the form of control packets that come 935 from the CE via LP channel. After redirect processing, these 936 packets get sent out on external (to the NE) interface. 938 4. Incoming Redirect work in the form of control packets that come 939 from other NEs via external (to the NE) interfaces. After some 940 processing, such packets are sent to the CE. 942 It is worth emphasizing at this point again that the SCTP TML 943 processes the channel work in strict priority. For example, as long 944 as there are messages to send to the CE on the HP channel, they will 945 be processed first until there are no more left before processing the 946 next priority work (which is to read new messages on the HP channel 947 incoming from the CE). 949 A.2.2. CE Channel work scheduling 951 The CE scheduling, in priority order, needs to deal with: 953 1. The HP channel I/O in the following priority order: 955 1. Process incoming responses to requests of work it made to the 956 FE(s). 958 2. Transmitting any outstanding HP work it needs for the FE(s) 959 to complete. 961 2. Incoming ForCES events from the FE(s) via the MP channel. 963 3. Outgoing Redirect work in the form of control packets that get 964 sent from the CE via LP channel destined to external (to the NE) 965 interface on FE(s). 967 4. Incoming Redirect work in the form of control packets that come 968 from other NEs via external (to the NE) interfaces on the FE(s). 970 It is worth to repeat for emphasis again that the SCTP TML processes 971 the channel work in strict priority. For example, if there are 972 messages incoming from an FE on the HP channel, they will be 973 processed first until there are no more left before processing the 974 next priority work which is to transmit any outstanding HP channel 975 messages going to the FE. 977 A.3. SCTP TML Channel Termination 979 Appendix B.2 describes a controlled disassociation of the FE from the 980 NE. 982 It is also possible for connectivity to be lost between the FE and CE 983 on one or more sockets. In cases where SCTP multi-homing features 984 are used for path availability, the disconnection of a socket will 985 only occur if all paths are unreachable; otherwise, SCTP will ensure 986 reachability. In the situation of a total connectivity loss of even 987 one SCTP socket, it is recommended that the FE and CE SHOULD assume a 988 state equivalent to ForCES Association Teardown being issued and 989 follow the sequence described in Appendix B.2. 991 A CE could also disconnect sockets to an FE to indicate an "emergency 992 teardown". The "emergency teardown" may be necessary in cases when a 993 CE needs to disconnect an FE but knows that an FE is busy processing 994 a lot of outstanding commands (some of which the FE hasn't got around 995 to processing yet). By virtue of the CE closing the connections, the 996 FE will immediately be asynchronously notified and will not have to 997 process any outstanding commands from the CE. 999 A.4. SCTP TML NE level channel scheduling 1001 In handling NE-level I/O work, an implementation needs to worry about 1002 being both fair and robust across peer ForCES nodes. 1004 Fairness is desired so that each peer node makes progress across the 1005 NE. For the sake of illustration consider two FEs connected to a CE; 1006 whereas one FE has a few HP messages that need to be processed by the 1007 CE, another may have infinite HP messages. The scheduling scheme may 1008 decide to use a quota scheduling system to ensure that the second FE 1009 does not hog the CE cycles. 1011 Robustness is desired so that the NE does not succumb to a DoS attack 1012 from hostile entities and always achieves a maximum stable workload 1013 processing level. For the sake of illustration consider again two 1014 FEs connected to a CE. Consider FE1 as having a large number of HP 1015 and MP messages and FE2 having a large number of MP and LP messages. 1016 The scheduling scheme needs to ensure that while FE1 always gets its 1017 messages processed, at some point we allow FE2 messages to be 1018 processed. A promotion and preemption based scheduling could be used 1019 by the CE to resolve this issue. 1021 Appendix B. Suggested Service Interface 1023 This section outlines high level service interface between FEM/CEM 1024 and TML, the PL and TML, and between local and remote TMLs. The 1025 intent of this interface discussion is to provide general guidelines. 1026 The implementer is expected to care of details and even follow a 1027 different approach if needed. 1029 The theory of operation for the PL-TML service is as follows: 1031 1. The PL starts up and bootstraps the TML. The end result of a 1032 successful TML bootstrap is that the CE TML and the FE TML 1033 connect to each other at the transport level. 1035 2. Transmission and reception of the PL messages commences after a 1036 successful TML bootstrap. The PL uses send and receive PL-TML 1037 interfaces to communicate to its peers. The TML is agnostic to 1038 the nature of the messages being sent or received. The first 1039 message exchanges that happen are to establish ForCES 1040 association. Subsequent messages maybe either unsolicited events 1041 from the FE PL, control message redirects from/to the CE to/from 1042 FE, and configuration from the CE to the FE and their responses 1043 flowing from the FE to the CE. 1045 3. The PL does a shutdown of the TML after terminating ForCES 1046 association. 1048 B.1. TML Boot-strapping 1050 Figure 6 illustrates a flow for the TML bootstrapped by the PL. 1052 When the PL starts up (possibly after some internal initialization), 1053 it boots up the TML. The TML first interacts with the FEM/CEM and 1054 acquires the necessary TML parameterization (Section 4.2.1.6). Next 1055 the TML uses the information it retrieved from the FEM/CEM interface 1056 to initialize itself. 1058 The TML on the FE proceeds to connect the 3 channels to the CE. The 1059 socket interface is used for each of the channels. The TML continues 1060 to re-try the connections to the CE until all 3 channels are 1061 connected. It is advisable that the number of connection retry 1062 attempts and the time between each retry is also configurable via the 1063 FEM. On failure to connect one or more channels, and after the 1064 configured number of retry thresholds is exceeded, the TML will 1065 return an appropriate failure indicator to the PL. On success (as 1066 shown in Figure 6), a success indication is presented to the PL. 1068 FE PL FE TML FEM CEM CE TML CE PL 1069 | | | | | | 1070 | | | | | Bootup | 1071 | | | | |<-------------------| 1072 | Bootup | | | | | 1073 |----------->| | |get CEM info| | 1074 | |get FEM info | |<-----------| | 1075 | |------------>| ~ ~ | 1076 | ~ ~ |----------->| | 1077 | |<------------| | | 1078 | | |-initialize TML | 1079 | | |-create the 3 chans.| 1080 | | | to listen to FEs | 1081 | | | | 1082 | |-initialize TML |Bootup success | 1083 | |-create the 3 chans. locally |------------------->| 1084 | |-connect 3 chans. remotely | | 1085 | |------------------------------>| | 1086 | ~ ~ - FE TML connected ~ 1087 | ~ ~ - FE TML info init ~ 1088 | | channels connected | | 1089 | |<------------------------------| | 1090 | Bootup | | | 1091 | succeeded | | | 1092 |<-----------| | | 1093 | | | | 1095 Figure 6: SCTP TML Bootstrapping 1097 On the CE things are slightly different. After initializing from the 1098 CEM, the TML on the CE side proceeds to initialize the 3 channels to 1099 listen to remote connections from the FEs. The success or failure 1100 indication is passed on to the CE PL (in the same manner as was done 1101 in the FE). 1103 Post boot-up, the CE TML waits for connections from the FEs. Upon a 1104 successful connection by an FE, the CE TML level keeps track of the 1105 transport level details of the FE. Note, at this stage only 1106 transport level connection has been established; ForCES level 1107 association follows using send/receive PL-TML interfaces (refer to 1108 Appendix B.3 and Figure 8). 1110 B.2. TML Shutdown 1112 Figure 7 shows an example of an FE shutting down the TML. It is 1113 assumed at this point that the ForCES Association Teardown has been 1114 issued by the CE. It should also be noted that different 1115 implementations may have different procedures for cleaning up state 1116 etc. 1118 When the FE PL issues a shutdown to its TML for a specific PL ID, the 1119 TML releases all the channel connections to the CE. This is achieved 1120 by closing the sockets used to communicate to the CE. This results 1121 in the stack sending a SCTP shutdown which is received on the CE. 1123 FE PL FE TML CE TML CE PL 1124 | | | | 1125 | Shutdown | | | 1126 |----------->| | | 1127 | |-disconnect 3 chans. | | 1128 | |-SCTP level shutdown | | 1129 | |------------------------>| | 1130 | | | | 1131 | | |TML detects shutdown| 1132 | | |-FE TML info cleanup| 1133 | | |-optionally tell PL | 1134 | | |------------------->| 1135 | | | | 1136 | |- clean up any state of | | 1137 | |-channels disconnected | | 1138 | |<------------------------| | 1139 | |-SCTP shutdown ACK | | 1140 | | | | 1141 | Shutdown | | | 1142 | succeeded | | | 1143 |<-----------| | | 1144 | | | | 1146 Figure 7: FE Shutting down 1148 On the CE side, a TML disconnection would result in possible cleanup 1149 of the FE state. Optionally, depending on the implementation, there 1150 may be need to inform the PL about the TML disconnection. The CE 1151 stack level SCTP sends an acknowledgement to the FE TML in response 1152 to the earlier SCTP shutdown. 1154 B.3. TML Sending and Receiving 1156 The TML should be agnostic to the content of the PL messages, or 1157 their operations. The PL should provide enough information to the 1158 TML for it to assign an appropriate priority and loss behavior to the 1159 message. Figure 8 shows an example of a message exchange originated 1160 at the FE and sent to the CE (such as a ForCES association message) 1161 which illustrates all the necessary service interfaces for sending 1162 and receiving. 1164 When the FE PL sends a message to the TML, the TML is expected to 1165 pick one of HP/MP/LP channels and send out the ForCES message. 1167 FE PL FE TML CE TML CE PL 1168 | | | | 1169 |PL send | | | 1170 |----------->| | | 1171 | | | | 1172 | | | | 1173 | |-pick channel | | 1174 | |-TML Send | | 1175 | |------------->| | 1176 | | | | 1177 | | |-TML Receive on chan. | 1178 | | |- mux to PL/PL recv | 1179 | | |--------------------->| 1180 | | | ~ 1181 | | | ~ PL Process 1182 | | | ~ 1183 | | | PL send | 1184 | | |<---------------------| 1185 | | |-pick chan to send on | 1186 | | |-TML send | 1187 | |<-------------| | 1188 | |-TML Receive | | 1189 | |-mux to PL | | 1190 | PL Recv | | | 1191 |<---------- | | | 1192 | | | | 1194 Figure 8: Send and Recv Flow 1196 When the CE TML receives the ForCES message on the channel it was 1197 sent on, it demultiplexes the message to the CE PL. 1199 The CE PL, after some processing (in this example dealing with the 1200 FE's association), sends to the TML the response. And as in the case 1201 of FE PL, the CE TML picks the channel to send on before sending. 1203 The processing of the ForCES message upon arriving at the FE TML and 1204 delivery to the FE PL is similar to the CE side equivalent as shown 1205 above in Appendix B.3. 1207 Authors' Addresses 1209 Jamal Hadi Salim 1210 Mojatatu Networks 1211 Ottawa, Ontario 1212 Canada 1214 Email: hadi@mojatatu.com 1216 Kentaro Ogawa 1217 NTT Corporation 1218 3-9-11 Midori-cho 1219 Musashino-shi, Tokyo 180-8585 1220 Japan 1222 Email: ogawa.kentaro@lab.ntt.co.jp