IP Storage Working Group Charles Monia INTERNET DRAFT Rod Mullendore Expires October 2001 Josh Tseng Nishan Systems Franco Travostino Victor Firoiu Nortel Networks David Robinson Sun Microsystems Wayland Jeong Troika Networks Rory Bolt Quantum/ATL Paul Rutherford ADIC Mark Edwards Eurologic February 2001 iFCP - A Protocol for Internet Fibre Channel Storage Networking Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Comments Comments should be sent to the ips mailing list (ips@ece.cmu.edu) or to the author(s). Monia, et al. Standard Track 1 iFCP April 2001 Status of this Memo..............................................1 Comments.........................................................1 1. Abstract................................................4 2. About This Document.....................................4 2.1 Conventions used in this document.......................4 2.2 Purpose of this document................................4 3. iFCP Introduction.......................................4 3.1 Definitions.............................................5 3.2 The iFCP Network Model..................................6 3.3 The N_PORT Addressing Model.............................8 3.3.1 Operation in Address Transparent Mode..................11 3.3.2 Operation in Address Translation Mode..................12 3.4 iFCP Layered Services..................................15 3.4.1 Application Layer......................................16 3.4.2 FC-4 Layer (FCP).......................................17 3.4.3 FC-2 Layer.............................................17 3.4.4 iFCP Layer.............................................17 4. iFCP Protocol..........................................18 4.1 Overview...............................................18 4.1.1 iFCP Transport Services................................18 4.1.2 iFCP Support for Link Services.........................18 4.2 Mandatory FC-2 Functionality...........................18 4.3 FC-2 Functionality Not Supported.......................18 4.4 Optional FC-2 Functionality............................19 5. Encapsulation of Fibre Channel Frames..................19 6. TCP Stream Transport of iFCP Frames....................19 6.1 TCP Session Model......................................19 6.2 TCP Port Numbers.......................................19 7. Link Services..........................................20 7.1 Augmented Link Service Messages........................20 7.2 Link Service Augmentation..............................21 7.3 Augmented Link Services................................23 7.3.1 Abort Exchange (ABTX)..................................23 7.3.2 Discover Address (ADISC)...............................24 7.3.3 FC Address Resolution Protocol Reply...................24 7.3.4 FC Address Resolution Protocol Request.................24 7.3.5 Logout (LOGO)..........................................24 7.3.6 Port Login (PLOGI).....................................25 7.3.7 Read Exchange Concise..................................25 7.3.8 Read Exchange Concise Accept...........................26 7.3.9 Read Exchange Status Block (RES).......................26 7.3.10 Read Exchange Status Block Accept......................27 7.3.11 Read Link Error Status (RLS)...........................28 7.3.12 Read Sequence Status Block (RSS).......................28 7.3.13 Reinstate Recovery Qualifier (RRQ).....................28 7.3.14 Request Sequence Initiative (RSI)......................29 7.3.15 Third Party Process Logout (TPRLO).....................29 8. TCP Link Service Messages..............................31 8.1 Network Connection Interfaces (NINTF)..................31 8.2 Connection Bind (CBIND)................................34 8.3 Unbind Connection (UNBIND).............................35 Monia Standards Track 2 iFCP April 2001 8.4 TCP Message (TCPMSG)..................................37 9. Error Detection and Recovery Procedures for iFCP......38 9.1 Overview..............................................38 9.2 Timer Definitions.....................................38 9.2.1 Error_Detect_Timeout (E_D_TOV)........................38 9.2.2 Resource Allocation Timeout (R_A_TOV).................39 9.2.3 Resource Recovery Timer (RR_TOV)......................39 9.3 TCP Error Recovery Issues.............................39 9.4 iFCP Protocol Error...................................39 10. Fabric Services Supported by an iFCP implementation...39 11. Security..............................................40 11.1 Overview..............................................40 11.2 Physical Security.....................................40 11.3 Controlling Access....................................40 11.4 Authentication and Encryption.........................40 11.5 Storage Firewalls.....................................41 12. Quality of Service Considerations.....................41 12.1 Minimal requirements..................................41 12.2 High-assurance........................................42 13. References............................................43 13.1 Relevant SCSI (T10) Specifications....................43 10.2 Relevant Fibre Channel (T11) Specifications.........44 10.3 Relevant RFC Documents..............................44 10.4 Other Reference Documents...........................45 14. Author's Addresses....................................45 A. iFCP Support for Fibre Channel Link Services..........48 A.1 Basic Link Services...................................48 A.2 Link Services Processed Transparently.................48 A.3 Augmented Link Services...............................49 B. Performance of The Multi-Connection iFCP Session Model 51 B.1 Relationship of Throughput to Packet Losses...........51 B.2 Background............................................52 Full Copyright Statement.......................................54 Monia Standards Track 3 iFCP April 2001 1. Abstract This document specifies an architecture and gateway-to-gateway protocol for the implementation of Fibre Channel fabric functionality on a network in which TCP/IP switching and routing elements replace Fibre Channel components. The protocol enables the attachment of existing Fibre Channel storage products to an IP network by supporting the fabric services required by such devices. 2. About This Document 2.1 Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. All frame formats are in big endian network byte order. 2.2 Purpose of this document This is a standards-track document, which specifies a protocol for the implementation of Fibre Channel transport services on a TCP/IP network. Some portions of this document contain material from standards controlled by NCITS T10 and T11. This material is included here for informational purposes only. The authoritative information is given in the appropriate NCITS standards document. The authoritative portions of this document specify the protocol for mapping standards-compliant fibre Channel storage and adapter implementations to TCP/IP. This mapping includes sections of this document which describe the "iFCP Protocol" (see section 4). 3. iFCP Introduction iFCP is a gateway-to-gateway protocol, which provides Fibre Channel fabric services to FCP-based Fibre Channel devices over a TCP/IP network. iFCP uses TCP to provide congestion control, error detection and recovery. iFCP's primary objective is to allow interconnection and networking of existing Fibre Channel devices at wire speeds over an IP network. The protocol and method of frame translation described in this document permit the transparent attachment of Fibre Channel Monia Standards Track 4 iFCP April 2001 storage devices to an IP-based fabric by means of lightweight gateways. The protocol achieves this transparency through an address translation process that allows normal frame traffic to pass through the gateway directly, with provisions for intercepting and emulating the fabric services required by an FCP device. 3.1 Definitions Terms needed to clarify the concepts presented in this document are presented here. Address-translation mode – A mode of gateway operation in which the scope of N_PORT fabric addresses for locally attached devices are local to the iFCP gateway. Address-transparent mode – A mode of gateway operation in which the scope of N_PORT fabric addresses for all fibre channel devices are unique to the logical fabric to which the gateway belongs. Gateway Region – The portion of the storage network accessed through an iFCP gateway. Devices in the region consist of all fibre channel devices directly attached to the gateway. Logical Fabric – A collection of iFCP gateways configured to interoperate together in address-transparent mode. Fibre Channel Network - A native fibre channel fabric and all attached Fibre Channel devices. Fabric - The part of a Fibre Channel network that provides the transport services defined in the FC-FS specification. A fabric may be implemented in the IP framework by means of the architecture and protocols discussed in this document. FC-2 - The Fibre Channel transport services layer described in the FC-FS specification. FCP Portal - An IP-addressable entity representing the point at which a logical or physical iFCP device is attached to the IP network. N_PORT - An iFCP or Fibre Channel entity representing the interface to Fibre Channel device functionality. This interface implements the Fibre Channel N_PORT semantics specified in the FC-FS standard [FC-FS]. Monia Standards Track 5 iFCP April 2001 N_PORT fabric address - The address of an N_PORT within the Fibre Channel fabric. N_PORT Network Address - The address of an N_PORT in the IP fabric. This address consists of the IP address of the FCP Portal and the N_PORT ID of the directly- attached Fibre Channel device. F_Port - The interface used by an N_PORT to access Fibre Channel fabric and fabric services functionality. iFCP - The protocol discussed in this document. Logical FCP Device - The abstraction representing a single Fibre Channel device as it appears on an iFCP network. iSNS - The protocol by which storage name services are implemented. Resolution of Fibre Channel network object names is provided by an iSNS name server. N_PORT Session - An association created when two N_PORTS have executed a PLOGI operation. It is comprised of the N_PORTs and TCP connection that carries traffic between them. iFCP Frame - The frame inserted into the TCP stream which contains the Fibre Channel frame and iFCP header. Port Login (PLOGI) - The Fibre Channel Extended Link Service (ELS) that establishes an N_PORT login session through the exchange of identification and operation parameters between an originating N_PORT and a responding N_PORT. DOMAIN_ID – The value contained in the high-order byte of a 24-bit N_PORT fibre channel address. 3.2 The iFCP Network Model The following diagram shows a Fibre Channel fabric with attached devices. These are connected to the fabric through N_PORT and F_PORT interfaces, whose behavior is specified in [FGS]. Within the Fibre Channel device domain, fabric-addressable entities consist of other N_PORTs and devices internal to the fabric that perform the fabric services defined in [FGS]. In this case, the N_PORT Fibre Channel addresses are 24-bit quantities that are unique within the scope of the FC fabric. N_PORTs that perform fabric services are assigned well-known addresses starting at the top end of the 24-bit Fibre Channel address space. Monia Standards Track 6 iFCP April 2001 Fibre Channel Network +--------+ +--------+ | FC | | FC | | Device | | Device | |........| |........| Fibre Channel | N_PORT |<------>| N_PORT | Device Domain +---+----+ +----+---+ ^ | | | +---+----+ +----+---+ | | F_PORT | | F_PORT | | ==========+========+========+========+============== | Fabric & | | | Fabric Services | v | | Fibre Channel +--------------------------+ Fabric Domain An iFCP Network with iFCP Gateways Fibre Channel Devices Fibre Channel Devices +--------+ +--------+ +--------+ +--------+ | FC | | FC | | FC | | FC | | Device | | Device | Fibre | Device | | Device | Fibre |........| |........| Channel |........| |........| Channel | N_PORT | | N_PORT |<--------->| N_PORT | | N_PORT | Device +---+----+ +---+----+ Traffic +----+---+ +----+---+ Domain | | | | ^ +---+----+ +---+----+ +----+---+ +----+---+ | | F_PORT | | F_PORT | | F_PORT | | F_PORT | | =+========+==+========+===========+========+==+========+========== | iFCP Layer |<--------->| iFCP Layer | | |....................| ^ |....................| | | FCP Portal | | | FCP Portal | v +--------+-----------+ | +----------+---------+ IP | Control | Fabric | Data | | | | | |<------Encapsulated Frames------->| | +------------------+ | | | | | +------+ IP Network +--------+ | | +------------------+ The above diagram shows the simplest implementation of an equivalent iFCP fabric. Two gateway regions are shown. Each consists of Fibre Channel devices directly connected to the iFCP fabric through F_PORTs implemented as part of the edge switch or gateway. Monia Standards Track 7 iFCP April 2001 Looking into the F_PORT on the Fibre Channel side of the gateway, the network appears as a Fibre Channel fabric. Here, the gateway presents remote N_PORTs as directly attached devices. Conversely, on the IP network side, the gateway presents each locally connected N_PORT as a logical fibre channel device. An important property of this gateway architecture is that the fabric configuration and topology within the gateway region are opaque to the IP network. That is, the topology in the fibre channel domain, whether it is loop- or switch-based, is hidden from the IP network and from other gateways. Consequently, support for such FC fabric topologies becomes a gateway implementation option. In such cases, the gateway incorporates whatever functionality is required to distil and present locally attached N_PORTs (or NL_PORTs) as logical iFCP devices. N_PORT to N_PORT communications that traverse a TCP/IP network require the intervention of the iFCP layer. This consists of the following operations: a) Execution of the frame addressing and mapping functions described in section 8. b) Execution of fabric-supplied link services addressed to one of the well-known Fibre Channel N_PORT addresses. c) Encapsulation of Fibre Channel frames for injection into the TCP/IP network and de-encapsulate Fibre Channel frames received from the TCP/IP network. d) Establishment of an N_PORT login session in response to a PLOGI directed to a remote device. The following sections discuss the frame addressing mechanism and the way in which it is used to achieve communications transparency between N_PORTs. 3.3 The N_PORT Addressing Model This section discusses the role of the N_PORT addressing model in the routing of frames between locally and remotely attached N_PORTs. In the case of a remote N_PORT, where the frame traffic must traverse the IP network, the gateway must perform this routing transparently with respect to the locally attached N_PORT. To provide such transparency, the gateway maintains an association between the fibre channel address of a remote N_PORT, as seen by a locally attached device, and the Monia Standards Track 8 iFCP April 2001 corresponding address of the remote device on the IP network. To establish this association the iFCP gateway assigns and manages fibre channel N_PORT fabric addresses as described in the following sections. The fabric address of an N_PORT device is a 24-bit value having the following format defined by the fibre channel specification [FCS]: Bit 23 16 15 8 7 0 +-----------+------------+----------+ | Domain ID | Area ID | Port ID | +-----------+------------+----------+ Fibre Channel Address Format Such addresses are volatile and subject to change based on modifications in the fabric configuration. In a fibre channel fabric, each switch element has a unique Domain I/D assigned by a master switch. The value of the Domain I/D ranges from 1 to 239 (0xEF). Each switch in turn controls a 65K block of addresses divided into area and port IDs. N_PORTs logging into the fabric receive a unique fabric address consisting of the switch’s Domain I/D concatenated with switch-assigned area and port I/Ds. These N_PORT addresses are carried in the fibre channel frame as shown in the following diagram. Bit 31 24 23 0 +--------+-----------------------------------+ Word 0 | | Destination N_PORT Address (D_ID) | +--------+-----------------------------------+ Word 1 | | Source N_PORT Address (S_ID) | +--------+-----------------------------------+ . | | . | Control information | . | and Payload | Word 527 +--------------------------------------------+ (Max) Fibre Channel Address Fields within a Frame The D_ID and S_ID fields represent the fabric addresses of the source and destination N_PORTs respectively. In an iFCP storage fabric, the iFCP gateway replaces the FC switch element as the device responsible for N_PORT address assignment and frame routing. Unlike an FC switch, however, an iFCP gateway must route frames between N_PORTs within the gateway region or to external devices attached to remote gateways on the IP network. Monia Standards Track 9 iFCP April 2001 In order to be FC-compatible, the gateway must route such frames using only the embedded 24-bit address. By exploiting its control of address allocation and access to frame traffic entering or leaving the gateway region, it is able to achieve the necessary transparency. The gateway may allocate device addresses in one of two ways: a) Gateway local – A mode of address assignment in which the gateway locally assigns values for all N_PORT device addresses, including remote devices. The address of a remote device is represented by a gateway assigned N_PORT alias. The scope of all such addresses is restricted to the gateway-controlled region. A gateway using local addressing is said to be operating in address-translation mode. b) Fabric Global – A mode of address assignment in which several gateways collaborate to form a ‘logical fabric’. Each gateway in control of a region is responsible for obtaining and distributing unique domain I/Ds from the address assignment authority as described in section 3.3.1.1. Consequently, within the scope of the logical fabric, the address of each N_PORT is unique. For that reason, gateway-assigned aliases are not required to represent remote N_PORTs. A gateway using fabric global addressing is said to be operating in address-transparent mode. The choice of addressing mode involves the tradeoffs between scalability, and transparency discussed below. The scalability constraints are a byproduct of the Fibre Channel address allocation policy described above. As noted, a an IP fabric using this address allocation scheme is limited to a combined total of 239 gateways and fibre channel switch elements. As the system expands, an IP fabric may consist of many switch elements distributed throughout the enterprise, each of which controls a small number of devices. In this case, the limitation in switch count may become a barrier to extending and fully integrating the storage network. Gateway local addressing avoids this limitation by decoupling N_PORT fabric addresses from the constraints of Fibre Channel address space management. Consequently, a virtually unlimited number of iFCP gateways, Fibre Channel devices and switch elements may be internetworked. This mode of address allocation also simplifies management of the IP storage fabric configuration by eliminating the need for a centralized address-assignment authority. Monia Standards Track 10 iFCP April 2001 A consequence of gateway local addressing is that the 24-bit N_PORT address is no longer unique across the storage network. As a result, when processing frame traffic to or from remote N_PORTs, the gateway must intervene to translate the 24-bit N_PORT addresses between the sending and receiving gateways. These address operations involve: a) Translating the N_PORT I/Ds in the frame header and b) Translating N_PORT I/Ds carried in the payload of certain basic or extended link service messages. The process of N_PORT I/D translation for the frame header is described in section 3.3.2. The processing for link services with frame addresses in the payload is described in section 7.1. The details of the address transparent and address translation operational modes are discussed in the following sections. 3.3.1 Operation in Address Transparent Mode The use of fabric global address assignments is an alternative where address transparency is considered more important than connectivity. In addition to the scalability limits discussed above, the following considerations and requirements pertain to this mode of operation: a) The dependency on the services of a central address assignment authority, such as iSNS, may increase. If connectivity with the server is lost, new DOMAIN_ID values cannot be automatically allocated as gateways and fibre channel switch elements are added to the logical fabric. As a result, new gateways and switch elements cannot be automatically added to the ip fabric. Of course, it is always possible to add and manage such additional components manually. b) Multiple iFCP gateways set up with independently- administered address servers must be completely torn down and slaved under a single iSNS name server before they can be configured into the same logical fabric. In contrast, operation in gateway local mode requires only that the independent iSNS servers import client attributes from other iSNS servers, before clients under different iSNS authorities can be made to interoperate. c) iFCP gateways in transparent mode will not interoperate with iFCP gateways that are not in transparent mode. d) When interoperating with locally attached Fibre Channel fabrics, the iFCP gateway MUST assume control of DOMAIN_ID Monia Standards Track 11 iFCP April 2001 assignments in accordance with the appropriate Fibre Channel standard or specification. As described in section 3.3.1.1, DOMAIN_ID values assigned to FC switches in attached fabrics must be issued by the iSNS server or manually assigned. e) When operating in address transparent Mode, no fibre channel address translation SHALL take place, and no link service Messages shall be augmented with additional information by the iFCP layer. The process for establishing the TCP/IP context associated with an N_PORT login session in this mode is similar to that specified for address translation mode (section 3.3.2). 3.3.1.1 Transparent Mode Domain I/D Management As described above, each gateway and fibre channel switch in a logical fabric must have a unique domain I/D. In a gateway region containing fibre channel switch elements, each element obtains a domain I/D by querying a master switch element as described in [FC-SW] -- in this case the iFCP gateway itself. The gateway in turn may obtain domain I/Ds on demand from a central address allocation authority, such as an iSNS name server or manually from a pre-assigned block of IDs. In that sense, the address authority (e.g., iSNS) assumes the role of master switch for the logical fabric. 3.3.1.2 Incompatibility with Address Translation Mode iFCP gateways in address transparent mode shall not originate or accept frames that do not have bit ??? ("iFCP TRANSPARENT MODE") set to one in the /TBD/ field of the encapsulation header. The iFCP gateway shall immediately terminate any N_PORT sessions with the iFCP gateway from which it receives such frames. 3.3.2 Operation in Address Translation Mode This section summarizes the process for modifying FC frame addresses embedded in the frame header. As described above, the iFCP gateway is responsible for assigning Fibre Channel N_PORT addresses to locally and remotely attached N_PORTs. For remotely attached N_PORTs, the gateway assigns an N_PORT alias used in place of the N_PORT address assigned by the remote gateway. To perform this function and enable the appropriate routing, the gateway builds and maintains a table that maps N_PORT aliases to the appropriate TCP/IP connection and N_PORT ID of all external N_PORTs. Monia Standards Track 12 iFCP April 2001 The gateway opportunistically builds the store of N_PORT addresses and TCP/IP connections for remotely attached devices in the IP fabric by: a) Intercepting name service requests issued by locally- attached N_PORTs as described below or, b) Intercepting incoming N_PORT login requests from external Fibre Channel devices and outgoing N_PORT login requests directed to remote N_PORTs. Such requests are used to establish the N_PORT login session as described in section 6.1. In response to name server requests, the iSNS server returns the IP address and N_PORT ID pair of the remote device. The IP address is mapped to the connection context. After saving the context and N_PORT ID, the iFCP layer creates the 24-bit N_PORT alias that is returned to the local N_PORT as the Fibre Channel address of the external device. 3.3.2.1 Translation Table Maintenance The contents of the gateway’s address translation tables are updated opportunistically, in response to the name service queries and PLOGI requests described previously. There is no need to invalidate entries in response to changes in the fabric configuration, since any potentially stale entries caused by such events are self-correcting as described below. Once a fabric has achieved steady-state operation, any event that causes a change in the fibre channel address of a device also causes the device to terminate all N_PORT sessions. In the process of resuming operation, the status of the device, including its new address, is reflected in the name server’s database. The new state of the device is advertised using the appropriate state change notifications. These, in turn, trigger the series of port login operations described below. For inbound PLOGI requests, the iFCP gateway simply updates the translation table, generates the N_PORT alias and forwards the request to the local N_PORT for processing as described above. For outbound requests, a fabric-attached fibre channel device usually precedes the PLOGI with a name server query to obtain the device’s new N_PORT address. At this point, the iFCP gateway intercepts such a request, performs the necessary iSNS query, creates the translation table entry and returns the assigned N_PORT alias to the requester. Monia Standards Track 13 iFCP April 2001 After issuing the PLOGI, the N_PORT verifies that it has logged in with the expected device by checking the device name returned in the PLOGI response. An N_PORT that attempts to execute a PLOGI without first querying the name server is still required to confirm the device name as described above. 3.3.2.2 Frame Address Translation For outbound frames, the table of external N_PORT network addresses are referenced to map the Destination N_PORT alias and Source N_PORT ID to a TCP connection identifier and the N_PORT ID assigned by the remote gateway. The translation process for outbound frames is shown below. Raw Fibre Channel Frame +--------+-----------------------------------+ +--------------+ | | Destination N_PORT Alias |--->| Lookup TCP | +--------+-----------------------------------+ | connection | | | Source N_PORT ID |--->| and N_PORT ID| +--------+-----------------------------------+ +------+-------+ | | | TCP | Control information | | Conn | and Payload | | & +--------------------------------------------+ | N_PORT | ID | After Address Translation and TCP/IP Encapsulation | +--------------------------------------------+ Conn | | iFCP Encapsulation |<----------+ | Header | Context | +========+===================================+ | | | Destination N_PORT ID |<----------+ +--------+-----------------------------------+ | | Source N_PORT ID | +--------+-----------------------------------+ | | | Control information | | and Payload | +--------------------------------------------+ For inbound frames, the store regenerates the N_PORT alias from the TCP connection context and N_PORT ID contained in the encapsulated FC frame. The translation process for inbound frames is shown below. Monia Standards Track 14 iFCP April 2001 Network Format of Inbound Frame +--------------------------------------------+ Conn. +--------+ | iFCP Encapsulation Header |------>| N_PORT | | |Context| Alias | +========+===================================+ | Lookup | | | Destination N_PORT ID | | | +--------+-----------------------------------+ | | | | Source N_PORT ID |------>| | +--------+-----------------------------------+ +----+---+ | | |N_PORT | Control information | |Alias | and Payload | | +--------------------------------------------+ | | | | Frame after Address Translation and De-encapsulation | +--------+-----------------------------------+ | | | Destination N_PORT ID | | +--------+-----------------------------------+ | | | Source N_PORT Alias |<-----------+ +--------+-----------------------------------+ | | | Control information | | and Payload | +--------------------------------------------+ 3.3.2.3 Incompatibility with Address Transparent Mode iFCP gateways in address translation mode shall not originate or accept frames that have bit ??? ("iFCP TRANSPARENT MODE") set to one in the /TBD/ field of the encapsulation header. The iFCP gateway shall immediately terminate any N_PORT sessions with the iFCP gateway from which it receives such frames. 3.4 iFCP Layered Services The following diagram shows the functional layers for host devices that support FCP. As shown, iFCP provides a set of layered services that transparently provide the transport services required by FCP devices. Using the iFCP framework, any existing host FCP implementation will execute with no modifications required. The iFCP protocol layer consists of the data transport services and iFCP-specific Link Services. This layer provides transport services specific to Fibre Channel devices as specified in [FC-PH], [FC-PH-2], and [FC-PH-3]. Monia Standards Track 15 iFCP April 2001 This is illustrated in the following diagram, which shows the IP Fabric consisting of the TCP/IP network and the iFCP Layer. The IP Fabric provides the transport services for FCP, and is a direct replacement for the transport services provided by a Fibre Channel fabric. Meanwhile, the components in the Fibre Channel Device Domain remain unchanged. +---------------------------------------+ - - - - - - - | Storage & Backup Applications | +---------------------------------------+ | Operating System | Application +--------------------+ | Layer | SCSI | | +--------------------+ | - - - - - - - | FCP | | FC-4 Layer +------------+-------+------------------+ - - - - - - - | | Link Services | | +--------------------------+ FC-2 Layer ^ | | | | N_PORT - F_PORT Interface | Fibre Channel | | Device Domain <=============================================================> | | IP Fabric | iFCP Data Transport Service | | | | v | +---------------+ | |iFCP Specific | iFCP Layer | |Link Services | +-----------------------+---------------+ - - - - - - | | | TCP | Transport | | Layer +---------------------------------------+ - - - - - - | | | IP | Network | | Layer +---------------------------------------+ - - - - - - | | | Physical Transport | Link Layer | | +---------------------------------------+ - - - - - - In the figure shown above, each layer leverages the services of the layer below it. 3.4.1 Application Layer This includes the operating system, Storage and Backup applications, and the SCSI driver. This layer interfaces with FCP and Link Services in the FC-2 and FC-4 layers. Monia Standards Track 16 iFCP April 2001 3.4.2 FC-4 Layer (FCP) FCP is the Fibre Channel FC-4 layer application protocol used to communicate with devices implementing the SCSI-3 command set and architectural model. Basically, FCP divides each SCSI I/O operation into a series of information units to be transferred between the initiator and target. 3.4.3 FC-2 Layer The FC-2 Layer provides the facilities for Link Services and transfer of Fibre Channel information units as described below. 3.4.3.1 Link Service Messages Fibre Channel defines a series of link services defined in Fibre Channel Physical and Signaling Interface specification (FC-PH, FC-PH-2, FC-PH-3). These Link Service Messages provide a set of defined functions that allow a Fibre Channel port to send control information, or to request another port to perform a specific function. Some Link Service messages reference services provided internally within the Fibre Channel fabric. 3.4.3.2 N_PORT Interface This is an interface which provides access to Fibre Channel device functionality. The N_PORT interface is responsible for segmentation and reassembly of information units from Fibre Channel frames. 3.4.3.3 F_PORT Interface This is the interface through which the N_PORT accesses the Fibre Channel fabric. 3.4.4 iFCP Layer The iFCP layer provides three essential services for FCP-based storage products: a) Transport of Fibre Channel frames and Link Service messages between N_PORTs b) Support for special Link Service messages needed by iFCP to manage the transmission of storage data on a IP network. c) Augmentation of some Link Service messages with additional data needed in the iFCP environment. Monia Standards Track 17 iFCP April 2001 The iFCP layer maps Fibre Channel frames to a predetermined TCP connection for transport. Additionally, many link service messages can similarly be transported without modification over a TCP connection. 4. iFCP Protocol 4.1 Overview 4.1.1 iFCP Transport Services The iFCP transport services map the Fibre Channel frames comprising each FCP IU and Link Service message to a predetermined TCP connection for transport across an IP network. When receiving FCP-based storage data from the network, the iFCP layer transports, and delivers each resulting frame to the appropriate N_PORT via the F_PORT. The iFCP layer never interprets the contents of the frame payload. For incoming iFCP frames with control data, iFCP interprets the augmented information, modifies the frame content accordingly, and may forward the resulting frame to the N_PORT for further processing. For out-bound Fibre Channel frames that require control data, the iFCP layer creates the augmented information based on frame content, modifies the frame content, then transmits the resulting Fibre Channel frame with augmented data through the appropriate TCP connection. 4.1.2 iFCP Support for Link Services Some Link Service messages reference constructs specific to the Fibre Channel fabric environment but irrelevant in the context of an IP fabric. When iFCP encounters such messages, it will augment the information in the payload by adding additional information in the iFCP header. The receiving iFCP layer will reference the augmented information in order to reconstruct the original Link Service message. The reconstructed frames are then forwarded to the receiving N_PORT for further processing. Section 7.1 describes augmented Link Services in detail. 4.2 Mandatory FC-2 Functionality [To be specified] 4.3 FC-2 Functionality Not Supported [To be specified] Monia Standards Track 18 iFCP April 2001 4.4 Optional FC-2 Functionality [To be specified] 5. Encapsulation of Fibre Channel Frames [Editor’s note: This section will be based on the FCIP/iFCP common encapsulation specification.] 6. TCP Stream Transport of iFCP Frames TCP connections MAY be established between FCP_Portals that have discovered each other through a naming service or through manual configuration. If a TCP connection is not maintained between the FCP_Portals, then a change in the status of remote N_PORTs must be discovered through a central name server authority. Multiple TCP connections may exist between pairs of FCP Portals. Such connections are either "bound" or "unbound". An unbound connection is a TCP connection that is not actively supporting an N_PORT login session. Pre-existing TCP connections between FCP Portals remain unbound and uncommitted until a CBIND message (see section 7.2.2) has been transmitted through them. When the iFCP layer detects a Port Login (PLOGI) message creating a login session between a pair of N_PORTs, it will select an existing unbound TCP connection or establish a new TCP connection, and send the CBIND message down that TCP connection. This allocates the TCP connection to that PLOGI login session. A TCP connection may not be bound to more than one N_PORT login session. 6.1 TCP Session Model iFCP uses a single TCP connection to transport all Fibre Channel frames between unique pairs of N_PORTs. A TCP connection may be used by one and only one N_PORT login session. 6.2 TCP Port Numbers An FCP Portal uses a single port number to receive TCP connection requests for iFCP over TCP. All TCP connections established between FCP Portals must be directed to the registered well known port number assigned by the IANA. An FCP Portal may use any TCP port number consistent with its implementation of the TCP/IP stack to initiate a TCP connection, but each port number must be unique. Monia Standards Track 19 iFCP April 2001 7. Link Services The link services provide a set of functions that allow a port to send control information or request another port to perform a specific function. Each Link Service message (response and reply) is carried by a Fibre Channel sequence, and can be segmented into multiple frames. The iFCP Layer is responsible for transporting Link Service messages across the IP fabric. This includes mapping Link Service messages appropriately from the domain of the Fibre Channel transport to that of the IP network. This process may involve manipulation of field values as the Link Service message travels to and from the IP and Fibre Channel fabrics. It also may also require the inclusion of augmented data by the iFCP layer in order to make the Link Service message significant in the IP fabric domain. Each link service or extended link service is processed according to one of the following policies: a) Transparent – The link service message and reply MUST be transported to the receiving N_PORT by the iFCP gateway without altering the message payload. The link service message and reply are not processed by the iFCP implementation. b) Augmented - Designates an extended link service reply or request containing fibre channel addresses in the payload or requiring other special processing by the iFCP implementation. The processing for augmented link services is described in this section. c) Rejected – When issued by a directly attached N_PORT, the specified link service request MUST be rejected by the iFCP implementation. The implementation MUST respond to the issuing N_PORT as specified in this document. This section describes the processing for augmented link services, including the manner in which augmentation data is transmitted over the IP network. Appendix A enumerates all link services and the iFCP processing policy that applies to each. 7.1 Augmented Link Service Messages Augmentation applies to the extended link service requests defined in this section. Such requests are transmitted in a fibre channel frame having the following format: Monia Standards Track 20 iFCP April 2001 Word +----------+------------------------------------------------+ 0| R_CTL | D_ID | | [22] | [Destination of extended link Service request] | +----------+------------------------------------------------+ 1| CS_CTL | S_ID | | | [Source of extended link service request] | +----------+------------------------------------------------+ 2| TYPE | F_CTL | +----------+------------------+-----------------------------+ 3| SEQ_ID | DF_CTL | SEQ_CNT | +----------+-----------+------+-----------------------------+ 4| OX_ID | RX_ID | +-----------------------------+-----------------------------+ 5| Parameter | | [ 00 00 00 00 ] | +-----------------------------------------------------------+ 6| LS_COMMAND | | [Extended Link Service Command Code] | +-----------------------------------------------------------+ 7| | .| Additional Service Request Parameters | .| ( if any ) | n| | +-----------------------------------------------------------+ Format of ELS Frame 7.2 Link Service Augmentation Augmented data includes information required by the receiving gateway to convert an N_PORT address in the payload to an N_PORT alias in the receiving gateway’s address space. The following rules define the manner in which such augmentation data is packaged and referenced. For an N_PORT address field, the gateway originating the frame MUST set the value in the payload to identify the data to be converted as follows: 0x00 00 00 – The receiving gateway MUST reference the augmentation data to set the field contents as described below. The augmentation information is the 64-bit world wide identifier of the N_PORT as set forth in the fibre channel specification. 0x00 00 01 – The gateway receiving the frame MUST replace the contents of the field with the N_PORT alias of the frame originator. Monia Standards Track 21 iFCP April 2001 0x00 00 02 – The gateway receiving the frame MUST replace the contents of the field with the N_PORT I/D of the destination N_PORT. Since fibre channel addressing rules prohibit the assignment of fabric addresses with a domain I/D of 0, these codes will never correspond to valid N_PORT fabric IDs. When the augmentation data is a 64-bit world wide unique N_PORT identifier, the receiving gateway SHALL obtain the information needed to fill in the ELS field by converting the N_PORT world-wide identifier to a gateway IP address and N_PORT ID. This information MUST be obtained through a name server query. If the N_PORT is locally attached, the gateway MUST fill in the field with the N_PORT ID. If the N_PORT is remotely attached, the gateway MUST assign and fill in the field with an N_PORT alias. If an N_PORT alias has already been assigned, it MAY be reused. In the event that the sending gateway cannot obtain the world wide identifier of an N_PORT, or a receiving gateway cannot obtain the IP address and N_PORT ID, the gateway detecting the error SHALL terminate the request with an LS_RJT message as described in [FCS]. The Reason Code SHALL be set to 0x07 (protocol error) and the Reason Explanation SHALL be set to 0x1F (Invalid N_PORT identifier). [Editor’s note: Such errors, when detected by the receiving gateway, may be indicative of a serious problem requiring a more drastic response. Therefore, this section should be regarded as tentative.] Augmented data is sent with the ELS request or ACC frames in one of the following ways: a) By appending the necessary data to the end of the ELS frame. b) By extending the sequence through the addition of additional frames. In the first case, a new frame SHALL be created whose length includes the augmented data. The procedure for extending the ELS sequence with additional frames is /TBS/. After applying the augmented data, the receiving gateway SHALL forward the resulting ELS to the destination N_PORT with the augmented information removed. When the ACC response must be augmented, the receiving gateway must act as a proxy for the originator, retaining the state Monia Standards Track 22 iFCP April 2001 needed to process the response from the N_PORT to which the request was directed. 7.3 Augmented Link Services The following Link Service Messages must receive special processing or be augmented with additional control data. When the iFCP header encapsulates one of these Extended Link Service messages in the iFCP payload, the AUGMENTATION PRESENT bit must be enabled in the iFCP FLAGS field as specified in section /TBS/, and the augmentation data must be appended as described in the following section. An ELS response frame containing augmented data must be similarly formatted. Link Service Message LS_COMMAND Mnemonic -------------------- ---------- -------- Abort Exchange 0x06 00 00 00 ABTX Discover Address 0x52 00 00 00 ADISC FC Address Resolution Protocol 0x55 00 00 00 FARP-REPLY Reply FC Address Resolution Protocol 0x54 00 00 00 FARP-REQ Request Logout 0x05 00 00 00 LOGO Port Login 0x30 00 00 00 PLOGI Read Exchange Concise 0x13 00 00 00 REC Read Exchange Status Block 0x08 00 00 00 RES Read Link Error Status Block 0x0F 00 00 00 RLS Read Sequence Status Block 0x09 00 00 00 RSS Reinstate Recovery Qualifier 0x12 00 00 00 RRQ Request Sequence Initiative 0x0A 00 00 00 RSI Third Party Process Logout 0x24 00 00 00 TPRLO 7.3.1 Abort Exchange (ABTX) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x6 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | RRQ Status | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID of Tgt exchange | RX_ID of tgt exchange| +------+------------+------------+-----------+----------+ | 3-10 | Optional association header (32 bytes | +======+============+============+===========+==========+ The originating iFCP gateway SHALL set the contents of the exchange originator S_ID to 0x000001 as specified in section 7.2. Monia Standards Track 23 iFCP April 2001 7.3.2 Discover Address (ADISC) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x52 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Reserved | Hard address of initiator | +------+------------+------------+-----------+----------+ | 2-3 | Port Name of Originator | +------+------------+------------+-----------+----------+ | 4-5 | Node Name of originator | +------+------------+------------+-----------+----------+ | 6 | Rsvd | N_PORT I/D of Originator | +======+============+============+===========+==========+ The originating iFCP gateway SHALL set the contents of the originator N_PORT I/D to 0x000001 as specified in section 7.2. The originating gateway SHALL not modify the hard address of the initiator. The gateway processing the ACC response MUST set the Hard Address field to 0. 7.3.3 FC Address Resolution Protocol Reply /TBS/ 7.3.4 FC Address Resolution Protocol Request /TBS/ 7.3.5 Logout (LOGO) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x5 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | N_PORT I/D being logged out | +------+------------+------------+-----------+----------+ | 2-3 | Port name of the LOGO originator (8 bytes) | +======+============+============+===========+==========+ The originating iFCP gateway shall set the N_PORT I/D to 0. The receiving gateway SHALL fill in the contents of this field using the N_PORT name of the LOGO originator in words 2 and 3. Monia Standards Track 24 iFCP April 2001 7.3.6 Port Login (PLOGI) PLOGI provides the mechanism for establishing a login session between two N_PORTs. The PLOGI request carries information identifying the originating N_PORT, including specification of its capabilities and limitations. If the destination N_PORT accepts the login request, it sends an accept (an ACC frame with PLOGI payload), specifying its capabilities and limitations. This exchange establishes the operating environment for the two N_PORTs. The following figure is duplicated from FC-PH, and shows the PLOGI message format for both request and accept (ACC) response. A port will reject a PLOGI request by transmitting an LS_RJT message, which contains no payload. Byte Offset +----------------------------------+ 0 | LS_COMMAND | 4 Bytes +----------------------------------+ 4 | COMMON SERVICE PARAMETERS | 16 Bytes +----------------------------------+ 20 | PORT NAME | 8 Bytes +----------------------------------+ 28 | NODE NAME | 8 Bytes +----------------------------------+ 36 | CLASS 1 SERVICE PARAMETERS | 16 Bytes +----------------------------------+ 52 | CLASS 2 SERVICE PARAMETERS | 16 Bytes +----------------------------------+ 68 | CLASS 3 SERVICE PARAMETERS | 16 Bytes +----------------------------------+ 86 | CLASS 4 SERVICE PARAMETERS | 16 Bytes +----------------------------------+ 102 | VENDOR VERSION LEVEL | 16 Bytes +----------------------------------+ Total Length = 116 bytes Details on the above fields, including common and class-based service parameters, can be found in [FC-PH]. The above PLOGI message is transported by the iFCP layer without modification. [Editor’s note: The service parameter details that apply to an iFCP environment are /TBS/.] 7.3.7 Read Exchange Concise ELS Format: Monia Standards Track 25 iFCP April 2001 +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x13 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID | RX_ID | +======+============+============+===========+==========+ | 3-4 |Port name of the exchange originator (8 bytes) | +======+============+============+===========+==========+ The originating gateway SHALL set the Exchange Originator S_ID field to 0 and augment the ELS by appending the port name of the exchange originator. The receiving gateway SHALL fill in the S_ID using the port name of the originator. 7.3.8 Read Exchange Concise Accept Format of ACC Response: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Acc = 0x02 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | OX_ID | RX_ID | +------+------------+------------+-----------+----------+ | 2 | Rsvd | Exchange Originator N_PORT ID | +------+------------+------------+-----------+----------+ | 3 | Rsvd | Exchange Responder N_PORT ID | +------+------------+------------+-----------+----------+ | 4 | Data Transfer Count | +------+------------+------------+-----------+----------+ | 5 | Exchange Status | +======+============+============+===========+==========+ | 6-7 |Port name of the exchange originator (8 bytes) | +======+============+============+===========+==========+ | 8-9 |Port name of the exchange responder (8 bytes) | +======+============+============+===========+==========+ The iFCP gateway originating the ACC response SHALL set the Exchange Originator and Exchange Responder N_PORT IDs to 0 and SHALL augment the ELS by appending the port names of the originator and responder as shown above. 7.3.9 Read Exchange Status Block (RES) ELS Format: +------+------------+------------+-----------+----------+ Monia Standards Track 26 iFCP April 2001 | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x13 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID | RX_ID | +------+------------+------------+-----------+----------+ | 3-10 | Association header (may be optionally req’d) | +======+============+============+===========+==========+ | 11-18| Port name of the exchange originator (8 bytes) | +======+============+============+===========+==========+ The originating iFCP gateway SHALL set the S_ID field to 0 and augment the ELS frame by appending the port name as shown above. The receiving gateway SHALL reference the appended port name to fill in the exchange originator S_ID field as described in section 7.2. 7.3.10 Read Exchange Status Block Accept Format of ELS Accept Response: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Acc = 0x02 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | OX_ID | RX_ID | +------+------------+------------+-----------+----------+ | 2 | Rsvd | Exchange Originator N_PORT ID | +------+------------+------------+-----------+----------+ | 3 | Rsvd | Exchange Responder N_PORT ID | +------+------------+------------+-----------+----------+ | 4 | Exchange Status Bits | +------+------------+------------+-----------+----------+ | 5 | Reserved | +------+------------+------------+-----------+----------+ | 6–n | Service Parameters and Sequence Statuses | | | as described in [FCS] | +======+============+============+===========+==========+ |n+1- | Port name of the exchange originator (8 bytes) | |n+8 | | +======+============+============+===========+==========+ |n+9- | Port name of the exchange responder (8 bytes) | |n+16 | | +======+============+============+===========+==========+ The N_PORT I/Ds of the originator and responder are set to 0. The augmented data needed to format the ELS ACC response is Monia Standards Track 27 iFCP April 2001 appended to the end of the variable length ACC data as shown above. 7.3.11 Read Link Error Status (RLS) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x0F | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | N_PORT Identifier | +======+============+============+===========+==========+ | 2-9 | Port name of the N_PORT (8 bytes) | +======+============+============+===========+==========+ The originating gateway MUST set the N_PORT identifier to 0 and augment the ELS by appending the port name as shown above. The receiving gateway MUST fill in the N_PORT Identifier as described in section 7.2. 7.3.12 Read Sequence Status Block (RSS) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x09 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | SEQ_ID | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID | RX_ID | +======+============+============+===========+==========+ | 3-4 |Port name of the exchange originator (8 bytes) | +======+============+============+===========+==========+ The originating gateway MUST set the N_PORT identifier to 0 and augment the ELS by appending the port name as shown above. The receiving gateway MUST fill in the N_PORT Identifier as described in section 7.2. 7.3.13 Reinstate Recovery Qualifier (RRQ) ELS Format: Monia Standards Track 28 iFCP April 2001 +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x12 | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID | RX_ID | +------+------------+------------+-----------+----------+ | 3-10 | Association header (may be optionally req’d) | +======+============+============+===========+==========+ The originating iFCP gateway SHALL set the S_ID field to 1. The receiving gateway SHALL fill in the exchange originator S_ID field with the N_PORT alias as described in section 7.2. 7.3.14 Request Sequence Initiative (RSI) ELS Format: +------+------------+------------+-----------+----------+ | Word | Bits 31–24 | Bits 23–16 | Bits 15–8 | Bits 7-0 | +------+------------+------------+-----------+----------+ | 0 | Cmd = 0x0A | 0x00 | 0x00 | 0x00 | +------+------------+------------+-----------+----------+ | 1 | Rsvd | Exchange Originator S_ID | +------+------------+------------+-----------+----------+ | 2 | OX_ID | RX_ID | +------+------------+------------+-----------+----------+ | 3-10 | Association header (may be optionally req’d) | +======+============+============+===========+==========+ The originating iFCP gateway SHALL set the S_ID field to 1. The receiving gateway SHALL fill in the exchange originator S_ID field with the N_PORT alias as described in section 7.2. 7.3.15 Third Party Process Logout (TPRLO) TPRLO provides a mechanism for an N_PORT (third party) to remove one or more login sessions that exists between the destination N_PORT and other N_PORTs specified in the command. This command includes one or more TPRLO LOGOUT PARAMETER PAGEs, each of which when combined with the destination N_PORT identifies a SCSI login session which shall be terminated by the command. Monia Standards Track 29 iFCP April 2001 Byte Offset +----------------------------------+ 0 | LS_COMMAND | 1 Byte +----------------------------------+ 1 | PAGE LENGTH (0x10) | 1 Byte +----------------------------------+ 2 | PAYLOAD LENGTH (0x14) | 2 Bytes +----------------------------------+ 4 | TPRLO LOGOUT PARAMETER PAGE 1 | 2-4 Bytes +----------------------------------+ | . . . . | M Bytes +----------------------------------+ | TPRLO LOGOUT PARAMETER PAGE N | 2-4 Bytes +----------------------------------+ Total Length = Variable Each TPRLO LOGOUT PARAMETER PAGE identifies a remote N_PORT which when combined with the destination N_PORT identifies a SCSI session to be terminated. The TPRLO LOGOUT PARAMETER PAGE is of the following format: Byte Offset +----------------------------------+ 0 | TYPE CODE | 1 Byte +----------------------------------+ 1 | TYPE CODE EXTENSION | 1 Byte +----------------------------------+ 2 | TPRLO FLAGS | 2 Bytes +----------------------------------+ 4 | ORIG PROCESS ASSOC (if present) | 4 Bytes +----------------------------------+ 8 | RESP PROCESS ASSOC (if present) | 4 Bytes +----------------------------------+ 12 | RESERVED | 1 Byte +----------------------------------+ 13 | THIRD PARTY ORIGINATOR N_PORT ID | 3 Bytes +----------------------------------+ When the iFCP header contains a TPRLO message (including the ACC response), iFCP augmented data field will contain the PORT_NAME(s) (WWPN) identifying the N_PORT described by the equivalent TPRLO LOGOUT PARAMETER PAGE(s). If more than one TPRLO LOGOUT PARAMETER PAGE is contained in the Link Service message, the corresponding PORT_NAME shall also be included. PORT_NAMEs shall be listed in the same order as the equivalent TPRLO LOGOUT PARAMETER PAGEs in the original Link Service message. Monia Standards Track 30 iFCP April 2001 [The format for passing augmentation data is /TBS/] Additionally, the THIRD PARTY ORIGINATOR N_PORT ID field in each TPRLO LOGOUT PARAMETER PAGE shall be cleared when it is sent by the originateing gateway. This applies to both the original Link Service message and the ACC response. When the iFCP layer receives a TPRLO message, it shall use the latter to replace the THIRD PARTY ORIGINATOR N_PORT ID in the original Link Service message, before forwarding it on to the upper Fibre Channel layers. Additional information on TPRLO can be found in [FC-PH-2]. 8. TCP Link Service Messages TCP Link Service Messages are used to manage TCP connections. They are passed between peer FCP Portals, and are only processed within the iFCP layer. The response to the TCP Link Service Message (if any) will echo the original request. The LS_COMMAND value for the response remains the same as that used for the request. Additionally, the ABTS request shall never be generated for any TCP Link Service Message. {Editor’s note: Since these messages are never passed to the fibre channel device, the use of the FC ELS format is not required. However, leveraging the format may benefit a gateway implementation. Depending on the tradeoffs, therefore, the format may be modified to eliminate use of the ELS as a message template.] The Link Service frame carrying a TCP ELS message is identified by the TCP ELS bit being set in the iFCP FLAGS field of the iFCP header. Additionally, the TYPE field is 0x01 and R_CTL field is 0x22 for the request, and 0x23 for the reply. The following lists the TCP Link Service messages and their corresponding LS_COMMAND values. Request LS_COMMAND Short Name iFCP Support ------- ---------- ---------- ----------- Control Connection Bind 0xE0 CBIND REQUIRED Unbind Connection 0xE4 UNBIND REQUIRED TCP Message 0xE8 TCPMSG REQUIRED Network Connection Interfaces 0xED NINTF REQUIRED 8.1 Network Connection Interfaces (NINTF) Monia Standards Track 31 iFCP April 2001 NINTF allows an FCP Portal to request information on other network interfaces that may be used to establish connections with the responding gateway implementation. This extended link service will return the number of network interfaces available, and an interface descriptor record for a single interface. Since each NINTF request returns information on one interface, multiple NINTF requests are required to obtain information on more than one interface. The following shows the format of the NINTF request message. Byte Offset +----------------------------------+ 0 | LS_COMMAND (0xED000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | INTERFACE KEY | 2 Bytes +----------------------------------+ 10 | RESERVED | 2 Bytes +----------------------------------+ Total Length = 12 USER INFO - Contains any data desired by the requester. The value will be echoed by the recipient. INTERFACE KEY - Contains an index to the interface for which the NINTF message is querying. Each interface at the destination shall be sequentially numbered beginning with 1. If the number of interfaces supported by the message recipient is unknown, then this field shall contain 0. In the NINTF response, the recipient will indicate the number of interfaces supported. Each of these interfaces can be referenced in subsequent NINTF messages by the sender by setting the INTERFACE KEY value up to the highest-numbered interface. The following shows the format of the NINTF response. Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xED000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | RESERVED | 3 Bytes +----------------------------------+ 11 | INTERFACES AVAILABLE (A) | 1 Byte +----------------------------------+ 12 | INTERFACE RECORDS | X Bytes +----------------------------------+ Monia Standards Track 32 iFCP April 2001 Total Length = X + 12 USER INFO - The 4-byte field is the same value as the USER INFO in the NINTF request. The recipient echoes this value back to the sender, and does not perform any operation using this field. INTERFACES AVAILABLE (A) - This parameter specifies the number of additional network interfaces that may be used to establish TCP connections. The value stored in this field also specifies the number (A) of network interface records that are present at the end of the message. INTERFACE RECORDS - This field contains A interface records describing each of the network interfaces. An interface record consists of 5 parameters as shown in below. Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | RECORD LENGTH | 1 Byte +----------------------------------+ 1 | IP ADDRESS TYPE | 1 Byte +----------------------------------+ 2 | INTERFACE HANDLE | 2 Bytes +----------------------------------+ 4 | RESERVED | 4 Bytes +----------------------------------+ 8 | INTERFACE SPEED | 4 Bytes +----------------------------------+ | IP ADDRESS | X-12 Bytes +----------------------------------+ Total Length = X RECORD LENGTH - Defines the total length, in bytes, of the INTERFACE RECORD, including the RECORD LENGTH field. This value shall be a multiple of 4 bytes. IP ADDRESS TYPE - Defines the type of address in the IP ADDRESS field. 0x01 indicates IPv4, 0x02 indicates Ipv6. INTERFACE HANDLE - This 16-bit field contains an identifying number (i.e., handle) assigned to the interface by the destination N_PORT. INTERFACE SPEED - This parameter specifies the data rate of the interface in bits per second. The value in this field is the data rate divided by 1024. For example, a value of 1024 indicates a data rate of 1048576 bits per second. Monia Standards Track 33 iFCP April 2001 IP ADDRESS - This field contains the IP address of the network interface for which information is being returned. If the address type is N bytes long and the field is larger than N, the address shall be in the first N bytes of the field with the remainder of the field set to 0. 8.2 Connection Bind (CBIND) The CBIND message binds an N_PORT login session to a specific TCP connection. In the CBIND request message, the source and destination N_Ports are identified by the N_PORT network address (iFCP portal address and N_PORT ID). The following shows the format of the CBIND request. Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xE0000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | SOURCE PORT NAME | 8 Bytes +----------------------------------+ Length = 16 USER INFO - Contains any data desired by the requester. This info is echo-ed back by the recipient. SOURCE PORT NAME - Contains the originating N_PORT's World Wide Port Name (WWPN). The FCP Portal uses this to verify that there is no pre-existing N_PORT session between the source and destination N_PORTs. [The response to this error condition will be handled in a future release of this specification] The following shows the format of the CBIND response. Monia Standards Track 34 iFCP April 2001 Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xE0000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | DESTINATION PORT NAME | 8 Bytes +----------------------------------+ 16 | RESERVED | 2 Bytes +----------------------------------+ 18 | CBIND STATUS | 2 Bytes +----------------------------------+ 20 | RESERVED | 2 Bytes +----------------------------------+ 22 | CONNECTION HANDLE | 4 Bytes +----------------------------------+ Total Length = 26 USER INFO - Contains the same value received in the USER INFO field of the CBIND request message. DESTINATION PORT NAME - Contains the destination N_PORT's World Wide Port Name (WWPN). CBIND STATUS - Indicates success or failure of the CBIND request. CBIND values are shown below. Value Description ----- ----------- 0 Successful – No other status 1 – 15 Reserved 16 Failed – Unspecified Reason 17 Failed – No such device 18 Failed – N_PORT session already exists 19 Failed – Lack of resources Others Reserved CONNECTION HANDLE (CHANDLE) - Contains a value assigned by the FCP Portal to identify the control connection 8.3 Unbind Connection (UNBIND) UNBIND is used to release a bound TCP connection and return it to the pool of unbound TCP connections. This message is transmitted in the connection that is to be unbound. The following is the format of the UNBIND request message. Monia Standards Track 35 iFCP April 2001 Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xE4000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | CONNECTION HANDLE | 4 Bytes +----------------------------------+ 12 | RESERVED | 8 Bytes +----------------------------------+ Total Length = 20 CONNECTION HANDLE (CHANDLE) - Contains a value assigned by the FCP Portal to identify the connection The following shows the format of the UNBIND response message. Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xE4000000) | 4 Bytes +----------------------------------+ 4 | USER INFO | 4 Bytes +----------------------------------+ 8 | CONNECTION HANDLE | 4 Bytes +----------------------------------+ 16 | RESERVED | 10 Bytes +----------------------------------+ 26 | UNBIND STATUS | 2 Bytes +----------------------------------+ 28 | RESERVED | 2 Bytes +----------------------------------+ Total Length = 26 UNBIND STATUS - Indicates the success or failure of the UNBIND request. Value Description ----- ----------- 0 Successful – No other status 1 – 15 Reserved 16 Failed – Unspecified Reason 17 Failed – No such device 18 Failed – Connection ID Invalid Others Reserved CONNECTION HANDLE (CHANDLE) - Contains a value assigned by the FCP Portal to identify the unbound connection. Monia Standards Track 36 iFCP April 2001 8.4 TCP Message (TCPMSG) TCPMSG sends an error message to another iFCP port. TCPMSG differs from other messages in that there is no reply to TCPMSG (both the first and last sequence in a exchange). The primary purpose for TCPMSG is to generate a message informing an iFCP port that a fatal FCP/TCP protocol error was detected, and all connections established with the iFCP port are being closed. TCPMSG can also be used to send "Informative" or "Warning" messages that may be used for debugging or diagnostic purposes. The format of the TCPMSG request message follows. Byte MSb LSb Offset 7 6 5 4 3 2 1 0 +----------------------------------+ 0 | LS_COMMAND (0xEE000000) | 4 Bytes +----------------------------------+ 4 | RESERVED | 4 Bytes +----------------------------------+ 8 | ERROR CODE | 2 Bytes +----------------------------------+ 10 | TCPMSG FLAGS | 1 Byte +----------------------------------+ 11 | MSG LENGTH (L) | 1 Byte +----------------------------------+ 12 | MSG | L Bytes +----------------------------------+ Total Length = L + 12 ERROR CODE - Specifies one of the predefined error messages shown in the following table. This field is valid only if the FATAL bit is 1 and MSG LENGTH is 0 in the TCPMSG FLAGS field. Value Description ----- ----------- 0x0001 Loss of Synchronization on Connection Others Reserved TCPMSG FLAGS - This field contains 3 bit flags that specify how the recipient should interpret the received message. Monia Standards Track 37 iFCP April 2001 Bit Field Flag Description --------- ---- ----------- 7:3 RESERVED 2 INFORMATIVE Indicates the message is informative, usually for debugging purposes. These messages may be discarded. 1 WARNING Indicates the message is a warning. Processing of warning messages is required and implementation-specific. 0 FATAL Indicates that a fatal protocol error has been detected. Sender is terminating the login sessions with the recipient and closing all TCP connections. The recipient shall implicitly logout the sender of the message and close TCP connections to the sender. A WARNING or INFORMATIVE message shall not cause the recipient to alter the operating environment. When more than one TCPMSG FLAG bit is set, the message shall be considered Fatal. When no flags are set, the message shall be discarded. MSG LENGTH - Specifies the length in bytes of the MSG field. The length must be a multiple of 4 and can be a value of between 0 and 128. A value of 0 indicates the MSG field is not present. 9. Error Detection and Recovery Procedures for iFCP 9.1 Overview [FCP-2], [FC-PH], and [FC-PH-2] define error detection and recovery procedures. These Fibre Channel-defined mechanisms continue to be available in the iFCP environment. 9.2 Timer Definitions 9.2.1 Error_Detect_Timeout (E_D_TOV) E_D_TOV is "a reasonable timeout value for detection of a response to a time event". The default value specified by FC- PH of 10 seconds will be also used as the iFCP default value. Monia Standards Track 38 iFCP April 2001 E_D_TOV is the maximum time allowed between the transmission of consecutive data frames within a sequence. For Class 2 service, E_D_TOV specifies the maximum time interval between transmission of a frame, and receipt of the ACK for that frame. [The policy for setting E_D_TOV for an IP fabric is TBS] 9.2.2 Resource Allocation Timeout (R_A_TOV) R_A_TOV is defined in FC-PH-2 as "the maximum transit time within a fabric to guarantee that a lost frame will never emerge from the fabric". A value of 2 x R_A_TOV is the minimum time that the originator of an ELS request or FC-4 ELS request shall wait for the response to that request. [The policy for setting R_A_TOV for an IP fabric is TBS] 9.2.3 Resource Recovery Timer (RR_TOV) [The content of this section is TBD] 9.3 TCP Error Recovery Issues A failed TCP connection will result in a dropped N_PORT session. [The remainder of this section is TBD] 9.4 iFCP Protocol Error iFCP protocol errors between FCP Portals shall be considered fatal errors resulting in the termination of the login sessions and closing of the TCP sessions. An iFCP protocol error occurs when Fibre Channel frames are sent on the wrong TCP connection. One example of a protocol error is receiving an FCP_CMND IU on the data connection. If an iFCP port detects an iFCP/TCP protocol error on a connection, the port shall transmit a TCPMSG message on the control connection (if one exists) with the appropriate error code. The FCP_Portal shall then implicitly log out and close all TCP connections established with the iFCP port, and ignore all data received on these TCP connections until they are reopened. [The information returned to the N_PORT upon occurence of an iFCP protocol error will be specified in the next revision of this specification] 10. Fabric Services Supported by an iFCP implementation Monia Standards Track 39 iFCP April 2001 An iFCP gateway implementation MUST support the following fabric services: N_PORT ID Value Description Section --------------- ----------- ------- 0xFF FF FE F_PORT Server /TBS/ 0xFF FF FD Fabric Controller /TBS/ 0xFF FF FC Directory/Name Server /TBS/ 11. Security 11.1 Overview As with any other IP-based network, an iFCP storage network has security issues which must be addressed with the appropriate security policies and enforcement resources. There are various levels of security paradigms which when applied appropriately to an iFCP network can provide sufficient levels of security, including data integrity, authentication, and privacy, depending on user needs. 11.2 Physical Security Most existing SCSI and Fibre Channel interconnections are deployed in private, physically isolated environments where hostile entities are not provided access to the SCSI and Fibre Channel interconnects. This is the most basic security mechanism, and may be a sufficient model in some cases for an iFCP network. 11.3 Controlling Access A second level of security is the use of zoning. Zoning specifies which devices are allowed to communicate, and is similar in concept to VLAN (Virtual Local Area Network) technology. Zoning information is maintained in a Name Server. 11.4 Authentication and Encryption Where additional levels of data integrity and privacy are required for iFCP, existing IPSec specifications can be applied to iFCP. Because IPSec is a layer-3 technology and has no knowledge of TCP, UDP, or higher-level protocols such as iFCP and FCP, it can be applied transparently to iFCP. The Monia Standards Track 40 iFCP April 2001 following IETF documents describe the operational framework and automatic keying mechanisms for IPSec. RFC2401 Security Architecture for the Internet Protocol RFC2402 IP Authentication Header RFC2406 IP Encapsulating Security Payload RFC2407 The Internet IP Security Domain of Interpretation for ISAKMP RFC2408 Internet Security Association and Key Management Protocol (ISAKMP) RFC2409 The Internet Key Exchange (IKE) 11.5 Storage Firewalls Firewalls are a common and proven methodology for securing access to IP-based networks, and they can be appropriate for use in IP-based storage networks as well. A firewall is a choke point through which all transit traffic must transit in order to pass between two separate networks. Since all iFCP traffic uses a well-known IANA-assigned TCP port number, it can easily be recognized and inspected. Access to storage resources can be secured by setting up a single gateway through which all outside non-secured traffic must pass through in order to access resources in the storage network. Such a firewall can be a proxy host operating at the session or application layer, requiring authentication before allowing traffic to pass. It can also be a stateful inspection gateway which understands the iFCP protocol, and can passively inspect and discover security threats as they transit the gateway. A third option is to use a standard router access control list to filter authorized traffic based upon static parameters such as IP addresses and TCP port numbers. 12. Quality of Service Considerations 12.1 Minimal requirements Conforming iFCP protocol implementations SHALL correctly communicate gateway-to-gateway even across one or more intervening best-effort IP regions. The timings with which such gateway-to gateway communication is performed, however, will greatly depend upon BER, packet losses, latency, and jitter experienced throughout the best-effort IP regions. The Monia Standards Track 41 iFCP April 2001 higher these parameters, the higher will be the gap measured between iFCP observed behaviors and baseline iFCP behaviors (i.e., as produced by two iFCP gateways directly attached to one another). 12.2 High-assurance It is expected that many iFCP deployments will benefit from a high degree of assurance on the behaviors of the intervening IP regions, with resulting high-assurance on the overall end- to-end path, as directly experienced by Fibre Channel applications. Such assurance on the IP behaviors stems from the intervening IP regions supporting standard Quality-of- Service (QoS) techniques, fully complementary to iFCP, such as: a) Congestion avoidance by over-provisioning of the network b) Integrated Services [IntServ] QoS c) @ D i f f e rentiated Services [DiffServ] QoS d) @ M u l t i -Protocol Label Switching [MPLS] In the most general definition, two iFCP gateways are separated by one or more independently managed IP regions, some of which implement some of the QoS solutions mentioned above. The IP regions with these QoS solutions are said to support Service Level Agreements (SLAs). Such agreements finalize requirements on network parameters such as bandwidth, loss, latency, jitter, burst length. The requirements may be expressed in absolute or relative terms, and apply to a unidirectional flow of packets. Depending on the QoS techniques available, the dynamic stipulation of a SLA may require the iFCP gateway to interact with network ancillary functions such admission control and bandwidth brokers (with RSVP or other signalling protocols that an IP region may accept). Due to the fact that Fibre Channel Class 2 and Class 3 do not support fractional bandwidth guarantees, and that iFCP is committed to supporting current Fibre Channel semantics, it is impossible for an iFCP gateway to autonomously infer bandwidth requirements from streaming Fibre Channel traffic. Rather, the requirements on bandwidth or other network parameters need to be injected out-of-band into a iFCP gateway (or the node that will actually negotiate the SLA on the gateway's behalf) through mechanisms outside the scope of this specification (e.g., through a management interface into the iFCP gateway). The administrator of a iFCP gateway MAY thus stipulate a Service Level Agreement with the local IP region for one, Monia Standards Track 42 iFCP April 2001 several, or all of an iFCP gateway's TCP sessions used by iFCP. Alternately, this responsibility may be delegated to a node downstream. Should an iFCP implementation support multiple tuples over the same TCP connection, and should such a connection be subject to a SLA, then all these tuples will share in the same SLA and the resulting treatment by the network. For finer granularity of QoS behaviors, iFCP implementations MAY elect to dedicate a distinct TCP connection to each active tuple. This is the way an individual tuple can enjoy a customized SLA. To render the best emulation of Fibre Channel possible over IP, it is anticipated that typical SLAs will specify a fixed amount of bandwidth, null losses, and, to a lesser degree of relevance, low latency, and low jitter. For example, an IP region using DiffServ QoS may support SLAs of this nature by applying EF DSCPs to the iFCP traffic. For the same SLA, another IP region might as well use a different DSCP or different QoS techniques alltogether. The way different QoS techniques are re-mapped at the edge of different intervening IP regions is beyond the scope of this specification. [T11/00-603V0] describes a proposal to add fractional bandwidth guarantees to Class 2 and 3 (migrating it from Class 4). In such proposal, the bandwidth parameters would surface in the FLOGI request and accept, and PLOGI request and accept. In this case, it will become possible for an iFCP gateway to trap this information and autonomously remap it onto the SLA negotiation mechanism required by the local IP region, without resorting to out-of-band QoS management. Such an in-band QoS mechanism would result in true end-to-end provisioning of network resources. Forthcoming revisions of this iFCP specification will build upon this new opportunity. 13. References 13.1 Relevant SCSI (T10) Specifications The following documents are available from: Global Engineering, 15 Inverness Way East, Englewood, CO 80112-5704. Telephone (800) 854-7179 or (303) 792-2181, Fax: (303) 792- 2192 [SAM] SCSI-3 Architecture Model (SAM), ANSI X3.270-1996 [SAM-2] SCSI Architecture Model-2 (SAM-2), Project 1157-D, revision 11 [SPC] SCSI Primary Commands (SPC), ANSI X3.301-1997 Monia Standards Track 43 iFCP April 2001 [SPC-2] SCSI Primary Commands-2 (SPC-2), Project 1236-D, revision 16 [FCP] Fibre Channel Protocol for SCSI (FCP), ANSI X3.269-1996 [FCP-2] Fibre Channel Protocol for SCSI, Second Revision (FCP- 2), Project 1144D, revision 04 10.2 Relevant Fibre Channel (T11) Specifications The following documents are available from: Global Engineering, 15 Inverness Way East, Englewood, CO 80112-5704. Telephone (800) 854-7179 or (303) 792-2181, Fax: (303) 792- 2192 [FC-PH] Fibre Channel Physical and Signaling Interface (FC-PH) Rev 4.3, ANSI X3.230:1994 [FC-PH-2] Fibre Channel Physical and Signaling Interface (FC-PH- 2) Rev 7.4, ANSI X3.297:1997 [FC-PH-3] Fibre Channel Physical and Signaling Interface (FC-PH- 3) Rev 9.4, ANSI X3.303:1998 [FC-FG] Fibre Channel Generic Requirements (FC-FG) Rev 3.5 ANS X3.289:1996 [FC-GS-2] Fibre Channel Generic Services (FC-GS-2) Rev 5.2, ANSI NCITS 288 [FC-AL] Fibre Channel Arbitrated Loop (FC-AL) Rev 4.5, ANSI X3.272:1996 [FC-AL-2] Fibre Channel Arbitrated Loop (FC-AL-2) Rev 7.0, NCITS 32:1999 [FC-PLDA] Fibre Channel Private Loop SCSI Direct Attachment (FC LDA), NCITS TR-19:1998 [FC-FLA] Fibre Channel Fabric Loop Attachment (FC-FLA), NCITS TR-20:1998 [FC-TAPE] Fibre Channel Tape and Tape Medium Changers (FC-TAPE), NCITS TR-24:1999 10.3 Relevant RFC Documents Monia Standards Track 44 iFCP April 2001 [RFC768] User Datagram Protocol [RFC791] Internet Protocol, DARPA Internet Program Protocol Specification [RFC1146] TCP Alternate Checksum Options [RFC2401] Security Architecture for Internet Protocol [RFC2402] IP Authentication Header [RFC2406] Encapsulating Security Protocol (ESP) [RFC2407] The Internet IP Security Domain for ISAKMP [RFC2408] Internet Security Association and Key Management Protocol (ISAKMP) [RFC2409] The Internet Key Exchange (IKE) [RFC2460] Internet Protocol, Version 6 (IPv6) Specification 10.4 Other Reference Documents Fibre Channel, Gigabit Communications and I/O for Computer Networks, Alan F. Beener, McGraw-Hill, ISBN 0-07-005669-2 The Fibre Channel Consultant, A Comprehensive Introduction, Robert W. Kembel, Northwest Learning Associates, ISBN 0- 931836-82-6 The Fibre Channel Consultant, Arbitrated Loop, Rober W. Kembel, Connectivity Solutions, a division of Northwest Learning Associates, ISBN 0-931836-84-0 14. Author's Addresses Charles Monia Rod Mullendore Josh Tseng Nishan Systems 3850 North First Street San Jose, CA 95134 Phone: 408-519-3986 Email: cmonia@nishansystems.com Monia Standards Track 45 iFCP April 2001 Franco Travostino Victor Firoiu Nortel Networks Director, Content Internetworking Lab 3 Federal Street Billerica, MA 01821 Phone: 978-288-7708 Email: travos@nortelnetworks.com David Robinson Sun Microsystems Senior Staff Engineer M/S UNWK02-107 901 San Antonio Road Palo Alto, CA 94303-4900 Phone: 510-574-9226 Email: david.robinson@ebay.sun.com Wayland Jeong Troika Networks Vice President, Hardware Engineering 2829 Townsgate Road Suite 200 Westlake Village, CA 91361 Phone: 805-370-2614 Email: wayland@troikanetworks.com Rory Bolt Quantum/ATL Director, System Design 101 Innovation Drive Irvine, CA 92612 Phone: 949-856-7760 Email: rbolt@atlp.com Paul Rutherford ADIC Vice President, Technology & Software 1143 Willows Road N.E. P.O. Box 97057 Redmond, WA 98073-9757 Phone: 425-881-8004 Email: paul.rutherford@adic.com Mark Edwards Senior Systems Architect Eurologic Development, Ltd. 4th Floor, Howard House Monia Standards Track 46 iFCP April 2001 Queens Ave, UK. BS8 1SD Phone: +44 (0)117 930 9600 Email: medwards@eurologic.com Monia Standards Track 47 iFCP April 2001 Appendix A A. iFCP Support for Fibre Channel Link Services For reference purposes, this appendix enumerates all the fibre channel link services and the manner in which each shall be processed by an iFCP implementation. The iFCP processing policies are defined in section 7. A.1 Basic Link Services The basic link services are shown in the following table. Basic Link Services Name Description iFCP Policy ---- ----------- ---------- ABTS Abort Sequence Transparent BA_ACC Basic Accept Transparent BA_RJT Basic Reject Transparent NOP No Operation Transparent PRMT Preempted Rejected (Applies to Class 1 only) RMC Remove Connection Rejected (Applies to Class 1 only) A.2 Link Services Processed Transparently The following link service requests and responses MUST be processed transparently as defined in section 7. ELSs Processed Transparently Name Description ---- ----------- ACC Accept ADVC Advise Credit CSR Clock Synchronization Request CSU Clock Synchronization Update ECHO Echo ESTC Estimate Credit ESTS Establish Streaming FACT Fabric Activate Alias_ID FAN Fabric Address Notification FARP- Fibre Channel Address REPLY Resolution Protocol Reply Monia Standards Track 48 iFCP April 2001 FARP-REQ Fibre Channel Address Resolution Protocol Request FDACT Fabric Deactivate Alias_ID FDISC Discover F_Port Service Parameters FLOGI F_Port Login GAID Get Alias_ID LCLM Login Control List Management LINIT Loop Initialize LIRR Link Incident Record Registration LPC Loop Port Control LS_RJT Link Service Reject LSTS Loop Status NACT N_Port Activate Alias_ID NDACT N_Port Deactivate Alias_ID PDISC Discover N_Port Service Parameters PRLI Process Login PRLO Process Logout QoSR Quality of Service Request RCS Read Connection Status RLIR Registered Link Incident Report RNC Report Node Capability RNFT Report Node FC-4 Types RNID Request Node Identification Data RPL Read Port List RPS Read Port Status Block RPSC Report Port Speed Capabilities RSCN Registered State Change Notification RTIN Request Topology Information RTV Read Timeout Value RVCS Read Virtual Circuit Status SBRP Set Bit-error Reporting Parameters SCL Scan Remote Loop SCN State Change Notification SCR State Change Registration TEST Test TPLS Test Process Login State A.3 Augmented Link Services The following extended link services are augmented with additional data and processed by the iFCP implementation as described in the referenced section listed in the table. Augmented Link Services Monia Standards Track 49 iFCP April 2001 Name Description Section ---- ----------- ------- ABTX Abort Exchange 7.3.1 ADISC Discover Address 7.3.2 FARP- Fibre Channel Address 7.3.3 REPLY Resolution Protocol Reply FARP-REQ Fibre Channel Address 7.3.4 Resolution Protocol Request LOGO N_PORT Logout 7.3.5 PLOGI Port Login 7.3.6 REC Read Exchange Concise 7.3.7 RES Read Exchange Status Block 7.3.9 RLS Read Link Error Status Block 7.3.11 RRQ Reinstate Recovery Qualifier 7.3.13 RSI Request Sequence Initiative 7.3.14 RSS Read Sequence Status Block 7.3.12 TPRLO Third Party Process Logout 7.3.15 Monia Standards Track 50 iFCP April 2001 Appendix B B. Performance of The Multi-Connection iFCP Session Model This appendix provides a quantitative analysis of the claim that N TCP connections carrying the traffic of all the sessions active between gateways provide significantly higher aggregate average throughput than a single TCP connection carrying the same sessions. The analysis shows that the difference is proportional to the square of the number of TCP sessions, N. This analyses is based on three fundamental assumptions: (i) all the available bandwidth in a link is available to iFCP traffic, (ii) the sender has always data ready to send (as is most likely the case with a backup application), and (iii) the maximum window size at the two TCP ends (i.e., the iFCP gateways) is set to the link nominal capacity multiplied by the round-trip-time (so as to have the highest chances of saturating the link yet without unduly raising buffering requirements at the end nodes). The N^2 factor that emerges from this analysis is essentially due to the way TCP congestion control reacts to packet losses. B.1 Relationship of Throughput to Packet Losses There are several reasons for packet losses: network congestion, link errors and network errors. Network congestion is pervasive in current IP networks, where the only way to control congestion is through dropping packets. Techniques for loss prevention, such as traffic engineering, admission control and bandwidth reservation, are not widely deployed and hence are not a factor in the behavior of existing networks. Even in a perfectly engineered network, link errors occur. Assuming a link error rate equal to that specified for Fibre Channel (10^-12) and a 10Gb/s link, there is one error every 100 seconds. Network errors also occur with significant frequency in IP networks. Jonathan Stone and Craig Partridge recently reported in Sigcomm 2000 that network errors caught by the TCP checksum occur with significant frequency. Between one packet in 1100 and 1 in 32000 have errors get past the link CRC and are detected by the TCP/IP checksum. TCP throughput is impacted by each packet loss. Following TCP's congestion control algorithm (supported by the Tahoe, Reno, New-Reno, and SACK implementations) each packet loss results in the TCP sender's congestion window being reduced to half of its current value, and therefore (assuming constant Round Trip Time), TCP's throughput is halved. After that, the window increases by roughly one packet every two Round Trip Times (assuming the widely-used Delayed-Acknowledgement Monia Standards Track 51 iFCP April 2001 algorithm). The temporary decrease in TCP's rate translates into a missed opportunity to transmit a given amount of data. As we show in the following Background section, for N storage connections sharing an IP "pipe" of rate E, the amount of data missing the opportunity to be transmitted due to a packet loss is: D(N) = E^2/(N^2)*RTT^2/(256*M) where RTT = Round Trip Time, M = packet size. For example, for a set of N=100 connections totaling E=10Gb/s, RTT=10ms, M=1500B, the data not transmitted in time due to a packet loss is D(N)=2.6MB. For the same set transported over one TCP session, the data not sent in time is D(1)= 26GB, a 10,000 fold increase. The time interval for TCP to recover its sending rate to its initial value after a packet loss is I(N)= 0.833 seconds in the case N TCP connections, and I(1)=83.3seconds in the case of a single TCP connection. Observe that in the latter case, the time to recover its rate, I(1)=83.3s, is of the same order of magnitude as the time between two packet losses due exclusively to a link Bit Error Rate of 10^-12. In other words, a packet loss occurs almost immediately after TCP has recovered its rate. This means that a single TCP connection delivers on average about 3/4 of the required 10Gb/s rate, since 1/4 of the rate is lost during the time the TCP rate is increasing linearly from 1/2 to full rate. (More precisely, the effective rate is 8.27Gb/s because 1/4 of the rate is lost during 83.3s, and the time between two errors is now 120.825s due to a decreased sending rate). By comparison, N TCP connections deliver approximately 9.99979Gb/s (i.e., lost 1/4 of one TCP full rate of 100Mb/s during 0.833s out of a 100s interval). If the impact of TCP checksum errors is also considered, the TCP sending rate is limited to an average of (8M/RTT)sqrt(3/4p), where p is the probability of packet loss (see [1] for details). For M=1500, RTT=10ms and p=1/32000, TCP throughput is about 240Mb/s. For p=1/1100, maximum TCP throughput is 34.4Mb/s. Therefore, to fill a 10Gb/s line, about 42 simultaneous TCP flows are required (in the case where p=1/32000) or 291 TCP flows (in the case where p=1/1100). Practically, for these reasons the iFCP protocol supports combinations of M tuples using N TCP connections, with M, N >= 1, and with an individual tuple using at most one TCP connection (thus M >= N). B.2 Background. Monia Standards Track 52 iFCP April 2001 For a TCP session to sustain a rate of C bits/second, the TCP's maximum congestion window W (measured in number of packets) has to be at least W0=RTT*C/(8*M) where RTT = Round Trip Time in seconds, M = packet size in Bytes. The following analyses assumes W=W0. Later, the problems with the alternative W>W0 are discussed. The time needed by the TCP sender to recover from a single packet loss and have its sending rate reach the previous C value is I = 2*RTT*W/2 = RTT*W = RTT^2*C/(8*M). The total amount of data (in Bytes) missing the opportunity to be transmitted in this time interval I is: D = C/8*I/4 = C^2*RTT^2/(256*M) Consider a set of tuples sharing an IP "pipe" of rate E to be transported in N TCP sessions. Assuming all connections are processed equally, each TCP session sends at a rate of E/N. One packet loss impacts only one TCP session, and thus, the total amount of data missing the opportunity to be transmitted due to a packet loss is D(N) = E^2/(N^2)*RTT^2/(256*M). On the other hand, if the same set of tuples sharing an IP "pipe" of rate E is transported in one TCP session only, the total amount of data losing the opportunity to be transmitted due to a packet loss is D(1) = E^2*RTT^2/(256*M) = D(N)*N^2. The impact of packet losses on the single-TCP solution can be reduced by configuring the maximum congestion window to be larger than the bandwidth*delay product, W>W0. But in this case, only W0 packets can be in transit on the line, while the rest (up to the current window size) need to be stored in a queue at the line's ingress. In order to provide full line rate utilization assuming periodic losses, the maximum congestion window should be at least 2*W0, due to TCP's congestion Monia Standards Track 53 iFCP April 2001 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 2 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 Monia Standards Track 54