idnits 2.17.1 draft-ietf-nfsv4-migration-issues-16.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 2319 has weird spacing: '... pieces of th...' -- The document date (August 21, 2018) is 2072 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5661 (Obsoleted by RFC 8881) == Outdated reference: A later version (-05) exists of draft-ietf-nfsv4-mv0-trunking-update-01 == Outdated reference: A later version (-04) exists of draft-ietf-nfsv4-mv1-msns-update-01 -- Obsolete informational reference (is this intentional?): RFC 3530 (Obsoleted by RFC 7530) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFSv4 D. Noveck, Ed. 3 Internet-Draft NetApp 4 Intended status: Informational P. Shivam 5 Expires: February 22, 2019 IBM 6 C. Lever 7 B. Baker 8 ORACLE 9 August 21, 2018 11 NFSv4 Migration and Trunking: Implementation and Specification Issues 12 draft-ietf-nfsv4-migration-issues-16 14 Abstract 16 This document discusses a range of implementation and specification 17 issues concerning features related to the use of location-related 18 attributes in NFSv4. These include migration, which transfers 19 responsibility for a file system from one server to another, and 20 trunking which deals with the discovery and control of the set of 21 server endpoints to use to access a file system. The focus of the 22 discussion, which relates to multiple minor versions, is on defining 23 the appropriate clarifications and corrections for existing 24 specifications. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on February 22, 2019. 43 Copyright Notice 45 Copyright (c) 2018 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Language . . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 63 2.2. Use of Normative Terms . . . . . . . . . . . . . . . . . 4 64 2.3. Terminology Used in this Document . . . . . . . . . . . . 5 65 3. Issues that Apply to Multiple Minor Versions and their 66 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 3.1. Issue Summary . . . . . . . . . . . . . . . . . . . . . . 7 68 3.2. Resolution of Multi-Version Issues . . . . . . . . . . . 8 69 3.2.1. Providing Trunking Discovery . . . . . . . . . . . . 9 70 3.2.2. Addressing Changes in Trunking Configuration . . . . 11 71 3.2.3. Interaction of Trunking and Migration . . . . . . . . 11 72 3.2.4. Dealing with Multiple Connection Types . . . . . . . 13 73 4. NFSv4.0 Issues . . . . . . . . . . . . . . . . . . . . . . . 14 74 4.1. Core NFSv4.0 Migration Issues . . . . . . . . . . . . . . 14 75 4.2. Resolution of Core Migration Protocol Difficulties in 76 NFSv4.0 . . . . . . . . . . . . . . . . . . . . . . . . . 15 77 4.3. Additional NFSv4.0 Issues . . . . . . . . . . . . . . . . 16 78 4.4. Resolution of Additional NFSv4.0 Issues . . . . . . . . . 16 79 4.4.1. Resolution of NFSv4.0 Issues with Multiple Connection 80 Types . . . . . . . . . . . . . . . . . . . . . . . . 17 81 5. Issues for NFSv4.1 and Beyond . . . . . . . . . . . . . . . . 19 82 5.1. Trunking-focused Issues to Address for NFSv4.1 . . . . . 19 83 5.1.1. Handling of Additional Information in 84 fs_locations_info . . . . . . . . . . . . . . . . . . 19 85 5.1.2. Addressing Server_owner Changes in NFSv4.1 . . . . . 19 86 5.2. Migration-related Issues to Address for NFSv4.1 . . . . . 21 87 5.2.1. Addressing State Merger in NFSv4.1 . . . . . . . . . 22 88 5.2.2. Addressing pNFS Relationship with Migration . . . . . 22 89 5.2.3. Addressing Confirmation Status of Migrated 90 Client IDs in NFSv4.1 . . . . . . . . . . . . . . . . 23 91 5.2.4. Addressing Session Migration in NFSv4.1 . . . . . . . 24 92 5.2.5. Dealing with Multiple Connection Types in NFSv4.1 . . 24 93 5.3. Possible Resolutions for NFSv4.1 Protocol Issues . . . . 26 94 5.3.1. Client ID Confirmation Issues . . . . . . . . . . . . 26 95 5.3.2. Dealing with Multiple Location Entries . . . . . . . 27 96 5.3.3. Migration and pNFS . . . . . . . . . . . . . . . . . 29 97 5.4. Defining Server Responsibilities for NFSv4.1 Transitions 30 98 5.4.1. Server Responsibilities in Effecting Transparent 99 State Migration . . . . . . . . . . . . . . . . . . . 30 100 5.4.2. Synchronizing Session Transfer . . . . . . . . . . . 32 101 5.4.3. Server Issues Dealing with RECLAIM_COMPLETE . . . . . 34 102 5.5. Defining Client Responsibilities for NFSv4.1 Transitions 35 103 5.5.1. Client Recovery from Migration Events . . . . . . . . 35 104 5.5.2. The Migration Discovery Process . . . . . . . . . . . 38 105 5.5.3. Overview of Client Response to NFS4ERR_MOVED . . . . 39 106 5.5.4. Obtaining Access to Sessions and State after 107 Migration . . . . . . . . . . . . . . . . . . . . . . 41 108 5.5.5. Obtaining Access to Sessions and State after Network 109 Address Transfer . . . . . . . . . . . . . . . . . . 43 110 5.6. Resolution of NFSv4.1 Issues . . . . . . . . . . . . . . 43 111 5.7. Potential Protocol Extensions . . . . . . . . . . . . . . 46 112 6. Evolution of Issue Handling . . . . . . . . . . . . . . . . . 47 113 6.1. History of this Document . . . . . . . . . . . . . . . . 47 114 6.2. Further Work Needed . . . . . . . . . . . . . . . . . . . 48 115 7. Security Considerations . . . . . . . . . . . . . . . . . . . 52 116 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 53 117 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 53 118 9.1. Normative References . . . . . . . . . . . . . . . . . . 53 119 9.2. Informative References . . . . . . . . . . . . . . . . . 54 120 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 54 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 55 123 1. Introduction 125 This is an informational document that discusses a number of related 126 issues in multiple minor versions of NFSv4 (Network File System 127 Version 4). 129 Many of these relate to the migration feature of NFSv4, which 130 provides for moving responsibility for a single filesystem from one 131 server to another, without disruption to clients. A number of 132 problems in the specification of this feature in NFSv4.0 were 133 resolved by the publication of [RFC7931], which added trunking 134 detection to NFSV4.0. However, NFSv4.0 remains without an 135 appropriate discussion of trunking discovery, which has many 136 important connections with migration. As a result, NFSv4.0 requires 137 clarification of how the client is to respond to changes in the 138 trunking arrangements to use, both when migration occurs and when it 139 does not. 141 In addition, there are specification issues to be resolved with 142 regard to the NFSv4.1 version of these features which are discussed 143 in this document. 145 All of the issues discussed relate to the handling and interpretation 146 of the location-related attributes fs_locations and fs_locations_info 147 and to the proper client and server handling of changes in the values 148 of these attributes 150 These issues are all related to the protocol features for effecting 151 file system migration, or to trunking discovery but it is not 152 possible to treat each of these features in isolation. These 153 features are inherently linked because migration needs to deal with 154 the possibility of multiple server addresses in location attributes 155 and because location attributes, which provide trunking-related 156 information, may change, which might or might not involve migration. 158 2. Language 160 2.1. Requirements Language 162 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 163 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 164 document are to be interpreted as described in [RFC2119]. 166 2.2. Use of Normative Terms 168 This document, which deals with existing issues/problems in 169 standards-track documents, is in the informational category, and 170 while the facts it reports may have normative implications, any such 171 normative significance is left for the readers to determine. For 172 example, we may report that the existing definition of migration for 173 NFSv4.1 does not properly describe how migrating state is to be 174 merged with existing state for the destination server. While it is 175 to be expected that client and server implementers will judge this to 176 be a situation that it would be appropriate to resolve, the judgment 177 as to how pressing this issue should be considered is a judgment for 178 the reader, and eventually the nfsv4 working group to make. 180 We do explore possible ways in which such issues can be dealt with, 181 with minimal negative effects, given that the working group has 182 decided to address these issues, but the choice of exactly how to 183 address these is best given effect in one or more standards-track 184 documents and/or errata. 186 In the context of this informational document, these normative 187 keywords will generally occur in the context of a quotation, most 188 often direct but sometimes indirect. The context will make it clear 189 whether the quotation is from: 191 o The base definition of the NFSv4.0 protocol [RFC7530]. 193 o The document updating the handling of migration in NFSv4.0 194 [RFC7931]. 196 o The current definition of the NFSv4.1 protocol [RFC5661]. 198 An additional possibility is that these terms may appear in a 199 proposed or possible text to serve as a replacement for some part of 200 a current protocol specification. Sometimes, a number of possible 201 alternative texts may be listed and benefits and detriments of each 202 examined in turn. 204 2.3. Terminology Used in this Document 206 In this document the phrase "client ID" always refers to the 64-bit 207 shorthand identifier assigned by the server (a clientid4) and never 208 to the structure which the client uses to identify itself to the 209 server (called an nfs_client_id4 or client_owner in NFSv4.0 and 210 NFSv4.1 respectively). The opaque identifier within those structures 211 is referred to as a client id string". 213 Regarding the discussion of potential network endpoints, we use the 214 following terminology: 216 o The phrase "connection type" denotes the use of an existing or 217 potential connection to support NFSv4 layered on top of the RPC 218 stream transport as described in [RFC5531] or on top of RPC-over- 219 RDMA as described in [RFC8166]. Establishing a connection of a 220 particular type requires that the client and server support that 221 connection type given the particular client and server network 222 addresses used. 224 o Each connection is established between a client and a specific 225 server endpoint. Two endpoints are considered distinct if they 226 differ in either network address or connection type. Multiple 227 connections may be established to the same endpoint or to 228 different endpoints. 230 o The phrase "network endpoint specification" refers to the 231 combination of a network address and a connection type. 233 Regarding trunking of connections to server network endpoints, we use 234 the following terminology: 236 o Trunking detection refers to ways of deciding whether two specific 237 network endpoints are connected to the same NFSv4 server. The 238 means available to make this determination depends on the protocol 239 version, and, in some cases, on the client implementation. 241 o Two network endpoints connected to the same server are said to be 242 server-trunkable. 244 o Two network endpoints connected to the same server such that 245 connections to those endpoints can be used to support a single 246 common session are referred to as session-trunkable. Note NFSv4.1 247 allows two endpoints to be server-trunkable without being session- 248 trunkable, while in NFSv4.0 no addresses are session-trunkable, 249 since there are no sessions. 251 o Trunking discovery is a process by which a client using one 252 network endpoint can obtain the network addresses for endpoints 253 that are trunkable (either server-trunkable or session-trunkable) 254 with it. 256 Regarding terminology relating to attributes used in trunking 257 discovery and other multi-server namespace features: 259 o Location attributes include the fs_locations and fs_locations_info 260 attributes. 262 o Location entries are the individual file system locations in the 263 location attributes. 265 o Location elements are derived from location entries. If a 266 location entry specifies an IP address there is only a single 267 corresponding location element. Location entries that contain a 268 host name, are resolved using DNS, and may result in one or more 269 location elements. All location elements consist of a location 270 address which is the IP address of an interface to a server and an 271 fs name which is the location of the file system within the 272 server's pseudo-fs. The fs name is empty if the server has no 273 pseudo-fs and only a single exported file system at the root 274 filehandle. 276 o Two location elements are said to be server-trunkable if they 277 specify the same fs name and the location addresses are such that 278 the location addresses are server-trunkable. 280 o Two location elements are said to be session-trunkable if they 281 specify the same fs name and the location addresses are such that 282 the location addresses are session-trunkable. 284 Each set of server-trunkable location elements defines the available 285 access paths to a particular file system. When there are multiple 286 such file systems, each of these, which contains the same data, is a 287 replica of the others. Logically, such replication is symmetric, 288 since the fs currently in use and an alternate fs are replicas of 289 each other. Often, in other documents, the term "replica" is not 290 applied to the fs currently in use, despite the fact that the 291 replication relation is inherently symmetric. 293 3. Issues that Apply to Multiple Minor Versions and their Resolution 295 Although there are a common set of issues that need to be addressed, 296 the differences between NFSv4.0 and NFSv4.1 means that the detailed 297 handling of these issues will be significantly different in each 298 protocol. 300 In order to accommodate this situation, this section will deal with 301 the commonalities across protocol minor versions while the specifics 302 appropriate to each minor version are dealt with in Sections 4 and 5 303 respectively. 305 3.1. Issue Summary 307 Many of these issues arise from a lack of clarity regarding the 308 meaning of and proper handling for location attributes that specify 309 more than a single server address. Such situations can arise as a 310 result of multiple entries in the same attribute or because a single 311 entry has a server name which, when processed by DNS, is mapped to 312 multiple server addresses. 314 Another set of issues arises from the fact that many of the 315 facilities that must deal with multiple network addresses assume 316 there is only a single connection type shared by all of the 317 addresses. It is necessary to deal with a mixture of connection 318 types. 320 Both [RFC7530] and [RFC5661] indicate that multiple addresses may be 321 present and that these addresses may be different paths to the same 322 server as well as different copies of the same data. However, the 323 following issues have, for both protocols, interfered with the 324 recognition of the existing location attributes as a way of providing 325 a trunking discovery function: 327 o There is no discussion of the use of these attributes when a file 328 system is first accessed, giving the impression that they are only 329 to be used as a way of overcoming access difficulties. 331 o The treatment of migration (and in the case of NFSv4.1 of file 332 system transitions in general) is written as if only a single 333 server address will be accessed. 335 o Although location attributes can contain the addresses of 336 migration targets and of additional replicas as well, the issues 337 that arise when both of these are specified are not clearly 338 discussed. 340 In addition, there are factors regarding trunking that relate to 341 specific protocol versions and documents: 343 o In NFSv4.0, as described solely by [RFC7530], trunking is treated 344 as a problem to be avoided, making the whole matter moot. 346 o In NFSv4.0, as described by [RFC7530] together with [RFC7931], the 347 situation is different. There is a means of trunking detection 348 suggested in [RFC7931] but it is a suggestion only valid when the 349 client chooses to use the uniform client id string model. 351 o For NFSv4.1, as described by [RFC5661], there is a standard method 352 of trunking detection, which can be relied upon. 354 The issues that need to be addressed for both versions are: 356 o Provision of a trunking discovery facility to allow a client to 357 find out about other addresses that may be used to access the 358 current server. Such a facility needs to allows the set of 359 addresses to be used to change as necessary. 361 o Discussion of how the appropriate connection type for a given 362 client-server connection is to be arrived at and of trunking 363 issues between endpoints of multiple connection types using the 364 same network address. 366 o Better integration of migration with trunking changes, including 367 situations in which the set of endpoints usable to access the same 368 server changes (without migration) and those in which there is a 369 shift to a different server, but trunking of endpoints on either 370 the source or destination is involved 372 Note that although these issues need to be addressed for both 373 protocols, the resolutions need not be the same and the protocol 374 facilities within each protocol may limit the completeness of the 375 resolutions provided. 377 3.2. Resolution of Multi-Version Issues 379 Although the specifics of addressing these issues will be different 380 for different versions, there are some common aspects discussed in 381 the subsections below: 383 o The trunking discovery function is to be addressed in 384 substantially the same way in both versions, as explained in 385 Section 3.2.1. The only version-related differences are the 386 inclusion of the fs_locations_info attribute in NFSv4.1 and the 387 potential addition of further per-endpoint information within 388 extensions to be defined for use in later versions of NFSv4. 390 o Handling of changes to the set of addresses to be used also needs 391 to be addressed in substantially the same way in both versions. 392 See Section 3.2.2 for further discussion. 394 o The interaction of trunking and migration is discussed in general 395 terms in Section 3.2.3. However, the specifics of the NFSv4.1 396 client's response to NFS4ERR_MOVED are discussed in Sections 397 5.5.3, 5.5.4, and 5.5.5. 399 3.2.1. Providing Trunking Discovery 401 A client can discover a set network addresses to use to access a file 402 system using an NFSv4 server in a number of ways: 404 o If the client is accessing a server using its name, that name can 405 be mapped to a set of IP addresses using DNS and if multiple 406 addresses are available, those addresses can generally be used 407 together to access the server. 409 o A client connected to a server without knowledge of its name can 410 obtain the value of a location attribute (e.g. fs_locations). 411 Where an entry within that attribute specifies a server name, DNS 412 can be used to obtain one or more network addresses corresponding 413 to that name. In cases in which one of those is the address being 414 used, the others that corresponding to that name can also be used 415 to access the server. 417 o A client can obtain the value of a location attribute (e.g. 418 fs_locations) and use location entries that specify network 419 addresses. When there is a means of trunking detection available 420 (see below), all of the addresses that are determined to 421 correspond to the same server can be used to access that server. 423 Note that the last two of these are usable in situations in which 424 NFS4ERR_MOVED was returned. Note that this does not necessarily mean 425 that migration has occurred since there may be a shift in the set of 426 network addresses to be used without changing to a different server. 427 See Section 3.2.3 for further discussion. 429 Which of the above means of providing trunking information is 430 appropriate to use in a given environment will depend on security 431 considerations, the possible need for the server to direct different 432 clients to different sets of addresses, and the availability of 433 trunking detection facilities on the clients. 435 With regard to security, the possibility that requests to determine 436 the set of network addresses corresponding to a given server might be 437 interfered with or have their responses corrupted needs to be taken 438 account of. As a result, when use of DNSSEC is not available, it 439 might not be advisable to present server names in location attributes 440 and present the network addresses directly, eliminating the need to 441 use DNS to effect this translation. Fetching of location attributes 442 should be done with integrity protection. 444 In many cases, the server will provide all the network addresses to 445 be used to access a given server, allowing the client to select the 446 address or set of addresses most suited to its purposes. However, in 447 some situations, the server will want to direct clients to use 448 specific sets of network addresses to effect load balancing, to meet 449 quality-of-service goals, or to optimize use of clustered servers by 450 directing traffic to the cluster element most able to handle it 451 efficiently. In such environments, presentation of network addresses 452 directly in the location attribute can help give the server the 453 necessary control over the paths to be used when accessing particular 454 file systems. When such techniques are used, servers typically 455 present their own network addresses in the location attribute while 456 adding the names of other servers, such as those used to access 457 replicas. 459 Trunking detection allows the client to determine whether two network 460 addresses can be used to access the same server. The availability of 461 trunking detection depends on the protocol version, and, in some 462 case, on client implementation choices: 464 o For NFSv4.0, a means by which it can be determined if two network 465 addresses correspond to the same server is suggested in [RFC7931]. 466 However is it is optional and only available to clients using the 467 uniform client id string approach. 469 o For NFSv4.1, the client can compare the server_owner returned in 470 the response to EXCHANGE_ID to determine if two network addresses 471 correspond to the same server. 473 As a result, direct presentation of network addresses in location 474 entries may be problematic for NFSv4.0, since some clients might not 475 have the trunking detection facilities that allow them to take 476 advantage of this information. For further discussion of issues 477 related to NFSv4.0, see Section 4.4. 479 3.2.2. Addressing Changes in Trunking Configuration 481 When the client is capable of finding out a set of network addresses 482 to use in accessing a server, it is always possible for that set to 483 change. 485 This sometimes requires that a network address previously used to 486 access a server becomes invalid for that purpose. This requires a 487 way of notifying the client and a way for the client to adapt to this 488 change by using a new set of network addresses to access the server. 489 This will involve recovery much like that for migration although the 490 same server and file system is used throughout. 492 3.2.3. Interaction of Trunking and Migration 494 When the set of network addresses designated by a location attribute 495 changes, NFS4ERR_MOVED may or may not result, and in some of the 496 cases in which it is returned migration will occur, while in others 497 there a shift in the network addresses used to access a particular 498 file system with no migration. 500 o When the list of networks addresses is a superset of that 501 previously in effect, there is no need for migration or any other 502 sort of client adjustment. Nevertheless, the client is free to 503 use an additional address if it provides another path to the same 504 server. If, on the other hand, it does not do so, the client may 505 treat it as it does a separate replica, to be used if the current 506 server addresses become unavailable. 508 o When the list of networks addresses is a subset of that previously 509 in effect, immediate action is not needed if the address was not 510 being used. The client should avoid using it in the future, 511 whether the address is for another replica or a potential 512 additional path to the server being used. 514 o When an address being removed is one of a number of paths to the 515 current server, the client can cease to use it but it can continue 516 to use it until NFS4ERR_MOVED is received. This is not considered 517 a migration event, unless it is the last available path to the 518 server that has become unusable. 520 When migration does occur, multiple addresses may be in use on the 521 server previous to migration and multiple addresses may be available 522 for use on the destination server. 524 With regard to the server in use, it may be that return of 525 NFS4ERR_MOVED indicates that a particular network address is no 526 longer to be used, without implying that migration of the file system 527 to a different server is needed In light of this possibility, clients 528 are best off not concluding that migration has occurred until 529 concluding that all the network addresses known to be associated with 530 the server are not usable. 532 It should be noted that the need to defer this determination is not 533 absolute. If a client is not aware of all network addresses for any 534 reason, if may conclude that migration has occurred when it has not 535 and treat a switch to a different server address as if it were a 536 migration event. This is generally harmless since the use of the 537 same server via a new address will appear as a successful Transparent 538 State Migration. 540 While significant harm will not arise from this misapprehension, it 541 can give rise to disconcerting situations. For example, if a lock 542 has been revoked during the address shift, it will appear to the 543 client as if the lock has been lost during migration, normally 544 calling for it to be recoverable via an fs-specific grace period 545 associated with the migration event. 547 With regard to the destination server, it is desirable for the client 548 to be aware of all the valid network addresses that can be used to 549 access the destination server. However, there is no need for this to 550 be done immediately. Implementations can process the additional 551 location elements in parallel with normal use of the first valid 552 location entry found to access the destination. 554 Because a location attribute may include entries relating to the 555 current server, the migration destination and possible replicas to 556 use, scanning for available network addresses could potentially be a 557 long process. The following list of helpful practices, here 558 presented as suggestions, could become RECOMMENDATIONs or 559 REQUIREMENTs in future standards-track documents 561 o Servers are well advised to place location entries that represent 562 addresses usable with the current server or a migration target 563 before those associated with replicas. 565 o A client can cease scanning for trunkable location entries once it 566 encounters one whose fs_name differs from the current fs name. 568 o A client can cease scanning for trunkable location entries once it 569 encounters a location element whose address in not server- 570 trunkable with the one it is using. 572 3.2.4. Dealing with Multiple Connection Types 574 Because of the use of RPC-over-RDMA [RFC8166] as an underlying 575 transport for NFSV4, as described in [RFC8267], a client may have 576 multiple connection types to the same server network address. This 577 gives rise to a number of issues with regard to NFSv4 multi-server 578 namespace features. 580 o In the case of migration or referral, the client is directed to 581 one or more server network addresses and faces the problem of 582 selecting the appropriate connection type. 584 o When trunking multiple connections, the client might be directed 585 to use the same server network address with a different set of 586 potential connection types leaving the client to choose the 587 connection type to be used when a set with multiple connection 588 types is provided. 590 Although the situation is similar for both protocol versions, 591 differences in the attributes supported may result in important 592 differences in how connection types are selected. 594 o In the case of NFSv4.0, the fs_locations attribute has no ability 595 to indicate valid connection types. Only the network address is 596 provided, either directly in the location entry or as a result of 597 a server name being mapped to a set of network addresses. As a 598 result, the client may have to attempt connection with multiple 599 connection types, making its own selection of the subset to be 600 used when more than one connection type is available. 602 o NFSv4.1 has some facilities to aid in the selection of connection 603 types. The entries within the fs_locations_info attribute may 604 indicate the availability of RDMA connection support using the 605 FSLI4TF_RDMA flag. In addition, for RDMA implementation which 606 allow conversion (i.e. step-up) between non-RDMA and RDMA modes 607 within the scope of a single connection, the 608 CREATE_SESSION4_FLAG_CONN_RDMA flag may be used as part of 609 detecting whether RDMA support is present. When that flag is not 610 present, step-up is not supported, but the client may use the 611 FSLI4TF_RDMA flag to determine if RDMA support is available, and 612 establish a new connection to use to obtain RDMA support. 614 In addition to the selection of an appropriate connection type to use 615 when multiple connection types are available, the simultaneous 616 availability of multiple connection types raises issues related to 617 trunking, in the same way as the availability of multiple network 618 addresses connected to the same server. These issues, including the 619 relationship of such trunking to migration, might could potentially 620 be dealt differently within NFSv4.0 and NFSv4.1, although similar 621 treatment is desirable. The treatment of these issues is discussed 622 in Sections 4.4.1 and 5.2.5 respectively. 624 Note that the handling of trunking for NFSv4.0 and for an NFSv4.1 625 metadata server differs from that for an NFSv4.1 data server. In 626 that latter case, specification of trunking patterns including the 627 connection type of endpoints is under the control of the metadata 628 server and the client simply uses the information presented by the 629 metadata server to guide selection of the endpoints to be accessed. 631 One potential difference between the versions that needs to be 632 resolved concerns the issue of the trunking of multiple connections 633 directed to endpoints that share a network address while differing as 634 to connection type. While NFSv4.1 is specified in [RFC5661] as 635 requiring that such connections be trunkable, neither [RFC7530] nor 636 [RFC7931] contains a corresponding statement. 638 4. NFSv4.0 Issues 640 4.1. Core NFSv4.0 Migration Issues 642 Many of the problems seen with Transparent State Migration derived 643 from the inability of NFSv4.0servers to determine whether two client 644 IDs, issued on different servers, corresponded to the same client. 645 This difficulty derived in turn from the common practice, recommended 646 by [RFC7530], in which each client presented different client 647 identification strings to different servers, rather than presenting 648 the same identification string to all servers. 650 This practice, later referred to as the "non-uniform" client id 651 string approach, derived from concern that, since NFSv4.0 provided no 652 means to determine whether two IP addresses correspond to the server, 653 a single client connected to both might be confused by the fact that 654 state changes made via one IP address might unexpectedly affect the 655 state maintained with respect to the second IP address, thought of as 656 a separate server 658 To avoid this unexpected behavior, clients used the non-uniform 659 client id string approach. By doing so, a client connected to two 660 different servers (or to two IP addresses connected to the same 661 server) appeared to be two different servers. Since the server is 662 under the impression that two different clients are involved, state 663 changes made on each distinct IP address cannot be reflected on 664 another. 666 However, by doing things in that way, state migrated from server to 667 server cannot be referred to the actual client which generated it, 668 leading to confusion. 670 In addition to this core problem, the following issues with regard to 671 Transparent State Migration needed to be addressed: 673 o Clarification regarding the ability to merge state from different 674 leases even though their expiration times might not be precisely 675 synchronized. 677 o Clarifying the treatment of client IDs since it is not always 678 clear when clientid4 and when nfs_client_id4 was intended. 680 o Clarifying the logic of returning NFS4ERR_LEASE_MOVED. 682 o Clarifying the handling NFS4ERR_CLID_INUSE. 684 4.2. Resolution of Core Migration Protocol Difficulties in NFSv4.0 686 The client string identification issue was addressed in [RFC7931] as 687 follows: 689 o Defining both the uniform and non-uniform client id string 690 approaches as valid choices but indicating that the latter posed 691 difficulties for Transparent Stare Migration. 693 o Providing a way that clients using the uniform approach could use 694 to determine whether two IP addresses are connected to the same 695 server. 697 o Allowing clients using the uniform approach to avoid negative 698 consequences due to otherwise unexpected behavior since behavior 699 that is a consequence of known trunking relationships is not 700 unexpected. 702 o As a result, servers migrating state can be aware of the fact that 703 the same client is associated with two different items of state 704 even when that state was originally created on two different 705 servers. 707 Since all of the other issues noted in Section 4.1 were also 708 addressed by [RFC7931], publication of that document updating 709 [RFC7530] addressed all issues with Transparent State Migration in 710 NFSv4.0 known at that time. 712 4.3. Additional NFSv4.0 Issues 714 In light of the fact that a large set of migration-specific issues 715 were addressed by the publication of [RFC7931], the remaining issues 716 derive from those mentioned in Section 3.1. These include: 718 o Introducing facilities for trunking discovery. 720 o Clarifying the handling of multiple connection types, including 721 issues related to the trunking of multiple connections of 722 different types to the same network address. 724 o Clarifying the relationship between migration and trunking, 725 including trunking among multiple server endpoints sharing a 726 server network address. 728 4.4. Resolution of Additional NFSv4.0 Issues 730 One possible approach to addressing these issues would entail 731 publication of an additional standards-track document updating 732 [RFC7530]. 734 Fortunately, it appears that all of the material to be updated 735 appears in Section 8 of that document, whether it concerns the 736 provision of trunking discovery or the interaction of trunking and 737 migration. It also appears that none of the material to be updated 738 is in sections updated by [RFC7931]. 740 A review of the existing Section 8 of [RFC7530], shows the following 741 sections as requiring significant attention: 743 o The existing Section 8.1 requires a considerable expansion to 744 explain the various uses of the fs_locations and the possible 745 interactions among them. 747 o The existing Section 8.4 may require substantial re-organization 748 to reflect the facts that fs_locations has multiple functions and 749 may be referenced on multiple occasions. 751 o The existing Section 8.5 follows the previous approach for NFSv4.0 752 in assuming that trunking simply cannot and should not happen. 753 For example, the last paragraph says: 755 If a single location entry designates multiple server IP 756 addresses, the client should choose a single one to use. When 757 two server addresses are designated by a single location entry 758 and they correspond to different servers, this normally 759 indicates some sort of misconfiguration, and so the client 760 should avoid using such location entries when alternatives are 761 available. When they are not, clients should pick one of the 762 IP addresses and use it, without using others that are not 763 directed to the same server. 765 In addition, no part of the existing Section 8, mentions the 766 possibility of multiple connection types, which completes the 767 exclusion of the possibility of multiple trunked server 768 endpoints from the existing description of NFSv4.0. 770 As written, this section seems to foreclose any use of trunking in 771 connection with migration. In retrospect, it appears that this 772 section should have been revised as part of [RFC7931], but since 773 that was not done then, the issue needs to be addressed now. 775 Overall, it appears that, in addition to the revision of Sections 8.1 776 and 8.5, Section 8.4 need to be reorganized. One possible approach 777 is to divide the material into sub-sections as follows: 779 o A replacement introductory subsection describing all the uses of 780 location information. 782 o A new subsection describing trunking discovery and detection, 783 based on use of the existing entries within the fs_locations 784 attribute. 786 o A new subsection describing the handling of multiple connection 787 types. For a discussion of issues to be addressed, see 788 Section 4.4.1. 790 o A replacement subsection dealing with replication and trunking. 792 o A replacement subsection dealing with migration. 794 o A new subsection dealing with the interaction of trunking, 795 replication, and migration. 797 4.4.1. Resolution of NFSv4.0 Issues with Multiple Connection Types 799 The existence of multiple connection types raises issues regarding 800 how the connection type to be used is determined by the client. Such 801 issues need to be addressed when a new server is accessed and also 802 when NFS4ERR_MOVED is returned and a server endpoint is to be 803 selected to access the current file system. 805 The absence of explicit support for multiple connection types within 806 NFSv4.0 means that the client has a great deal of freedom in making 807 this determination, although some implementation guidance could be 808 provided. A client could attempt to establish a connection of each 809 connection type and the connection type (or types) that it chooses. 810 To make this an efficient process, servers which do not provide 811 support for a particular connection type should promptly indicate 812 that non-support. It should be the case that all server endpoints 813 sharing a particular network address are to be considered trunkable,, 814 even though currently neither [RFC7530] nor [RFC7931] explicitly 815 states that. 817 The approach mentioned above should, in general, be usable in the 818 cases of migration and referral, as well as for initial mount. 819 Clients might well treat these situations differently, for example by 820 using the type of the current connection as the initial type to try 821 in the migration case, while not doing in other cases. 823 Situations in which NFS4ERR_MOVED is returned without requiring any 824 shift in target network address require special attention, in order 825 to allow a shift in the network endpoint to be used to be indicated 826 even if there is no corresponding shift in network address. In the 827 absence of multiple connection types, receiving NFS4ERR_MOVED when 828 accessing one file system serves as an indication that that address 829 is not to be used to access that file system subsequently, making it 830 necessary to use other network addresses to access the file system, 831 after migration or a shift in trunking patterns without migration. 833 Since NFSv4.0 does not provide any way for the server to specify the 834 use of particular connection types, it might seem that there is no 835 way for the server to direct such a shift. However, when 836 NFS4ERR_MOVED is returned and the network address on which it was 837 returned is still present in the location entries returned, a client 838 may reasonably conclude that: 840 o The endpoint from which NFS4ERR_MOVED was returned is not to be 841 used to access the file system in question. 843 o Other endpoints using the same network address but different 844 connection types could be used to access the filesystem. 846 This gives the client a set of server endpoints to test for access to 847 the filesystem. In cases in which there is already a connection 848 established to that endpoint, file system access can be tested using 849 a PUTFH within the target file system followed by a GETFH, which will 850 either succeed or return NFS4ERR_MOVED depending on whether the 851 endpoint used can validly access the file system. In other cases a 852 connection will need to be established before such a test can be 853 performed. 855 5. Issues for NFSv4.1 and Beyond 857 5.1. Trunking-focused Issues to Address for NFSv4.1 859 While the addition of trunking discovery will be addressed in the 860 same way for both protocols, there are a number of cases in which 861 there are issues where the specifics of v4.1 require special 862 treatment: 864 o Because there is information in the fs_locations_info attribute 865 that goes beyond the location entries, it needs to be clear how 866 such information is to be handled, particularly when it applies to 867 multiple access paths which might or might not be paths to the 868 same replica. See Section 5.1.1 for a discussion of related 869 issues. 871 o Since NFSv4.1 provides a way to determine whether two addresses 872 are connected to the same server, the possibility that the 873 information on which that determination will change needs to be 874 addressed, as is discussed in Section 5.1.2. 876 5.1.1. Handling of Additional Information in fs_locations_info 878 The more extensive structure of the fs_locations_info attribute, as 879 compared with fs_locations, means that a number of areas may need 880 clarification, when fs_locations_info is used in connection with 881 trunking discovery: 883 o The function of fields within the attribute, outside the location 884 entries (e.g. fli_valid_for) may need to be clarified, as it 885 applies to trunking discovery. 887 o Since much of the information in the fs_locations4_server 888 structure applies to a particular replica, rather than a specific 889 access path, it needs to be clear where such information is 890 replica-specific or access-path-specific and how any 891 inconsistencies are to be dealt with. 893 5.1.2. Addressing Server_owner Changes in NFSv4.1 895 This issue is addressed in [RFC5661], although it does not provide a 896 clear description of the needed handling. 898 Section 2.10.5 of [RFC5661] states the following. 900 The client should be prepared for the possibility that 901 eir_server_owner values may be different on subsequent EXCHANGE_ID 902 requests made to the same network address, as a result of various 903 sorts of reconfiguration events. When this happens and the 904 changes result in the invalidation of previously valid forms of 905 trunking, the client should cease to use those forms, either by 906 dropping connections or by adding sessions. For a discussion of 907 lock reclaim as it relates to such reconfiguration events, see 908 Section 8.4.2.1. 910 While this paragraph is literally true in that such reconfiguration 911 events can happen and clients have to deal with them, it is confusing 912 in that it can be read as suggesting that clients have to deal with 913 them without disruption, which in general is impossible. 915 A clearer alternative would be: 917 It is always possible that, as a result of various sorts of 918 reconfiguration events, eir_server_scope and eir_server_owner 919 values may be different on subsequent EXCHANGE_ID requests made to 920 the same network address. 922 In most cases such reconfiguration events will be disruptive and 923 indicate that an IP address formerly connected to one server is 924 now connected to an entirely different one. 926 Some guidelines on client handling of such situations follow: 928 o When eir_server_scope changes, the client has no assurance that 929 any id's it obtained previously (e.g. file handles) can be 930 validly used on the new server, and, even if the new server 931 accepts them, there is no assurance that this is not due to 932 accident. Thus it is best to treat all such state as lost/ 933 stale although a client may assume that the probability of 934 inadvertent acceptance is low and treat this situation as 935 within the next case. 937 o When eir_server_scope remains the same and 938 eir_server_owner.so_major_id changes, the client can use 939 filehandles it has and attempt reclaims. It may find that 940 these are now stale but if NFS4ERR_STALE is not received, he 941 can proceed to reclaim his opens. 943 o When eir_server_scope and eir_server_owner.so_major_id remain 944 the same, the client has to use the now-current values of 945 eir_server-owner.so_minor_id in deciding on appropriate forms 946 of trunking. 948 5.2. Migration-related Issues to Address for NFSv4.1 950 Because NFSv4.1 embraces the uniform client-string approach, as 951 advised by section 2.4 of [RFC5661], addressing migration issues is 952 simpler, in that a shift in client id string models is not required. 953 Instead, NFSv4 returns information in the EXCHANGE_ID response to 954 enable trunking relationships to be determined by the client. 956 Despite this simplification, there are substantial issues that need 957 to be dealt with: 959 o The other necessary part of addressing migration issues, providing 960 for the server's merger of leases that relate to the same client, 961 is not currently addressed by [RFC5661] and changes need to be 962 made to make it clear that state needs to be appropriately merged 963 as part of migration, to avoid multiple client IDs between a 964 client-server pair. 966 o The current discussion (in [RFC5661]), of the possibility of 967 server_owner changes is incomplete and confusing. 969 o As with NFSV4.0, the interaction of trunking with migration and 970 other aspects of multi-server namespace needs to be clarified. 972 Addressing migration in NFSv4.1 will also require adaptation of the 973 approaches used in [RFC7931] to the NFSv4.1 environment including: 975 o The use of EXCHANGE_ID needs to be accommodated including issues 976 associated with the expected confirmation status of client IDs 977 transferred by Transparent State Migration. 979 o The use of sessions needs to be addressed including discussion of 980 the proper use of the status bits returned by the SEQUENCE 981 operation. 983 In addition, there are a number of new features within NFSv4.1 whose 984 relationship with migration needs to be clarified. Some examples: 986 o There needs to be some clarification of how migration, and 987 particularly Transparent State Migration, should interact with 988 pNFS layouts. 990 o There are a number of issues related to the migration of sessions 991 that need to be addressed. 993 Discussion of how to resolve these issues will appear in the sections 994 below. 996 5.2.1. Addressing State Merger in NFSv4.1 998 The existing treatment of state transfer in [RFC5661], has similar 999 problems to that in [RFC7530] in that it assumes that the state for 1000 multiple filesystems formerly on different servers will not be merged 1001 so that it appears under a single common client ID. We've already 1002 seen the reasons that this is a problem with regard to NFSv4.0. 1004 Although we don't have the problems stemming from the non-uniform 1005 client-string approach, there are a number of complexities in the 1006 existing treatment of state management in the section entitled "Lock 1007 State and File System Transitions" in [RFC5661] that make this non- 1008 trivial to address: 1010 o Migration is currently treated together with other sorts of 1011 filesystem transitions including transitioning between replicas 1012 without any NFS4ERR_MOVED errors. 1014 o There is separate handling and discussion of the cases of matching 1015 and non-matching server scopes. 1017 o In the case of matching server scopes, the text calls for an 1018 unrealistic degree of transparency, suggesting that the source and 1019 destination servers need to cooperate in stateid and client ID 1020 assignment. 1022 o In the case of non-matching server scopes, the text does not 1023 mention the possibility of the transparent migration of state at 1024 all, resulting in a functional regression from NFSV4.0 1026 5.2.2. Addressing pNFS Relationship with Migration 1028 This is made difficult because, within the pNFS framework, migration 1029 might mean any of several things: 1031 o Transfer of the MDS, leaving DS's as they are. 1033 This would be minimally disruptive to those using layouts but 1034 would require the pNFS control protocol being used to support the 1035 DS being directed to a new MDS. 1037 o Transfer of a DS, leaving everything else in place. 1039 Such a transfer can be handled without using migration at all. 1040 The server can recall/revoke layouts, and issue new ones, as 1041 appropriate. 1043 o Transfer of the filesystem to a new filesystem with both MDS and 1044 DS's moving. 1046 In such a transfer, an entirely different set of DS's will be at 1047 the target location. There may even be no pNFS support on the 1048 destination filesystem at all. 1050 Migration needs to support both the first and last of these models. 1052 5.2.3. Addressing Confirmation Status of Migrated Client IDs in NFSv4.1 1054 When a client ID is transferred between systems as a part of 1055 migration, it has never been clear whether it should be considered 1056 confirmed or unconfirmed on the target server. In the case in which 1057 an associated session is transferred together with the client ID, it 1058 is clear that the transferred client ID needs to be considered 1059 confirmed, as the existence of an associated session is incompatible 1060 with an unconfirmed client ID. 1062 The case in which a client ID is transferred without an associated 1063 session is less clear-cut, particularly since the treatment of 1064 EXCHANGE_ID in [RFC5661] assumes that CREATE_SESSION is the only 1065 means by which a client id may be confirmed. While this assumption 1066 is valid in the absence of Transparent State Migration, 1067 implementation of migration means that if this assumption is 1068 maintained, it is not clear how migrated client IDs can be a 1069 accommodated. If this assumption were maintained, we would have to 1070 choose between the following two alternatives, regarding whether the 1071 client ID to be reported as confirmed when EXCHANGE_ID is used to 1072 register an already-known client_owner with the server. 1074 o Report the client ID unconfirmed, because of the lack of an 1075 associated session. This makes it simpler for the client to 1076 determine whether there is an associated session transferred at 1077 the same time. However, it is inconsistent with the fact there 1078 are stateids which have been transferred with the client ID. 1080 o Report the client ID as confirmed, because it was confirmed on the 1081 source server and the transfer is not considered to have affected 1082 that. Given the current description of EXCHANGE_ID in [RFC5661], 1083 some modification in the treatment of client id confirmation is 1084 called for. In particular, provision would have to be made to 1085 enable the client id slot sequence id to be used by the client to 1086 be determined. 1088 Although the first approach makes it simpler for the client to 1089 determine whether there is an associated session transferred at the 1090 same time, it makes it more difficult to determine whether 1091 Transparent State Migration has occurred. Section 5.2.4. 1093 In any case, adjustments will be required to deal with the fact that 1094 [RFC5661] currently assumes that a client id can only be confirmed by 1095 issuing a CREATE_SESSION. In order to properly deal with the status 1096 of migrated client ids, we have to distinguish among: 1098 o The confirmation status as reported by EXCHANGE_ID. 1100 o Whether the client id is considered confirmed as that term is used 1101 in the many other cases in which the confirmation status of a 1102 client ID affects how requests are handled. 1104 o How the client is to determine the initial sequence id to be used 1105 when doing operations such as CREATE_SESSION. 1107 In [RFC5661] as it currently stands all of these are tied together 1108 and it is not obvious how migrated client IDs could be accommodated 1109 in this structure, and what changes are necessary to make this 1110 possible. For more discussion of this issue, see Section 5.3.1. 1112 5.2.4. Addressing Session Migration in NFSv4.1 1114 Some issues that need to be addressed regard the migration of 1115 sessions, in addition to client IDs and stateids 1117 o It needs to be made clearer how the client can deal with the 1118 possibility that sessions might or might not be transferred as 1119 part of Transparent State Migration. 1121 o Rules need to be clarified regarding possible transfer of sessions 1122 when either the source session is being used to access other file 1123 systems on source server or there is already a session connecting 1124 the client to the destination server. 1126 o There needs to be more detail regarding how the protocol avoids 1127 situations in which the same session is subject to concurrent 1128 changes on two different servers at the same time. 1130 5.2.5. Dealing with Multiple Connection Types in NFSv4.1 1132 The existence of multiple connection types raises issues regarding 1133 how the connection type to be used is determined by the client. Such 1134 issues need to be addressed when a new server is accessed and also 1135 when NFS4ERR_MOVED is returned and a server endpoint is to be 1136 selected to access the current file system. 1138 The limited support for multiple connection types within NFSv4.1 1139 means that a client can make this determination by first establishing 1140 a non-RDMA connection and then using the FSLI4TF_RDMA flag in the 1141 fs_locations_info attribute for the root file system to determine if 1142 an RDMA connection should be established. Such a connection can 1143 then, at the client's option, replace or remain trunked with the 1144 original connection. 1146 The approach mentioned above should, in general, be usable in the 1147 cases of migration and referral, as well as for initial mount. 1149 Situations in which NFS4ERR_MOVED is returned without requiring any 1150 shift in target network address require special attention, in order 1151 to allow a shift in the network endpoint to be used to be indicated 1152 even if there is no corresponding shift in network address. In the 1153 absence of multiple connection types, receiving NFS4ERR_MOVED when 1154 accessing one file system serves as an indication that that address 1155 is not to be used to access that file system subsequently, making it 1156 necessary to use other network addresses to access the file system, 1157 after migration or a shift in trunking patterns without migration. 1159 Since NFSv4.1 only limited facilities for the server to specify the 1160 use of particular connection types, there are difficulties in 1161 directing such a shift. When NFS4ERR_MOVED is returned and the 1162 network address on which it was returned is still present in the 1163 location entries returned, a client may reasonably conclude that: 1165 o The usability of the associated RDMA endpoint can be determined 1166 based on the status of the FSLI4TF_RDMA in the fs_locations_info 1167 attribute for the file system being accessed. 1169 o The endpoint returning NFS4ERR_MOVED is not to be used to access 1170 the file system in question. 1172 o Other endpoints using the same network address but different 1173 connection types could be used to access the filesystem. 1175 This generally allows client to determine set of server endpoints to 1176 be used to access the filesystem. In cases in which there is some 1177 ambiguity file system access can be tested by establishing a 1178 connection if not already present and then using a PUTFH within the 1179 target file system followed by a GETFH, which will either succeed or 1180 return NFS4ERR_MOVED depending on whether the endpoint used can 1181 validly access the file system. 1183 5.3. Possible Resolutions for NFSv4.1 Protocol Issues 1185 The subsections below explore some ways of dealing with clarifying 1186 the protocol to address issues discussed in Section 5.2 1188 5.3.1. Client ID Confirmation Issues 1190 As mentioned previously [RFC5661], makes no provision for client IDs 1191 that are confirmed other than through the use of CREATE_SESSION. For 1192 example Section 18.35 of [RFC5661] states: 1194 The client uses the EXCHANGE_ID operation to register a particular 1195 client owner with the server. The client ID returned from this 1196 operation will be necessary for requests that create state on the 1197 server and will serve as a parent object to sessions created by 1198 the client. In order to confirm the client ID it must first be 1199 used, along with the returned eir_sequenceid, as arguments to 1200 CREATE_SESSION. If the flag EXCHGID4_FLAG_CONFIRMED_R is set in 1201 the result, eir_flags, then eir_sequenceid MUST be ignored, as it 1202 has no relevancy. 1204 In deciding how to address the status of migrated client IDs in the 1205 case of Transparent State Migration, we should avoid giving undue 1206 weight to the last sentence of the above simply because it is stated 1207 in the form of a normative requirement. We should instead focus on 1208 the reasons such terms (i.e. those defined by [RFC2119]) are to be 1209 used, to state interoperability constraints. In this case, the 1210 "MUST" applies to a conclusion based on the premise that a 1211 CREATE_SESSION must have been done to assure that the client ID is 1212 reliably known to the server. 1214 In that light, let us consider a possible replacement, that treats 1215 confirmation by means of CREATE_SESSION as one of a number of 1216 possible means and avoids some the undesirable consequences of 1217 adherence to the current approach, originally conceived without 1218 taking state migration into account. 1220 The client uses the EXCHANGE_ID operation to register a particular 1221 client_owner with the server. However, when the client_owner has 1222 been already been registered by other means (e.g. Transparent 1223 State Migration), the client may still use EXCHANGE_ID to obtain 1224 the client ID assigned previously. 1226 The client ID returned from this operation will be associated with 1227 the connection on which the EXHANGE_ID is received and will serve 1228 as a parent object for sessions created by the client on this 1229 connection or to which the connection is bound. As a result of 1230 using those sessions to make requests involving the creation of 1231 state, that state will become associated with the client ID 1232 returned. 1234 In situations in which the registration of the client_owner has 1235 not occurred previously, the client ID must first be used, along 1236 with the returned eir_sequenceid, in creating an associated 1237 session using CREATE_SESSION. 1239 If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the result, 1240 eir_flags, then it is an indication that the registration of the 1241 client_owner has already occurred and that a further 1242 CREATE_SESSION is not needed to confirm it. Of course, subsequent 1243 CREATE_SESSION operations may be needed for other reasons. 1245 The value eir_seqenceid is used to establish an initial sequence 1246 value associate with the client ID returned. In cases in which a 1247 CREATE_SESSION has already been done, there is no need for this 1248 value, since sequencing of such request has already been 1249 established and the client has no need for this value and will 1250 ignore it 1252 5.3.2. Dealing with Multiple Location Entries 1254 The possibility that more than one server address may be present in 1255 location attributes requires further clarification. This is 1256 particularly the case, given the potential role of trunking for 1257 NFSv4.1, whose connection to migration needs to be clarified. 1259 The description of the location attributes in [RFC5661], while it 1260 indicates that multiple address entries in these attributes may be 1261 used to indicate alternate paths to the file system, does so mainly 1262 in the context of replication and does so without mentioning 1263 trunking. The discussion of migration does not discuss the 1264 possibility of multiple location entries or trunking, which we will 1265 explore here. 1267 We will cover cases in which multiple addresses appear directly in 1268 the attributes as well as those in which the multiple addresses 1269 result because a single location entry is expanded into multiple 1270 location elements using addresses provided by DNS. 1272 When the set of valid location elements by which a file system may be 1273 accessed changes, migration need not be involved. Some cases to 1274 consider: 1276 o When the set of location elements expands, migration is not 1277 involved. In the case in which the additional elements are not 1278 trunkable with ones previously being used, the new elements serve 1279 as additional access locations, available in case of the failure 1280 of server addresses being used. When additional elements are 1281 trunkable with those currently being used the client may use the 1282 additional addresses just as they might have if they had been 1283 available when use of the file system began. 1285 There is no current mechanism by which the client can be notified 1286 of a change in the set of available location for an fs. Given the 1287 client has at least one IP address available to access the 1288 filesystem in question, periodic polling is an adequate mechanism 1289 for the client to find additional server addresses to use to 1290 access the file system. 1292 o When the set of location elements contracts but none of the 1293 elements no longer usable were in fact being used by the client, 1294 then no migration is involved and no change in network addresses 1295 is needed. Only if the client were to start using one of the 1296 unavailable elements would the client be notified (via 1297 NFS4ERR_MOVED) of the need to not use those elements and to use 1298 others provided by a location attribute. 1300 When a specific server address being used becomes unavailable to 1301 service a particular file system, NFS4ERR_MOVED will be returned, and 1302 the client will respond based on the available locations. Whether 1303 continuity of locking state will be available depends on a number of 1304 factors: 1306 o If there are still elements in use trunkable with the element that 1307 has become unavailable, there will still be a continuity of 1308 locking state, even though Transparent State Migration per se has 1309 not occurred. If the in-use addresses are session-trunkable with 1310 the address becoming unavailable, only one connection is lost and 1311 all existing sessions will remain available. If, on the other 1312 hand, the in-use addresses are only clientid-trunkable with the 1313 address becoming unavailable, a session can be lost. However, 1314 that session can be made available on those other nodes, just as 1315 they it would have been if Transparent State Migration were in 1316 effect, even though no migration has occurred. 1318 o Otherwise, if there are available addresses trunkable with the one 1319 that has become unavailable, the client has access to existing 1320 locking state once it establishes a connection with the new 1321 addresses, using a new or existing session depending on the type 1322 of trunking in effect. This is also similar to the case in which 1323 Transparent State Migration has occurred, even though there is no 1324 migration, with the state remaining on the existing server. 1326 Note that this case, as well as the previous one, can be expected 1327 in the case in which the server seeks to direct traffic with 1328 regard to particular file systems to choose addresses, in the 1329 interest of load balancing, to adjust to hardware availability 1330 constraints, or for other reasons. 1332 o In other cases, migration has occurred and the client can 1333 determine whether Transparent State Migration occurred and whether 1334 any locking state was lost during the transfer. 1336 Whether migration has occurred or not, the client can use the 1337 procedure described in Section 5.5.3 to recover access to existing 1338 locking state and, in some cases, sessions. 1340 One should note the following differences between migration with 1341 Transparent State Migration and the similar cases in which there is a 1342 continuity of locking state with no change in the server. 1344 o When locks are lost (as indicated when using them or via the 1345 SEQ4_STATUS flags) and migration has not been done, they are not 1346 to be reclaimed, except when SEQ4_STATUS_RESTART_RECLAIM_NEEDED is 1347 set. Instead such losses are treated as lock revocations and 1348 acknowledged using FREE_STATEID. 1350 o When migration has not been done, there is no need for a 1351 RECLAIM_COMPLETE (with rca_one_fs set to TRUE). 1353 5.3.3. Migration and pNFS 1355 When pNFS is involved, the protocol is capable of supporting: 1357 o Migration of the MDS, leaving DS's in place. 1359 o Migration of the file system as a whole, including the MDS and 1360 associated DS's. 1362 o Replacement of one DS by another. 1364 o Migration of a pNFS file system to one in which pNFS is not used. 1366 o Migration of a file system not using pNFS to one in which layouts 1367 are available. 1369 Migration of the MDS function is directly supported by Transparent 1370 State Migration. Layout state will normally be transparently 1371 transferred, just as other state is. As a result, Transparent State 1372 Migration provides a framework in which, given appropriate inter-MDS 1373 data transfer, one MDS can be substituted for another. 1375 Migration of the file system function as a whole can be accomplished 1376 by recalling all layouts as part of the initial phase of the 1377 migration process. As a result, IO will be done through the MDS 1378 during the migration process, and new layouts can be granted once the 1379 client is interacting with the new MDS. An MDS can also effect this 1380 sort of transition by revoking all layouts as part of Transparent 1381 State Migration, as long as the client is notified about the loss of 1382 state. 1384 In order to allow migration to a file system on which pNFS is not 1385 supported, clients need to be prepared for a situation in which 1386 layouts are not available or supported on the destination file system 1387 and so direct IO requests to the destination server, rather than 1388 depending on layouts being available. 1390 Replacement of one DS by another is not addressed by migration as 1391 such but can be effected by an MDS recalling layouts for the DS to be 1392 replaced and issuing new ones to be served by the successor DS. 1394 Migration may transfer a file system from a server which does not 1395 support pNFS to one which does. In order to properly adapt to this 1396 situation, clients which support pNFS, but function adequately in its 1397 absence, should check for pNFS support when a file system is migrated 1398 and be prepared to use pNFS when support is available. 1400 5.4. Defining Server Responsibilities for NFSv4.1 Transitions 1402 The subsections below discuss server responsibilities in providing 1403 for the propagation of locking state when a file system is migrated. 1405 Sections 5.4.1 and 5.4.2 discuss the responsibilities of source and 1406 destination servers in effecting the necessary transfer of 1407 information to support Transparent State Migration. 1409 Section 5.4.3 discusses issues relating to the handling of state 1410 recovery using client-directed reclaim of existing locks, used when 1411 Transparent State Migration is not available 1413 5.4.1. Server Responsibilities in Effecting Transparent State Migration 1415 The basic responsibility of the source server in effecting 1416 Transparent State Migration is to make available to the destination 1417 server a description of each piece of locking state associated with 1418 the file system being migrated. In addition to client id string and 1419 verifier, the source server needs to provide, for each stateid: 1421 o The stateid including the current sequence value. 1423 o The associated client ID. 1425 o The handle of the associated file. 1427 o The type of the lock, such as open, byte-range lock, delegation, 1428 layout. 1430 o For locks such as opens and byte-range locks, there will be 1431 information about the owner(s) of the lock. 1433 o For recallable/revocable lock types, the current recall status 1434 needs to be included. 1436 o For each lock type there will by type-specific information, such 1437 as share and deny modes for opens and type and byte ranges for 1438 byte-range locks and layouts. 1440 A further server responsibility concerns locks that are revoked or 1441 otherwise lost during the process of file system migration. Because 1442 locks that appear to be lost during the process of migration will be 1443 reclaimed by the client, the servers have to take steps to ensure 1444 that locks revoked soon before or soon after migration are not 1445 inadvertently allowed to be reclaimed in situations in which the 1446 continuity of lock possession cannot be assured. 1448 o For locks lost on the source but whose loss has not yet been 1449 acknowledged by the client (by using FREE_STATEID), the 1450 destination must be aware of this loss so that it can deny a 1451 request to reclaim them. 1453 o For locks lost on the destination after the state transfer but 1454 before the client's RECLAIM_COMPLETE is done, the destination 1455 server should note these and not allow them to be reclaimed. 1457 An additional responsibility of the cooperating servers concerns 1458 situations in which a stateid cannot be transferred transparently 1459 because it conflicts with an existing stateid held by the client and 1460 associated with a different file system. In this case there are two 1461 valid choices: 1463 o Treat the transfer, as in NFSv4.0, as one without Transparent 1464 State Migration. In this case, conflicting locks cannot be 1465 granted until the client does a RECLAIM_COMPLETE (with rca_one_fs 1466 set to TRUE), after reclaiming the locks it had, with the 1467 exception of reclaims denied because they were attempts to reclaim 1468 locks that had been lost. 1470 o Implement Transparent State Migration, except for the lock with 1471 the conflicting stateid. In this case, the client will be aware 1472 of a lost lock (through the SEQ4_STATUS flags) and be allowed to 1473 reclaim it, after which it does the appropriate RECLAIM_COMPLETE. 1475 5.4.2. Synchronizing Session Transfer 1477 When transferring state between the source and destination, the 1478 issues discussed in Section 7.2 of [RFC7931] must still be attended 1479 to. In this case, the use of NFS4ERR_DELAY is still necessary in 1480 NFSv4.1, as it was in NFSv4.0, to prevent locking state changing 1481 while it is being transferred. 1483 There are a number of important differences in the NFS4.1 context: 1485 o The absence of RELEASE_LOCKOWNER means that the one case in which 1486 an operation could not be deferred by use of NFS4ERR_DELAY no 1487 longer exists. 1489 o Sequencing of operations is no longer done using owner-based 1490 operation sequences numbers. Instead, sequencing is session- 1491 based 1493 As a result, when sessions are not transferred, the techniques 1494 discussed in [RFC7931] are adequate and will not be further 1495 discussed. 1497 When sessions are transferred, there are a number of issues that pose 1498 challenges since, 1500 o A single session may be used to access multiple file systems, not 1501 all of which are being transferred. 1503 o Requests made on a session may, even if rejected, affect the state 1504 of the session by advancing the sequence number associated with 1505 the slot used. 1507 As a result, when the filesystem state might otherwise be considered 1508 unmodifiable, the client might have any number of in-flight requests, 1509 each of which is capable of changing session state, which may be of a 1510 number of types: 1512 1. Those requests that were processed on the migrating file system, 1513 before migration began. 1515 2. Those requests which got the error NFS4ERR_DELAY because the file 1516 system being accessed was in the process of being migrated. 1518 3. Those requests which got the error NFS4ERR_MOVED because the file 1519 system being accessed had been migrated. 1521 4. Those requests that accessed the migrating file system, in order 1522 to obtain location or status information. 1524 5. Those requests that did not reference the migrating file system. 1526 It should be noted that the history of any particular slot is likely 1527 to include a number of these request classes. In the case in which a 1528 session which is migrated is used by filesystems other than the one 1529 migrated, requests of class 5 may be common and be the last request 1530 processed, for many slots. 1532 Since session state can change even after the locking state has been 1533 fixed as part of the migration process, the session state known to 1534 the client could be different from that on the destination server, 1535 which necessarily reflects the session state on the source server, at 1536 an earlier time. In deciding how to deal with this situation, it is 1537 helpful to distinguish between two sorts of behavioral consequences 1538 of the choice of initial sequence ID values. 1540 o The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID 1541 in a request is neither equal to the last one seen for the current 1542 slot nor the next greater one. 1544 In view of the difficulty of arriving at a mutually acceptable 1545 value for the correct last sequence value at the point of 1546 migration, it may be necessary for the server to show some degree 1547 of forbearance, when the sequence ID is one that would be 1548 considered unacceptable if session migration were not involved. 1550 o Returning the cached reply for a previously executed request when 1551 the sequence ID in the request matches the last value recorded for 1552 the slot. 1554 In the cases in which an error is returned and there is no 1555 possibility of any non-idempotent operation having been executed, 1556 it may not be necessary to adhere to this as strictly as might be 1557 proper if session migration were not involved. For example, the 1558 fact that the error NFS4ERR_DELAY was returned may not assist the 1559 client in any material way, while the fact that NFS4ERR_MOVED was 1560 returned by the source server may not be relevant when the request 1561 was reissued, directed to the destination server. 1563 One part of adapting to these sorts of issues would restrict 1564 enforcement of normal slot sequence enforcement semantics until the 1565 client itself, by issuing a request using a particular slot on the 1566 destination server, established the new starting sequence for that 1567 slot on the migrated session. 1569 An important issue is that the specification needs to take note of 1570 all potential COMPOUNDs, even if they might be unlikely in practice. 1571 For example, a COMPOUND is allowed to access multiple file systems 1572 and might perform non-idempotent operations in some of them before 1573 accessing a file system being migrated. Also, a COMPOUND may return 1574 considerable data in the response, before being rejected with 1575 NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as 1576 sa_cachethis. 1578 Some possibilities that need to be considered to address the issues: 1580 o Do not enforce any sequencing semantics for a particular slot 1581 until the client has established the starting sequence for that 1582 slot on the destination server. 1584 o For each slot, do not return a cached reply returning 1585 NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established 1586 the starting sequence for that slot on the destination server. 1588 o Until the client has established the starting sequence for a 1589 particular slot on the destination server, do not report 1590 NFS4ERR_SEQ_MISORDERED or return a cached reply returning 1591 NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of 1592 a series of operations where the response is NFS4_OK until the 1593 final error. 1595 5.4.3. Server Issues Dealing with RECLAIM_COMPLETE 1597 When Transparent State Migration is not available, servers can 1598 provide a grace-period limited to a single file system, giving 1599 clients the opportunity to reestablish their locks, originally held 1600 on the source server, on the destination server, using the same 1601 reclaim options normally used to recover from a server restart. 1603 As part of that process, clients need to signal the end of their 1604 contribution to the lock recovery process for a particular file 1605 system transition by using the RECLAIM_COMPLETE operation described 1606 in [RFC5661] specifying an rca_one_fs value of TRUE. 1608 Since the publication of that document there have been a number of 1609 developments regarding the handling of this form of RECLAIM_COMPLETE 1610 that create issues that need to be addressed: 1612 o The treatment of RECLAIM_COMPLETE in [RFC5661] was not as explicit 1613 about the purpose of rca_one_fs as it might have been, leading to 1614 some implementor confusion. 1616 o Some clients, most likely those intending to access a single file 1617 system, have issued RECLAIM_COMPLETE operations specifying an 1618 rca_one_fs value of TRUE even where no file system migration event 1619 has occurred. In so doing, such clients have essentially ignored 1620 the REQUIREMENT that a client perform a RECLAIM_COMPLETE operation 1621 specifying an rca_one_fs value of FALSE before obtaining any new 1622 locks. 1624 o Servers, in supporting such servers, may have accepted such 1625 RECLAIM_COMPLETE operations and treated them as if they were done 1626 with an rca_one_fs value of FALSE. 1628 These developments, while troubling, do not raise any substantive 1629 difficulty, if the servers do not support fs migration. However, to 1630 enable file system migration to be implemented, some work must be 1631 done to make the rca_one_fs useful, while maintaining necessary 1632 compatibility with existing implementations. 1634 o A new treatment of RECLAIM_COMPLETE is needed to eliminate the 1635 ambiguities within the one contained in [RFC5661] and making it 1636 clear that rca_one_fs is not to be misused as it has been. 1638 o While servers "SHOULD NOT" accept erroneous uses of 1639 RECLAIM_COMPLETE with rca_one_fs TRUE (i.e. those not connected to 1640 file system migration), the consequence of doing so need to be 1641 elucidated. As a result, servers which have no possibility of 1642 being the destination in cases of fs migration will continue to be 1643 able to do what they have been doing while it is likely that 1644 others will have cease to accepting such RECLAIM_COMPLETE 1645 operations in place of those with rca_one_fs set to FALSE. 1647 5.5. Defining Client Responsibilities for NFSv4.1 Transitions 1649 The subsections below discuss the responsibilities of the client in 1650 dealing with transition to a new server (migration) and to use of new 1651 network addresses in accessing existing servers. 1653 5.5.1. Client Recovery from Migration Events 1655 When a file system is migrated, there a number of migration-related 1656 status indications with which clients need to deal: 1658 o If an attempt is made to use or return a filehandle within a file 1659 system that has been migrated away from the server on which it was 1660 previously available, the error NFS4ERR_MOVED is returned. 1662 This condition continues on subsequent attempts to access the file 1663 system in question. The only way the client can avoid the error 1664 is to cease accessing the filesystem in question at its old server 1665 location and access it instead on the server to which it has been 1666 migrated. 1668 o Whenever a SEQUENCE operation is sent by a client to a server 1669 which generated state held on that client which is associated with 1670 a file system that has been migrated away from the server on which 1671 it was previously available, the status bit 1672 SEQ4_STATUS_LEASE_MOVED is set in the response. 1674 This condition continues until the client acknowledges the 1675 notification by fetching a location attribute for the migrated 1676 file system. When there are multiple migrated file systems, a 1677 location attribute for each such migrated file system needs to be 1678 fetched, in order to clear the condition. Even after the 1679 condition is cleared, the client needs to respond by using the 1680 location information to access the destination server to ensure 1681 that leases are not needlessly expired. 1683 Unlike the case of NFSv4.0 in which the corresponding conditions are 1684 distinct errors and thus mutually exclusive, in NFSv4.1 the client 1685 can, and often will, receive both indications on the same request. 1686 As a result, implementations need to address the question of how to 1687 co-ordinate the necessary recovery actions when both indications 1688 arrive simultaneously. It should be noted that when the server 1689 decides whether SEQ4_STATUS_LEASE_MOVED is to be set, it has no way 1690 of knowing which file system will be referenced or whether 1691 NFS4ERR_MOVED will be returned. 1693 While it is true that, when only a single migrated file system is 1694 involved, a single set of actions will clear both indications, the 1695 possibility of multiple migrated file systems calls for an approach 1696 in which there are separate recovery actions for each indication. In 1697 general, the response to neither indication can be subsumed within 1698 the other since: 1700 o If the client were to respond only to the MOVED indication, there 1701 would be no effective client response to a situation in which a 1702 file system was not being actively accessed at the time migration 1703 occurred. As a result, leases on the destination server might be 1704 needlessly expired. 1706 o If the client were to respond only to the LEASE_MOVED indication, 1707 recovery for migrated file systems in active use could be deferred 1708 in order to accomplish recovery for others not being actively 1709 accessed. The consequences of this choice can pose particular 1710 problems when there are a large number of file systems supported 1711 by a particular server, or when it happens that some servers, 1712 after receiving migrated file systems have periods of 1713 unavailability, such as occur as a result of server reboot. This 1714 can result in recovery for actively accessed migrated file systems 1715 being unnecessarily delayed for long periods of time. 1717 Similar considerations apply to other arrangements in which one of 1718 the indications, while not ignored per se, is subsumed within a 1719 single recovery process focused on recovery for the other indication. 1721 Although clients are free to decide on their own approaches to 1722 recovery, we will explore below an approach with the following 1723 characteristics: 1725 o All instances of the MOVED indication, whether they involve 1726 migration or not, should be dealt with promptly, either by doing 1727 the necessary recovery directly, providing that it be done 1728 asynchronously, or ensuring that it is already under way. 1730 o All instances of the LEASE_MOVED indication should be dealt with 1731 asynchronously, in a migration discovery thread whose job is to 1732 clear that indication by fetching the appropriate location 1733 attribute. Because this thread will only be fetching a location 1734 attribute and the fs_status attribute for the file systems 1735 referenced by the client, it cannot receive MOVED indications. 1736 Some useful guidance regarding possible implementation of a 1737 migration discovery thread can be found in Section 5.5.2. 1739 o When a migration discovery thread happens upon a migrated file 1740 system (i.e. not present and not a referral), the thread is likely 1741 to have cleared one (out of an unknown number) of file systems 1742 whose migration needs to be responded to. The discovery thread 1743 needs to schedule the appropriate migration recovery (as described 1744 in Section 5.5.3). This is necessary to ensure that migrated file 1745 systems will be referenced on the destination server in order to 1746 avoid unnecessary lease expiration. 1748 For many of the migrated file systems discovered in this way, the 1749 client has not received any MOVED indication. In such cases, 1750 lease recovery needs to be scheduled but it should not interfere 1751 with continuation of the migration discovery function. 1753 o When a migration discovery thread receives a LEASE_MOVED 1754 indication, it takes no special action but continues its normal 1755 operation. On the other hand, if a LEASE_MOVED indication is not 1756 received, it indicates that the thread has completed its work 1757 successfully. 1759 5.5.2. The Migration Discovery Process 1761 As noted above, LEASE_MOVED indications are best dealt with in a 1762 migration discovery thread. Because of this structure, 1764 o No action needs to be taken for such indications received by the 1765 migration discovery threads, since continuation of that thread's 1766 work will address the issue. 1768 o For such indications received in other contexts, the generally 1769 appropriate response is to initiate or otherwise provide for the 1770 execution of a migration discovery thread for file systems 1771 associated with the server IP address returning the indication. 1773 o In all cases in which the appropriate migration discovery thread 1774 is running, nothing further needs to be done to respond to 1775 LEASE_MOVED indications. 1777 This leaves a potential difficulty in situations in which the 1778 migration discovery thread is near to completion but is still 1779 operating. One should not ignore a LEASE_MOVED indication if the 1780 discovery thread is not able to respond to migrated file system 1781 without additional aid. A further difficulty in addressing such 1782 situation is that a LEASE_MOVED indication may reflect the server's 1783 state at the time the SEQUENCE operation was processed, which may be 1784 different from that in effect at the time the response is received. 1786 A useful approach to this issue involves the use of separate 1787 externally-visible discovery thread states representing non- 1788 operation, normal operation, and completion/verification of migration 1789 discovery processing. 1791 Within that framework, discovery thread processing would proceed as 1792 follows. 1794 o While in the normal-operation state, the thread would fetch, for 1795 successive file systems known to the client on the server being 1796 worked on, a location attribute plus the fs_status attribute. 1798 o If the fs_status attribute indicates that the file system is a 1799 migrated one (i.e. fss_absent is true and fss_type != 1800 STATUS4_REFERRAL) and thus that it is likely that the fetch of the 1801 location attribute has cleared one the file systems contributing 1802 to the LEASE_MOVED indication. 1804 o In cases in which that happened, the thread cannot know whether 1805 the LEASE_MOVED indication has been cleared and so it enters the 1806 completion/verification state and proceeds to issue a COMPOUND to 1807 see if the LEASE_MOVED indication has been cleared. 1809 o When the discovery thread is in the completion/verification state, 1810 if others get a LEASE_MOVED indication they note this fact and it 1811 is used when the request completes, as described below. 1813 When the request used in the completion/verification state completes: 1815 o If a LEASE_MOVED indication is returned, the discovery thread 1816 resumes its normal work. 1818 o Otherwise, if there is any record that other requests saw a 1819 LEASE_MOVED indication, that record is cleared and the 1820 verification request retried. The discovery thread remains in 1821 completion/verification state. 1823 o If there has been no LEASE_MOVED indication, the work of the 1824 discovery thread is considered completed and it enters the non- 1825 operating state. 1827 5.5.3. Overview of Client Response to NFS4ERR_MOVED 1829 This section outlines a way in which a client that receives 1830 NFS4ERR_MOVED can respond by using a new server or network address if 1831 one is available. As part of that process, it will determine: 1833 o Whether the NFS4ERR_MOVED indicates migration has occurred, or 1834 whether it indicates another sort of file system transition as 1835 discussed in Section 5.3.2. 1837 o In the case of migration, whether Transparent State Migration has 1838 occurred. 1840 o Whether any state has been lost during the process of Transparent 1841 State Migration. 1843 o Whether sessions have been transferred as part of Transparent 1844 State Migration. 1846 During the first phase of this process, the client proceeds to 1847 examine location entries to find the initial network address it will 1848 use to continue access to the file system or its replacement. For 1849 each location entry that the client examines, the process consists of 1850 five steps: 1852 1. Performing an EXCHANGE_ID is directed at the location address. 1853 This operation is used to register the client-owner with the 1854 server, to obtain a client ID to be use subsequently to 1855 communicate with it, to obtain tat client ID's confirmation 1856 status and, to determine server_owner and scope for the purpose 1857 of determining if the entry is trunkable with that previously 1858 being used to access the file system (i.e. that it represents 1859 another path to the same file system and can share locking state 1860 with it). 1862 2. Making an initial determination of whether migration has 1863 occurred. The initial determination will be based on whether the 1864 EXCHANGE_ID results indicate that the current location element is 1865 server-trunkable with that used to access the file system when 1866 access was terminated by receiving NFS4ERR_MOVED. If it is, then 1867 migration has not occurred and the transition is dealt with, at 1868 least initially, as one involving continued access to the same 1869 file system on the same server through a new network address. 1871 3. Obtaining access to existing session state or creating new 1872 sessions. How this is done depends on the initial determination 1873 of whether migration has occurred and can be done as described in 1874 Section 5.5.4 in the case of migration or as described in 1875 Section 5.5.5 in the case of a network address transfer without 1876 migration. 1878 4. Verification of the trunking relationship assumed in step 2 as 1879 discussed in Section 2.10.5.1 of [RFC5661]. Although this step 1880 will generally confirm the initial determination, it is possible 1881 for verification to fail with the result that an initial 1882 determination that a network address shift (without migration) 1883 has occurred may be invalidated and migration determined to have 1884 occurred. There is no need to redo step 3 above, since it will 1885 be possible to continue use of the session established already. 1887 5. Obtaining access to existing locking state and/or reobtaining it. 1888 How this is done depends on the final determination of whether 1889 migration has occurred and can be done as described in 1890 Section 5.5.4 in the case of migration or as described in 1891 Section 5.5.5 in the case of a network address transfer without 1892 migration. 1894 Once the initial address has been determined, clients are free to 1895 apply an abbreviated process to find additional addresses trunkable 1896 with it (clients may seek session-trunkable or server trunkable 1897 addresses depending on whether they support clientid trunking). 1898 During this later phase of the process, further location entries are 1899 examined using the abbreviated procedure specified below: 1901 1. Before the EXCHANGE_ID, the fs_name field is examined and if it 1902 does not match that currently being used, the entry is ignored. 1903 otherwise, one proceeds as specified by step 1 above,. 1905 2. In the case that the network address is session-trunkable with 1906 one used previously a BIND_CONN_TO_SESSION is used to access that 1907 session using new network address. Otherwise, or if the bind 1908 operation fails, a CREATE_SESSION is done. 1910 3. The verification procedure referred to in step 4 above is used. 1911 However, if it fails, the entry is ignored and the next available 1912 entry is used. 1914 5.5.4. Obtaining Access to Sessions and State after Migration 1916 In the event that migration has occurred, the determination of 1917 whether Transparent State Migration has occurred is driven by the 1918 client ID returned by the EXCHANGE_ID and the reported confirmation 1919 status. 1921 o If the client ID is an unconfirmed client ID not previously known 1922 to the client, then Transparent State Migration has not occurred. 1924 o If the client ID is a confirmed client ID previously known to the 1925 client, then any transferred state would have been merged with an 1926 existing client ID representing the client to the destination 1927 server. In this state merger case, Transparent State Migration 1928 might or might not have occurred and a determination as to whether 1929 it has occurred is deferred until sessions are established and we 1930 are ready to begin state recovery. 1932 o If the client ID is a confirmed client ID not previously known to 1933 the client, then the client can conclude that the client ID was 1934 transferred as part of Transparent State Migration. In this 1935 transferred client ID case, Transparent State Migration has 1936 occurred although some state may have been lost. 1938 Once the client ID has been obtained, it is necessary to obtain 1939 access to sessions to continue communication with the new server. In 1940 any of the cases in which Transparent State Migration has occurred, 1941 it is possible that a session was transferred as well. To deal with 1942 that possibility, clients can, after doing the EXCHANGE_ID, issue a 1943 BIND_CONN_TO_SESSION to connect the transferred session to a 1944 connection to the new server. If that fails, it is an indication 1945 that the session was not transferred and that a new session needs to 1946 be created to take its place. 1948 In some situations, it is possible for a BIND_CONN_TO_SESSION to 1949 succeed without session migration having occurred. If state merger 1950 has taken place then the associated client ID may have already had a 1951 set of existing sessions, with it being possible that the sessionid 1952 of a given session is the same as one that might have been migrated. 1953 In that event, a BIND_CONN_TO_SESSION might succeed, even though 1954 there could have been no migration of the session with that 1955 sessionid. 1957 Once the client has determined the initial migration status, and 1958 determined that there was a shift to a new server, it needs to re- 1959 establish its lock state, if possible. To enable this to happen 1960 without loss of the guarantees normally provided by locking, the 1961 destination server needs to implement a per-fs grace period in all 1962 cases in which lock state was lost, including those in which 1963 Transparent State Migration was not implemented. 1965 Clients need to be deal with the following cases: 1967 o In the state merger case, it is possible that the server has not 1968 attempted Transparent State Migration, in which case state may 1969 have been lost without it being reflected in the SEQ4_STATUS bits. 1970 To determine whether this has happened, the client can use 1971 TEST_STATEID to check whether the stateids created on the source 1972 server are still accessible on the destination server. Once a 1973 single stateid is found to have been successfully transferred, the 1974 client can conclude that Transparent State Migration was begun and 1975 any failure to transport all of the stateids will be reflected in 1976 the SEQ4_STATUS bits. Otherwise. Transparent State Migration has 1977 not occurred. 1979 o In a case in which Transparent State Migration has not occurred, 1980 the client can use the per-fs grace period provided by the 1981 destination server to reclaim locks that were held on the source 1982 server. 1984 o In a case in which Transparent State Migration has occurred, and 1985 no lock state was lost (as shown by SEQ4_STATUS flags), no lock 1986 reclaim is necessary. 1988 o In a case in which Transparent State Migration has occurred, and 1989 some lock state was lost (as shown by SEQ4_STATUS flags), existing 1990 stateids need to be checked for validity using TEST_STATEID, and 1991 reclaim used to re-establish any that were not transferred. 1993 For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value 1994 of true should be done before normal use of the file system including 1995 obtaining new locks for the file system. This applies even if no 1996 locks were lost and needed to be reclaimed. 1998 5.5.5. Obtaining Access to Sessions and State after Network Address 1999 Transfer 2001 The case in which there is a transfer to a new network address 2002 without migration is similar to that described in Section 5.5.4 in 2003 that there is a need to obtain access to needed sessions and locking 2004 state. However, the details are simpler and will vary depending on 2005 the type of trunking between the address receiving NFS4ERR_MOVED and 2006 that to which the transfer is to be made 2008 To make a session available for use, a BIND_CONN_TO_SESSION should be 2009 used to obtain access to the session previously in use. Only if this 2010 fails, should a CREATE_SESSION be done. While this procedure mirrors 2011 that in Section 5.5.4, there is an important difference in that 2012 preservation of the session is not purely optional but depends on the 2013 type of trunking. 2015 Access to appropriate locking state should need no actions beyond 2016 access to the session. However. the SEQ4_STATUS bits should be 2017 checked for lost locking state, including the need to reclaim locks 2018 after a server reboot. 2020 5.6. Resolution of NFSv4.1 Issues 2022 One possibility is that addressing all of the NFSv4.1 issues would 2023 entail publication of a standards-track document updating [RFC5661]. 2025 Such a document would have three major elements: 2027 o A considerable expansion of the existing Section 11.4 explaining 2028 the various uses of the location attribute and the possible 2029 interactions among these various uses. This, like the 2030 corresponding replacement section for NFSv4.0 would be based on 2031 our Section 3.2 above. Information regarding the specifics of 2032 trunking discovery might appear in this section, in a new sub- 2033 section. As part of this revision, the existing Section 11.4.2 2034 would need to be revised to explain all the possible results of 2035 NFS4ERR_MOVED including migration and a possible transparent 2036 transition in which the network address changes but the server 2037 does not. 2039 o A revision of the existing section 18.35 (dealing with 2040 EXCHANGE_ID) addressing the issues discussed in Section 5.3.1. 2042 o A major replacement of the existing Section 11.7, entitled 2043 "Effecting File System Transitions", as discussed below. 2045 In addition, there is a set of smaller changes necessary 2047 o Update the existing Section 2.10.5 to clarify the proper response 2048 to server_owner changes, as described in our Section 5.1.2. 2050 o Replacement of the existing Section 15.1.2.4 to reflect the fact 2051 that NFS4ERR_MOVED can occur when a file system is now accessible 2052 at a different network address. A possible replacement text might 2053 be: 2055 The file system that contains the current filehandle object is 2056 not accessible using the network address which has been used. 2057 It may have been relocated, migrated to another server, be 2058 accessible using another network address on the current server, 2059 or it may have never been present. The client may obtain the 2060 new file system location by obtaining the "fs_locations" or 2061 "fs_locations_info" attribute for the current filehandle. For 2062 further discussion, refer to Section 11.4.2 2064 The replacement for the existing section 11.7 would maintain most 2065 sections essentially as they are, only making minor changes to 2066 include server-trunking in the discussion. However, in some cases 2067 involving more significant changes to existing sub-sections, and 2068 potential new sub-sections are listed below: 2070 o The existing Section 11.7.1 needs to be modified to refer 2071 explicitly to the previous discussion of trunking discovery. 2073 In addition, the term "multi-home single-server namespace", used 2074 nowhere else in [RFC5661], poses difficulties. From the 2075 description given it appears that the case being referred to in 2076 one in which two network addresses return server_owners with the 2077 same major_id and different minor_id values, making the network 2078 addresses server-trunkable without being session trunkable. 2080 A better approach would be to refer to "server-trunking" as used 2081 elsewhere in this document and use the replacement for the 2082 existing Section 18.35 to identify clientid trunking as the means 2083 to adapt to network addresses which are server-trunkable without 2084 being session-trunkable and session trunking as the means to adapt 2085 to network addresses which are session-trunkable. 2087 o The existing Section 11.7.2 needs to be better connected to 2088 trunking discovery. By calling these "transparent" transitions, 2089 it obscures the fact that some (or all) of the "transitions" it is 2090 discussing are not in fact transitions between servers or file 2091 systems but merely changes the set of communication paths in use. 2093 o The existing Section 11.7.2.1, needs to address more clearly the 2094 case of server-trunkable addresses which are not session- 2095 trunkable. As it is, it mentions the related concept of 2096 clustering, but only deals explicitly with the case in which two 2097 distinct servers share access to one or more file systems and does 2098 not mention the case in which the network addresses can be used to 2099 access a shared stateid space without being session-trunkable. 2101 o The existing Section 11.7.2.2, while correct, needs to be part of 2102 a general re-organization since the characteristics it lists as 2103 necessary for a transparent transition will be of use in other 2104 contexts, particularly as they apply to Transparent State 2105 Migration as well. It make sense to move these to a new sub- 2106 section within the equivalent of the Existing Section 11.7. 2108 o The existing Section 11.7.7, needs the a major rework to deal with 2109 its basic assumption, that existing state can only be made 2110 available on the destination server if the source and destination 2111 co-operate in state management and maintain a common client id 2112 space. It is not clear how this can be done, other than for 2113 servers working together so as to provide clientid trunking, a 2114 case that is probably considered as a "transparent transition". 2115 The section needs to modified to allow something along the lines 2116 of NFSv4.0-style Transparent State Migration with the details 2117 provided by a later section (see below). 2119 A related issues concerns the sentence, "In the case of migration, 2120 the servers involved in the migration of a file system SHOULD 2121 transfer all server state from the original to the new server. It 2122 is unclear why this is a "SHOULD" as the rest of the paragraph 2123 essentially tells the client that it needs to be prepared for the 2124 server not to do this. The equivalent is a "should" in [RFC7931], 2125 and there is no reason to add to confusion by making a "SHOULD" in 2126 NFSv4.1. also, there is no mention of the need to provide a fs- 2127 specific grace period in the cases in which Transparent State 2128 Migration is not made available. 2130 o Adding a new section (at level of the existing Section 11.7.7) 2131 about state transfer during migration. Although the phrase 2132 "Transparent State Migration" is well established in the context 2133 of NFSv4.0, the word "transparent" could cause confusion given the 2134 existing use of the phrase "transparent transitions". A possible 2135 title for the new section is "State Transfer during Migration" 2137 The new section would present the NFSv4.1-equivalent of Transparent 2138 State Migration as described in [RFC7931]. This would address the 2139 issues presented in Section 5.2 along the lines suggested in Sections 2140 5.3, 5.4, and 5.5. 2142 5.7. Potential Protocol Extensions 2144 There are a number possibilities to provides additional facilities 2145 related to issues discussed in this document using the protocol 2146 extension mechanisms described in [RFC8178]. These facilities relate 2147 to the handling of multiple connection types. 2149 The possibility of additional connection types was not addressed in 2150 NFSv4.0, either in [RFC3530] or [RFC7530]. While the use of multiple 2151 connection types is allowed, facilities to determine the connection 2152 type to be used are sub-optimal and are expected to remain so. 2154 In the case of NFSv4.1, there are facilities to aid in the 2155 determination of connection types that can be used. However, such 2156 facilities are limited to the two connection types already defined 2157 and may have weaknesses in dealing with changes in the set of 2158 connection types to be used and in selecting connections to be used, 2159 particularly in clustered server environments, in which the set of 2160 potential trunked server endpoints can be large. 2162 In light of this situation, it appears that a number of potential 2163 extensions to NFSv4 might be considered, as provided for by 2164 [RFC8178]. Such extensions could take the form of additional 2165 OPTIONAL attributes. While these attributes would be part of 2166 NFSv4.2, the fact that there is no change in the set of REQUIRED 2167 features between NFSv4.1 and NFSv4.2 means that the upgrade path for 2168 clients and servers can be made relatively simple. 2170 The additional attributes sketched out below would provide a more 2171 complete way of addressing the possibility of trunking of a large set 2172 of server endpoints, of multiple connection types: 2174 o A new fs-scoped attribute, fs_location_endpoints, could provide 2175 potential locations of a file system by using location entries 2176 specifying each potential endpoint, rather than specifying, as do 2177 fs_locations and fs_locations_info, the network address applicable 2178 to all potential endpoints. 2180 o A new server-scoped attribute, server_endpoints, could provide a 2181 set of trunkable endpoints to be used to access the current 2182 server, together with additional performance-related information 2183 useful for endpoint selection. 2185 A fuller elaboration of these proposals would require the writing of 2186 one or more standards-track documents, assuming sufficient interest 2187 in proceeding along this route. Any such work would be separate from 2188 other work suggested to resolve existing protocol issues and will not 2189 be mentioned in Section 6.2 2191 6. Evolution of Issue Handling 2193 6.1. History of this Document 2195 The contents of successive versions of this document have changed 2196 because new issues have been discovered, because there have been 2197 changes in our understanding of how these features should interact, 2198 and because some of the issues have been adequately addressed with 2199 regard to certain protocol versions. 2201 As a result, it may be helpful to understand the history of these 2202 issues, which is complicated because multiple NFSv4 protocols have 2203 been involved. 2205 This history can be summarized as follows 2207 o Initially, the focus was on the difficulties seen in NFSv4.0 2208 implementations of Transparent State Migration, and on identifying 2209 possible corrections to [RFC7530] that might address these issues. 2211 At this point, treatment of NFSv4.1 was minimal. 2213 o As examination of the issues continued, it became clear that the 2214 use of the non-uniform client string model was a critical element 2215 of the problem and further work proceeded on that basis. 2217 During the period, treatment of NFSv4.1 was expanded but the fact 2218 that NFSv4.1 had existing facilities for trunking detection was 2219 taken as an indication that the problems would not be difficult to 2220 address.. 2222 o As work proceeded on a standards-track document addressing the 2223 NFSv4.0 issues, material that proposed changes to address the 2224 issues became less relevant, since the effective vehicle for 2225 addressing these issues became the standards-track document 2226 eventually published as [RFC7931]. 2228 During this period, and subsequently, treatment of NFSv4.1 2229 remained essentially unchanged. 2231 o With the publication of [RFC7931], material regarding fixes for 2232 the NSV4.0 became vestigial but the material was retained for a 2233 while together with a shift from proposing those changes to 2234 reporting that they had been made. 2236 o Later, in response to experiences testing existing NFSv4.1 2237 implementations of migration, the focus of the document shifted 2238 decisively to NFSv4.1. As part of the analysis of migration 2239 within NFSv4.1, it was realized that issues related to the 2240 appearance of multiple addresses were fundamental to clearly 2241 describing how migration would work and that changes in the set of 2242 such addresses might or might not involve migration. 2244 At this point, discussion of NFSV4.0 issues was further limited. 2245 The issues seen were noted but the discussion of the resolution 2246 was limited to explaining that they had been addressed by the 2247 publication of [RFC7931]. 2249 o Finally, based on the results of work to provide NFSv4 with 2250 trunking discovery facilities, a decision was made that this work 2251 was most appropriately dealt with together with migration, for 2252 reasons noted previously. 2254 Since the trunking discovery facilities apply to all NFSv4 minor 2255 versions, work was needed to define those for NFSv4.0as well, 2256 together with the necessary interactions with migration. 2258 Although there is a need for further working group discussion and 2259 review, it appears that the issues to be dealt with have been 2260 identified and that most work to address these issues need to take 2261 place as part of the construction of one or more standards-track 2262 documents. See Section 6.2 for further information about possible 2263 approaches to providing the necessary documents. 2265 6.2. Further Work Needed 2267 The following table classifies issues in this area and indicates 2268 which are currently adequately addressed and where the protocol 2269 specifications need further correction or clarification. Where the 2270 topic is adequately addressed, a reference is given to the RFC 2271 providing support for the issue. In other cases, an area name 2272 (explained below) is given. 2274 +-------+-----------+----------+-----------+----------+-------------+ 2275 | Vers. | Trunking | Trunking | State | Multiple | Interaction | 2276 | | Detection | Disc. | Migration | Conn. | of Trunking | 2277 | | | | | Types | and | 2278 | | | | | | Migration | 2279 +-------+-----------+----------+-----------+----------+-------------+ 2280 | v4.0 | [RFC7931] | TrDisc-0 | [RFC7931] | Mct-0 | Int-0 | 2281 | v4.1+ | [RFC5661] | TrDisc-1 | SM-1 | Mct-1 | Int-1 | 2282 +-------+-----------+----------+-----------+----------+-------------+ 2284 The following table explains the work that needs to be done 2285 corresponding to each area name above. 2287 +----------+--------------------------------------------------------+ 2288 | Area | Description | 2289 | Name | | 2290 +----------+--------------------------------------------------------+ 2291 | TrDisc-0 | Although it is possible for there to be multiple | 2292 | | location entries for a given file system, the | 2293 | | possibility of using these to enable trunking | 2294 | | discovery was not addressed in [RFC7530], most likely | 2295 | | because trunking was considered a problem to be | 2296 | | avoided (rather than a helpful feature) at that time. | 2297 | | This situation could have been addressed by the | 2298 | | publication of [RFC7931] but unfortunately that did | 2299 | | not happen. | 2300 +----------+--------------------------------------------------------+ 2301 | TrDisc-1 | Despite the fact that [RFC5661] provides a means of | 2302 | | trunking detection, trunking discovery was not | 2303 | | addressed. This problem was compounded by confusion | 2304 | | regarding multiple file system replicas arising from | 2305 | | the fact that multiple network addresses connected to | 2306 | | the same server were treated as if they were referring | 2307 | | to distinct sets of replicas. | 2308 +----------+--------------------------------------------------------+ 2309 | SM-1 | Unlike [RFC7530], which mishandled Transparent State | 2310 | | Migration because of confusion arising from the lack | 2311 | | of appropriate trunking support, [RFC5661] simply | 2312 | | neglected to provide any description of this feature. | 2313 | | It appears likely that confusion between the needs of | 2314 | | migration and those of dealing with shifts in | 2315 | | responsibility for clustered file system access had | 2316 | | significant role in allowing this issue to be ignored. | 2317 | | Rectifying this situation along the lines of [RFC7931] | 2318 | | is complicated by the need to rewrite significant | 2319 | | pieces of the section about multi-server namespace to | 2320 | | address this confusion. Beyond this, the necessary | 2321 | | treatment will need to reflect changes required by the | 2322 | | use of the sessions model and related changes in | 2323 | | NFSv.1 and also address migration-related issues | 2324 | | raised by optional features such as pNFS and the | 2325 | | fs_locations_info attribute. In addition to | 2326 | | correcting the handling of Transparent State | 2327 | | migration, work also needs to be done to address | 2328 | | migration-related issues in the handling of | 2329 | | RECLAIM_COMPLETE. | 2330 +----------+--------------------------------------------------------+ 2331 | Mct-0 | Even though protocol support for multiple connection | 2332 | | types is quite limited in NFSv4.0, there still are | 2333 | | multiple connection types specified and implemented. | 2334 | | As a result, some guidance has to be given to allow | 2335 | | interoperable implementations to be developed, and | 2336 | | used, without extensive user configuration effort. | 2337 | | This should include some treatment of situations in | 2338 | | which the set of connection types to be used to access | 2339 | | a given file system changes, requiring appropriate | 2340 | | recovery from an NFS4ERR_MOVED error. | 2341 +----------+--------------------------------------------------------+ 2342 | Mct-1 | Even though protocol support for multiple connection | 2343 | | types is more limited than one might like, there are | 2344 | | helpful facilities that can be used to simplify the | 2345 | | process of determining the connection type(s) to be | 2346 | | used. The proper use of the available facilities | 2347 | | needs to be clarified including examination of cases | 2348 | | in which the set of connection types to be used to | 2349 | | access a given file system changes, requiring | 2350 | | appropriate recovery from an NFS4ERR_MOVED error. | 2351 +----------+--------------------------------------------------------+ 2352 | Int-0 | The need to provide trunking-related information puts | 2353 | | additional focus on the issue of dealing with changes | 2354 | | in the value of location-related attributes. This | 2355 | | applies when trunking configurations change and at | 2356 | | other times as well. In addition, the existence of | 2357 | | multiple network addresses connected to the same | 2358 | | server requires clarification when migration and | 2359 | | replication features are used. | 2360 +----------+--------------------------------------------------------+ 2361 | Int-1 | This requires similar handling to the case above. | 2362 | | However, further work is made necessary by the fact | 2363 | | that shifts between different sets of network | 2364 | | addresses are erroneously treated as instances of | 2365 | | migration in [RFC5661]. | 2366 +----------+--------------------------------------------------------+ 2368 There are number of possible ways of packaging the necessary changes 2369 into RFCs. Some of these are impractical for various reasons: 2371 o While it possible to treat each area in its own RFC, writing seven 2372 RFCs would increase the work required, and delay needed 2373 corrections to both versions. Further, it would result in a 2374 situation in which in which someone needing to understand the 2375 specification of NFS version 4.0 or 4.1 would need to be familiar 2376 with a large set of RFCs. 2378 o One could have a document addressing all of the areas above Such a 2379 document would update both [RFC7530] and [RFC5661]. That would 2380 result in a confusing document given how different the v4.0 and 2381 v4.1 protocols are, since most readers will want a clear 2382 description of one or the other. 2384 o It is also possible to produce separate documents addressing 2385 Trdisc-*, SM-1., Mct-*, and Int-*. This would be subject many of 2386 the difficulties of the two approaches above. 2388 The alternative, of organizing the changes by minor version, is being 2389 actively pursued by work on following Standards Track working group 2390 documents: 2392 o [I-D.ietf-nfsv4-mv0-trunking-update] addresses the issues within 2393 TrDisc-0 and Int-0 by providing updates to [RFC7530], the vast 2394 majority of which are within Section 8 of that document. Work to 2395 include Mct-0 needs to be added. 2397 o [I-D.ietf-nfsv4-mv1-msns-update] addresses the issues within 2398 TrDisc-1, SM-1, and Int-1 providing updates to [RFC5661], the vast 2399 majority of which are within Section 11 of that document. Work to 2400 include Mct-1 is underway. 2402 These two documents will require additional review and discussion 2403 before proceeding to publication as Proposed standards, updating 2404 [RFC7530] and [RFC5661] respectively. 2406 If the working group decides to continue along this path, it may be 2407 desirable to consolidate the changes currently specified in these 2408 documents. Currently, these document replace individual sub-sections 2409 of Section 8 (of [RFC7530]) or Section 11 (of [RFC5661]). While this 2410 is helpful in explaining what is changing and why, things might be 2411 different when the eventual RFC is published. At that point, it is 2412 could be judged more important to have simply understood 2413 specifications of NFS versions 4.0 and 4.1. At that point, a full 2414 replacement section of the affected section might be more desirable 2415 as the basis of the RFC to be published. Alternatively, that 2416 consolidation might be delayed and done later as part of publication 2417 of rfc7530bis and rfc5661bis documents. 2419 7. Security Considerations 2421 In general, the Security Considerations sections of existing 2422 specifications for NFS versions 4.0 and 4.1 provide recommendations 2423 for appropriate handling of requests obtaining location-related 2424 information. In particular, it is recommended that integrity 2425 protection be used when fetching location-related attributes: 2427 o With regard to NFSv4.0, this is done in Section 8.6 of [RFC7931] 2428 which updates the Security Considerations section of [RFC7530]. 2430 o With regard to NFSv4.1, this is done in the Security 2431 Considerations section of [RFC5661]. 2433 Despite this however, there is a need for further changes in the 2434 Security Considerations with regard to both minor versions dealt with 2435 here. The following issues need to be addressed: 2437 o Because of the potential use of DNS to convert server names to a 2438 set of server network addresses, such translations are subject to 2439 the same sorts of potential interference with trunking discovery 2440 that would occur when trunking discovery is provided using network 2441 addresses returned in the location-related attributes. 2443 To address this issue, specifications for both minor versions need 2444 to mention the issue and indicate that use of DNSSEC [RFC4033] is 2445 appropriate. When it is not available, the server should allow 2446 use of DNS for trunking discovery to be avoided by presenting 2447 network addresses in the location-related attributes, with these 2448 values subject to RPCSECGSS integrity protection. 2450 o Although use of RPCSEC_GSS ([RFC2203], [RFC5403], [RFC7861]) with 2451 integrity protection is RECOMMENDED and "implementations" are 2452 REQUIRED to provide support. However, the possibility that a 2453 particular client may be unable to use RPCSEC_GSS when accessing a 2454 particular server cannot be excluded. As a result, it is 2455 necessary to discuss how such situations affect trunking 2456 discovery, referral, replication, and migration. 2458 o In the case of replication, referral, and migration, it is 2459 necessary to discuss how RPCSEC_GSS mutual authentication on the 2460 destination can be used to make sure that the network addresses 2461 provided by trunking discovery have not been interfered with and 2462 correspond to the server names provided by the location attributes 2463 on the server to which the client was directed. 2465 8. IANA Considerations 2467 This document does not require actions by IANA. 2469 9. References 2471 9.1. Normative References 2473 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2474 Requirement Levels", BCP 14, RFC 2119, 2475 DOI 10.17487/RFC2119, March 1997, 2476 . 2478 [RFC2203] Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol 2479 Specification", RFC 2203, DOI 10.17487/RFC2203, September 2480 1997, . 2482 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 2483 Rose, "DNS Security Introduction and Requirements", 2484 RFC 4033, DOI 10.17487/RFC4033, March 2005, 2485 . 2487 [RFC5403] Eisler, M., "RPCSEC_GSS Version 2", RFC 5403, 2488 DOI 10.17487/RFC5403, February 2009, 2489 . 2491 [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol 2492 Specification Version 2", RFC 5531, DOI 10.17487/RFC5531, 2493 May 2009, . 2495 [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., 2496 "Network File System (NFS) Version 4 Minor Version 1 2497 Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, 2498 . 2500 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 2501 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 2502 March 2015, . 2504 [RFC7861] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) 2505 Security Version 3", RFC 7861, DOI 10.17487/RFC7861, 2506 November 2016, . 2508 [RFC7931] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, 2509 "NFSv4.0 Migration: Specification Update", RFC 7931, 2510 DOI 10.17487/RFC7931, July 2016, 2511 . 2513 [RFC8166] Lever, C., Ed., Simpson, W., and T. Talpey, "Remote Direct 2514 Memory Access Transport for Remote Procedure Call Version 2515 1", RFC 8166, DOI 10.17487/RFC8166, June 2017, 2516 . 2518 [RFC8178] Noveck, D., "Rules for NFSv4 Extensions and Minor 2519 Versions", RFC 8178, DOI 10.17487/RFC8178, July 2017, 2520 . 2522 [RFC8267] Lever, C., "Network File System (NFS) Upper-Layer Binding 2523 to RPC-over-RDMA Version 1", RFC 8267, 2524 DOI 10.17487/RFC8267, October 2017, 2525 . 2527 9.2. Informative References 2529 [I-D.ietf-nfsv4-mv0-trunking-update] 2530 Lever, C. and D. Noveck, "NFS version 4.0 Trunking 2531 Update", draft-ietf-nfsv4-mv0-trunking-update-01 (work in 2532 progress), July 2018. 2534 [I-D.ietf-nfsv4-mv1-msns-update] 2535 Noveck, D. and C. Lever, "NFSv4.1 Update for Multi-Server 2536 Namespace", draft-ietf-nfsv4-mv1-msns-update-01 (work in 2537 progress), June 2018. 2539 [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., 2540 Beame, C., Eisler, M., and D. Noveck, "Network File System 2541 (NFS) version 4 Protocol", RFC 3530, DOI 10.17487/RFC3530, 2542 April 2003, . 2544 Acknowledgments 2546 The editor and authors of this document gratefully acknowledge the 2547 contributions of Trond Myklebust of Primary Data, Robert Thurlow of 2548 Oracle, and Andy Adamson of NetApp. We also thank Tom Haynes of 2549 Primary Data and Spencer Shepler of Microsoft for their guidance and 2550 suggestions. 2552 Rick Macklem provided an analysis of the current description of 2553 RECLAIM_COMPLETE and information about its implemenation for which we 2554 are grateful. 2556 Special thanks go to members of the Oracle Solaris NFS team, 2557 especially Rick Mesta and James Wahlig who were then part of that 2558 team, for their work implementing an NFSv4.0 migration prototype and 2559 identifying many of the issues documented here. Also, the work of 2560 Xuan Qi for Oracle using NFSv4.1 client and server prototypes was 2561 helpful. 2563 Authors' Addresses 2565 David Noveck (editor) 2566 NetApp 2567 1601 Trapelo Road 2568 Waltham, MA 02451 2569 US 2571 Phone: +1 781 572 8038 2572 Email: davenoveck@gmail.com 2574 Piyush Shivam 2575 IBM Corporation 2576 11501 Burnet Road 2577 Austin, TX 78758 2578 US 2580 Email: piyush.shivam@ibm.com 2582 Charles Lever 2583 Oracle Corporation 2584 1015 Granger Avenue 2585 Ann Arbor, MI 48104 2586 US 2588 Phone: +1 248 614 5091 2589 Email: chuck.lever@oracle.com 2591 Bill Baker 2592 Oracle Corporation 2593 5300 Riata Park Ct. 2594 Austin, TX 78727 2595 US 2597 Phone: +1 512 401 1081 2598 Email: bill.baker@oracle.com