idnits 2.17.1 draft-ietf-nfsv4-migration-issues-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 8, 2017) is 2627 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5661 (Obsoleted by RFC 8881) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFSv4 D. Noveck, Ed. 3 Internet-Draft 4 Intended status: Informational P. Shivam 5 Expires: August 12, 2017 C. Lever 6 B. Baker 7 ORACLE 8 February 8, 2017 10 NFSv4 migration: Implementation Experience and Specification Issues 11 draft-ietf-nfsv4-migration-issues-11 13 Abstract 15 The migration feature of NFSv4 provides for moving responsibility for 16 a single filesystem from one server to another, without disruption to 17 clients. Implementation experience has shown problems in 18 specification for this feature in RFC7530. This document explains 19 the choices made to address these issues by updating the NFSv4.0 20 specification in RFC7931 and those to be made with regard to the 21 NFSv4.1 specification, in order to properly address migration. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on August 12, 2017. 40 Copyright Notice 42 Copyright (c) 2017 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3. NFSv4.0 Implementation Experience . . . . . . . . . . . . . . 3 60 3.1. Implementation Issues . . . . . . . . . . . . . . . . . . 3 61 3.1.1. Failure to Free Migrated State on Client Reboot . . . 4 62 3.1.2. Server Reboots Resulting in a Confused Lease 63 Situation . . . . . . . . . . . . . . . . . . . . . . 4 64 3.1.3. Client Complexity Issues . . . . . . . . . . . . . . 5 65 3.2. Sources of Protocol Difficulties . . . . . . . . . . . . 7 66 3.2.1. Issues with nfs_client_id4 Generation and Use . . . . 7 67 3.2.2. Issues with Lease Proliferation . . . . . . . . . . . 9 68 4. Resolution of NFSv4.0 Protocol Difficulties . . . . . . . . . 9 69 4.1. Changes Regarding nfs_client_id4 Client-string . . . . . 9 70 4.2. Changes Regarding Merged (vs. Synchronized) Leases . . . 10 71 4.3. Other Changes to Migration-state Sections . . . . . . . . 11 72 4.3.1. Changes Regarding Client ID Migration . . . . . . . . 12 73 4.3.2. Changes Regarding Callback Re-establishment . . . . . 12 74 4.3.3. NFS4ERR_LEASE_MOVED Rework . . . . . . . . . . . . . 13 75 4.4. Changes to Other Sections . . . . . . . . . . . . . . . . 13 76 4.4.1. Need for Additional Changes . . . . . . . . . . . . . 13 77 4.4.2. Callback Update . . . . . . . . . . . . . . . . . . . 14 78 4.4.3. clientid4 Handling . . . . . . . . . . . . . . . . . 14 79 4.4.4. Handling of NFS4ERR_CLID_INUSE . . . . . . . . . . . 16 80 5. Issues for NFSv4.1 . . . . . . . . . . . . . . . . . . . . . 17 81 5.1. Addressing state merger in NFSv4.1 . . . . . . . . . . . 17 82 5.2. Addressing pNFS relationship with migration . . . . . . . 18 83 5.3. Addressing server owner changes in NFSv4.1 . . . . . . . 18 84 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 85 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 86 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 87 9. Normative References . . . . . . . . . . . . . . . . . . . . 20 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 90 1. Introduction 92 This document is in the informational category, and while the facts 93 it reports may have normative implications, any such normative 94 significance reflects the readers' preferences. For example, we may 95 report that the reboot of a client with migrated state results in 96 state not being promptly cleared and that this will prevent granting 97 of conflicting lock requests at least for the lease time, which is a 98 fact. While it is to be expected that client and server implementers 99 will judge this to be a situation that is best avoided, the judgment 100 as to how pressing this issue should be considered is a judgment for 101 the reader, and eventually the nfsv4 working group to make. 103 We do explore possible ways in which such issues can be avoided, with 104 minimal negative effects, given that the working group has decided to 105 address these issues, but the choice of exactly how to address these 106 is best given effect in one or more standards-track documents and/or 107 errata. 109 This document focuses on NFSv4.0, since that is where the majority of 110 implementation experience has been. Nevertheless, there is 111 discussion of the implications of the NFSv4.0 experience for 112 migration in NFSv4.1, as well as discussion of other issues with 113 regard to the treatment of migration in NFSv4.1. 115 2. Conventions 117 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 118 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 119 document are to be interpreted as described in [RFC2119]. 121 In the context of this informational document, these normative 122 keywords will always occur in the context of a quotation, most often 123 direct but sometimes indirect. The context will make it clear 124 whether the quotation is from: 126 o The previously current definitive definition of the NFSv4.0 127 protocol [RFC7530]. 129 o The current definitive definition of the NFSv4.1 protocol 130 [RFC5661]. 132 o A proposed or possible text to serve as a replacement for the 133 current or previous definitive document text. Sometimes, a number 134 of possible alternative texts may be listed and benefits and 135 detriments of each examined in turn. 137 3. NFSv4.0 Implementation Experience 139 3.1. Implementation Issues 141 Note that the examples below reflect current experience which arises 142 from clients implementing the recommendation to use different 143 nfs_client_id4 id strings for different server addresses, i.e. using 144 what is later referred to herein as the "non-uniform client-string 145 approach." 147 This is simply because that is the experience implementers have had. 148 The reader should not assume that in all cases, this practice is the 149 source of the difficulty. It may be so in some cases but clearly it 150 is not in all cases. 152 3.1.1. Failure to Free Migrated State on Client Reboot 154 The following sort of situation has proved troublesome: 156 o A client C establishes a clientid4 C1 with server ABC specifying 157 an nfs_client_id4 with id string value "C-ABC" and boot verifier 158 0x111. 160 o The client begins to access files in filesystem F on server ABC, 161 resulting in generating stateids S1, S2, etc. under the lease for 162 clientid C1. It may also access files on other filesystems on the 163 same server. 165 o The filesystem is migrated from server ABC to server XYZ. When 166 transparent state migration is in effect, stateids S1 and S2 and 167 clientid4 C1 are now available for use by client C at server XYZ. 169 o Client C reboots and attempts to access data on server XYZ, 170 whether in filesystem F or another. It does a SETCLIENTID with an 171 nfs_client_id4 with id string value "C-XYZ" and boot verifier 172 0x112. There is thus no occasion to free stateids S1 and S2 since 173 they are associated with a different client name and so lease 174 expiration is the only way that they can be gotten rid of. 176 Note here that while it seems clear to us in this example that C-XYZ 177 and C-ABC are from the same client, the server has no way to 178 determine the structure of the "opaque" id string. In the protocol, 179 it really is treated as opaque. Only the client knows which 180 nfs_client_id4 values designate the same client on a different 181 server. 183 3.1.2. Server Reboots Resulting in a Confused Lease Situation 185 Further problems arise from scenarios like the following. 187 o Client C talks to server ABC using an nfs_client_id4 id string 188 such as "C-ABC" and a boot verifier v1. As a result, a lease with 189 clientid4 c.i is established: {v1, "C-ABC", c.i}. 191 o fs_a1 migrates from server ABC to server XYZ along with its state. 192 Now server XYZ also has a lease: {v1, "C-ABC", c.i}. 194 o Server ABC reboots. 196 o Client C talks to server ABC using an nfs_client_id4 id string 197 such as "C-ABC" and a boot verifier v1. As a result, a lease with 198 clientid4 c.j is established: {v1, "C-ABC", c.j}. 200 o fs_a2 migrates from server ABC to server XYZ. Now server XYZ also 201 has a lease: {v1, "C-ABC", c.j}. 203 o Now server XYZ has two leases that match {v1, "C-ABC", *}, when 204 the protocol clearly assumes there can be only one. 206 Note that if the client used "C" (rather than "C-ABC") as the 207 nfs_client_id4 id string, the exact same situation would arise. 209 One of the first cases in which this sort of situation has resulted 210 in difficulties is in connection with doing a SETCLIENTID for 211 callback update. 213 The SETCLIENTID for callback update only includes the nfs_client_id4, 214 assuming there can only be one such with a given nfs_client_id4 215 value. If there were multiple, confirmed client records with 216 identical nfs_client_id4 id string values, there would be no way to 217 map the callback update request to the correct client record. Apart 218 from the migration handling specified in [RFC7530], such a situation 219 cannot arise. 221 3.1.3. Client Complexity Issues 223 Consider the following situation: 225 o There are a set of clients C1 through Cn accessing servers S1 226 through Sm. Each server manages some significant number of 227 filesystems with the filesystem count L being significantly 228 greater than m. 230 o Each client Cx will access a subset of the servers and so will 231 have up to m clientids, which we will call Cxy for server Sy. 233 o Now assume that for load-balancing or other operational reasons, 234 numbers of filesystems are migrated among the servers. As a 235 result, each client-server pair will have up to m clientids and 236 each client will have up to m**2 clientids. If we add the 237 possibility of server reboot, the only bound on a client's 238 clientid count is L. 240 Now, instead of a clientid4 identifying a client-server pair, we have 241 many more entities for the client to deal with. In addition, it 242 isn't clear how new state is to be incorporated in this structure. 244 The limitations of the migrated state (inability to be freed on 245 reboot) would argue against adding more such state but trying to 246 avoid that would run into its own difficulties. For example, a 247 single lockowner string presented under two different clientids would 248 appear as two different entities. 250 Thus we have to choose between: 252 o indefinite prolongation of foreign clientids even after all 253 transferred state is gone. 255 o having multiple requests for the same lockowner-string-named 256 entity carried on in parallel by separate identically named 257 lockowners under different clientid4's 259 o Adding serialization at the lock-owner string level, in addition 260 to that at the lockowner level. 262 In any case, we have gone (in adding migration as it was described) 263 from a situation in which 265 o Each client has a single clientid4/lease for each server it talks 266 to. 268 o Each client has a single nfs_client_id4 for each server it talks 269 to. 271 o Every state id can be mapped to an associated lease based on the 272 server it was obtained from. 274 To one in which 276 o Each client may have multiple clientid4's for a single server. 278 o For each stateid, the client must separately record the clientid4 279 that it is assigned to, or it must manage separate "state blobs" 280 for each fsid and map those to clientid4's. 282 o Before doing an operation that can result in a stateid, the client 283 must either find a "state blob" based on fsid or create a new one, 284 possibly with a new clientid4. 286 o There may be multiple clientid4's all connected to the same server 287 and using the same nfs_clientid4. 289 This sort of additional client complexity is troublesome and needs to 290 be eliminated. 292 3.2. Sources of Protocol Difficulties 294 3.2.1. Issues with nfs_client_id4 Generation and Use 296 In [RFC7530], the section entitled "Client ID" says: 298 The second field, id is a variable length string that uniquely 299 defines the client. 301 There are two possible interpretations of the phrase "uniquely 302 defines" in the above: 304 o The relation between strings and clients is a function from such 305 strings to clients so that each string designates a single client. 307 o The relation between strings and clients is a bijection between 308 such strings and clients so that each string designates a single 309 client and each client is named by a single string. 311 The first interpretation would make these client-strings like phone 312 numbers (a single person can have several) while the second would 313 make them like social security numbers. 315 Debate about the possible meanings of "uniquely defines" in this 316 context is quite possible but not very helpful. The following points 317 should be noted though: 319 o The second interpretation is more consistent with the way 320 "uniquely defines" is used elsewhere in the spec. 322 o The spec as now written intends the first interpretation (or is 323 internally inconsistent). In fact, it recommends, although non- 324 normatively, that a single client have at least as many client- 325 strings as server addresses that it interacts with. It says, in 326 the third bullet point regarding construction of the string (which 327 we shall henceforth refer to as client-string-BP3): 329 The string should be different for each server network address 330 that the client accesses, rather than common to all server 331 network addresses. 333 o If internode interactions are limited to those between a client 334 and its servers, there is no occasion for servers to be concerned 335 with the question of whether two client-strings designate the same 336 client, so that there is no occasion for the difference in 337 interpretation to matter. 339 o When transparent migration of client state occurs between two 340 servers, it becomes important to determine when state on two 341 different servers is for the same client or not, and this 342 distinction becomes very important. 344 Given the need for the server to be aware of client identity with 345 regard to migrated state, either client-string construction rules 346 will have to change or there will be a need to get around current 347 issues, or perhaps a combination of these two will be required. 348 Later sections will examine the options and propose a solution. 350 One consideration that may indicate that this cannot remain exactly 351 as it has been derives from the fact that the current explanation for 352 this behavior is not correct. In [RFC7530], the section entitled 353 "Client ID" says: 355 The reason is that it may not be possible for the client to tell 356 if the same server is listening on multiple network addresses. If 357 the client issues SETCLIENTID with the same id string to each 358 network address of such a server, the server will think it is the 359 same client, and each successive SETCLIENTID will cause the server 360 to begin the process of removing the client's previous leased 361 state. 363 In point of fact, a "SETCLIENTID with the same id string" sent to 364 multiple network addresses will be treated as all from the same 365 client but will not "cause the server to begin the process of 366 removing the client's previous leased state" unless the server 367 believes it is a different instance of the same client, i.e. if the 368 id string is the same and there is a different boot verifier. If the 369 client does not reboot, the verifier should not change. If it does 370 reboot, the verifier will change, and it is appropriate that the 371 server "begin the process of removing the client's previous leased 372 state. 374 The situation of multiple SETCLIENTID requests received by a server 375 on multiple network addresses is exactly the same, from the protocol 376 design point of view, as when multiple (i.e. duplicate) SETCLIENTID 377 requests are received by the server on a single network address. The 378 same protocol mechanisms that prevent erroneous state deletion in the 379 latter case prevent it in the former case. There is no reason for 380 special handling of the multiple-network-appearance case, in this 381 regard. 383 3.2.2. Issues with Lease Proliferation 385 It is often felt that this is a consequence of the client-string 386 construction issues, and it is certainly the case that the two are 387 closely connected in that non-uniform client-strings make it 388 impossible for the server to appropriately combine leases from the 389 same client. 391 However, even where the server could combine leases from the same 392 client, it needs to be clear how and when it will do so, so that the 393 client will be prepared. These issues will have to be addressed at 394 various places in the protocol specification. 396 This could be enough only if we are prepared to do away with the 397 "should" recommending non-uniform client-strings and replace it with 398 a "should not" or even a "SHOULD NOT". Current client implementation 399 patterns make this an unpalatable choice for use as a general 400 solution, but it is reasonable to "RECOMMEND" this choice for a well- 401 defined subset of clients. One alternative would be to create a way 402 for the server to infer from client behavior which leases are held by 403 the same client and use this information to do appropriate lease 404 mergers. Prototyping and detailed specification work has shown that 405 this could be done but the resulting complexity is such that a better 406 choice is to "RECOMMEND" use of the uniform client-string approach 407 for clients supporting the migration feature. 409 Because of the discussion of client-string construction in [RFC7530], 410 most existing clients implement the non-uniform client-string 411 approach. As a result, existing servers may not have been tested 412 with clients implementing uniform client-strings. As a consequence, 413 care must be taken to preserve interoperability between UCS-capable 414 clients and servers that don't tolerate uniform client strings for 415 one reason or another. 417 4. Resolution of NFSv4.0 Protocol Difficulties 419 This section lists the changes that were necessary to resolve the 420 difficulties mentioned above. Such changes, along with other 421 clarifications found to be desirable during drafting and review are 422 contained in [RFC7931]. 424 4.1. Changes Regarding nfs_client_id4 Client-string 426 It was decided to replace client-string-BP3 with the following text: 428 The string MAY be different for each server network address that 429 the client accesses, rather than common to all server network 430 addresses. 432 In addition, given the importance of the issue of client identity and 433 the fact that both client string-approaches are to be considered 434 valid, a greatly expanded treatment of client identity was desirable. 435 It had the following major elements. 437 o Fully describing the consequences of making the string different 438 for each network address (the non-uniform client-string approach) 439 and of making it the same for all network addresses (the uniform 440 client string approach). 442 o Giving helpful guidance about the factors that might affect client 443 implementation choice between these approaches. 445 o Describing the compatibility issues that might cause servers to be 446 incompatible with the uniform approach and give guidance about 447 dealing with these. 449 o Describing how a client using the uniform approach might use 450 server behavior to determine server address trunking patterns. 452 o Presenting a clearer and more complete set of recommendations to 453 guide client string construction. 455 4.2. Changes Regarding Merged (vs. Synchronized) Leases 457 In [RFC7530], the section entitled "Migration and State" says: 459 As part of the transfer of information between servers, leases 460 would be transferred as well. The leases being transferred to the 461 new server will typically have a different expiration time from 462 those for the same client, previously on the old server. To 463 maintain the property that all leases on a given server for a 464 given client expire at the same time, the server should advance 465 the expiration time to the later of the leases being transferred 466 or the leases already present. This allows the client to maintain 467 lease renewal of both classes without special effort: 469 There are a number of problems with this and any resolution of our 470 difficulties must address them somehow. 472 o [RFC7530] recommends that the client make it essentially 473 impossible to determine when two leases are from "the same 474 client". 476 o It is not appropriate to speak of "maintain[ing] the property that 477 all leases on a given server for a given client expire at the same 478 time", since this is not a property that holds even in the absence 479 of migration. A server listening on multiple network addresses 480 may have the same client appear as multiple clients with no way to 481 recognize the client as the same. 483 o Even if the client identity issue could be resolved, advancing the 484 lease time at the point of migration would not maintain the 485 desired synchronization property. The leases would be 486 synchronized until one of them was renewed, after which they would 487 be unsynchronized again. 489 To avoid client complexity, we need to have no more than one lease 490 between a single client and a single server. This requires merger of 491 leases since there is no real help from synchronizing them at a 492 single instant. 494 For the uniform approach, the destination server would simply merge 495 leases as part of state transfer, since two leases with the same 496 nfs_client_id4 values must be for the same client. 498 We have made the following decisions as far as proposed normative 499 statements regarding for state merger. They reflect the facts that 500 we want to allow full migration support in the simplest way possible 501 and that we can't say MUST since we have older clients and servers to 502 deal with. 504 o Clients MAY use the uniform client-string approach and are well- 505 advised to do so if they are concerned about getting good 506 migration support. 508 o Servers SHOULD provide automatic lease merger during state 509 migration so that clients using the uniform id approach get the 510 support automatically. 512 If servers obey the SHOULD and clients choose to adopt the uniform id 513 approach, having more than a single lease for a given client-server 514 pair will be a transient situation, cleaned up as part of adapting to 515 use of migrated state. 517 Since clients and servers will be a mixture of old and new and 518 because nothing is a MUST we have to ensure that no combination will 519 show worse behavior than is exhibited by current (i.e. old) clients 520 and servers. 522 4.3. Other Changes to Migration-state Sections 523 4.3.1. Changes Regarding Client ID Migration 525 In [RFC7530], the section entitled "Migration and State" says: 527 In the case of migration, the servers involved in the migration of 528 a filesystem SHOULD transfer all server state from the original to 529 the new server. This must be done in a way that is transparent to 530 the client. This state transfer will ease the client's transition 531 when a filesystem migration occurs. If the servers are successful 532 in transferring all state, the client will continue to use 533 stateids assigned by the original server. Therefore the new 534 server must recognize these stateids as valid. This holds true 535 for the client ID as well. Since responsibility for an entire 536 filesystem is transferred with a migration event, there is no 537 possibility that conflicts will arise on the new server as a 538 result of the transfer of locks. 540 This poses some difficulties, mostly because the part about "client 541 ID" is not clear: 543 o It isn't clear what part of the paragraph the "this" in the 544 statement "this holds true ..." is meant to signify. 546 o The phrase "the client ID" is ambiguous, possibly indicating the 547 clientid4 and possibly indicating the nfs_client_id4. 549 o If the text means to suggest that the same clientid4 must be used, 550 the logic is not clear since the issue is not the same as for 551 stateids of which there might be many. Adapting to the change of 552 a single clientid, as might happen as a part of lease migration, 553 is relatively easy for the client. 555 We have decided that it is best to address this issue as follows: 557 o Make it clear that both clientid4 and nfs_client_id4 (including 558 both id string and boot verifier) are to be transferred. 560 o Indicate that the initial transfer will result in the same 561 clientid4 after transfer but this is not guaranteed since there 562 may conflict with an existing clientid4 on the destination server 563 and because lease merger can result in a change of the clientid4. 565 4.3.2. Changes Regarding Callback Re-establishment 567 In [RFC7530], the section entitled "Migration and State" says: 569 A client SHOULD re-establish new callback information with the new 570 server as soon as possible, according to sequences described in 571 sections "Operation 35: SETCLIENTID - Negotiate Client ID" and 572 "Operation 36: SETCLIENTID_CONFIRM - Confirm Client ID". This 573 ensures that server operations are not blocked by the inability to 574 recall delegations. 576 The above will need to be fixed to reflect the possibility of merging 577 of leases, 579 4.3.3. NFS4ERR_LEASE_MOVED Rework 581 In [RFC7530], the section entitled "Notification of Migrated Lease" 582 says: 584 Upon receiving the NFS4ERR_LEASE_MOVED error, a client that 585 supports filesystem migration MUST probe all filesystems from that 586 server on which it holds open state. Once the client has 587 successfully probed all those filesystems which are migrated, the 588 server MUST resume normal handling of stateful requests from that 589 client. 591 There is a lack of clarity that is prompted by ambiguity about what 592 exactly probing is and what the interlock between client and server 593 must be. This has led to some worry about the scalability of the 594 probing process, and although the time required does scale linearly 595 with the number of filesystems that the client may have state for 596 with respect to a given server, the actual process can be done 597 efficiently. 599 To address these issues, the text above had to be rewritten to be 600 more clear and to give suggestions about how to do the required 601 scanning efficiently. 603 4.4. Changes to Other Sections 605 4.4.1. Need for Additional Changes 607 There are a number of cases in which certain sections, not 608 specifically related to migration, require additional clarification. 609 This is generally because text that is clear in a context in which 610 leases and clientids are created in one place and live there forever 611 may need further refinement in the more dynamic environment that 612 arises as part of migration. 614 Some examples: 616 o Some people are under the impression that updating callback 617 endpoint information for an existing client, as used during 618 migration, may cause the destination server to free existing 619 state. There need to be additions to clarify the situation. 621 o The handling of the sets of clientid4's maintained by each server 622 needs to be clarified. In particular, the issue of how the client 623 adapts to the presumably independent and uncoordinated clientid4 624 sets needs to be clearly addressed 626 o Statements regarding handling of invalid clientid4's need to be 627 clarified and/or refined in light of the possibilities that arise 628 due to lease motion and merger. 630 o Confusion and lack of clarity about NFS4ERR_CLID_INUSE. 632 4.4.2. Callback Update 634 Some changes are necessary to reduce confusion about the process of 635 callback information update and in particular to make it clear that 636 no state is freed as a result: 638 o Make it clear that after migration there are confirmed entries for 639 transferred clientid4/nfs_client_id4 pairs. 641 o Be explicit in the sections headed "otherwise," in the 642 descriptions of SETCLIENTID and SETCLIENTID_CONFIRM, that these 643 don't apply in the cases we are concerned about. 645 4.4.3. clientid4 Handling 647 To address both of the clientid4-related issues mentioned in 648 Section 4.4.1, it was necessary to replace the last three paragraphs 649 of the section entitled "Client ID" with the following: 651 Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has 652 successfully completed, the client uses the shorthand client 653 identifier, of type clientid4, instead of the longer and less 654 compact nfs_client_id4 structure. This shorthand client 655 identifier (a client ID) is assigned by the server and should be 656 chosen so that it will not conflict with a client ID previously 657 assigned by same server. This applies across server restarts or 658 reboots. 660 Distinct servers MAY assign clientid4's independently, and will 661 generally do so. Therefore, a client has to be prepared to deal 662 with multiple instances of the same clientid4 value received on 663 distinct IP addresses, denoting separate entities. When trunking 664 of server IP addresses is not a consideration, a client should 665 keep track of (IP-address, clientid4) pairs, so that each pair is 666 distinct. In the face of possible trunking of server IP 667 addresses, the client will use the receipt of the same clientid4 668 from multiple IP-addresses, as an indication that the two IP- 669 addresses may be trunked and proceed to determine, from the 670 observed server behavior whether the two addresses are in fact 671 trunked. 673 When a clientid4 is presented to a server and that clientid4 is 674 not recognized, the server will reject the request with the error 675 NFS4ERR_STALE_CLIENTID. This can occur for a number of reasons: 677 * A server reboot causing loss of the server's knowledge of the 678 client 680 * Client error sending an incorrect clientid4 or a valid 681 clientid4 to the wrong server. 683 * Loss of lease state due to lease expiration. 685 * Client or server error causing the server to believe that the 686 client has rebooted (i.e. receiving a SETCLIENTID with an 687 nfs_client_id4 which has a matching id string and a non- 688 matching boot verifier). 690 * Migration of all state under the associated lease causes its 691 non-existence to be recognized on the source server. 693 * Merger of state under the associated lease with another lease 694 under a different clientid causes the clientid4 serving as the 695 source of the merge to cease being recognized on its server. 697 In the event of a server reboot, or loss of lease state due to 698 lease expiration, the client must obtain a new clientid4 by use of 699 the SETCLIENTID operation and then proceed to any other necessary 700 recovery for the server reboot case (See the section entitled 701 "Server Failure and Recovery"). In cases of server or client 702 error resulting in this error, use of SETCLIENTID to establish a 703 new lease is desirable as well. 705 In the last two cases, different recovery procedures are required. 706 Note that in cases in which there is any uncertainty about which 707 sort of handling is applicable, the distinguishing characteristic 708 is that in reboot-like cases, the clientid4 and all associated 709 stateids cease to exist while in migration-related cases, the 710 clientid4 ceases to exist while the stateids are still valid. 712 The client must also employ the SETCLIENTID operation when it 713 receives a NFS4ERR_STALE_STATEID error using a stateid derived 714 from its current clientid4, since this indicates a situation, such 715 as server reboot which has invalidated the existing clientid4 and 716 associated stateids (see the section entitled "lock-owner" for 717 details). 719 See the detailed descriptions of SETCLIENTID and 720 SETCLIENTID_CONFIRM for a complete specification of the 721 operations. 723 4.4.4. Handling of NFS4ERR_CLID_INUSE 725 It appears to be the intention that only a single principal be used 726 for client establishment between any client-server pair. However: 728 o There is no explicit statement to this effect. 730 o The error that indicates a principal conflict has a name which 731 does not clarify this issue: NFS4ERR_CLID_INUSE. 733 o The definition of the error is also not very helpful: "The 734 SETCLIENTID operation has found that a client id is already in use 735 by another client". 737 As a result, servers exist which reject a SETCLIENTID simply because 738 there already exists a clientid for the same client, established 739 using a different IP address. Although this is generally understood 740 to be erroneous, such servers still exist and the spec should make 741 the correct behavior clear. 743 Although the error name cannot be changed, the following changes 744 should be made to avoid confusion: 746 o The definition of the error should be changed to read as follows: 748 The SETCLIENTID operation has found that the specified 749 nfs_client_id4 was previously presented with a different 750 principal and that client instance currently holds an active 751 lease. A server MAY return this error if the same principal is 752 used but a change in authentication flavor gives good reason to 753 reject the new SETCLIENTID operation as not bona fide. 755 o In the description of SETCLIENTID, the phrase "then the server 756 returns a NFS4ERR_CLID_INUSE error" should be expanded to read 757 "then the server returns a NFS4ERR_CLID_INUSE error, since use of 758 a single client with multiple principals is not allowed." 760 5. Issues for NFSv4.1 762 Because NFSv4.1 embraces the uniform client-string approach, as 763 advised by section 2.4 of [RFC5661], addressing migration issues is 764 simpler. 766 Nevertheless, there are some issues that will have to be addressed. 767 Some examples: 769 o The other necessary part of addressing migration issues, providing 770 for the server's merger of leases that relate to the same client, 771 is not currently addressed by NFSv4.1 and changes need to be made 772 to make it clear that state needs to be appropriately merged as 773 part of migration, to avoid multiple clientids between a client- 774 server pair. 776 o There needs to be some clarification of how migration, and 777 particularly transparent state migration, should interact with 778 pNFS layouts. 780 o The current discussion (in [RFC5661]), of the possibility of 781 server_owner changes is incomplete and confusing. 783 Discussion of how to resolve these issues will appear in the sections 784 below. 786 5.1. Addressing state merger in NFSv4.1 788 The existing treatment of state transfer in [RFC5661], has similar 789 problems to that in [RFC7530] in that it assumes that the state for 790 multiple filesystems on different servers will not be merged to so 791 that it appears under a single common clientid. We've already seen 792 the reasons that this is a problem, with regard to NFSv4.0. 794 Although we don't have the problems stemming from the non-uniform 795 client-string approach, there are a number of complexities in the 796 existing treatment of state management in the section entitled "Lock 797 State and File System Transitions" in [RFC5661] that make this non- 798 trivial to address: 800 o Migration is currently treated together with other sorts of 801 filesystem transitions including transitioning between replicas 802 without any NFS4ERR_MOVED errors. 804 o There is separate handling and discussion of the cases of matching 805 and non-matching server scopes. 807 o In the case of matching server scopes, the text calls for an 808 impossible degree of transparency. 810 o In the case of non-matching server scopes, the text does not 811 mention transparent state migration at all, resulting in a 812 functional regression from NFSV4.0 814 5.2. Addressing pNFS relationship with migration 816 This is made difficult because, within the PNFS framework, migration 817 might mean any of several things: 819 o Transfer of the MDS, leaving DS's alone. 821 This would be minimally disruptive to those using layouts but 822 would require the pNFS control protocol to support the DS being 823 directed to a new MDS. 825 o Transfer of a DS, leaving everything else in place. 827 Such a transfer can be handled without using migration at all. 828 The server can recall/revoke layouts, as appropriate. 830 o Transfer of the filesystem to a new filesystem with both MDS and 831 DS's moving. 833 In such a transfer, an entirely different set of DS's will be at 834 the target location. There may even be no pNFS support on the 835 destination filesystem at all. 837 Migration needs to support both the first and last of these models. 839 5.3. Addressing server owner changes in NFSv4.1 841 Section 2.10.5 of [RFC5661] states the following. 843 The client should be prepared for the possibility that 844 eir_server_owner values may be different on subsequent EXCHANGE_ID 845 requests made to the same network address, as a result of various 846 sorts of reconfiguration events. When this happens and the 847 changes result in the invalidation of previously valid forms of 848 trunking, the client should cease to use those forms, either by 849 dropping connections or by adding sessions. For a discussion of 850 lock reclaim as it relates to such reconfiguration events, see 851 Section 8.4.2.1. 853 While this paragraph is literally true in that such reconfiguration 854 events can happen and clients have to deal with them, it is confusing 855 in that it can be read as suggesting that clients have to deal with 856 them without disruption, which in general is impossible. 858 A clearer alternative would be: 860 It is always possible that, as a result of various sorts of 861 reconfiguration events, eir_server_scope and eir_server_owner 862 values may be different on subsequent EXCHANGE_ID requests made to 863 the same network address. 865 In most cases such reconfiguration events will be disruptive and 866 indicate that an IP address formerly connected to one server is 867 now connected to an entirely different one. 869 Some guidelines on client handling of such situations follow: 871 * When eir_server_scope changes, the client has no assurance that 872 any id's it obtained previously (e.g. file handles) can be 873 validly used on the new server, and, even if the new server 874 accepts them, there is no assurance that this is not due to 875 accident. Thus it is best to treat all such state as lost/ 876 stale although a client may assume that the probability of 877 inadvertent acceptance is low and treat this situation as 878 within the next case. 880 * When eir_server_scope remains the same and 881 eir_server_owner.so_major_id changes, the client can use 882 filehandles it has and attempt reclaims. It may find that 883 these are now stale but if NFS4ERR_STALE is not received, he 884 can proceed to reclaim his opens. 886 * When eir_server_scope and eir_server_owner.so_major_id remain 887 the same, the client has to use the now-current values of 888 eir_server-owner.so_minor_id in deciding on appropriate forms 889 of trunking. 891 6. Security Considerations 893 With regard to NFSv4.0, the Security Considerations section of 894 [RFC7530] encourages clients to protect the integrity of the SECINFO 895 operation, any GETATTR operation for the fs_locations attribute. A 896 needed change is to include the operations SETCLIENTID/ 897 SETCLIENTID_CONFIRM as among those for which integrity protection is 898 recommended. A migration recovery event can use any or all of these 899 operations. 901 With regard to NFSv4.1, the Security Considerations section of 902 [RFC5661] takes proper care of migration-related issues. No change 903 is needed. 905 7. IANA Considerations 907 This document does not require actions by IANA. 909 8. Acknowledgements 911 The editor and authors of this document gratefully acknowledge the 912 contributions of Trond Myklebust of NetApp and Robert Thurlow of 913 Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of 914 Microsoft for their guidance and suggestions. 916 Special thanks go to members of the Oracle Solaris NFS team, 917 especially Rick Mesta and James Wahlig, for their work implementing 918 an NFSv4.0 migration prototype and identifying many of the issues 919 documented here. 921 9. Normative References 923 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 924 Requirement Levels", BCP 14, RFC 2119, 925 DOI 10.17487/RFC2119, March 1997, 926 . 928 [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., 929 "Network File System (NFS) Version 4 Minor Version 1 930 Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, 931 . 933 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 934 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 935 March 2015, . 937 [RFC7931] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, 938 "NFSv4.0 Migration: Specification Update", RFC 7931, 939 DOI 10.17487/RFC7931, July 2016, 940 . 942 Authors' Addresses 943 David Noveck (editor) 944 26 Locust Avenue 945 Lexington, MA 02421 946 US 948 Phone: +1 781 572 8038 949 Email: davenoveck@gmail.com 951 Piyush Shivam 952 Oracle Corporation 953 5300 Riata Park Ct. 954 Austin, TX 78727 955 US 957 Phone: +1 512 401 1019 958 Email: piyush.shivam@oracle.com 960 Charles Lever 961 Oracle Corporation 962 1015 Granger Avenue 963 Ann Arbor, MI 48104 964 US 966 Phone: +1 248 614 5091 967 Email: chuck.lever@oracle.com 969 Bill Baker 970 Oracle Corporation 971 5300 Riata Park Ct. 972 Austin, TX 78727 973 US 975 Phone: +1 512 401 1081 976 Email: bill.baker@oracle.com