idnits 2.17.1 draft-ietf-nfsv4-migration-issues-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 19, 2016) is 2988 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5661 (Obsoleted by RFC 8881) == Outdated reference: A later version (-11) exists of draft-ietf-nfsv4-versioning-03 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFSv4 D. Noveck, Ed. 3 Internet-Draft HPE 4 Intended status: Informational P. Shivam 5 Expires: August 22, 2016 C. Lever 6 B. Baker 7 ORACLE 8 February 19, 2016 10 NFSv4 migration: Implementation experience and spec issues to resolve 11 draft-ietf-nfsv4-migration-issues-09 13 Abstract 15 The migration feature of NFSv4 provides for moving responsibility for 16 a single filesystem from one server to another, without disruption to 17 clients. Recent implementation experience has shown problems in the 18 existing specification for this feature. This document discusses the 19 issues which have arisen and explores the options available for 20 curing the issues. It also explains the choices made regarding 21 updating the NFSv4.0 specification and those to be made with regard 22 to the NFSv4.1 specification, in order to properly address migration. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on August 22, 2016. 41 Copyright Notice 43 Copyright (c) 2016 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. NFSv4.0 Implementation Experience . . . . . . . . . . . . . . 4 61 3.1. Implementation issues . . . . . . . . . . . . . . . . . . 4 62 3.1.1. Failure to free migrated state on client reboot . . . 4 63 3.1.2. Server reboots resulting in a confused lease 64 situation . . . . . . . . . . . . . . . . . . . . . . 5 65 3.1.3. Client complexity issues . . . . . . . . . . . . . . 6 66 3.2. Sources of Protocol difficulties . . . . . . . . . . . . 7 67 3.2.1. Issues with nfs_client_id4 generation and use . . . . 7 68 3.2.2. Issues with lease proliferation . . . . . . . . . . . 9 69 4. Issues to be resolved in NFSv4.0 . . . . . . . . . . . . . . 10 70 4.1. Possible changes to nfs_client_id4 client-string . . . . 10 71 4.2. Possible changes to handle differing nfs_client_id4 72 string values . . . . . . . . . . . . . . . . . . . . . . 11 73 4.3. Possible changes to add a new operation . . . . . . . . . 12 74 4.4. Other issues within migration-state sections . . . . . . 12 75 4.5. Issues within other sections . . . . . . . . . . . . . . 12 76 5. Proposed resolution of NFSv4.0 protocol difficulties . . . . 13 77 5.1. Proposed changes: nfs_client_id4 client-string . . . . . 13 78 5.2. Proposed changes: merged (vs. synchronized) leases . . . 14 79 5.3. Other proposed changes to migration-state sections . . . 15 80 5.3.1. Proposed changes: Client ID migration . . . . . . . . 15 81 5.3.2. Proposed changes: Callback re-establishment . . . . . 16 82 5.3.3. Proposed changes: NFS4ERR_LEASE_MOVED rework . . . . 17 83 5.4. Proposed changes to other sections . . . . . . . . . . . 17 84 5.4.1. Proposed changes: callback update . . . . . . . . . . 17 85 5.4.2. Proposed changes: clientid4 handling . . . . . . . . 17 86 5.4.3. Proposed changes: NFS4ERR_CLID_INUSE . . . . . . . . 19 87 6. Issues for NFSv4.1 . . . . . . . . . . . . . . . . . . . . . 20 88 6.1. Addressing state merger in NFSv4.1 . . . . . . . . . . . 20 89 6.2. Addressing pNFS relationship with migration . . . . . . . 21 90 6.3. Addressing server owner changes in NFSv4.1 . . . . . . . 21 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 23 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 93 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 94 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 95 10.1. Normative References . . . . . . . . . . . . . . . . . . 23 96 10.2. Informative References . . . . . . . . . . . . . . . . . 24 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 100 1. Introduction 102 This document is in the informational category, and while the facts 103 it reports may have normative implications, any such normative 104 significance reflects the readers' preferences. For example, we may 105 report that the reboot of a client with migrated state results in 106 state not being promptly cleared and that this will prevent granting 107 of conflicting lock requests at least for the lease time, which is a 108 fact. While it is to be expected that client and server implementers 109 will judge this to be a situation that is best avoided, the judgment 110 as to how pressing this issue should be considered is a judgment for 111 the reader, and eventually the nfsv4 working group to make. 113 We do explore possible ways in which such issues can be avoided, with 114 minimal negative effects, given that the working group has decided to 115 address these issues, but the choice of exactly how to address these 116 is best given effect in one or more standards-track documents and/or 117 errata. 119 This document focuses on NFSv4.0, since that is where the majority of 120 implementation experience has been. Nevertheless, there is 121 discussion of the implications of the NFSv4.0 experience for 122 migration in NFSv4.1, as well as discussion of other issues with 123 regard to the treatment of migration in NFSv4.1. 125 2. Conventions 127 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 128 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 129 document are to be interpreted as described in [RFC2119]. 131 In the context of this informational document, these normative 132 keywords will always occur in the context of a quotation, most often 133 direct but sometimes indirect. The context will make it clear 134 whether the quotation is from: 136 o The current definitive definition of the NFSv4.0 protocol 137 [RFC7530]. 139 o The current definitive definition of the NFSv4.1 protocol 140 [RFC5661]. 142 o A proposed or possible text to serve as a replacement for the 143 current definitive document text. Sometimes, a number of possible 144 alternative texts may be listed and benefits and detriments of 145 each examined in turn. 147 3. NFSv4.0 Implementation Experience 149 3.1. Implementation issues 151 Note that the examples below reflect current experience which arises 152 from clients implementing the recommendation to use different 153 nfs_client_id4 id strings for different server addresses, i.e. using 154 what is later referred to herein as the "non-uniform client-string 155 approach." 157 This is simply because that is the experience implementers have had. 158 The reader should not assume that in all cases, this practice is the 159 source of the difficulty. It may be so in some cases but clearly it 160 is not in all cases. 162 3.1.1. Failure to free migrated state on client reboot 164 The following sort of situation has proved troublesome: 166 o A client C establishes a clientid4 C1 with server ABC specifying 167 an nfs_client_id4 with id string value "C-ABC" and boot verifier 168 0x111. 170 o The client begins to access files in filesystem F on server ABC, 171 resulting in generating stateids S1, S2, etc. under the lease for 172 clientid C1. It may also access files on other filesystems on the 173 same server. 175 o The filesystem is migrated from server ABC to server XYZ. When 176 transparent state migration is in effect, stateids S1 and S2 and 177 clientid4 C1 are now available for use by client C at server XYZ. 179 o Client C reboots and attempts to access data on server XYZ, 180 whether in filesystem F or another. It does a SETCLIENTID with an 181 nfs_client_id4 with id string value "C-XYZ" and boot verifier 182 0x112. There is thus no occasion to free stateids S1 and S2 since 183 they are associated with a different client name and so lease 184 expiration is the only way that they can be gotten rid of. 186 Note here that while it seems clear to us in this example that C-XYZ 187 and C-ABC are from the same client, the server has no way to 188 determine the structure of the "opaque" id string. In the protocol, 189 it really is treated as opaque. Only the client knows which 190 nfs_client_id4 values designate the same client on a different 191 server. 193 3.1.2. Server reboots resulting in a confused lease situation 195 Further problems arise from scenarios like the following. 197 o Client C talks to server ABC using an nfs_client_id4 id string 198 such as "C-ABC" and a boot verifier v1. As a result, a lease with 199 clientid4 c.i is established: {v1, "C-ABC", c.i}. 201 o fs_a1 migrates from server ABC to server XYZ along with its state. 202 Now server XYZ also has a lease: {v1, "C-ABC", c.i}. 204 o Server ABC reboots. 206 o Client C talks to server ABC using an nfs_client_id4 id string 207 such as "C-ABC" and a boot verifier v1. As a result, a lease with 208 clientid4 c.j is established: {v1, "C-ABC", c.j}. 210 o fs_a2 migrates from server ABC to server XYZ. Now server XYZ also 211 has a lease: {v1, "C-ABC", c.j}. 213 o Now server XYZ has two leases that match {v1, "C-ABC", *}, when 214 the protocol clearly assumes there can be only one. 216 Note that if the client used "C" (rather than "C-ABC") as the 217 nfs_client_id4 id string, the exact same situation would arise. 219 One of the first cases in which this sort of situation has resulted 220 in difficulties is in connection with doing a SETCLIENTID for 221 callback update. 223 The SETCLIENTID for callback update only includes the nfs_client_id4, 224 assuming there can only be one such with a given nfs_client_id4 225 value. If there were multiple, confirmed client records with 226 identical nfs_client_id4 id string values, there would be no way to 227 map the callback update request to the correct client record. Apart 228 from the migration handling specified in [RFC7530], such a situation 229 cannot arise. 231 One possible accommodation for this particular issue that has been 232 used is to add a RENEW operation along with SETCLIENTID (on a 233 callback update) to disambiguate the client. 235 When the client updates the callback info to the destination, the 236 client would, by convention, send a compound like this: 238 { RENEW clientid4, SETCLIENTID nfs_client_id4,verf,cb } 239 The presence of the clientid4 in the compound would allow the server 240 to differentiate among the various leases that it knows of, all with 241 the same nfs_client_id4 value. 243 While this would be a reasonable patch for an isolated protocol 244 weakness, interoperable clients and servers would require that the 245 protocol truly be updated to allow such a situation, specifically 246 that of multiple clientid4's with the same nfs_client_id4 value. The 247 protocol is currently designed and implemented assuming this cannot 248 happen. We need to either prevent the situation from happening, or 249 fully adapt to the possibilities which can arise. See Section 4 for 250 a discussion of such issues. 252 3.1.3. Client complexity issues 254 Consider the following situation: 256 o There are a set of clients C1 through Cn accessing servers S1 257 through Sm. Each server manages some significant number of 258 filesystems with the filesystem count L being significantly 259 greater than m. 261 o Each client Cx will access a subset of the servers and so will 262 have up to m clientids, which we will call Cxy for server Sy. 264 o Now assume that for load-balancing or other operational reasons, 265 numbers of filesystems are migrated among the servers. As a 266 result, each client-server pair will have up to m clientids and 267 each client will have up to m**2 clientids. If we add the 268 possibility of server reboot, the only bound on a client's 269 clientid count is L. 271 Now, instead of a clientid4 identifying a client-server pair, we have 272 many more entities for the client to deal with. In addition, it 273 isn't clear how new state is to be incorporated in this structure. 275 The limitations of the migrated state (inability to be freed on 276 reboot) would argue against adding more such state but trying to 277 avoid that would run into its own difficulties. For example, a 278 single lockowner string presented under two different clientids would 279 appear as two different entities. 281 Thus we have to choose between: 283 o indefinite prolongation of foreign clientids even after all 284 transferred state is gone. 286 o having multiple requests for the same lockowner-string-named 287 entity carried on in parallel by separate identically named 288 lockowners under different clientid4's 290 o Adding serialization at the lock-owner string level, in addition 291 to that at the lockowner level. 293 In any case, we have gone (in adding migration as it was described) 294 from a situation in which 296 o Each client has a single clientid4/lease for each server it talks 297 to. 299 o Each client has a single nfs_client_id4 for each server it talks 300 to. 302 o Every state id can be mapped to an associated lease based on the 303 server it was obtained from. 305 To one in which 307 o Each client may have multiple clientid4's for a single server. 309 o For each stateid, the client must separately record the clientid4 310 that it is assigned to, or it must manage separate "state blobs" 311 for each fsid and map those to clientid4's. 313 o Before doing an operation that can result in a stateid, the client 314 must either find a "state blob" based on fsid or create a new one, 315 possibly with a new clientid4. 317 o There may be multiple clientid4's all connected to the same server 318 and using the same nfs_clientid4. 320 This sort of additional client complexity is troublesome and needs to 321 be eliminated. 323 3.2. Sources of Protocol difficulties 325 3.2.1. Issues with nfs_client_id4 generation and use 327 In the current definitive definition of the NFSv4.0 protocol, 328 [RFC7530], the section entitled "Client ID" says: 330 The second field, id is a variable length string that uniquely 331 defines the client. 333 There are two possible interpretations of the phrase "uniquely 334 defines" in the above: 336 o The relation between strings and clients is a function from such 337 strings to clients so that each string designates a single client. 339 o The relation between strings and clients is a bijection between 340 such strings and clients so that each string designates a single 341 client and each client is named by a single string. 343 The first interpretation would make these client-strings like phone 344 numbers (a single person can have several) while the second would 345 make them like social security numbers. 347 Debate about the possible meanings of "uniquely defines" in this 348 context is quite possible but not very helpful. The following points 349 should be noted though: 351 o The second interpretation is more consistent with the way 352 "uniquely defines" is used elsewhere in the spec. 354 o The spec as now written intends the first interpretation (or is 355 internally inconsistent). In fact, it recommends, although non- 356 normatively, that a single client have at least as many client- 357 strings as server addresses that it interacts with. It says, in 358 the third bullet point regarding construction of the string (which 359 we shall henceforth refer to as client-string-BP3): 361 The string should be different for each server network address 362 that the client accesses, rather than common to all server 363 network addresses. 365 o If internode interactions are limited to those between a client 366 and its servers, there is no occasion for servers to be concerned 367 with the question of whether two client-strings designate the same 368 client, so that there is no occasion for the difference in 369 interpretation to matter. 371 o When transparent migration of client state occurs between two 372 servers, it becomes important to determine when state on two 373 different servers is for the same client or not, and this 374 distinction becomes very important. 376 Given the need for the server to be aware of client identity with 377 regard to migrated state, either client-string construction rules 378 will have to change or there will be a need to get around current 379 issues, or perhaps a combination of these two will be required. 380 Later sections will examine the options and propose a solution. 382 One consideration that may indicate that this cannot remain exactly 383 as it is today has to do with the fact that the current explanation 384 for this behavior is not correct. In the current definitive 385 definition of the NFSv4.0 protocol [RFC7530], the section entitled 386 "Client ID" says: 388 The reason is that it may not be possible for the client to tell 389 if the same server is listening on multiple network addresses. If 390 the client issues SETCLIENTID with the same id string to each 391 network address of such a server, the server will think it is the 392 same client, and each successive SETCLIENTID will cause the server 393 to begin the process of removing the client's previous leased 394 state. 396 In point of fact, a "SETCLIENTID with the same id string" sent to 397 multiple network addresses will be treated as all from the same 398 client but will not "cause the server to begin the process of 399 removing the client's previous leased state" unless the server 400 believes it is a different instance of the same client, i.e. if the 401 id string is the same and there is a different boot verifier. If the 402 client does not reboot, the verifier should not change. If it does 403 reboot, the verifier will change, and it is appropriate that the 404 server "begin the process of removing the client's previous leased 405 state. 407 The situation of multiple SETCLIENTID requests received by a server 408 on multiple network addresses is exactly the same, from the protocol 409 design point of view, as when multiple (i.e. duplicate) SETCLIENTID 410 requests are received by the server on a single network address. The 411 same protocol mechanisms that prevent erroneous state deletion in the 412 latter case prevent it in the former case. There is no reason for 413 special handling of the multiple-network-appearance case, in this 414 regard. 416 3.2.2. Issues with lease proliferation 418 It is often felt that this is a consequence of the client-string 419 construction issues, and it is certainly the case that the two are 420 closely connected in that non-uniform client-strings make it 421 impossible for the server to appropriately combine leases from the 422 same client. 424 However, even where the server could combine leases from the same 425 client, it needs to be clear how and when it will do so, so that the 426 client will be prepared. These issues will have to be addressed at 427 various places in the spec. 429 This could be enough only if we are prepared to do away with the 430 "should" recommending non-uniform client-strings and replace it with 431 a "should not" or even a "SHOULD NOT". Current client implementation 432 patterns make this an unpalatable choice for use as a general 433 solution, but it is reasonable to "RECOMMEND" this choice for a well- 434 defined subset of clients. One alternative would be to create a way 435 for the server to infer from client behavior which leases are held by 436 the same client and use this information to do appropriate lease 437 mergers. Prototyping and detailed specification work has shown that 438 this could be done but the resulting complexity is such that a better 439 choice is to "RECOMMEND" use of the uniform client-string approach 440 for clients supporting the migration feature. 442 Because of the discussion of client-string construction in [RFC7530], 443 most existing clients implement the non-uniform client-string 444 approach. As a result, existing servers may not have been tested 445 with clients implementing uniform client-strings. As a consequence, 446 care must be taken to preserve interoperability between UCS-capable 447 clients and servers that don't tolerate uniform client strings for 448 one reason or another. 450 4. Issues to be resolved in NFSv4.0 452 4.1. Possible changes to nfs_client_id4 client-string 454 The fact that the reason given in client-string-BP3 is not valid 455 makes the existing "should" insupportable. We can't either 457 o Keep a reason we know is invalid. 459 o Keep saying "should" without giving a reason. 461 What are often presented as reasons that motivate use of the non- 462 uniform approach always turn out to be cases in which, if the uniform 463 approach were used, the server will treat a client which accesses 464 that server via two different IP addresses as part of a single 465 client, as it in fact is. This may be disconcerting to a client 466 unaware that the two IP addresses connect to the same server. This 467 is not a reason to use the non-uniform approach but is better thought 468 of as an illustration of the fact that those using the uniform 469 approach need to be aware of the possibility of server trunking and 470 its potential effect on server behavior. 472 If it is possible to reliably infer the existence of trunking of 473 server IP addresses from observed server behavior, use of the uniform 474 approach would be more desirable, although compatibility issues would 475 have to be dealt with. 477 An alternative to having the client infer the existence of trunking 478 of IP server addresses, is to make this information available to the 479 client directly. See Section 4.3 for details. 481 It is always possible that a valid new reason will be found, but so 482 far none has been proposed. Given the history, the burden of proof 483 should be on those asserting the validity of a proposed new reason. 485 So we will assume for now that the "should" will have to go. The 486 question is what to replace it with. 488 o We can't say "MUST NOT", despite the problems this raises for 489 migration since this is pretty late in the day for such a change. 490 Many currently operating clients obey the existing "should". 491 Similar considerations would apply for "SHOULD NOT" or "should 492 not". 494 o Dropping client-string-BP3 entirely is a possibility but, given 495 the context and history, it would just be a confusing version of 496 "SHOULD NOT". 498 o Using "MAY" would clearly specify that both ways of doing this are 499 valid choices for clients and that servers will have to deal with 500 clients that make either choice. 502 o This might be modified by a "SHOULD" (or even a "MUST") for 503 particular groups of clients. 505 o There will have to be some text explaining why a client might make 506 either choice but, except for the particular cases referred to 507 above, we will have to make sure that it is truly descriptive, and 508 not slanted in either direction. 510 4.2. Possible changes to handle differing nfs_client_id4 string values 512 Given the difficulties caused by having different nfs_client_id4 513 client-string values for the same client, we have two choices: 515 o Deprecate the existing treatment and basically say the client is 516 on its own doing migration, if it follows it. 518 o Introduce a way of having the client provide client identity 519 information to the server, if it can be done compatibly while 520 staying within the bounds of v4.0. 522 4.3. Possible changes to add a new operation 524 It might be possible to return server-identity information to the 525 client, just as is done in NFSv4.1 by the response to the EXCHANGE_ID 526 operation. This could be done by a SETCLIENTID_PLUS optional 527 operation, which acts like SETCLIENTID, except that it returns server 528 identity information. Such information could be used by clients, 529 making it possible to for them to be aware of server trunking 530 relationships, rather than having to infer them from server behavior. 532 It has been generally thought that protocol extensions such as this 533 are not appropriate in bis documents and other documents updating 534 NFSv4 protocol definition RFC's. However, [NFSv4-vers] discusses 535 means by which protocol extensions, similar to those allowed between 536 minor versions, can be used to correct protocol mistakes. 538 A decision to adopt this approach would require waiting for 539 [NFSv4-vers] to become a Proposed Standard. In view of the time 540 necessary for that to happen, this approach is not expected to be 541 adopted in an RFC updating [RFC7530], such as [migr-v4.0-update]. 542 Still, it is worth keeping in mind, if implementers have difficulties 543 inferring trunking relationships using the techniques discussed 544 there. 546 4.4. Other issues within migration-state sections 548 There are a number of issues where the existing text is unclear and/ 549 or wrong and needs to be fixed in some way. 551 o Lack of clarity in the discussion of moving clientids (as well as 552 stateids) as part of moving state for migration. 554 o The discussion of synchronized leases is wrong in that there is no 555 way to determine (in the current spec) when leases are for the 556 same client and also wrong in suggesting a benefit from leases 557 synchronized at the point of transfer. What is needed is merger 558 of leases, which is necessary to keep client complexity 559 requirements from getting out of hand. 561 o Lack of clarity in the discussion of LEASE_MOVED handling, 562 including failure to fully address situations in which transparent 563 state migration did not occur. 565 4.5. Issues within other sections 567 There are a number of cases in which certain sections, not 568 specifically related to migration, require additional clarification. 569 This is generally because text that is clear in a context in which 570 leases and clientids are created in one place and live there forever 571 may need further refinement in the more dynamic environment that 572 arises as part of migration. 574 Some examples: 576 o Some people are under the impression that updating callback 577 endpoint information for an existing client, as used during 578 migration, may cause the destination server to free existing 579 state. There need to be additions to clarify the situation. 581 o The handling of the sets of clientid4's maintained by each server 582 needs to be clarified. In particular, the issue of how the client 583 adapts to the presumably independent and uncoordinated clientid4 584 sets needs to be clearly addressed 586 o Statements regarding handling of invalid clientid4's need to be 587 clarified and/or refined in light of the possibilities that arise 588 due to lease motion and merger. 590 o Confusion and lack of clarity about NFS4ERR_CLID_INUSE. 592 5. Proposed resolution of NFSv4.0 protocol difficulties 594 This section lists the changes which we believe are necessary to 595 resolve the difficulties mentioned above. Such changes, along with 596 other clarifications found to be desirable during drafting and review 597 are contained in [migr-v4.0-update]. 599 5.1. Proposed changes: nfs_client_id4 client-string 601 We propose replacing client-string-BP3 with the following text: 603 The string MAY be different for each server network address that 604 the client accesses, rather than common to all server network 605 addresses. 607 In addition, given the importance of the issue of client identity and 608 the fact that both client string-approaches are to be considered 609 valid, a greatly expanded treatment of client identity desirable. It 610 should have the following major elements. 612 o It should fully describe the consequences of making the string 613 different for each network address (the non-uniform client-string 614 approach) and of making it the same for all network addresses (the 615 uniform client string approach). 617 o It should give helpful guidance about the factors that might 618 affect client implementation choice between these approaches. 620 o It should describe the compatibility issues that might cause 621 servers to be incompatible with the uniform approach and give 622 guidance about dealing with these. 624 o It should describe how a client using the uniform approach might 625 use server behavior to determine server address trunking patterns. 627 o It should present a clearer and more complete set of 628 recommendations to guide client string construction. 630 5.2. Proposed changes: merged (vs. synchronized) leases 632 In the current definitive definition of the NFSv4.0 protocol, 633 [RFC7530], the section entitled "Migration and State" says: 635 As part of the transfer of information between servers, leases 636 would be transferred as well. The leases being transferred to the 637 new server will typically have a different expiration time from 638 those for the same client, previously on the old server. To 639 maintain the property that all leases on a given server for a 640 given client expire at the same time, the server should advance 641 the expiration time to the later of the leases being transferred 642 or the leases already present. This allows the client to maintain 643 lease renewal of both classes without special effort: 645 There are a number of problems with this and any resolution of our 646 difficulties must address them somehow. 648 o The current v4.0 spec recommends that the client make it 649 essentially impossible to determine when two leases are from "the 650 same client". 652 o It is not appropriate to speak of "maintain[ing] the property that 653 all leases on a given server for a given client expire at the same 654 time", since this is not a property that holds even in the absence 655 of migration. A server listening on multiple network addresses 656 may have the same client appear as multiple clients with no way to 657 recognize the client as the same. 659 o Even if the client identity issue could be resolved, advancing the 660 lease time at the point of migration would not maintain the 661 desired synchronization property. The leases would be 662 synchronized until one of them was renewed, after which they would 663 be unsynchronized again. 665 To avoid client complexity, we need to have no more than one lease 666 between a single client and a single server. This requires merger of 667 leases since there is no real help from synchronizing them at a 668 single instant. 670 For the uniform approach, the destination server would simply merge 671 leases as part of state transfer, since two leases with the same 672 nfs_client_id4 values must be for the same client. 674 We have made the following decisions as far as proposed normative 675 statements regarding for state merger. They reflect the facts that 676 we want to allow full migration support in the simplest way possible 677 and that we can't say MUST since we have older clients and servers to 678 deal with. 680 o Clients MAY use the uniform client-string approach and are well- 681 advised to do so if they are concerned about getting good 682 migration support. 684 o Servers SHOULD provide automatic lease merger during state 685 migration so that clients using the uniform id approach get the 686 support automatically. 688 If servers obey the SHOULD and clients choose to adopt the uniform id 689 approach, having more than a single lease for a given client-server 690 pair will be a transient situation, cleaned up as part of adapting to 691 use of migrated state. 693 Since clients and servers will be a mixture of old and new and 694 because nothing is a MUST we have to ensure that no combination will 695 show worse behavior than is exhibited by current (i.e. old) clients 696 and servers. 698 5.3. Other proposed changes to migration-state sections 700 5.3.1. Proposed changes: Client ID migration 702 In the current definitive definition of the NFSv4.0 protocol 703 [RFC7530], the section entitled "Migration and State" says: 705 In the case of migration, the servers involved in the migration of 706 a filesystem SHOULD transfer all server state from the original to 707 the new server. This must be done in a way that is transparent to 708 the client. This state transfer will ease the client's transition 709 when a filesystem migration occurs. If the servers are successful 710 in transferring all state, the client will continue to use 711 stateids assigned by the original server. Therefore the new 712 server must recognize these stateids as valid. This holds true 713 for the client ID as well. Since responsibility for an entire 714 filesystem is transferred with a migration event, there is no 715 possibility that conflicts will arise on the new server as a 716 result of the transfer of locks. 718 This poses some difficulties, mostly because the part about "client 719 ID" is not clear: 721 o It isn't clear what part of the paragraph the "this" in the 722 statement "this holds true ..." is meant to signify. 724 o The phrase "the client ID" is ambiguous, possibly indicating the 725 clientid4 and possibly indicating the nfs_client_id4. 727 o If the text means to suggest that the same clientid4 must be used, 728 the logic is not clear since the issue is not the same as for 729 stateids of which there might be many. Adapting to the change of 730 a single clientid, as might happen as a part of lease migration, 731 is relatively easy for the client. 733 We have decided that it is best to address this issue as follows: 735 o Make it clear that both clientid4 and nfs_client_id4 (including 736 both id string and boot verifier) are to be transferred. 738 o Indicate that the initial transfer will result in the same 739 clientid4 after transfer but this is not guaranteed since there 740 may conflict with an existing clientid4 on the destination server 741 and because lease merger can result in a change of the clientid4. 743 5.3.2. Proposed changes: Callback re-establishment 745 In the current definitive definition of the NFSv4.0 protocol 746 [RFC7530], the section entitled "Migration and State" says: 748 A client SHOULD re-establish new callback information with the new 749 server as soon as possible, according to sequences described in 750 sections "Operation 35: SETCLIENTID - Negotiate Client ID" and 751 "Operation 36: SETCLIENTID_CONFIRM - Confirm Client ID". This 752 ensures that server operations are not blocked by the inability to 753 recall delegations. 755 The above will need to be fixed to reflect the possibility of merging 756 of leases, 758 5.3.3. Proposed changes: NFS4ERR_LEASE_MOVED rework 760 In the current definitive definition of the NFSv4.0 protocol 761 [RFC7530], the section entitled "Notification of Migrated Lease" 762 says: 764 Upon receiving the NFS4ERR_LEASE_MOVED error, a client that 765 supports filesystem migration MUST probe all filesystems from that 766 server on which it holds open state. Once the client has 767 successfully probed all those filesystems which are migrated, the 768 server MUST resume normal handling of stateful requests from that 769 client. 771 There is a lack of clarity that is prompted by ambiguity about what 772 exactly probing is and what the interlock between client and server 773 must be. This has led to some worry about the scalability of the 774 probing process, and although the time required does scale linearly 775 with the number of filesystems that the client may have state for 776 with respect to a given server, the actual process can be done 777 efficiently. 779 To address these issues we propose rewriting the above to be more 780 clear and to give suggestions about how to do the required scanning 781 efficiently. 783 5.4. Proposed changes to other sections 785 5.4.1. Proposed changes: callback update 787 Some changes are necessary to reduce confusion about the process of 788 callback information update and in particular to make it clear that 789 no state is freed as a result: 791 o Make it clear that after migration there are confirmed entries for 792 transferred clientid4/nfs_client_id4 pairs. 794 o Be explicit in the sections headed "otherwise," in the 795 descriptions of SETCLIENTID and SETCLIENTID_CONFIRM, that these 796 don't apply in the cases we are concerned about. 798 5.4.2. Proposed changes: clientid4 handling 800 To address both of the clientid4-related issues mentioned in 801 Section 4.5, we propose replacing the last three paragraphs of the 802 section entitled "Client ID" with the following: 804 Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has 805 successfully completed, the client uses the shorthand client 806 identifier, of type clientid4, instead of the longer and less 807 compact nfs_client_id4 structure. This shorthand client 808 identifier (a client ID) is assigned by the server and should be 809 chosen so that it will not conflict with a client ID previously 810 assigned by same server. This applies across server restarts or 811 reboots. 813 Distinct servers MAY assign clientid4's independently, and will 814 generally do so. Therefore, a client has to be prepared to deal 815 with multiple instances of the same clientid4 value received on 816 distinct IP addresses, denoting separate entities. When trunking 817 of server IP addresses is not a consideration, a client should 818 keep track of (IP-address, clientid4) pairs, so that each pair is 819 distinct. In the face of possible trunking of server IP 820 addresses, the client will use the receipt of the same clientid4 821 from multiple IP-addresses, as an indication that the two IP- 822 addresses may be trunked and proceed to determine, from the 823 observed server behavior whether the two addresses are in fact 824 trunked. 826 When a clientid4 is presented to a server and that clientid4 is 827 not recognized, the server will reject the request with the error 828 NFS4ERR_STALE_CLIENTID. This can occur for a number of reasons: 830 * A server reboot causing loss of the server's knowledge of the 831 client 833 * Client error sending an incorrect clientid4 or a valid 834 clientid4 to the wrong server. 836 * Loss of lease state due to lease expiration. 838 * Client or server error causing the server to believe that the 839 client has rebooted (i.e. receiving a SETCLIENTID with an 840 nfs_client_id4 which has a matching id string and a non- 841 matching boot verifier). 843 * Migration of all state under the associated lease causes its 844 non-existence to be recognized on the source server. 846 * Merger of state under the associated lease with another lease 847 under a different clientid causes the clientid4 serving as the 848 source of the merge to cease being recognized on its server. 850 In the event of a server reboot, or loss of lease state due to 851 lease expiration, the client must obtain a new clientid4 by use of 852 the SETCLIENTID operation and then proceed to any other necessary 853 recovery for the server reboot case (See the section entitled 854 "Server Failure and Recovery"). In cases of server or client 855 error resulting in this error, use of SETCLIENTID to establish a 856 new lease is desirable as well. 858 In the last two cases, different recovery procedures are required. 859 Note that in cases in which there is any uncertainty about which 860 sort of handling is applicable, the distinguishing characteristic 861 is that in reboot-like cases, the clientid4 and all associated 862 stateids cease to exist while in migration-related cases, the 863 clientid4 ceases to exist while the stateids are still valid. 865 The client must also employ the SETCLIENTID operation when it 866 receives a NFS4ERR_STALE_STATEID error using a stateid derived 867 from its current clientid4, since this indicates a situation, such 868 as server reboot which has invalidated the existing clientid4 and 869 associated stateids (see the section entitled "lock-owner" for 870 details). 872 See the detailed descriptions of SETCLIENTID and 873 SETCLIENTID_CONFIRM for a complete specification of the 874 operations. 876 5.4.3. Proposed changes: NFS4ERR_CLID_INUSE 878 It appears to be the intention that only a single principal be used 879 for client establishment between any client-server pair. However: 881 o There is no explicit statement to this effect. 883 o The error that indicates a principal conflict has a name which 884 does not clarify this issue: NFS4ERR_CLID_INUSE. 886 o The definition of the error is also not very helpful: "The 887 SETCLIENTID operation has found that a client id is already in use 888 by another client". 890 As a result, servers exist which reject a SETCLIENTID simply because 891 there already exists a clientid for the same client, established 892 using a different IP address. Although this is generally understood 893 to be erroneous, such servers still exist and the spec should make 894 the correct behavior clear. 896 Although the error name cannot be changed, the following changes 897 should be made to avoid confusion: 899 o The definition of the error should be changed to read as follows: 901 The SETCLIENTID operation has found that the specified 902 nfs_client_id4 was previously presented with a different 903 principal and that client instance currently holds an active 904 lease. A server MAY return this error if the same principal is 905 used but a change in authentication flavor gives good reason to 906 reject the new SETCLIENTID operation as not bona fide. 908 o In the description of SETCLIENTID, the phrase "then the server 909 returns a NFS4ERR_CLID_INUSE error" should be expanded to read 910 "then the server returns a NFS4ERR_CLID_INUSE error, since use of 911 a single client with multiple principals is not allowed." 913 6. Issues for NFSv4.1 915 Because NFSv4.1 embraces the uniform client-string approach, as 916 advised by section 2.4 of [RFC5661], addressing migration issues is 917 simpler. 919 Nevertheless, there are some issues that will have to be addressed. 920 Some examples: 922 o The other necessary part of addressing migration issues, providing 923 for the server's merger of leases that relate to the same client, 924 is not currently addressed by NFSv4.1 and changes need to be made 925 to make it clear that state needs to be appropriately merged as 926 part of migration, to avoid multiple clientids between a client- 927 server pair. 929 o There needs to be some clarification of how migration, and 930 particularly transparent state migration, should interact with 931 pNFS layouts. 933 o The current discussion (in [RFC5661]), of the possibility of 934 server_owner changes is incomplete and confusing. 936 Discussion of how to resolve these issues will appear in the sections 937 below. 939 6.1. Addressing state merger in NFSv4.1 941 The existing treatment of state transfer in [RFC5661], has similar 942 problems to that in [RFC7530] in that it assumes that the state for 943 multiple filesystems on different servers will not be merged to so 944 that it appears under a single common clientid. We've already seen 945 the reasons that this is a problem, with regard to NFSv4.0. 947 Although we don't have the problems stemming from the non-uniform 948 client-string approach, there are a number of complexities in the 949 existing treatment of state management in the section entitled "Lock 950 State and File System Transitions" in [RFC5661] that make this non- 951 trivial to address: 953 o Migration is currently treated together with other sorts of 954 filesystem transitions including transitioning between replicas 955 without any NFS4ERR_MOVED errors. 957 o There is separate handling and discussion of the cases of matching 958 and non-matching server scopes. 960 o In the case of matching server scopes, the text calls for an 961 impossible degree of transparency. 963 o In the case of non-matching server scopes, the text does not 964 mention transparent state migration at all, resulting in a 965 functional regression from NFSV4.0 967 6.2. Addressing pNFS relationship with migration 969 This is made difficult because, within the PNFS framework, migration 970 might mean any of several things: 972 o Transfer of the MDS, leaving DS's alone. 974 This would be minimally disruptive to those using layouts but 975 would require the pNFS control protocol to support the DS being 976 directed to a new MDS. 978 o Transfer of a DS, leaving everything else in place. 980 Such a transfer can be handled without using migration at all. 981 The server can recall/revoke layouts, as appropriate. 983 o Transfer of the filesystem to a new filesystem with both MDS and 984 DS's moving. 986 In such a transfer, an entirely different set of DS's will be at 987 the target location. There may even be no pNFS support on the 988 destination filesystem at all. 990 Migration needs to support both the first and last of these models. 992 6.3. Addressing server owner changes in NFSv4.1 994 Section 2.10.5 of [RFC5661] states the following. 996 The client should be prepared for the possibility that 997 eir_server_owner values may be different on subsequent EXCHANGE_ID 998 requests made to the same network address, as a result of various 999 sorts of reconfiguration events. When this happens and the 1000 changes result in the invalidation of previously valid forms of 1001 trunking, the client should cease to use those forms, either by 1002 dropping connections or by adding sessions. For a discussion of 1003 lock reclaim as it relates to such reconfiguration events, see 1004 Section 8.4.2.1. 1006 While this paragraph is literally true in that such reconfiguration 1007 events can happen and clients have to deal with them, it is confusing 1008 in that it can be read as suggesting that clients have to deal with 1009 them without disruption, which in general is impossible. 1011 A clearer alternative would be: 1013 It is always possible that, as a result of various sorts of 1014 reconfiguration events, eir_server_scope and eir_server_owner 1015 values may be different on subsequent EXCHANGE_ID requests made to 1016 the same network address. 1018 In most cases such reconfiguration events will be disruptive and 1019 indicate that an IP address formerly connected to one server is 1020 now connected to an entirely different one. 1022 Some guidelines on client handling of such situations follow: 1024 * When eir_server_scope changes, the client has no assurance that 1025 any id's it obtained previously (e.g. file handles) can be 1026 validly used on the new server, and, even if the new server 1027 accepts them, there is no assurance that this is not due to 1028 accident. Thus it is best to treat all such state as lost/ 1029 stale although a client may assume that the probability of 1030 inadvertent acceptance is low and treat this situation as 1031 within the next case. 1033 * When eir_server_scope remains the same and 1034 eir_server_owner.so_major_id changes, the client can use 1035 filehandles it has and attempt reclaims. It may find that 1036 these are now stale but if NFS4ERR_STALE is not received, he 1037 can proceed to reclaim his opens. 1039 * When eir_server_scope and eir_server_owner.so_major_id remain 1040 the same, the client has to use the now-current values of 1041 eir_server-owner.so_minor_id in deciding on appropriate forms 1042 of trunking. 1044 7. Security Considerations 1046 With regard to NFSv4.0, the Security Considerations section of 1047 [RFC7530] encourages clients to protect the integrity of the SECINFO 1048 operation, any GETATTR operation for the fs_locations attribute. A 1049 needed change is to include the operations SETCLIENTID/ 1050 SETCLIENTID_CONFIRM as among those for which integrity protection is 1051 recommended. A migration recovery event can use any or all of these 1052 operations. 1054 With regard to NFSv4.1, the Security Considerations section of 1055 [RFC5661] takes proper care of migration-related issues. No change 1056 is needed. 1058 8. IANA Considerations 1060 This document does not require actions by IANA. 1062 9. Acknowledgements 1064 The editor and authors of this document gratefully acknowledge the 1065 contributions of Trond Myklebust of NetApp and Robert Thurlow of 1066 Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of 1067 Microsoft for their guidance and suggestions. 1069 Special thanks go to members of the Oracle Solaris NFS team, 1070 especially Rick Mesta and James Wahlig, for their work implementing 1071 an NFSv4.0 migration prototype and identifying many of the issues 1072 documented here. 1074 10. References 1076 10.1. Normative References 1078 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1079 Requirement Levels", BCP 14, RFC 2119, 1080 DOI 10.17487/RFC2119, March 1997, 1081 . 1083 [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., 1084 "Network File System (NFS) Version 4 Minor Version 1 1085 Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, 1086 . 1088 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 1089 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 1090 March 2015, . 1092 10.2. Informative References 1094 [migr-v4.0-update] 1095 Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, 1096 "NFSv4.0 migration: Specification Update", 2016, 1097 . 1100 Work in progress. 1102 [NFSv4-vers] 1103 Noveck, D., "NFSv4 Version Management", 2016, 1104 . 1107 Work in progress. 1109 Authors' Addresses 1111 David Noveck (editor) 1112 Hewlett Packard Enterprise 1113 165 Dascomb Road 1114 Andover, MA 01810 1115 US 1117 Phone: +1 978 474 2011 1118 Email: davenoveck@gmail.com 1120 Piyush Shivam 1121 Oracle Corporation 1122 5300 Riata Park Ct. 1123 Austin, TX 78727 1124 US 1126 Phone: +1 512 401 1019 1127 Email: piyush.shivam@oracle.com 1129 Charles Lever 1130 Oracle Corporation 1131 1015 Granger Avenue 1132 Ann Arbor, MI 48104 1133 US 1135 Phone: +1 248 614 5091 1136 Email: chuck.lever@oracle.com 1137 Bill Baker 1138 Oracle Corporation 1139 5300 Riata Park Ct. 1140 Austin, TX 78727 1141 US 1143 Phone: +1 512 401 1081 1144 Email: bill.baker@oracle.com