idnits 2.17.1 draft-ietf-nfsv4-rfc3010bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 11 longer pages, the longest (page 47) being 75 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC1094], [RFC1813]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 583 has weird spacing: '...ned int uin...' == Line 587 has weird spacing: '...d hyper uint6...' == Line 647 has weird spacing: '...8string typ...' == Line 727 has weird spacing: '...8string ser...' == Line 790 has weird spacing: '...ned int cb_pr...' == (38 more instances...) == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: The filehandle in the NFS protocol is a per server unique identifier for a file system object. The contents of the filehandle are opaque to the client. Therefore, the server is responsible for translating the filehandle to an internal representation of the file system object. Since the filehandle is the client's reference to an object and the client may cache this reference, the server SHOULD not reuse a filehandle for another file system object. If the server needs to reuse a filehandle value, the time elapsed before reuse SHOULD be large enough such that it is unlikely the client has a cached copy of the reused filehandle value. Note that a client may cache a filehandle for a very long time. For example, a client may cache NFS data to local storage as a method to expand its effective cache size and as a means to survive client restarts. Therefore, the lifetime of a cached filehandle may be extended. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The reader may be wondering why there are three FH4_VOL* bits and why FH4_VOLATILE_ANY is exclusive of FH4_VOL_MIGRATION and FH4_VOL_RENAME. If the a filehandle is normally persistent but cannot persist across a file set migration, then the presence of the FH4_VOL_MIGRATION or FH4_VOL_RENAME tells the client that it can treat the file handle as persistent for purposes of maintaining a file name to file handle cache, except for the specific event described by the bit. However, FH4_VOLATILE_ANY tells the client that it should not maintain such a cache for unopened files. A server MUST not present FH4_VOLATILE_ANY with FH4_VOL_MIGRATION or FH4_VOL_RENAME as this will lead to confusion. FH4_VOLATILE_ANY implies that the file handle will expire upon migration or rename, in addition to other events. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2002) is 7955 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC1094' on line 10524 looks like a reference -- Missing reference section? 'RFC1813' on line 10542 looks like a reference -- Missing reference section? 'RFC1831' on line 10548 looks like a reference -- Missing reference section? 'RFC1832' on line 10554 looks like a reference -- Missing reference section? 'RFC2203' on line 10596 looks like a reference -- Missing reference section? 'RFC1964' on line 899 looks like a reference -- Missing reference section? 'RFC2847' on line 10627 looks like a reference -- Missing reference section? 'RFC2078' on line 10584 looks like a reference -- Missing reference section? '12' on line 9351 looks like a reference -- Missing reference section? 'RFC1700' on line 10536 looks like a reference -- Missing reference section? 'RFC1833' on line 10560 looks like a reference -- Missing reference section? 'RFC2581' on line 862 looks like a reference -- Missing reference section? 'Floyd' on line 10469 looks like a reference -- Missing reference section? 'RFC2623' on line 10614 looks like a reference -- Missing reference section? 'RFC2025' on line 10566 looks like a reference -- Missing reference section? 'RFC2054' on line 10572 looks like a reference -- Missing reference section? 'RFC2055' on line 10578 looks like a reference -- Missing reference section? 'RFC2624' on line 10621 looks like a reference -- Missing reference section? 'RFC1345' on line 10530 looks like a reference -- Missing reference section? 'XNFS' on line 10663 looks like a reference -- Missing reference section? '4' on line 2597 looks like a reference -- Missing reference section? 'Juszczak' on line 10484 looks like a reference -- Missing reference section? 'ISO10646' on line 10479 looks like a reference -- Missing reference section? 'RFC2277' on line 10602 looks like a reference -- Missing reference section? 'RFC2279' on line 10608 looks like a reference -- Missing reference section? 'RFC2152' on line 10590 looks like a reference -- Missing reference section? 'Unicode1' on line 10650 looks like a reference -- Missing reference section? 'Unicode2' on line 10657 looks like a reference -- Missing reference section? 'Gray' on line 10474 looks like a reference -- Missing reference section? 'Kazar' on line 10493 looks like a reference -- Missing reference section? 'Macklem' on line 10500 looks like a reference -- Missing reference section? 'Mogul' on line 10648 looks like a reference -- Missing reference section? 'Nowicki' on line 10513 looks like a reference -- Missing reference section? 'Pawlowski' on line 10518 looks like a reference -- Missing reference section? 'Sandberg' on line 10633 looks like a reference -- Missing reference section? 'Srinivasan' on line 10641 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 10 warnings (==), 39 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFS Version 4 Working Group S. Shepler 3 INTERNET-DRAFT Sun Microsystems, Inc. 4 Document: draft-ietf-nfsv4-rfc3010bis-01.txt C. Beame 5 Hummingbird Ltd. 6 B. Callaghan 7 Sun Microsystems, Inc. 8 M. Eisler 9 Zambeel, Inc. 10 D. Noveck 11 Network Appliance, Inc. 12 D. Robinson 13 Sun Microsystems, Inc. 14 R. Thurlow 15 Sun Microsystems, Inc. 16 July 2002 18 NFS version 4 Protocol 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet- Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Abstract 43 NFS version 4 is a distributed file system protocol which owes 44 heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. 46 Unlike earlier versions, the NFS version 4 protocol supports 47 traditional file access while integrating support for file locking 48 and the mount protocol. In addition, support for strong security 49 (and its negotiation), compound operations, client caching, and 50 internationalization have been added. Of course, attention has been 51 applied to making NFS version 4 operate well in an Internet 52 environment. 54 Copyright 56 Copyright (C) The Internet Society (2000-2002). All Rights Reserved. 58 Key Words 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in RFC 2119. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 67 1.1. Overview of NFS Version 4 Features . . . . . . . . . . . . 7 68 1.1.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 69 1.1.2. Procedure and Operation Structure . . . . . . . . . . . 8 70 1.1.3. File System Model . . . . . . . . . . . . . . . . . . . 9 71 1.1.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 72 1.1.3.2. Attribute Types . . . . . . . . . . . . . . . . . . . 9 73 1.1.3.3. File System Replication and Migration . . . . . . . 10 74 1.1.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . . 10 75 1.1.5. File locking . . . . . . . . . . . . . . . . . . . . . 10 76 1.1.6. Client Caching and Delegation . . . . . . . . . . . . 11 77 1.2. General Definitions . . . . . . . . . . . . . . . . . . 12 78 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . 14 79 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 14 80 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 81 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 20 82 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 20 83 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 20 84 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 20 85 3.2.1.1. Kerberos V5 as security triple . . . . . . . . . . . 21 86 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 21 87 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 22 88 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 23 89 3.3.1. Security Error . . . . . . . . . . . . . . . . . . . . 23 90 3.3.2. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 23 91 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 23 92 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 26 93 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 26 94 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 26 95 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 27 96 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 27 97 4.2.1. General Properties of a Filehandle . . . . . . . . . . 27 98 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 28 99 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 28 100 4.2.4. One Method of Constructing a Volatile Filehandle . . . 30 101 4.3. Client Recovery from Filehandle Expiration . . . . . . . 30 102 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 32 103 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 33 104 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 33 105 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 33 106 5.4. Mandatory Attributes - Definitions . . . . . . . . . . . 35 107 5.5. Recommended Attributes - Definitions . . . . . . . . . . 37 108 5.6. Interpreting owner and owner_group . . . . . . . . . . . 41 109 5.7. Character Case Attributes . . . . . . . . . . . . . . . 43 110 5.8. Quota Attributes . . . . . . . . . . . . . . . . . . . . 43 111 5.9. Access Control Lists . . . . . . . . . . . . . . . . . . 44 112 5.9.1. ACE type . . . . . . . . . . . . . . . . . . . . . . . 45 113 5.9.2. ACE flag . . . . . . . . . . . . . . . . . . . . . . . 45 114 5.9.3. ACE Access Mask . . . . . . . . . . . . . . . . . . . 47 115 5.9.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 48 116 6. File System Migration and Replication . . . . . . . . . . 49 117 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 49 118 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 49 119 6.3. Interpretation of the fs_locations Attribute . . . . . . 50 120 6.4. Filehandle Recovery for Migration or Replication . . . . 51 121 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 52 122 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 52 123 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 52 124 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 52 125 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 53 126 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 53 127 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 53 128 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 54 129 7.8. Security Policy and Name Space Presentation . . . . . . 54 130 8. File Locking and Share Reservations . . . . . . . . . . . 55 131 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 55 132 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 55 133 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 57 134 8.1.3. nfs_lockowner and stateid Definition . . . . . . . . . 58 135 8.1.4. Use of the stateid . . . . . . . . . . . . . . . . . . 59 136 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 60 137 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 61 138 8.1.7. Releasing nfs_lockowner State . . . . . . . . . . . . 61 139 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 62 140 8.3. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 62 141 8.4. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 63 142 8.5. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 64 143 8.5.1. Client Failure and Recovery . . . . . . . . . . . . . 64 144 8.5.2. Server Failure and Recovery . . . . . . . . . . . . . 65 145 8.5.3. Network Partitions and Recovery . . . . . . . . . . . 66 146 8.6. Recovery from a Lock Request Timeout or Abort . . . . . 67 147 8.7. Server Revocation of Locks . . . . . . . . . . . . . . . 68 148 8.8. Share Reservations . . . . . . . . . . . . . . . . . . . 69 149 8.9. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 69 150 8.10. Open Upgrade and Downgrade . . . . . . . . . . . . . . 70 151 8.11. Short and Long Leases . . . . . . . . . . . . . . . . . 71 152 8.12. Clocks and Calculating Lease Expiration . . . . . . . . 71 153 8.13. Migration, Replication and State . . . . . . . . . . . 71 154 8.13.1. Migration and State . . . . . . . . . . . . . . . . . 72 155 8.13.2. Replication and State . . . . . . . . . . . . . . . . 72 156 8.13.3. Notification of Migrated Lease . . . . . . . . . . . 73 157 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 74 158 9.1. Performance Challenges for Client-Side Caching . . . . . 74 159 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 75 160 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 76 161 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 78 162 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 78 163 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 79 164 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 80 165 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 81 166 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 82 167 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 84 168 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 85 169 9.4.3. Recall of Open Delegation . . . . . . . . . . . . . . 85 170 9.4.4. Delegation Revocation . . . . . . . . . . . . . . . . 87 171 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 87 172 9.5.1. Revocation Recovery for Write Open Delegation . . . . 88 173 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 89 174 9.7. Name Caching . . . . . . . . . . . . . . . . . . . . . . 90 175 9.8. Directory Caching . . . . . . . . . . . . . . . . . . . 91 176 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 93 177 11. Internationalization . . . . . . . . . . . . . . . . . . 96 178 11.1. Universal Versus Local Character Sets . . . . . . . . . 96 179 11.2. Overview of Universal Character Set Standards . . . . . 97 180 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 98 181 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 98 182 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 99 183 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 100 184 13. NFS Version 4 Requests . . . . . . . . . . . . . . . . . 105 185 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 105 186 13.2. Evaluation of a Compound Request . . . . . . . . . . . 106 187 13.3. Synchronous Modifying Operations . . . . . . . . . . . 106 188 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 107 189 14. NFS Version 4 Procedures . . . . . . . . . . . . . . . . 108 190 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 108 191 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 109 192 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 112 193 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 115 194 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 117 195 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 120 196 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 197 Recovery . . . . . . . . . . . . . . . . . . . . . . 123 198 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 124 199 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 125 200 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 127 201 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 129 202 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 131 203 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 134 204 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 136 205 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 138 206 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 141 207 14.2.15. Operation 17: NVERIFY - Verify Difference in 208 Attributes . . . . . . . . . . . . . . . . . . . . . 143 209 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 145 210 14.2.17. Operation 19: OPENATTR - Open Named Attribute 211 Directory . . . . . . . . . . . . . . . . . . . . . 154 212 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 156 213 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access159 214 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 161 215 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 162 216 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 164 217 14.2.23. Operation 25: READ - Read from File . . . . . . . . 165 218 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 168 219 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 172 220 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 174 221 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 176 222 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 179 223 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 180 224 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 182 225 14.2.31. Operation 33: SECINFO - Obtain Available Security . 183 226 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 186 227 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 189 228 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 191 229 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 192 230 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 194 231 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner 232 State . . . . . . . . . . . . . . . . . . . . . . . 198 233 15. NFS Version 4 Callback Procedures . . . . . . . . . . . . 199 234 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 199 235 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 200 236 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 202 237 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 203 238 16. Security Considerations . . . . . . . . . . . . . . . . . 205 239 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 206 240 17.1. Named Attribute Definition . . . . . . . . . . . . . . 206 241 18. RPC definition file . . . . . . . . . . . . . . . . . . . 207 242 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 238 243 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 243 244 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 243 245 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 243 246 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 244 247 21. Full Copyright Statement . . . . . . . . . . . . . . . . 245 249 1. Introduction 251 The NFS version 4 protocol is a further revision of the NFS protocol 252 defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains 253 the essential characteristics of previous versions: design for easy 254 recovery, independent of transport protocols, operating systems and 255 filesystems, simplicity, and good performance. The NFS version 4 256 revision has the following goals: 258 o Improved access and good performance on the Internet. 260 The protocol is designed to transit firewalls easily, perform 261 well where latency is high and bandwidth is low, and scale to 262 very large numbers of clients per server. 264 o Strong security with negotiation built into the protocol. 266 The protocol builds on the work of the ONCRPC working group in 267 supporting the RPCSEC_GSS protocol. Additionally, the NFS 268 version 4 protocol provides a mechanism to allow clients and 269 servers the ability to negotiate security and require clients 270 and servers to support a minimal set of security schemes. 272 o Good cross-platform interoperability. 274 The protocol features a file system model that provides a 275 useful, common set of features that does not unduly favor one 276 file system or operating system over another. 278 o Designed for protocol extensions. 280 The protocol is designed to accept standard extensions that do 281 not compromise backward compatibility. 283 1.1. Overview of NFS Version 4 Features 285 To provide a reasonable context for the reader, the major features of 286 NFS version 4 protocol will be reviewed in brief. This will be done 287 to provide an appropriate context for both the reader who is familiar 288 with the previous versions of the NFS protocol and the reader that is 289 new to the NFS protocols. For the reader new to the NFS protocols, 290 there is still a fundamental knowledge that is expected. The reader 291 should be familiar with the XDR and RPC protocols as described in 292 [RFC1831] and [RFC1832]. A basic knowledge of file systems and 293 distributed file systems is expected as well. 295 1.1.1. RPC and Security 297 As with previous versions of NFS, the External Data Representation 298 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS 299 version 4 protocol are those defined in [RFC1831] and [RFC1832]. To 300 meet end to end security requirements, the RPCSEC_GSS framework 301 [RFC2203] will be used to extend the basic RPC security. With the 302 use of RPCSEC_GSS, various mechanisms can be provided to offer 303 authentication, integrity, and privacy to the NFS version 4 protocol. 304 Kerberos V5 will be used as described in [RFC1964] to provide one 305 security framework. The LIPKEY GSS-API mechanism described in 306 [RFC2847] will be used to provide for the use of user password and 307 server public key by the NFS version 4 protocol. With the use of 308 RPCSEC_GSS, other mechanisms may also be specified and used for NFS 309 version 4 security. 311 To enable in-band security negotiation, the NFS version 4 protocol 312 has added a new operation which provides the client a method of 313 querying the server about its policies regarding which security 314 mechanisms must be used for access to the server's file system 315 resources. With this, the client can securely match the security 316 mechanism that meets the policies specified at both the client and 317 server. 319 1.1.2. Procedure and Operation Structure 321 A significant departure from the previous versions of the NFS 322 protocol is the introduction of the COMPOUND procedure. For the NFS 323 version 4 protocol, there are two RPC procedures, NULL and COMPOUND. 324 The COMPOUND procedure is defined in terms of operations and these 325 operations correspond more closely to the traditional NFS procedures. 326 With the use of the COMPOUND procedure, the client is able to build 327 simple or complex requests. These COMPOUND requests allow for a 328 reduction in the number of RPCs needed for logical file system 329 operations. For example, without previous contact with a server a 330 client will be able to read data from a file in one request by 331 combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. 332 With previous versions of the NFS protocol, this type of single 333 request was not possible. 335 The model used for COMPOUND is very simple. There is no logical OR 336 or ANDing of operations. The operations combined within a COMPOUND 337 request are evaluated in order by the server. Once an operation 338 returns a failing result, the evaluation ends and the results of all 339 evaluated operations are returned to the client. 341 The NFS version 4 protocol continues to have the client refer to a 342 file or directory at the server by a "filehandle". The COMPOUND 343 procedure has a method of passing a filehandle from one operation to 344 another within the sequence of operations. There is a concept of a 345 "current filehandle" and "saved filehandle". Most operations use the 346 "current filehandle" as the file system object to operate upon. The 347 "saved filehandle" is used as temporary filehandle storage within a 348 COMPOUND procedure as well as an additional operand for certain 349 operations. 351 1.1.3. File System Model 353 The general file system model used for the NFS version 4 protocol is 354 the same as previous versions. The server file system is 355 hierarchical with the regular files contained within being treated as 356 opaque byte streams. In a slight departure, file and directory names 357 are encoded with UTF-8 to deal with the basics of 358 internationalization. 360 The NFS version 4 protocol does not require a separate protocol to 361 provide for the initial mapping between path name and filehandle. 362 Instead of using the older MOUNT protocol for this mapping, the 363 server provides a ROOT filehandle that represents the logical root or 364 top of the file system tree provided by the server. The server 365 provides multiple file systems by glueing them together with pseudo 366 file systems. These pseudo file systems provide for potential gaps 367 in the path names between real file systems. 369 1.1.3.1. Filehandle Types 371 In previous versions of the NFS protocol, the filehandle provided by 372 the server was guaranteed to be valid or persistent for the lifetime 373 of the file system object to which it referred. For some server 374 implementations, this persistence requirement has been difficult to 375 meet. For the NFS version 4 protocol, this requirement has been 376 relaxed by introducing another type of filehandle, volatile. With 377 persistent and volatile filehandle types, the server implementation 378 can match the abilities of the file system at the server along with 379 the operating environment. The client will have knowledge of the 380 type of filehandle being provided by the server and can be prepared 381 to deal with the semantics of each. 383 1.1.3.2. Attribute Types 385 The NFS version 4 protocol introduces three classes of file system or 386 file attributes. Like the additional filehandle type, the 387 classification of file attributes has been done to ease server 388 implementations along with extending the overall functionality of the 389 NFS protocol. This attribute model is structured to be extensible 390 such that new attributes can be introduced in minor revisions of the 391 protocol without requiring significant rework. 393 The three classifications are: mandatory, recommended and named 394 attributes. This is a significant departure from the previous 395 attribute model used in the NFS protocol. Previously, the attributes 396 for the file system and file objects were a fixed set of mainly Unix 397 attributes. If the server or client did not support a particular 398 attribute, it would have to simulate the attribute the best it could. 400 Mandatory attributes are the minimal set of file or file system 401 attributes that must be provided by the server and must be properly 402 represented by the server. Recommended attributes represent 403 different file system types and operating environments. The 404 recommended attributes will allow for better interoperability and the 405 inclusion of more operating environments. The mandatory and 406 recommended attribute sets are traditional file or file system 407 attributes. The third type of attribute is the named attribute. A 408 named attribute is an opaque byte stream that is associated with a 409 directory or file and referred to by a string name. Named attributes 410 are meant to be used by client applications as a method to associate 411 application specific data with a regular file or directory. 413 One significant addition to the recommended set of file attributes is 414 the Access Control List (ACL) attribute. This attribute provides for 415 directory and file access control beyond the model used in previous 416 versions of the NFS protocol. The ACL definition allows for 417 specification of user and group level access control. 419 1.1.3.3. File System Replication and Migration 421 With the use of a special file attribute, the ability to migrate or 422 replicate server file systems is enabled within the protocol. The 423 file system locations attribute provides a method for the client to 424 probe the server about the location of a file system. In the event 425 of a migration of a file system, the client will receive an error 426 when operating on the file system and it can then query as to the new 427 file system location. Similar steps are used for replication, the 428 client is able to query the server for the multiple available 429 locations of a particular file system. From this information, the 430 client can use its own policies to access the appropriate file system 431 location. 433 1.1.4. OPEN and CLOSE 435 The NFS version 4 protocol introduces OPEN and CLOSE operations. The 436 OPEN operation provides a single point where file lookup, creation, 437 and share semantics can be combined. The CLOSE operation also 438 provides for the release of state accumulated by OPEN. 440 1.1.5. File locking 442 With the NFS version 4 protocol, the support for byte range file 443 locking is part of the NFS protocol. The file locking support is 444 structured so that an RPC callback mechanism is not required. This 445 is a departure from the previous versions of the NFS file locking 446 protocol, Network Lock Manager (NLM). The state associated with file 447 locks is maintained at the server under a lease-based model. The 448 server defines a single lease period for all state held by a NFS 449 client. If the client does not renew its lease within the defined 450 period, all state associated with the client's lease may be released 451 by the server. The client may renew its lease with use of the RENEW 452 operation or implicitly by use of other operations (primarily READ). 454 1.1.6. Client Caching and Delegation 456 The file, attribute, and directory caching for the NFS version 4 457 protocol is similar to previous versions. Attributes and directory 458 information are cached for a duration determined by the client. At 459 the end of a predefined timeout, the client will query the server to 460 see if the related file system object has been updated. 462 For file data, the client checks its cache validity when the file is 463 opened. A query is sent to the server to determine if the file has 464 been changed. Based on this information, the client determines if 465 the data cache for the file should kept or released. Also, when the 466 file is closed, any modified data is written to the server. 468 If an application wants to serialize access to file data, file 469 locking of the file data ranges in question should be used. 471 The major addition to NFS version 4 in the area of caching is the 472 ability of the server to delegate certain responsibilities to the 473 client. When the server grants a delegation for a file to a client, 474 the client is guaranteed certain semantics with respect to the 475 sharing of that file with other clients. At OPEN, the server may 476 provide the client either a read or write delegation for the file. 477 If the client is granted a read delegation, it is assured that no 478 other client has the ability to write to the file for the duration of 479 the delegation. If the client is granted a write delegation, the 480 client is assured that no other client has read or write access to 481 the file. 483 Delegations can be recalled by the server. If another client 484 requests access to the file in such a way that the access conflicts 485 with the granted delegation, the server is able to notify the initial 486 client and recall the delegation. This requires that a callback path 487 exist between the server and client. If this callback path does not 488 exist, then delegations can not be granted. The essence of a 489 delegation is that it allows the client to locally service operations 490 such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate 491 interaction with the server. 493 1.2. General Definitions 495 The following definitions are provided for the purpose of providing 496 an appropriate context for the reader. 498 Client The "client" is the entity that accesses the NFS server's 499 resources. The client may be an application which contains 500 the logic to access the NFS server directly. The client 501 may also be the traditional operating system client remote 502 file system services for a set of applications. 504 In the case of file locking the client is the entity that 505 maintains a set of locks on behalf of one or more 506 applications. This client is responsible for crash or 507 failure recovery for those locks it manages. 509 Note that multiple clients may share the same transport and 510 multiple clients may exist on the same network node. 512 Clientid A 64-bit quantity used as a unique, short-hand reference to 513 a client supplied Verifier and ID. The server is 514 responsible for supplying the Clientid. 516 Lease An interval of time defined by the server for which the 517 client is irrevocably granted a lock. At the end of a 518 lease period the lock may be revoked if the lease has not 519 been extended. The lock must be revoked if a conflicting 520 lock has been granted after the lease interval. 522 All leases granted by a server have the same fixed 523 interval. Note that the fixed interval was chosen to 524 alleviate the expense a server would have in maintaining 525 state about variable length leases across server failures. 527 Lock The term "lock" is used to refer to both record (byte- 528 range) locks as well as file (share) locks unless 529 specifically stated otherwise. 531 Server The "Server" is the entity responsible for coordinating 532 client access to a set of file systems. 534 Stable Storage 535 NFS version 4 servers must be able to recover without data 536 loss from multiple power failures (including cascading 537 power failures, that is, several power failures in quick 538 succession), operating system failures, and hardware 539 failure of components other than the storage medium itself 540 (for example, disk, nonvolatile RAM). 542 Some examples of stable storage that are allowable for an 543 NFS server include: 545 1. Media commit of data, that is, the modified data has 546 been successfully written to the disk media, 547 for example, the disk platter. 549 2. An immediate reply disk drive with battery-backed 550 on-drive intermediate storage or uninterruptible power 551 system (UPS). 553 3. Server commit of data with battery-backed intermediate 554 storage and recovery software. 556 4. Cache commit with uninterruptible power system (UPS) 557 and recovery software. 559 Stateid A 128-bit quantity returned by a server that uniquely 560 defines the open and locking state provided by the server 561 for a specific open or lock owner for a specific file. 563 Stateids composed of all bits 0 or all bits 1 have special 564 meaning and are reserved values. 566 Verifier A 64-bit quantity generated by the client that the server 567 can use to determine if the client has restarted and lost 568 all previous lock state. 570 2. Protocol Data Types 572 The syntax and semantics to describe the data types of the NFS 573 version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] 574 documents. The next sections build upon the XDR data types to define 575 types and structures specific to this protocol. 577 2.1. Basic Data Types 579 Data Type Definition 580 _____________________________________________________________________ 581 int32_t typedef int int32_t; 583 uint32_t typedef unsigned int uint32_t; 585 int64_t typedef hyper int64_t; 587 uint64_t typedef unsigned hyper uint64_t; 589 attrlist4 typedef opaque attrlist4<>; 590 Used for file/directory attributes 592 bitmap4 typedef uint32_t bitmap4<>; 593 Used in attribute array encoding. 595 changeid4 typedef uint64_t changeid4; 596 Used in definition of change_info 598 clientid4 typedef uint64_t clientid4; 599 Shorthand reference to client identification 601 component4 typedef utf8string component4; 602 Represents path name components 604 count4 typedef uint32_t count4; 605 Various count parameters (READ, WRITE, COMMIT) 607 length4 typedef uint64_t length4; 608 Describes LOCK lengths 610 linktext4 typedef utf8string linktext4; 611 Symbolic link contents 613 mode4 typedef uint32_t mode4; 614 Mode attribute data type 616 nfs_cookie4 typedef uint64_t nfs_cookie4; 617 Opaque cookie value for READDIR 619 nfs_fh4 typedef opaque nfs_fh4; 620 Filehandle definition; NFS4_FHSIZE is defined as 128 622 nfs_ftype4 enum nfs_ftype4; 623 Various defined file types 625 nfsstat4 enum nfsstat4; 626 Return value for operations 628 offset4 typedef uint64_t offset4; 629 Various offset designations (READ, WRITE, LOCK, COMMIT) 631 pathname4 typedef component4 pathname4<>; 632 Represents path name for LOOKUP, OPEN and others 634 qop4 typedef uint32_t qop4; 635 Quality of protection designation in SECINFO 637 sec_oid4 typedef opaque sec_oid4<>; 638 Security Object Identifier 639 The sec_oid4 data type is not really opaque. 640 Instead contains an ASN.1 OBJECT IDENTIFIER as used 641 by GSS-API in the mech_type argument to 642 GSS_Init_sec_context. See [RFC2078] for details. 644 seqid4 typedef uint32_t seqid4; 645 Sequence identifier used for file locking 647 utf8string typedef opaque utf8string<>; 648 UTF-8 encoding for strings 650 verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 651 Verifier used for various operations (COMMIT, CREATE, 652 OPEN, READDIR, SETCLIENTID, WRITE) 653 NFS4_VERIFIER_SIZE is defined as 8 655 2.2. Structured Data Types 657 nfstime4 658 struct nfstime4 { 659 int64_t seconds; 660 uint32_t nseconds; 661 } 663 The nfstime4 structure gives the number of seconds and 664 nanoseconds since midnight or 0 hour January 1, 1970 Coordinated 665 Universal Time (UTC). Values greater than zero for the seconds 666 field denote dates after the 0 hour January 1, 1970. Values 667 less than zero for the seconds field denote dates before the 0 668 hour January 1, 1970. In both cases, the nseconds field is to 669 be added to the seconds field for the final time representation. 670 For example, if the time to be represented is one-half second 671 before 0 hour January 1, 1970, the seconds field would have a 672 value of negative one (-1) and the nseconds fields would have a 673 value of one-half second (500000000). Values greater than 674 999,999,999 for nseconds are considered invalid. 676 This data type is used to pass time and date information. A 677 server converts to and from its local representation of time 678 when processing time values, preserving as much accuracy as 679 possible. If the precision of timestamps stored for a file 680 system object is less than defined, loss of precision can occur. 681 An adjunct time maintenance protocol is recommended to reduce 682 client and server time skew. 684 time_how4 686 enum time_how4 { 687 SET_TO_SERVER_TIME4 = 0, 688 SET_TO_CLIENT_TIME4 = 1 689 }; 691 settime4 693 union settime4 switch (time_how4 set_it) { 694 case SET_TO_CLIENT_TIME4: 695 nfstime4 time; 696 default: 697 void; 698 }; 700 The above definitions are used as the attribute definitions to 701 set time values. If set_it is SET_TO_SERVER_TIME4, then the 702 server uses its local representation of time for the time value. 704 specdata4 706 struct specdata4 { 707 uint32_t specdata1; 708 uint32_t specdata2; 709 }; 711 This data type represents additional information for the device 712 file types NF4CHR and NF4BLK. 714 fsid4 716 struct fsid4 { 717 uint64_t major; 718 uint64_t minor; 719 }; 721 This type is the file system identifier that is used as a 722 mandatory attribute. 724 fs_location4 726 struct fs_location4 { 727 utf8string server<>; 728 pathname4 rootpath; 729 }; 731 fs_locations4 733 struct fs_locations4 { 734 pathname4 fs_root; 735 fs_location4 locations<>; 736 }; 738 The fs_location4 and fs_locations4 data types are used for the 739 fs_locations recommended attribute which is used for migration 740 and replication support. 742 fattr4 744 struct fattr4 { 745 bitmap4 attrmask; 746 attrlist4 attr_vals; 747 }; 749 The fattr4 structure is used to represent file and directory 750 attributes. 752 The bitmap is a counted array of 32 bit integers used to contain 753 bit values. The position of the integer in the array that 754 contains bit n can be computed from the expression (n / 32) and 755 its bit within that integer is (n mod 32). 757 0 1 758 +-----------+-----------+-----------+-- 759 | count | 31 .. 0 | 63 .. 32 | 760 +-----------+-----------+-----------+-- 762 change_info4 764 struct change_info4 { 765 bool atomic; 766 changeid4 before; 767 changeid4 after; 768 }; 770 This structure is used with the CREATE, LINK, REMOVE, RENAME 771 operations to let the client know the value of the change 772 attribute for the directory in which the target file system 773 object resides. 775 clientaddr4 777 struct clientaddr4 { 778 /* see struct rpcb in RFC 1833 */ 779 string r_netid<>; /* network id */ 780 string r_addr<>; /* universal address */ 781 }; 783 The clientaddr4 structure is used as part of the SETCLIENT 784 operation to either specify the address of the client that is 785 using a clientid or as part of the call back registration. 787 cb_client4 789 struct cb_client4 { 790 unsigned int cb_program; 791 clientaddr4 cb_location; 792 }; 794 This structure is used by the client to inform the server of its 795 call back address; includes the program number and client 796 address. 798 nfs_client_id4 800 struct nfs_client_id4 { 801 verifier4 verifier; 802 opaque id; 803 }; 805 This structure is part of the arguments to the SETCLIENTID 806 operation. NFS4_OPAQUE_LIMIT is defined as 1024. 808 open_owner4 810 struct open_owner4 { 811 clientid4 clientid; 812 opaque owner; 813 }; 815 This structure is used to identify the owner of open state. 816 NFS4_OPAQUE_LIMIT is defined as 1024. 818 lock_owner4 820 struct nfs_lockowner4 { 821 clientid4 clientid; 822 opaque owner; 823 }; 825 This structure is used to identify the owner of file locking 826 state. NFS4_OPAQUE_LIMIT is defined as 1024. 828 stateid4 830 struct stateid4 { 831 uint32_t seqid; 832 opaque other[12]; 833 }; 835 This strucutre is used for the various state sharing mechanisms 836 between the client and server. For the client, this data 837 structure is read-only. The seqid value is the only field that 838 the client should interpret. See the section for the OPEN 839 operation for further description of how the seqid field is to 840 be interpreted. 842 3. RPC and Security Flavor 844 The NFS version 4 protocol is a Remote Procedure Call (RPC) 845 application that uses RPC version 2 and the corresponding eXternal 846 Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The 847 RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as 848 the mechanism to deliver stronger security for the NFS version 4 849 protocol. 851 3.1. Ports and Transports 853 Historically, NFS version 2 and version 3 servers have resided on 854 port 2049. The registered port 2049 [RFC1700] for the NFS protocol 855 should be the default configuration. Using the registered port for 856 NFS services means the NFS client will not need to use the RPC 857 binding protocols as described in [RFC1833]; this will allow NFS to 858 transit firewalls. 860 The transport used by the RPC service for the NFS version 4 protocol 861 MUST provide congestion control comparable to that defined for TCP in 862 [RFC2581]. If the operating environment implements TCP, the NFS 863 version 4 protocol SHOULD be supported over TCP. The NFS client and 864 server may use other transports if they support congestion control as 865 defined above and in those cases a mechanism may be provided to 866 override TCP usage in favor of another transport. 868 If TCP is used as the transport, the client and server SHOULD use 869 persistent connections. This will prevent the weakening of TCP's 870 congestion control via short lived connections and will improve 871 performance for the WAN environment by eliminating the need for SYN 872 handshakes. 874 Note that for various timers, the client and server should avoid 875 inadvertent synchronization of those timers. For further discussion 876 of the general issue refer to [Floyd]. 878 3.2. Security Flavors 880 Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, 881 AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an 882 additional security flavor of RPCSEC_GSS has been introduced which 883 uses the functionality of GSS-API [RFC2078]. This allows for the use 884 of varying security mechanisms by the RPC layer without the 885 additional implementation overhead of adding RPC security flavors. 886 For NFS version 4, the RPCSEC_GSS security flavor MUST be used to 887 enable the mandatory security mechanism. Other flavors, such as, 888 AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. 890 3.2.1. Security mechanisms for NFS version 4 892 The use of RPCSEC_GSS requires selection of: mechanism, quality of 893 protection, and service (authentication, integrity, privacy). The 894 remainder of this document will refer to these three parameters of 895 the RPCSEC_GSS security as the security triple. 897 3.2.1.1. Kerberos V5 as security triple 899 The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be 900 implemented and provide the following security triples. 902 column descriptions: 904 1 == number of pseudo flavor 905 2 == name of pseudo flavor 906 3 == mechanism's OID 907 4 == mechanism's algorithm(s) 908 5 == RPCSEC_GSS service 910 1 2 3 4 5 911 ----------------------------------------------------------------------- 912 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none 913 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity 914 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy 915 for integrity, 916 and 56 bit DES 917 for privacy. 919 Note that the pseudo flavor is presented here as a mapping aid to the 920 implementor. Because this NFS protocol includes a method to 921 negotiate security and it understands the GSS-API mechanism, the 922 pseudo flavor is not needed. The pseudo flavor is needed for NFS 923 version 3 since the security negotiation is done via the MOUNT 924 protocol. 926 For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please 927 see [RFC2623]. 929 3.2.1.2. LIPKEY as a security triple 931 The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be 932 implemented and provide the following security triples. The 933 definition of the columns matches the previous subsection "Kerberos 934 V5 as security triple" 936 1 2 3 4 5 937 ----------------------------------------------------------------------- 938 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none 939 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity 940 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy 942 The mechanism algorithm is listed as "negotiated". This is because 943 LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the 944 confidentiality and integrity algorithms are negotiated. Since 945 SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit 946 cast5CBC for confidentiality for privacy as MANDATORY, and further 947 specifies that HMAC-MD5 and cast5CBC MUST be listed first before 948 weaker algorithms, specifying "negotiated" in column 4 does not 949 impair interoperability. In the event an SPKM-3 peer does not 950 support the mandatory algorithms, the other peer is free to accept or 951 reject the GSS-API context creation. 953 Because SPKM-3 negotiates the algorithms, subsequent calls to 954 LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality 955 of protection value of 0 (zero). See section 5.2 of [RFC2025] for an 956 explanation. 958 LIPKEY uses SPKM-3 to create a secure channel in which to pass a user 959 name and password from the client to the server. Once the user name 960 and password have been accepted by the server, calls to the LIPKEY 961 context are redirected to the SPKM-3 context. See [RFC2847] for more 962 details. 964 3.2.1.3. SPKM-3 as a security triple 966 The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be 967 implemented and provide the following security triples. The 968 definition of the columns matches the previous subsection "Kerberos 969 V5 as security triple". 971 1 2 3 4 5 972 ----------------------------------------------------------------------- 973 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none 974 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity 975 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy 977 For a discussion as to why the mechanism algorithm is listed as 978 "negotiated", see the previous section "LIPKEY as a security triple." 980 Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- 981 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of 982 protection value of 0 (zero). See section 5.2 of [RFC2025] for an 983 explanation. 985 Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a 986 mandatory set of triples to handle the situations where the initiator 987 (the client) is anonymous or where the initiator has its own 988 certificate. If the initiator is anonymous, there will not be a user 989 name and password to send to the target (the server). If the 990 initiator has its own certificate, then using passwords is 991 superfluous. 993 3.3. Security Negotiation 995 With the NFS version 4 server potentially offering multiple security 996 mechanisms, the client needs a method to determine or negotiate which 997 mechanism is to be used for its communication with the server. The 998 NFS server may have multiple points within its file system name space 999 that are available for use by NFS clients. In turn the NFS server 1000 may be configured such that each of these entry points may have 1001 different or multiple security mechanisms in use. 1003 The security negotiation between client and server must be done with 1004 a secure channel to eliminate the possibility of a third party 1005 intercepting the negotiation sequence and forcing the client and 1006 server to choose a lower level of security than required or desired. 1008 3.3.1. Security Error 1010 Based on the assumption that each NFS version 4 client and server 1011 must support a minimum set of security (i.e. LIPKEY, SPKM-3, and 1012 Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its 1013 communication with the server with one of the minimal security 1014 triples. During communication with the server, the client may 1015 receive an NFS error of NFS4ERR_WRONGSEC. This error allows the 1016 server to notify the client that the security triple currently being 1017 used is not appropriate for access to the server's file system 1018 resources. The client is then responsible for determining what 1019 security triples are available at the server and choose one which is 1020 appropriate for the client. 1022 3.3.2. SECINFO 1024 The new SECINFO operation will allow the client to determine, on a 1025 per filehandle basis, what security triple is to be used for server 1026 access. In general, the client will not have to use the SECINFO 1027 procedure except during initial communication with the server or when 1028 the client crosses policy boundaries at the server. It is possible 1029 that the server's policies change during the client's interaction 1030 therefore forcing the client to negotiate a new security triple. 1032 3.4. Callback RPC Authentication 1034 The callback RPC (described later) must mutually authenticate the NFS 1035 server to the principal that acquired the clientid (also described 1036 later), using the same security flavor the original SETCLIENTID 1037 operation used. Because LIPKEY is layered over SPKM-3, it is 1038 permissible for the server to use SPKM-3 and not LIPKEY for the 1039 callback even if the client used LIPKEY for SETCLIENTID. 1041 For AUTH_NONE, there are no principals, so this is a non-issue. 1043 For AUTH_SYS, the server simply uses the AUTH_SYS credential that the 1044 user used when it set up the delegation. 1046 For AUTH_DH, one commonly used convention is that the server uses the 1047 credential corresponding to this AUTH_DH principal: 1049 unix.host@domain 1051 where host and domain are variables corresponding to the name of 1052 server host and directory services domain in which it lives such as a 1053 Network Information System domain or a DNS domain. 1055 Regardless of what security mechanism under RPCSEC_GSS is being used, 1056 the NFS server, MUST identify itself in GSS-API via a 1057 GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE 1058 names are of the form: 1060 service@hostname 1062 For NFS, the "service" element is 1064 nfs 1066 Implementations of security mechanisms will convert nfs@hostname to 1067 various different forms. For Kerberos V5 and LIPKEY, the following 1068 form is RECOMMENDED: 1070 nfs/hostname 1072 For Kerberos V5, nfs/hostname would be a server principal in the 1073 Kerberos Key Distribution Center database. For LIPKEY, this would be 1074 the username passed to the target (the NFS version 4 client that 1075 receives the callback). 1077 It should be noted that LIPKEY may not work for callbacks, since the 1078 LIPKEY client uses a user id/password. If the NFS client receiving 1079 the callback can authenticate the NFS server's user name/password 1080 pair, and if the user that the NFS server is authenticating to has a 1081 public key certificate, then it works. 1083 In situations where NFS client uses LIPKEY and uses a per-host 1084 principal for the SETCLIENTID operation, instead of using LIPKEY for 1085 SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication 1086 be used. This effectively means that the client will use a 1087 certificate to authenticate and identify the initiator to the target 1088 on the NFS server. Using SPKM-3 and not LIPKEY has the following 1089 advantages: 1091 o When the server does a callback, it must authenticate to the 1092 principal used in the SETCLIENTID. Even if LIPKEY is used, 1093 because LIPKEY is layered over SPKM-3, the NFS client will need 1094 to have a certificate that corresponds to the principal used in 1095 the SETCLIENTID operation. From an administrative perspective, 1096 having a user name, password, and certificate for both the 1097 client and server is redundant. 1099 o LIPKEY was intended to minimize additional infrastructure 1100 requirements beyond a certificate for the target, and the 1101 expectation is that existing password infrastructure can be 1102 leveraged for the initiator. In some environments, a per-host 1103 password does not exist yet. If certificates are used for any 1104 per-host principals, then additional password infrastructure is 1105 not needed. 1107 o In cases when a host is both an NFS client and server, it can 1108 share the same per-host certificate. 1110 4. Filehandles 1112 The filehandle in the NFS protocol is a per server unique identifier 1113 for a file system object. The contents of the filehandle are opaque 1114 to the client. Therefore, the server is responsible for translating 1115 the filehandle to an internal representation of the file system 1116 object. Since the filehandle is the client's reference to an object 1117 and the client may cache this reference, the server SHOULD not reuse 1118 a filehandle for another file system object. If the server needs to 1119 reuse a filehandle value, the time elapsed before reuse SHOULD be 1120 large enough such that it is unlikely the client has a cached copy of 1121 the reused filehandle value. Note that a client may cache a 1122 filehandle for a very long time. For example, a client may cache NFS 1123 data to local storage as a method to expand its effective cache size 1124 and as a means to survive client restarts. Therefore, the lifetime 1125 of a cached filehandle may be extended. 1127 4.1. Obtaining the First Filehandle 1129 The operations of the NFS protocol are defined in terms of one or 1130 more filehandles. Therefore, the client needs a filehandle to 1131 initiate communication with the server. With the NFS version 2 1132 protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there 1133 exists an ancillary protocol to obtain this first filehandle. The 1134 MOUNT protocol, RPC program number 100005, provides the mechanism of 1135 translating a string based file system path name to a filehandle 1136 which can then be used by the NFS protocols. 1138 The MOUNT protocol has deficiencies in the area of security and use 1139 via firewalls. This is one reason that the use of the public 1140 filehandle was introduced in [RFC2054] and [RFC2055]. With the use 1141 of the public filehandle in combination with the LOOKUP procedure in 1142 the NFS version 2 and 3 protocols, it has been demonstrated that the 1143 MOUNT protocol is unnecessary for viable interaction between NFS 1144 client and server. 1146 Therefore, the NFS version 4 protocol will not use an ancillary 1147 protocol for translation from string based path names to a 1148 filehandle. Two special filehandles will be used as starting points 1149 for the NFS client. 1151 4.1.1. Root Filehandle 1153 The first of the special filehandles is the ROOT filehandle. The 1154 ROOT filehandle is the "conceptual" root of the file system name 1155 space at the NFS server. The client uses or starts with the ROOT 1156 filehandle by employing the PUTROOTFH operation. The PUTROOTFH 1157 operation instructs the server to set the "current" filehandle to the 1158 ROOT of the server's file tree. Once this PUTROOTFH operation is 1159 used, the client can then traverse the entirety of the server's file 1160 tree with the LOOKUP procedure. A complete discussion of the server 1161 name space is in the section "NFS Server Name Space". 1163 4.1.2. Public Filehandle 1165 The second special filehandle is the PUBLIC filehandle. Unlike the 1166 ROOT filehandle, the PUBLIC filehandle may be bound or represent an 1167 arbitrary file system object at the server. The server is 1168 responsible for this binding. It may be that the PUBLIC filehandle 1169 and the ROOT filehandle refer to the same file system object. 1170 However, it is up to the administrative software at the server and 1171 the policies of the server administrator to define the binding of the 1172 PUBLIC filehandle and server file system object. The client may not 1173 make any assumptions about this binding. 1175 4.2. Filehandle Types 1177 In the NFS version 2 and 3 protocols, there was one type of 1178 filehandle with a single set of semantics. The NFS version 4 1179 protocol introduces a new type of filehandle in an attempt to 1180 accommodate certain server environments. The first type of 1181 filehandle is 'persistent'. The semantics of a persistent filehandle 1182 are the same as the filehandles of the NFS version 2 and 3 protocols. 1183 The second or new type of filehandle is the "volatile" filehandle. 1185 The volatile filehandle type is being introduced to address server 1186 functionality or implementation issues which make correct 1187 implementation of a persistent filehandle infeasible. Some server 1188 environments do not provide a file system level invariant that can be 1189 used to construct a persistent filehandle. The underlying server 1190 file system may not provide the invariant or the server's file system 1191 programming interfaces may not provide access to the needed 1192 invariant. Volatile filehandles may ease the implementation of 1193 server functionality such as hierarchical storage management or file 1194 system reorganization or migration. However, the volatile filehandle 1195 increases the implementation burden for the client. However this 1196 increased burden is deemed acceptable based on the overall gains 1197 achieved by the protocol. 1199 Since the client will need to handle persistent and volatile 1200 filehandle differently, a file attribute is defined which may be used 1201 by the client to determine the filehandle types being returned by the 1202 server. 1204 4.2.1. General Properties of a Filehandle 1206 The filehandle contains all the information the server needs to 1207 distinguish an individual file. To the client, the filehandle is 1208 opaque. The client stores filehandles for use in a later request and 1209 can compare two filehandles from the same server for equality by 1210 doing a byte-by-byte comparison. However, the client MUST NOT 1211 otherwise interpret the contents of filehandles. If two filehandles 1212 from the same server are equal, they MUST refer to the same file. If 1213 they are not equal, the client may use information provided by the 1214 server, in the form of file attributes, to determine whether they 1215 denote the same files or different files. The client would do this 1216 as necessary for client side caching. Servers SHOULD try to maintain 1217 a one-to-one correspondence between filehandles and files but this is 1218 not required. Clients MUST use filehandle comparisons only to 1219 improve performance, not for correct behavior. All clients need to 1220 be prepared for situations in which it cannot be determined whether 1221 two filehandles denote the same object and in such cases, avoid 1222 making invalid assumpions which might cause incorrect behavior. 1223 Further discussion of filehandle and attribute comparison in the 1224 context of data caching is presented in the section "Data Caching and 1225 File Identity". 1227 As an example, in the case that two different path names when 1228 traversed at the server terminate at the same file system object, the 1229 server SHOULD return the same filehandle for each path. This can 1230 occur if a hard link is used to create two file names which refer to 1231 the same underlying file object and associated data. For example, if 1232 paths /a/b/c and /a/d/c refer to the same file, the server SHOULD 1233 return the same filehandle for both path names traversals. 1235 4.2.2. Persistent Filehandle 1237 A persistent filehandle is defined as having a fixed value for the 1238 lifetime of the file system object to which it refers. Once the 1239 server creates the filehandle for a file system object, the server 1240 MUST accept the same filehandle for the object for the lifetime of 1241 the object. If the server restarts or reboots the NFS server must 1242 honor the same filehandle value as it did in the server's previous 1243 instantiation. Similarly, if the file system is migrated, the new 1244 NFS server must honor the same file handle as the old NFS server. 1246 The persistent filehandle will be become stale or invalid when the 1247 file system object is removed. When the server is presented with a 1248 persistent filehandle that refers to a deleted object, it MUST return 1249 an error of NFS4ERR_STALE. A filehandle may become stale when the 1250 file system containing the object is no longer available. The file 1251 system may become unavailable if it exists on removable media and the 1252 media is no longer available at the server or the file system in 1253 whole has been destroyed or the file system has simply been removed 1254 from the server's name space (i.e. unmounted in a Unix environment). 1256 4.2.3. Volatile Filehandle 1258 A volatile filehandle does not share the same longevity 1259 characteristics of a persistent filehandle. The server may determine 1260 that a volatile filehandle is no longer valid at many different 1261 points in time. If the server can definitively determine that a 1262 volatile filehandle refers to an object that has been removed, the 1263 server should return NFS4ERR_STALE to the client (as is the case for 1264 persistent filehandles). In all other cases where the server 1265 determines that a volatile filehandle can no longer be used, it 1266 should return an error of NFS4ERR_FHEXPIRED. 1268 The mandatory attribute "fh_expire_type" is used by the client to 1269 determine what type of filehandle the server is providing for a 1270 particular file system. This attribute is a bitmask with the 1271 following values: 1273 FH4_PERSISTENT 1274 The value of FH4_PERSISTENT is used to indicate a persistent 1275 filehandle, which is valid until the object is removed from the 1276 file system. The server will not return NFS4ERR_FHEXPIRED for 1277 this filehandle. FH4_PERSISTENT is defined as a value in which 1278 none of the bits specified below are set. 1280 FH4_NOEXPIRE_WITH_OPEN 1281 The filehandle will not expire while client has the file open. 1282 If this bit is set, then the values FH4_VOLATILE_ANY or 1283 FH4_VOL_RENAME do not impact expiration while the file is open. 1284 Once the file is closed or if the FH4_NOEXPIRE_WITH_OPEN bit is 1285 false, the rest of the volatile related bits apply. 1287 FH4_VOLATILE_ANY 1288 The filehandle may expire at any time and will expire during 1289 system migration and rename. 1291 FH4_VOL_MIGRATION 1292 The filehandle will expire during file system migration. May 1293 only be set if FH4_VOLATILE_ANY is not set. 1295 FH4_VOL_RENAME 1296 The filehandle may expire due to a rename. This includes a 1297 rename by the requesting client or a rename by another client. 1298 May only be set if FH4_VOLATILE_ANY is not set. 1300 Servers which provide volatile filehandles should deny a RENAME or 1301 REMOVE that would affect an OPEN file or any of the components 1302 leading to the OPEN file. In addition, the server should deny all 1303 RENAME or REMOVE requests during the grace or lease period upon 1304 server restart. 1306 The reader may be wondering why there are three FH4_VOL* bits and why 1307 FH4_VOLATILE_ANY is exclusive of FH4_VOL_MIGRATION and 1308 FH4_VOL_RENAME. If the a filehandle is normally persistent but 1309 cannot persist across a file set migration, then the presence of the 1310 FH4_VOL_MIGRATION or FH4_VOL_RENAME tells the client that it can 1311 treat the file handle as persistent for purposes of maintaining a 1312 file name to file handle cache, except for the specific event 1313 described by the bit. However, FH4_VOLATILE_ANY tells the client 1314 that it should not maintain such a cache for unopened files. A 1315 server MUST not present FH4_VOLATILE_ANY with FH4_VOL_MIGRATION or 1316 FH4_VOL_RENAME as this will lead to confusion. FH4_VOLATILE_ANY 1317 implies that the file handle will expire upon migration or rename, in 1318 addition to other events. 1320 4.2.4. One Method of Constructing a Volatile Filehandle 1322 As mentioned, in some instances a filehandle is stale (no longer 1323 valid; perhaps because the file was removed from the server) or it is 1324 expired (the underlying file is valid but since the filehandle is 1325 volatile, it may have expired). Thus the server needs to be able to 1326 return NFS4ERR_STALE in the former case and NFS4ERR_FHEXPIRED in the 1327 latter case. This can be done by careful construction of the volatile 1328 filehandle. One possible implementation follows. 1330 A volatile filehandle, while opaque to the client could contain: 1332 [volatile bit = 1 | server boot time | slot | generation number] 1334 o slot is an index in the server volatile filehandle table 1336 o generation number is the generation number for the table 1337 entry/slot 1339 If the server boot time is less than the current server boot time, 1340 return NFS4ERR_FHEXPIRED. If slot is out of range, return 1341 NFS4ERR_BADHANDLE. If the generation number does not match, return 1342 NFS4ERR_FHEXPIRED. 1344 When the server reboots, the table is gone (it is volatile). 1346 If volatile bit is 0, then it is a persistent filehandle with a 1347 different structure following it. 1349 4.3. Client Recovery from Filehandle Expiration 1351 If possible, the client SHOULD recover from the receipt of an 1352 NFS4ERR_FHEXPIRED error. The client must take on additional 1353 responsibility so that it may prepare itself to recover from the 1354 expiration of a volatile filehandle. If the server returns 1355 persistent filehandles, the client does not need these additional 1356 steps. 1358 For volatile filehandles, most commonly the client will need to store 1359 the component names leading up to and including the file system 1360 object in question. With these names, the client should be able to 1361 recover by finding a filehandle in the name space that is still 1362 available or by starting at the root of the server's file system name 1363 space. 1365 If the expired filehandle refers to an object that has been removed 1366 from the file system, obviously the client will not be able to 1367 recover from the expired filehandle. 1369 It is also possible that the expired filehandle refers to a file that 1370 has been renamed. If the file was renamed by another client, again 1371 it is possible that the original client will not be able to recover. 1372 However, in the case that the client itself is renaming the file and 1373 the file is open, it is possible that the client may be able to 1374 recover. The client can determine the new path name based on the 1375 processing of the rename request. The client can then regenerate the 1376 new filehandle based on the new path name. The client could also use 1377 the compound operation mechanism to construct a set of operations 1378 like: 1379 RENAME A B 1380 LOOKUP B 1381 GETFH 1383 5. File Attributes 1385 To meet the requirements of extensibility and increased 1386 interoperability with non-Unix platforms, attributes must be handled 1387 in a flexible manner. The NFS Version 3 fattr3 structure contains a 1388 fixed list of attributes that not all clients and servers are able to 1389 support or care about. The fattr3 structure can not be extended as 1390 new needs arise and it provides no way to indicate non-support. With 1391 the NFS Version 4 protocol, the client will be able to ask what 1392 attributes the server supports and will be able to request only those 1393 attributes in which it is interested. 1395 To this end, attributes will be divided into three groups: mandatory, 1396 recommended, and named. Both mandatory and recommended attributes 1397 are supported in the NFS version 4 protocol by a specific and well- 1398 defined encoding and are identified by number. They are requested by 1399 setting a bit in the bit vector sent in the GETATTR request; the 1400 server response includes a bit vector to list what attributes were 1401 returned in the response. New mandatory or recommended attributes 1402 may be added to the NFS protocol between major revisions by 1403 publishing a standards-track RFC which allocates a new attribute 1404 number value and defines the encoding for the attribute. See the 1405 section "Minor Versioning" for further discussion. 1407 Named attributes are accessed by the new OPENATTR operation, which 1408 accesses a hidden directory of attributes associated with a file 1409 system object. OPENATTR takes a filehandle for the object and 1410 returns the filehandle for the attribute hierarchy. The filehandle 1411 for the named attributes is a directory object accessible by LOOKUP 1412 or READDIR and contains files whose names represent the named 1413 attributes and whose data bytes are the value of the attribute. For 1414 example: 1416 LOOKUP "foo" ; look up file 1417 GETATTR attrbits 1418 OPENATTR ; access foo's named attributes 1419 LOOKUP "x11icon" ; look up specific attribute 1420 READ 0,4096 ; read stream of bytes 1422 Named attributes are intended for data needed by applications rather 1423 than by an NFS client implementation. NFS implementors are strongly 1424 encouraged to define their new attributes as recommended attributes 1425 by bringing them to the IETF standards-track process. 1427 The set of attributes which are classified as mandatory is 1428 deliberately small since servers must do whatever it takes to support 1429 them. The recommended attributes may be unsupported; though a server 1430 should support as many as it can. Attributes are deemed mandatory if 1431 the data is both needed by a large number of clients and is not 1432 otherwise reasonably computable by the client when support is not 1433 provided on the server. 1435 5.1. Mandatory Attributes 1437 These MUST be supported by every NFS Version 4 client and server in 1438 order to ensure a minimum level of interoperability. The server must 1439 store and return these attributes and the client must be able to 1440 function with an attribute set limited to these attributes. With 1441 just the mandatory attributes some client functionality may be 1442 impaired or limited in some ways. A client may ask for any of these 1443 attributes to be returned by setting a bit in the GETATTR request and 1444 the server must return their value. 1446 5.2. Recommended Attributes 1448 These attributes are understood well enough to warrant support in the 1449 NFS Version 4 protocol. However, they may not be supported on all 1450 clients and servers. A client may ask for any of these attributes to 1451 be returned by setting a bit in the GETATTR request but must handle 1452 the case where the server does not return them. A client may ask for 1453 the set of attributes the server supports and should not request 1454 attributes the server does not support. A server should be tolerant 1455 of requests for unsupported attributes and simply not return them 1456 rather than considering the request an error. It is expected that 1457 servers will support all attributes they comfortably can and only 1458 fail to support attributes which are difficult to support in their 1459 operating environments. A server should provide attributes whenever 1460 they don't have to "tell lies" to the client. For example, a file 1461 modification time should be either an accurate time or should not be 1462 supported by the server. This will not always be comfortable to 1463 clients but it seems that the client has a better ability to 1464 fabricate or construct an attribute or do without the attribute. 1466 5.3. Named Attributes 1468 These attributes are not supported by direct encoding in the NFS 1469 Version 4 protocol but are accessed by string names rather than 1470 numbers and correspond to an uninterpreted stream of bytes which are 1471 stored with the file system object. The name space for these 1472 attributes may be accessed by using the OPENATTR operation. The 1473 OPENATTR operation returns a filehandle for a virtual "attribute 1474 directory" and further perusal of the name space may be done using 1475 READDIR and LOOKUP operations on this filehandle. Named attributes 1476 may then be examined or changed by normal READ and WRITE and CREATE 1477 operations on the filehandles returned from READDIR and LOOKUP. 1478 Named attributes may have attributes. 1480 It is recommended that servers support arbitrary named attributes. A 1481 client should not depend on the ability to store any named attributes 1482 in the server's file system. If a server does support named 1483 attributes, a client which is also able to handle them should be able 1484 to copy a file's data and meta-data with complete transparency from 1485 one location to another; this would imply that names allowed for 1486 regular directory entries are valid for named attribute names as 1487 well. 1489 Names of attributes will not be controlled by this document or other 1490 IETF standards track documents. See the section "IANA 1491 Considerations" for further discussion. 1493 5.4. Mandatory Attributes - Definitions 1495 Name # DataType Access Description 1496 ___________________________________________________________________ 1497 supp_attr 0 bitmap READ 1498 The bit vector which 1499 would retrieve all 1500 mandatory and 1501 recommended attributes 1502 that are supported for 1503 this object. 1505 type 1 nfs4_ftype READ 1506 The type of the object 1507 (file, directory, 1508 symlink) 1510 fh_expire_type 2 uint32 READ 1511 Server uses this to 1512 specify filehandle 1513 expiration behavior to 1514 the client. See the 1515 section "Filehandles" 1516 for additional 1517 description. 1519 change 3 uint64 READ 1520 A value created by the 1521 server that the client 1522 can use to determine 1523 if file data, 1524 directory contents or 1525 attributes of the 1526 object have been 1527 modified. The server 1528 may return the 1529 object's time_modify 1530 attribute for this 1531 attribute's value but 1532 only if the file 1533 system object can not 1534 be updated more 1535 frequently than the 1536 resolution of 1537 time_modify. 1539 size 4 uint64 R/W 1540 The size of the object 1541 in bytes. 1543 link_support 5 bool READ 1544 Does the object's file 1545 system supports hard 1546 links? 1548 symlink_support 6 bool READ 1549 Does the object's file 1550 system supports 1551 symbolic links? 1553 named_attr 7 bool READ 1554 Does this object have 1555 named attributes? 1557 fsid 8 fsid4 READ 1558 Unique file system 1559 identifier for the 1560 file system holding 1561 this object. fsid 1562 contains major and 1563 minor components each 1564 of which are uint64. 1566 unique_handles 9 bool READ 1567 Are two distinct 1568 filehandles guaranteed 1569 to refer to two 1570 different file system 1571 objects? 1573 lease_time 10 nfs_lease4 READ 1574 Duration of leases at 1575 server in seconds. 1577 rdattr_error 11 enum READ 1578 Error returned from 1579 getattr during 1580 readdir. 1582 filehandle 19 nfs_fh4 READ 1583 The filehandle of this 1584 object (primarily for 1585 readdir requests). 1587 5.5. Recommended Attributes - Definitions 1589 Name # Data Type Access Description 1590 _____________________________________________________________________ 1591 ACL 12 nfsace4<> R/W 1592 The access control 1593 list for the object. 1595 aclsupport 13 uint32 READ 1596 Indicates what types 1597 of ACLs are supported 1598 on the current file 1599 system. 1601 archive 14 bool R/W 1602 Whether or not this 1603 file has been 1604 archived since the 1605 time of last 1606 modification 1607 (deprecated in favor 1608 of time_backup). 1610 cansettime 15 bool READ 1611 Is the server able to 1612 change the times for 1613 a file system object 1614 as specified in a 1615 SETATTR operation? 1617 case_insensitive 16 bool READ 1618 Are filename 1619 comparisons on this 1620 file system case 1621 insensitive? 1623 case_preserving 17 bool READ 1624 Is filename case on 1625 this file system 1626 preserved? 1628 chown_restricted 18 bool READ 1629 If TRUE, the server 1630 will reject any 1631 request to change 1632 either the owner or 1633 the group associated 1634 with a file if the 1635 caller is not a 1636 privileged user (for 1637 example, "root" in 1638 Unix operating 1639 environments or in NT 1640 the "Take Ownership" 1641 privilege) 1643 fileid 20 uint64 READ 1644 A number uniquely 1645 identifying the file 1646 within the file 1647 system. 1649 files_avail 21 uint64 READ 1650 File slots available 1651 to this user on the 1652 file system 1653 containing this 1654 object - this should 1655 be the smallest 1656 relevant limit. 1658 files_free 22 uint64 READ 1659 Free file slots on 1660 the file system 1661 containing this 1662 object - this should 1663 be the smallest 1664 relevant limit. 1666 files_total 23 uint64 READ 1667 Total file slots on 1668 the file system 1669 containing this 1670 object. 1672 fs_locations 24 fs_locations READ 1673 Locations where this 1674 file system may be 1675 found. If the server 1676 returns NFS4ERR_MOVED 1677 as an error, this 1678 attribute must be 1679 supported. 1681 hidden 25 bool R/W 1682 Is file considered 1683 hidden with respect 1684 to the WIN32 API? 1686 homogeneous 26 bool READ 1687 Whether or not this 1688 object's file system 1689 is homogeneous, i.e. 1690 are per file system 1691 attributes the same 1692 for all file system's 1693 objects. 1695 maxfilesize 27 uint64 READ 1696 Maximum supported 1697 file size for the 1698 file system of this 1699 object. 1701 maxlink 28 uint32 READ 1702 Maximum number of 1703 links for this 1704 object. 1706 maxname 29 uint32 READ 1707 Maximum filename size 1708 supported for this 1709 object. 1711 maxread 30 uint64 READ 1712 Maximum read size 1713 supported for this 1714 object. 1716 maxwrite 31 uint64 READ 1717 Maximum write size 1718 supported for this 1719 object. This 1720 attribute SHOULD be 1721 supported if the file 1722 is writable. Lack of 1723 this attribute can 1724 lead to the client 1725 either wasting 1726 bandwidth or not 1727 receiving the best 1728 performance. 1730 mimetype 32 utf8<> R/W 1731 MIME body 1732 type/subtype of this 1733 object. 1735 mode 33 mode4 R/W 1736 Unix-style permission 1737 bits for this object 1738 (deprecated in favor 1739 of ACLs) 1741 no_trunc 34 bool READ 1742 If a name longer than 1743 name_max is used, 1744 will an error be 1745 returned or will the 1746 name be truncated? 1748 numlinks 35 uint32 READ 1749 Number of hard links 1750 to this object. 1752 owner 36 utf8<> R/W 1753 The string name of 1754 the owner of this 1755 object. 1757 owner_group 37 utf8<> R/W 1758 The string name of 1759 the group ownership 1760 of this object. 1762 quota_avail_hard 38 uint64 READ 1763 For definition see 1764 "Quota Attributes" 1765 section below. 1767 quota_avail_soft 39 uint64 READ 1768 For definition see 1769 "Quota Attributes" 1770 section below. 1772 quota_used 40 uint64 READ 1773 For definition see 1774 "Quota Attributes" 1775 section below. 1777 rawdev 41 specdata4 READ 1778 Raw device 1779 identifier. Unix 1780 device major/minor 1781 node information. 1783 space_avail 42 uint64 READ 1784 Disk space in bytes 1785 available to this 1786 user on the file 1787 system containing 1788 this object - this 1789 should be the 1790 smallest relevant 1791 limit. 1793 space_free 43 uint64 READ 1794 Free disk space in 1795 bytes on the file 1796 system containing 1797 this object - this 1798 should be the 1799 smallest relevant 1800 limit. 1802 space_total 44 uint64 READ 1803 Total disk space in 1804 bytes on the file 1805 system containing 1806 this object. 1808 space_used 45 uint64 READ 1809 Number of file system 1810 bytes allocated to 1811 this object. 1813 system 46 bool R/W 1814 Is this file a system 1815 file with respect to 1816 the WIN32 API? 1818 time_access 47 nfstime4 READ 1819 The time of last 1820 access to the object. 1822 time_access_set 48 settime4 WRITE 1823 Set the time of last 1824 access to the object. 1825 SETATTR use only. 1827 time_backup 49 nfstime4 R/W 1828 The time of last 1829 backup of the object. 1831 time_create 50 nfstime4 R/W 1832 The time of creation 1833 of the object. This 1834 attribute does not 1835 have any relation to 1836 the traditional Unix 1837 file attribute 1838 "ctime" or "change 1839 time". 1841 time_delta 51 nfstime4 READ 1842 Smallest useful 1843 server time 1844 granularity. 1846 time_metadata 52 nfstime4 R/W 1847 The time of last 1848 meta-data 1849 modification of the 1850 object. 1852 time_modify 53 nfstime4 READ 1853 The time of last 1854 modification to the 1855 object. 1857 time_modify_set 54 settime4 WRITE 1858 Set the time of last 1859 modification to the 1860 object. SETATTR use 1861 only. 1863 5.6. Interpreting owner and owner_group 1865 The recommended attributes "owner" and "owner_group" (and also users 1866 and groups within the "acl" attribute) are represented in terms of a 1867 UTF-8 string. To avoid a representation that is tied to a particular 1868 underlying implementation at the client or server, the use of the 1869 UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] 1870 provides additional rationale. It is expected that the client and 1871 server will have their own local representation of owner and 1872 owner_group that is used for local storage or presentation to the end 1873 user. Therefore, it is expected that when these attributes are 1874 transferred between the client and server that the local 1875 representation is translated to a syntax of the form 1876 "user@dns_domain". This will allow for a client and server that do 1877 not use the same local representation the ability to translate to a 1878 common syntax that can be interpreted by both. 1880 Similarly, security principals may be represented in different ways 1881 by different security mechanisms. Servers normally translate these 1882 representations into a common format, generally that used by local 1883 storage, to serve as a means of identifying the users corresponding 1884 to these security principals. When these local identifiers are 1885 translated to the form of the owner attribute, associated with files 1886 created by such principals they identify, in a common format, the 1887 users associated with each corresponding set of security principals. 1889 The translation used to interpret owner and group strings is not 1890 specified as part of the protocol. This allows various solutions to 1891 be employed. For example, a local translation table may be consulted 1892 that maps between a numeric id to the user@dns_domain syntax. A name 1893 service may also be used to accomplish the translation. A server may 1894 provide a more general service, not limited by any particular 1895 translation (which would only translate a limited set of possible 1896 strings) by storing the owner and owner_group attributes in local 1897 storage without any translation or it may augment a translation 1898 method by storing the entire string for attributes for which no 1899 translation is available while using the local representation for 1900 those cases in which a translation is available. 1902 Servers that do not provide support for all possible values of the 1903 owner and owner_group attributes, should return an error 1904 (NFS4ERR_BADOWNER) when a string is presented that has no 1905 translation, as the value to be set for a SETATTR of the owner, 1906 owner_group, or acl attributes. When a server does accept an owner 1907 or owner_group value as valid on a SETATTR (and similarly for the 1908 owner and group strings in an acl), it is promising to return that 1909 same string when a corresponding GETATTR is done. Configuration 1910 changes and ill-constructed name translations (those that contain 1911 aliasing) may make that promise impossible to honor. Servers should 1912 make appropriate efforts to avoid a situation in which these 1913 attributes have their values changed when no real change to ownership 1914 has occurred. 1916 The "dns_domain" portion of the owner string is meant to be a DNS 1917 domain name. For example, user@ietf.org. Servers should accept as 1918 valid a set of users for at least one domain. A server may treat 1919 other domains as having no valid translations. A more general 1920 service is provided when a server is capable of accepting users for 1921 multiple domains, or for all domains, subject to security 1922 constraints. 1924 In the case where there is no translation available to the client or 1925 server, the attribute value must be constructed without the "@". 1926 Therefore, the absence of the @ from the owner or owner_group 1927 attribute signifies that no translation was available at the sender 1928 and that the receiver of the attribute should not use that string as 1929 a basis for translation into its own internal format. Even though 1930 the attribute value can not be translated, it may still be useful. 1931 In the case of a client, the attribute string may be used for local 1932 display of ownership. 1934 To provide a greater degree of compatibility with previous versions 1935 of NFS (i.e. v2 and v3), which identified users and groups by 32-bit 1936 unsigned uid's and gid's, owner and group strings that consist of 1937 decimal numeric values with no leading zeros can be given a special 1938 interpretation by clients and servers which choose to provide such 1939 support. The receiver may treat such a user or group string as 1940 representing the same user as would be represented by a v2/v3 uid or 1941 gid having the corresponding numeric value. A server is not 1942 obligated to accept such a string, but may return an NFS4ERR_BADOWNER 1943 instead. To avoid this mechanism being used to subvert user and 1944 group translation, so that a client might pass all of the owners and 1945 groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER 1946 error when there is a valid translation for the user or owner 1947 designated in this way. In that case, the client must use the 1948 appropriate name@domain string and not the special form for 1949 compatibility. 1951 The owner string "nobody" may be used to designate an anonymous user, 1952 which will be associated with a file created by a security principal 1953 that cannot be mapped through normal means to the owner attribute. 1955 5.7. Character Case Attributes 1957 With respect to the case_insensitive and case_preserving attributes, 1958 each UCS-4 character (which UTF-8 encodes) has a "long descriptive 1959 name" [RFC1345] which may or may not included the word "CAPITAL" or 1960 "SMALL". The presence of SMALL or CAPITAL allows an NFS server to 1961 implement unambiguous and efficient table driven mappings for case 1962 insensitive comparisons, and non-case-preserving storage. For 1963 general character handling and internationalization issues, see the 1964 section "Internationalization". 1966 5.8. Quota Attributes 1968 For the attributes related to file system quotas, the following 1969 definitions apply: 1971 quota_avail_soft 1972 The value in bytes which represents the amount of additional 1973 disk space that can be allocated to this file or directory 1974 before the user may reasonably be warned. It is understood that 1975 this space may be consumed by allocations to other files or 1976 directories though there is a rule as to which other files or 1977 directories. 1979 quota_avail_hard 1980 The value in bytes which represent the amount of additional disk 1981 space beyond the current allocation that can be allocated to 1982 this file or directory before further allocations will be 1983 refused. It is understood that this space may be consumed by 1984 allocations to other files or directories. 1986 quota_used 1987 The value in bytes which represent the amount of disc space used 1988 by this file or directory and possibly a number of other similar 1989 files or directories, where the set of "similar" meets at least 1990 the criterion that allocating space to any file or directory in 1991 the set will reduce the "quota_avail_hard" of every other file 1992 or directory in the set. 1994 Note that there may be a number of distinct but overlapping sets 1995 of files or directories for which a quota_used value is 1996 maintained. E.g. "all files with a given owner", "all files with 1997 a given group owner". etc. 1999 The server is at liberty to choose any of those sets but should 2000 do so in a repeatable way. The rule may be configured per- 2001 filesystem or may be "choose the set with the smallest quota". 2003 5.9. Access Control Lists 2005 The NFS ACL attribute is an array of access control entries (ACE). 2006 There are various access control entry types. The server is able to 2007 communicate which ACE types are supported by returning the 2008 appropriate value within the aclsupport attribute. The types of ACEs 2009 are defined as follows: 2011 Type Description 2012 _____________________________________________________ 2013 ALLOW 2014 Explicitly grants the access defined in 2015 acemask4 to the file or directory. 2017 DENY 2018 Explicitly denies the access defined in 2019 acemask4 to the file or directory. 2021 AUDIT 2022 LOG (system dependent) any access 2023 attempt to a file or directory which 2024 uses any of the access methods specified 2025 in acemask4. 2027 ALARM 2028 Generate a system ALARM (system 2029 dependent) when any access attempt is 2030 made to a file or directory for the 2031 access methods specified in acemask4. 2033 The NFS ACE attribute is defined as follows: 2035 typedef uint32_t acetype4; 2036 typedef uint32_t aceflag4; 2037 typedef uint32_t acemask4; 2039 struct nfsace4 { 2040 acetype4 type; 2041 aceflag4 flag; 2042 acemask4 access_mask; 2043 utf8string who; 2044 }; 2046 To determine if an ACCESS or OPEN request succeeds each nfsace4 entry 2047 is processed in order by the server. Only ACEs which have a "who" 2048 that matches the requester are considered. Each ACE is processed 2049 until all of the bits of the requester's access have been ALLOWED. 2050 Once a bit (see below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it 2051 is no longer considered in the processing of later ACEs. If an 2052 ACCESS_DENIED_ACE is encountered where the requester's mode still has 2053 unALLOWED bits in common with the "access_mask" of the ACE, the 2054 request is denied. 2056 The bitmask constants used to represent the above definitions within 2057 the aclsupport attribute are as follows: 2059 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 2060 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 2061 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 2062 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 2064 5.9.1. ACE type 2066 The semantics of the "type" field follow the descriptions provided 2067 above. 2069 The bitmask constants used for the type field are as follows: 2071 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 2072 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 2073 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 2074 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 2076 5.9.2. ACE flag 2078 The "flag" field contains values based on the following descriptions. 2080 ACE4_FILE_INHERIT_ACE 2082 Can be placed on a directory and indicates that this ACE should be 2083 added to each new non-directory file created. 2085 ACE4_DIRECTORY_INHERIT_ACE 2087 Can be placed on a directory and indicates that this ACE should be 2088 added to each new directory created. 2090 ACE4_INHERIT_ONLY_ACE 2092 Can be placed on a directory but does not apply to the directory, 2093 only to newly created files/directories as specified by the above two 2094 flags. 2096 ACE4_NO_PROPAGATE_INHERIT_ACE 2098 Can be placed on a directory. Normally when a new directory is 2099 created and an ACE exists on the parent directory which is marked 2100 ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new directory. 2101 One for the directory itself and one which is an inheritable ACE for 2102 newly created directories. This flag tells the server to not place 2103 an ACE on the newly created directory which is inheritable by 2104 subdirectories of the created directory. 2106 ACE4_SUCCESSFUL_ACCESS_ACE_FLAG 2108 ACL4_FAILED_ACCESS_ACE_FLAG 2110 Both indicate for AUDIT and ALARM which state to log the event. On 2111 every ACCESS or OPEN call which occurs on a file or directory which 2112 has an ACL that is of type ACE4_SYSTEM_AUDIT_ACE_TYPE or 2113 ACE4_SYSTEM_ALARM_ACE_TYPE, the attempted access is compared to the 2114 ace4mask of these ACLs. If the access is a subset of ace4mask and the 2115 identifier match, an AUDIT trail or an ALARM is generated. By 2116 default this happens regardless of the success or failure of the 2117 ACCESS or OPEN call. 2119 The flag ACE4_SUCCESSFUL_ACCESS_ACE_FLAG only produces the AUDIT or 2120 ALARM if the ACCESS or OPEN call is successful. The 2121 ACE4_FAILED_ACCESS_ACE_FLAG causes the ALARM or AUDIT if the ACCESS 2122 or OPEN call fails. 2124 ACE4_IDENTIFIER_GROUP 2126 Indicates that the "who" refers to a GROUP as defined under Unix. 2128 The bitmask constants used for the flag field are as follows: 2130 const ACE4_FILE_INHERIT_ACE = 0x00000001; 2131 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 2132 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 2133 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 2134 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 2135 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 2136 const ACE4_IDENTIFIER_GROUP = 0x00000040; 2138 5.9.3. ACE Access Mask 2140 The access_mask field contains values based on the following: 2142 Access Description 2143 _______________________________________________________________ 2144 READ_DATA 2145 Permission to read the data of the file 2146 LIST_DIRECTORY 2147 Permission to list the contents of a 2148 directory 2149 WRITE_DATA 2150 Permission to modify the file's data 2151 ADD_FILE 2152 Permission to add a new file to a 2153 directory 2154 APPEND_DATA 2155 Permission to append data to a file 2156 ADD_SUBDIRECTORY 2157 Permission to create a subdirectory to a 2158 directory 2159 READ_NAMED_ATTRS 2160 Permission to read the named attributes 2161 of a file 2162 WRITE_NAMED_ATTRS 2163 Permission to write the named attributes 2164 of a file 2165 EXECUTE 2166 Permission to execute a file 2167 DELETE_CHILD 2168 Permission to delete a file or directory 2169 within a directory 2170 READ_ATTRIBUTES 2171 The ability to read basic attributes 2172 (non-acls) of a file 2173 WRITE_ATTRIBUTES 2174 Permission to change basic attributes 2175 (non-acls) of a file 2177 DELETE 2178 Permission to Delete the file 2179 READ_ACL 2180 Permission to Read the ACL 2181 WRITE_ACL 2182 Permission to Write the ACL 2183 WRITE_OWNER 2184 Permission to change the owner 2185 SYNCHRONIZE 2186 Permission to access file locally at the 2187 server with synchronous reads and writes 2189 The bitmask constants used for the access mask field are as follows: 2191 const ACE4_READ_DATA = 0x00000001; 2192 const ACE4_LIST_DIRECTORY = 0x00000001; 2193 const ACE4_WRITE_DATA = 0x00000002; 2194 const ACE4_ADD_FILE = 0x00000002; 2195 const ACE4_APPEND_DATA = 0x00000004; 2196 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 2197 const ACE4_READ_NAMED_ATTRS = 0x00000008; 2198 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 2199 const ACE4_EXECUTE = 0x00000020; 2200 const ACE4_DELETE_CHILD = 0x00000040; 2201 const ACE4_READ_ATTRIBUTES = 0x00000080; 2202 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 2204 const ACE4_DELETE = 0x00010000; 2205 const ACE4_READ_ACL = 0x00020000; 2206 const ACE4_WRITE_ACL = 0x00040000; 2207 const ACE4_WRITE_OWNER = 0x00080000; 2208 const ACE4_SYNCHRONIZE = 0x00100000; 2210 5.9.4. ACE who 2212 There are several special identifiers ("who") which need to be 2213 understood universally. Some of these identifiers cannot be 2214 understood when an NFS client accesses the server, but have meaning 2215 when a local process accesses the file. The ability to display and 2216 modify these permissions is permitted over NFS. 2218 Who Description 2219 _______________________________________________________________ 2220 "OWNER" 2221 The owner of the file. 2222 "GROUP" 2223 The group associated with the file. 2224 "EVERYONE" 2225 The world. 2226 "INTERACTIVE" 2227 Accessed from an interactive terminal. 2228 "NETWORK" 2229 Accessed via the network. 2230 "DIALUP" 2231 Accessed as a dialup user to the server. 2232 "BATCH" 2233 Accessed from a batch job. 2234 "ANONYMOUS" 2235 Accessed without any authentication. 2236 "AUTHENTICATED" 2237 Any authenticated user (opposite of 2238 ANONYMOUS) 2239 "SERVICE" 2240 Access from a system service. 2242 To avoid conflict, these special identifiers are distinguish by an 2243 appended "@" and should appear in the form "xxxx@" (note: no domain 2244 name after the "@"). For example: ANONYMOUS@. 2246 6. File System Migration and Replication 2248 With the use of the recommended attribute "fs_locations", the NFS 2249 version 4 server has a method of providing file system migration or 2250 replication services. For the purposes of migration and replication, 2251 a file system will be defined as all files that share a given fsid 2252 (both major and minor values are the same). 2254 The fs_locations attribute provides a list of file system locations. 2255 These locations are specified by providing the server name (either 2256 DNS domain or IP address) and the path name representing the root of 2257 the file system. Depending on the type of service being provided, 2258 the list will provide a new location or a set of alternate locations 2259 for the file system. The client will use this information to 2260 redirect its requests to the new server. 2262 6.1. Replication 2264 It is expected that file system replication will be used in the case 2265 of read-only data. Typically, the file system will be replicated on 2266 two or more servers. The fs_locations attribute will provide the 2267 list of these locations to the client. On first access of the file 2268 system, the client should obtain the value of the fs_locations 2269 attribute. If, in the future, the client finds the server 2270 unresponsive, the client may attempt to use another server specified 2271 by fs_locations. 2273 If applicable, the client must take the appropriate steps to recover 2274 valid filehandles from the new server. This is described in more 2275 detail in the following sections. 2277 6.2. Migration 2279 File system migration is used to move a file system from one server 2280 to another. Migration is typically used for a file system that is 2281 writable and has a single copy. The expected use of migration is for 2282 load balancing or general resource reallocation. The protocol does 2283 not specify how the file system will be moved between servers. This 2284 server-to-server transfer mechanism is left to the server 2285 implementor. However, the method used to communicate the migration 2286 event between client and server is specified here. 2288 Once the servers participating in the migration have completed the 2289 move of the file system, the error NFS4ERR_MOVED will be returned for 2290 subsequent requests received by the original server. The 2291 NFS4ERR_MOVED error is returned for all operations except PUTFH and 2292 GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will 2293 obtain the value of the fs_locations attribute. The client will then 2294 use the contents of the attribute to redirect its requests to the 2295 specified server. To facilitate the use of GETATTR, operations such 2296 as PUTFH must also be accepted by the server for the migrated file 2297 system's filehandles. Note that if the server returns NFS4ERR_MOVED, 2298 the server MUST support the fs_locations attribute. 2300 If the client requests more attributes than just fs_locations, the 2301 server may return fs_locations only. This is to be expected since 2302 the server has migrated the file system and may not have a method of 2303 obtaining additional attribute data. 2305 The server implementor needs to be careful in developing a migration 2306 solution. The server must consider all of the state information 2307 clients may have outstanding at the server. This includes but is not 2308 limited to locking/share state, delegation state, and asynchronous 2309 file writes which are represented by WRITE and COMMIT verifiers. The 2310 server should strive to minimize the impact on its clients during and 2311 after the migration process. 2313 6.3. Interpretation of the fs_locations Attribute 2315 The fs_location attribute is structured in the following way: 2317 struct fs_location { 2318 utf8string server<>; 2319 pathname4 rootpath; 2320 }; 2322 struct fs_locations { 2323 pathname4 fs_root; 2324 fs_location locations<>; 2325 }; 2327 The fs_location struct is used to represent the location of a file 2328 system by providing a server name and the path to the root of the 2329 file system. For a multi-homed server or a set of servers that use 2330 the same rootpath, an array of server names may be provided. An 2331 entry in the server array is an UTF8 string and represents one of a 2332 traditional DNS host name, IPv4 address, or IPv6 address. It is not 2333 a requirement that all servers that share the same rootpath be listed 2334 in one fs_location struct. The array of server names is provided for 2335 convenience. Servers that share the same rootpath may also be listed 2336 in separate fs_location entries in the fs_locations attribute. 2338 The fs_locations struct and attribute then contains an array of 2339 locations. Since the name space of each server may be constructed 2340 differently, the "fs_root" field is provided. The path represented 2341 by fs_root represents the location of the file system in the server's 2342 name space. Therefore, the fs_root path is only associated with the 2343 server from which the fs_locations attribute was obtained. The 2344 fs_root path is meant to aid the client in locating the file system 2345 at the various servers listed. 2347 As an example, there is a replicated file system located at two 2348 servers (servA and servB). At servA the file system is located at 2349 path "/a/b/c". At servB the file system is located at path "/x/y/z". 2350 In this example the client accesses the file system first at servA 2351 with a multi-component lookup path of "/a/b/c/d". Since the client 2352 used a multi-component lookup to obtain the filehandle at "/a/b/c/d", 2353 it is unaware that the file system's root is located in servA's name 2354 space at "/a/b/c". When the client switches to servB, it will need 2355 to determine that the directory it first referenced at servA is now 2356 represented by the path "/x/y/z/d" on servB. To facilitate this, the 2357 fs_locations attribute provided by servA would have a fs_root value 2358 of "/a/b/c" and two entries in fs_location. One entry in fs_location 2359 will be for itself (servA) and the other will be for servB with a 2360 path of "/x/y/z". With this information, the client is able to 2361 substitute "/x/y/z" for the "/a/b/c" at the beginning of its access 2362 path and construct "/x/y/z/d" to use for the new server. 2364 6.4. Filehandle Recovery for Migration or Replication 2366 Filehandles for file systems that are replicated or migrated 2367 generally have the same semantics as for file systems that are not 2368 replicated or migrated. For example, if a file system has persistent 2369 filehandles and it is migrated to another server, the filehandle 2370 values for the file system will be valid at the new server. 2372 For volatile filehandles, the servers involved likely do not have a 2373 mechanism to transfer filehandle format and content between 2374 themselves. Therefore, a server may have difficulty in determining 2375 if a volatile filehandle from an old server should return an error of 2376 NFS4ERR_FHEXPIRED. Therefore, the client is informed, with the use 2377 of the fh_expire_type attribute, whether volatile filehandles will 2378 expire at the migration or replication event. If the bit 2379 FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client 2380 must treat the volatile filehandle as if the server had returned the 2381 NFS4ERR_FHEXPIRED error. At the migration or replication event in 2382 the presence of the FH4_VOL_MIGRATION bit, the client will not 2383 present the original or old volatile file handle to the new server. 2384 The client will start its communication with the new server by 2385 recovering its filehandles using the saved file names. 2387 7. NFS Server Name Space 2389 7.1. Server Exports 2391 On a UNIX server the name space describes all the files reachable by 2392 pathnames under the root directory or "/". On a Windows NT server 2393 the name space constitutes all the files on disks named by mapped 2394 disk letters. NFS server administrators rarely make the entire 2395 server's file system name space available to NFS clients. More often 2396 portions of the name space are made available via an "export" 2397 feature. In previous versions of the NFS protocol, the root 2398 filehandle for each export is obtained through the MOUNT protocol; 2399 the client sends a string that identifies the export of name space 2400 and the server returns the root filehandle for it. The MOUNT 2401 protocol supports an EXPORTS procedure that will enumerate the 2402 server's exports. 2404 7.2. Browsing Exports 2406 The NFS version 4 protocol provides a root filehandle that clients 2407 can use to obtain filehandles for these exports via a multi-component 2408 LOOKUP. A common user experience is to use a graphical user 2409 interface (perhaps a file "Open" dialog window) to find a file via 2410 progressive browsing through a directory tree. The client must be 2411 able to move from one export to another export via single-component, 2412 progressive LOOKUP operations. 2414 This style of browsing is not well supported by the NFS version 2 and 2415 3 protocols. The client expects all LOOKUP operations to remain 2416 within a single server file system. For example, the device 2417 attribute will not change. This prevents a client from taking name 2418 space paths that span exports. 2420 An automounter on the client can obtain a snapshot of the server's 2421 name space using the EXPORTS procedure of the MOUNT protocol. If it 2422 understands the server's pathname syntax, it can create an image of 2423 the server's name space on the client. The parts of the name space 2424 that are not exported by the server are filled in with a "pseudo file 2425 system" that allows the user to browse from one mounted file system 2426 to another. There is a drawback to this representation of the 2427 server's name space on the client: it is static. If the server 2428 administrator adds a new export the client will be unaware of it. 2430 7.3. Server Pseudo File System 2432 NFS version 4 servers avoid this name space inconsistency by 2433 presenting all the exports within the framework of a single server 2434 name space. An NFS version 4 client uses LOOKUP and READDIR 2435 operations to browse seamlessly from one export to another. Portions 2436 of the server name space that are not exported are bridged via a 2437 "pseudo file system" that provides a view of exported directories 2438 only. A pseudo file system has a unique fsid and behaves like a 2439 normal, read only file system. 2441 Based on the construction of the server's name space, it is possible 2442 that multiple pseudo file systems may exist. For example, 2444 /a pseudo file system 2445 /a/b real file system 2446 /a/b/c pseudo file system 2447 /a/b/c/d real file system 2449 Each of the pseudo file systems are consider separate entities and 2450 therefore will have a unique fsid. 2452 7.4. Multiple Roots 2454 The DOS and Windows operating environments are sometimes described as 2455 having "multiple roots". File systems are commonly represented as 2456 disk letters. MacOS represents file systems as top level names. NFS 2457 version 4 servers for these platforms can construct a pseudo file 2458 system above these root names so that disk letters or volume names 2459 are simply directory names in the pseudo root. 2461 7.5. Filehandle Volatility 2463 The nature of the server's pseudo file system is that it is a logical 2464 representation of file system(s) available from the server. 2465 Therefore, the pseudo file system is most likely constructed 2466 dynamically when the server is first instantiated. It is expected 2467 that the pseudo file system may not have an on disk counterpart from 2468 which persistent filehandles could be constructed. Even though it is 2469 preferable that the server provide persistent filehandles for the 2470 pseudo file system, the NFS client should expect that pseudo file 2471 system filehandles are volatile. This can be confirmed by checking 2472 the associated "fh_expire_type" attribute for those filehandles in 2473 question. If the filehandles are volatile, the NFS client must be 2474 prepared to recover a filehandle value (e.g. with a multi-component 2475 LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. 2477 7.6. Exported Root 2479 If the server's root file system is exported, one might conclude that 2480 a pseudo-file system is not needed. This would be wrong. Assume the 2481 following file systems on a server: 2483 / disk1 (exported) 2484 /a disk2 (not exported) 2485 /a/b disk3 (exported) 2487 Because disk2 is not exported, disk3 cannot be reached with simple 2488 LOOKUPs. The server must bridge the gap with a pseudo-file system. 2490 7.7. Mount Point Crossing 2492 The server file system environment may be constructed in such a way 2493 that one file system contains a directory which is 'covered' or 2494 mounted upon by a second file system. For example: 2496 /a/b (file system 1) 2497 /a/b/c/d (file system 2) 2499 The pseudo file system for this server may be constructed to look 2500 like: 2502 / (place holder/not exported) 2503 /a/b (file system 1) 2504 /a/b/c/d (file system 2) 2506 It is the server's responsibility to present the pseudo file system 2507 that is complete to the client. If the client sends a lookup request 2508 for the path "/a/b/c/d", the server's response is the filehandle of 2509 the file system "/a/b/c/d". In previous versions of the NFS 2510 protocol, the server would respond with the directory "/a/b/c/d" 2511 within the file system "/a/b". 2513 The NFS client will be able to determine if it crosses a server mount 2514 point by a change in the value of the "fsid" attribute. 2516 7.8. Security Policy and Name Space Presentation 2518 The application of the server's security policy needs to be carefully 2519 considered by the implementor. One may choose to limit the 2520 viewability of portions of the pseudo file system based on the 2521 server's perception of the client's ability to authenticate itself 2522 properly. However, with the support of multiple security mechanisms 2523 and the ability to negotiate the appropriate use of these mechanisms, 2524 the server is unable to properly determine if a client will be able 2525 to authenticate itself. If, based on its policies, the server 2526 chooses to limit the contents of the pseudo file system, the server 2527 may effectively hide file systems from a client that may otherwise 2528 have legitimate access. 2530 8. File Locking and Share Reservations 2532 Integrating locking into the NFS protocol necessarily causes it to be 2533 state-full. With the inclusion of "share" file locks the protocol 2534 becomes substantially more dependent on state than the traditional 2535 combination of NFS and NLM [XNFS]. There are three components to 2536 making this state manageable: 2538 o Clear division between client and server 2540 o Ability to reliably detect inconsistency in state between client 2541 and server 2543 o Simple and robust recovery mechanisms 2545 In this model, the server owns the state information. The client 2546 communicates its view of this state to the server as needed. The 2547 client is also able to detect inconsistent state before modifying a 2548 file. 2550 To support Win32 "share" locks it is necessary to atomically OPEN or 2551 CREATE files. Having a separate share/unshare operation would not 2552 allow correct implementation of the Win32 OpenFile API. In order to 2553 correctly implement share semantics, the previous NFS protocol 2554 mechanisms used when a file is opened or created (LOOKUP, CREATE, 2555 ACCESS) need to be replaced. The NFS version 4 protocol has an OPEN 2556 operation that subsumes the NFS version 3 methodology of LOOKUP, 2557 CREATE, and ACCESS. However, because many operations require a 2558 filehandle, the traditional LOOKUP is preserved to map a file name to 2559 filehandle without establishing state on the server. The policy of 2560 granting access or modifying files is managed by the server based on 2561 the client's state. These mechanisms can implement policy ranging 2562 from advisory only locking to full mandatory locking. 2564 8.1. Locking 2566 It is assumed that manipulating a lock is rare when compared to READ 2567 and WRITE operations. It is also assumed that crashes and network 2568 partitions are relatively rare. Therefore it is important that the 2569 READ and WRITE operations have a lightweight mechanism to indicate if 2570 they possess a held lock. A lock request contains the heavyweight 2571 information required to establish a lock and uniquely define the lock 2572 owner. 2574 The following sections describe the transition from the heavy weight 2575 information to the eventual stateid used for most client and server 2576 locking and lease interactions. 2578 8.1.1. Client ID 2580 For each LOCK request, the client must identify itself to the server. 2582 This is done in such a way as to allow for correct lock 2583 identification and crash recovery. Client identification is 2584 accomplished with two values. 2586 o A verifier that is used to detect client reboots. 2588 o A variable length opaque array to uniquely define a client. 2590 For an operating system this may be a fully qualified host 2591 name or IP address. For a user level NFS client it may 2592 additionally contain a process id or other unique sequence. 2594 The data structure for the Client ID would then appear as: 2596 struct nfs_client_id { 2597 opaque verifier[4]; 2598 opaque id<>; 2599 } 2601 It is possible through the mis-configuration of a client or the 2602 existence of a rogue client that two clients end up using the same 2603 nfs_client_id. This situation is avoided by "negotiating" the 2604 nfs_client_id between client and server with the use of the 2605 SETCLIENTID and SETCLIENTID_CONFIRM operations. The following 2606 describes the two scenarios of negotiation. 2608 1 Client has never connected to the server 2610 In this case the client generates an nfs_client_id and 2611 unless another client has the same nfs_client_id.id field, 2612 the server accepts the request. The server also records the 2613 principal (or principal to uid mapping) from the credential 2614 in the RPC request that contains the nfs_client_id 2615 negotiation request (SETCLIENTID operation). 2617 Two clients might still use the same nfs_client_id.id due 2618 to perhaps configuration error. For example, a High 2619 Availability configuration where the nfs_client_id.id is 2620 derived from the ethernet controller address and both 2621 systems have the same address. In this case, the result is 2622 a switched union that returns, in addition to 2623 NFS4ERR_CLID_INUSE, the network address (the rpcbind netid 2624 and universal address) of the client that is using the id. 2626 2 Client is re-connecting to the server after a client reboot 2628 In this case, the client still generates an nfs_client_id 2629 but the nfs_client_id.id field will be the same as the 2630 nfs_client_id.id generated prior to reboot. If the server 2631 finds that the principal/uid is equal to the previously 2632 "registered" nfs_client_id.id, the server creates and 2633 returns a new clientid in response to the SETCLIENTID. If 2634 the principal/uid is not equal, then this is a rogue client 2635 and the request is returned in error. For more discussion 2636 of crash recovery semantics, see the section on "Crash 2637 Recovery". 2639 It is possible for a retransmission of request to be received by the 2640 server after the server has acted upon and responded to the original 2641 client request. Therefore to mitigate effects of the retransmission 2642 of the SETCLIENTID operation, the client and server use a 2643 confirmation step. The client uses the SETCLIENTID_CONFIRM operation 2644 with the server provided clientid to confirm the client's use of the 2645 new clientid. Once the server receives the confirmation from the 2646 client, the locking state for the client is released. 2648 In both cases, upon success, NFS4_OK is returned. To help reduce the 2649 amount of data transferred on OPEN and LOCK, the server will also 2650 return a unique 64-bit clientid value that is a shorthand reference 2651 to the nfs_client_id values presented by the client. From this point 2652 forward, the client will use the clientid to refer to itself. 2654 The clientid assigned by the server should be chosen so that it will 2655 not conflict with a clientid previously assigned by the server. This 2656 applies across server restarts or reboots. When a clientid is 2657 presented to a server and that clientid is not recognized, as would 2658 happen after a server reboot, the server will reject the request with 2659 the error NFS4ERR_STALE_CLIENTID. When this happens, the client must 2660 obtain a new clientid by use of the SETCLIENTID operation and then 2661 proceed to any other necessary recovery for the server reboot case 2662 (See the section "Server Failure and Recovery"). 2664 The client must also employ the SETCLIENTID operation when it 2665 receives a NFS4ERR_STALE_STATEID error using a stateid derived from 2666 its current clientid, since this also indicates a server reboot which 2667 has invalidated the existing clientid (see the next section 2668 "nfs_lockowner and stateid Definition" for details). 2670 8.1.2. Server Release of Clientid 2672 If the server determines that the client holds no associated state 2673 for its clientid, the server may choose to release the clientid. The 2674 server may make this choice for an inactive client so that resources 2675 are not consumed by those intermittently active clients. If the 2676 client contacts the server after this release, the server must ensure 2677 the client receives the appropriate error so that it will use the 2678 SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. 2679 It should be clear that the server must be very hesitant to release a 2680 clientid since the resulting work on the client to recover from such 2681 an event will be the same burden as if the server had failed and 2682 restarted. Typically a server would not release a clientid unless 2683 there had been no activity from that client for many minutes. 2685 8.1.3. nfs_lockowner and stateid Definition 2687 When requesting a lock, the client must present to the server the 2688 clientid and an identifier for the owner of the requested lock. 2689 These two fields are referred to as the nfs_lockowner and the 2690 definition of those fields are: 2692 o A clientid returned by the server as part of the client's use of 2693 the SETCLIENTID operation. 2695 o A variable length opaque array used to uniquely define the owner 2696 of a lock managed by the client. 2698 This may be a thread id, process id, or other unique value. 2700 When the server grants the lock, it responds with a unique 64-bit 2701 stateid. The stateid is used as a shorthand reference to the 2702 nfs_lockowner, since the server will be maintaining the 2703 correspondence between them. 2705 The server is free to form the stateid in any manner that it chooses 2706 as long as it is able to recognize invalid and out-of-date stateids. 2707 This requirement includes those stateids generated by earlier 2708 instances of the server. From this, the client can be properly 2709 notified of a server restart. This notification will occur when the 2710 client presents a stateid to the server from a previous 2711 instantiation. 2713 The server must be able to distinguish the following situations and 2714 return the error as specified: 2716 o The stateid was generated by an earlier server instance (i.e. 2717 before a server reboot). The error NFS4ERR_STALE_STATEID should 2718 be returned. 2720 o The stateid was generated by the current server instance but the 2721 stateid no longer designates the current locking state for the 2722 lockowner-file pair in question (i.e. one or more locking 2723 operations has occurred). The error NFS4ERR_OLD_STATEID should 2724 be returned. 2726 This error condition will only occur when the client issues a 2727 locking request which changes a stateid while an I/O request 2728 that uses that stateid is outstanding. 2730 o The stateid was generated by the current server instance but the 2731 stateid does not designate a locking state for any active 2732 lockowner-file pair. The error NFS4ERR_BAD_STATEID should be 2733 returned. 2735 This error condition will occur when there has been a logic 2736 error on the part of the client or server. This should not 2737 happen. 2739 One mechanism that may be used to satisfy these requirements is for 2740 the server to divide stateids into three fields: 2742 o A server verifier which uniquely designates a particular server 2743 instantiation. 2745 o An index into a table of locking-state structures. 2747 o A sequence value which is incremented for each stateid that is 2748 associated with the same index into the locking-state table. 2750 By matching the incoming stateid and its field values with the state 2751 held at the server, the server is able to easily determine if a 2752 stateid is valid for its current instantiation and state. If the 2753 stateid is not valid, the appropriate error can be supplied to the 2754 client. 2756 8.1.4. Use of the stateid 2758 All READ, WRITE and SETATTR operations contain a stateid. For the 2759 purposes of this section, SETATTR operations which change the size 2760 attribute of a file are treated as if they are writing the area 2761 between the old and new size (i.e. the range truncated or added to 2762 the file by means of the SETATTR), even where SETATTR is not 2763 explicitly mentioned in the text. 2765 If the nfs_lockowner performs a READ or WRITE in a situation in which 2766 it has established a lock on the server (and for these purposes any 2767 OPEN constitutes a share lock) the stateid (previously returned by 2768 the server) must be used to indicate what locks, including both 2769 record and share locks, are held by the lockowner. If no state is 2770 established by the client, either record lock or share lock, a 2771 stateid of all bits 0 is used. Regardless of whether a stateid of 2772 all bits 0, or a stateid returned by the server is used, if no 2773 conflicting locks are held on the file, the server may service the 2774 READ or WRITE operation. If a conflict with an explicit lock occurs, 2775 an error is returned for the operation (NFS4ERR_LOCKED). This allows 2776 "mandatory locking" to be implemented. 2778 Share locks are established by OPEN operations and by their nature 2779 are mandatory in that when the OPEN denies READ or WRITE operations, 2780 that denial results in such operations being rejected with error 2781 NFS4ERR_LOCKED. Record locks may be implemented by the server as 2782 either mandatory or advisory, or the choice of mandatory or advisory 2783 behavior may be determined by the server on the basis of the file 2784 being accessed. When record locks are advisory, they only prevent 2785 the granting of conflicting lock requests and have no effect on 2786 READ's or WRITE's. Mandatory record locks, however, prevent 2787 conflicting IO operations and when they are attempted, they are 2788 rejected with NFS4ERR_LOCKED. 2790 Every stateid other than the special stateid values noted above, 2791 whether returned by an OPEN-type operation (i.e. OPEN, 2792 OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), 2793 defines an access mode for the file (i.e. READ, WRITE, or READ_WRITE) 2794 as established by the original OPEN which began the stateid sequence, 2795 and as modified by subsequent OPEN's and OPEN_DOWNGRADE's within that 2796 stateid sequence. When a READ, WRITE, or SETATTR which specifies the 2797 size attribute, is done, the operation is subject to checking against 2798 the access mode to verify that the operation is appropriate given the 2799 OPEN with which the operation is associated. 2801 In the case of WRITE-type operations (i.e. WRITE's and SETATTR's 2802 which set size), the server must verify that the access mode allows 2803 writing and return an NFS4ERR_OPENMODE error if it does not. In the 2804 case, of READ, the server may perform the corresponding check on the 2805 access mode, or it may choose to allow READ on opens for WRITE only, 2806 to accommodate clients whose write implementation may unavoidably do 2807 reads (e.g. due to buffer cache constraints). However, even if 2808 READ's are allowed in these circumstances, the server MUST still 2809 check for locks that conflict with the READ (e.g. another open 2810 specify denial of READ's). Note that a server which does enforce the 2811 access mode check on READ's need not explicitly check for conflicting 2812 share reservations since the existence of OPEN for read access 2813 guarantees that no conflicting share reservation can exist. 2815 A stateid of all bits 1 (one) allows READ operations to bypass 2816 locking checks at the server. However, WRITE operations with a 2817 stateid with bits all 1 (one) do not bypass locking checks and are 2818 treated exactly the same as if a stateid of all bits 0 were used. 2820 An explicit lock may not be granted while a READ or WRITE operation 2821 with conflicting implicit locking is being performed. For the 2822 purposes of this paragraph, a READ is considered as having an 2823 implicit shared record lock for the area being read while a WRITE is 2824 considered as having an implicit exclusive record lock for the area 2825 being written (and similarly for SETATTR's that set size as discussed 2826 above). 2828 8.1.5. Sequencing of Lock Requests 2830 Locking is different than most NFS operations as it requires "at- 2831 most-one" semantics that are not provided by ONCRPC. ONCRPC over a 2832 reliable transport is not sufficient because a sequence of locking 2833 requests may span multiple TCP connections. In the face of 2834 retransmission or reordering, lock or unlock requests must have a 2835 well defined and consistent behavior. To accomplish this, each lock 2836 request contains a sequence number that is a consecutively increasing 2837 integer. Different nfs_lockowners have different sequences. The 2838 server maintains the last sequence number (L) received and the 2839 response that was returned. 2841 Note that for requests that contain a sequence number, for each 2842 nfs_lockowner, there should be no more than one outstanding request. 2844 If a request (r) with a previous sequence number (r < L) is received, 2845 it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a 2846 properly-functioning client, the response to (r) must have been 2847 received before the last request (L) was sent. If a duplicate of 2848 last request (r == L) is received, the stored response is returned. 2849 If a request beyond the next sequence (r == L + 2) is received, it is 2850 rejected with the return of error NFS4ERR_BAD_SEQID. Sequence 2851 history is reinitialized whenever the client verifier changes. 2853 Since the sequence number is represented with an unsigned 32-bit 2854 integer, the arithmetic involved with the sequence number is mod 2855 2^32. 2857 It is critical the server maintain the last response sent to the 2858 client to provide a more reliable cache of duplicate non-idempotent 2859 requests than that of the traditional cache described in [Juszczak]. 2860 The traditional duplicate request cache uses a least recently used 2861 algorithm for removing unneeded requests. However, the last lock 2862 request and response on a given nfs_lockowner must be cached as long 2863 as the lock state exists on the server. 2865 8.1.6. Recovery from Replayed Requests 2867 As described above, the sequence number is per nfs_lockowner. As 2868 long as the server maintains the last sequence number received and 2869 follows the methods described above, there are no risks of a 2870 Byzantine router re-sending old requests. The server need only 2871 maintain the (nfs_lockowner, sequence number) state as long as there 2872 are open files or closed files with locks outstanding. 2874 LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence 2875 number and therefore the risk of the replay of these operations 2876 resulting in undesired effects is non-existent while the server 2877 maintains the nfs_lockowner state. 2879 8.1.7. Releasing nfs_lockowner State 2881 When a particular nfs_lockowner no longer holds open or file locking 2882 state at the server, the server may choose to release the sequence 2883 number state associated with the nfs_lockowner. The server may make 2884 this choice based on lease expiration, for the reclamation of server 2885 memory, or other implementation specific details. In any event, the 2886 server is able to do this safely only when the nfs_lockowner no 2887 longer is being utilized by the client. The server may choose to 2888 hold the nfs_lockowner state in the event that retransmitted requests 2889 are received. However, the period to hold this state is 2890 implementation specific. 2892 In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is 2893 retransmitted after the server has previously released the 2894 nfs_lockowner state, the server will find that the nfs_lockowner has 2895 no files open and an error will be returned to the client. If the 2896 nfs_lockowner does have a file open, the stateid will not match and 2897 again an error is returned to the client. 2899 In the case that an OPEN is retransmitted and the nfs_lockowner is 2900 being used for the first time or the nfs_lockowner state has been 2901 previously released by the server, the use of the OPEN_CONFIRM 2902 operation will prevent incorrect behavior. When the server observes 2903 the use of the nfs_lockowner for the first time, it will direct the 2904 client to perform the OPEN_CONFIRM for the corresponding OPEN. This 2905 sequence establishes the use of an nfs_lockowner and associated 2906 sequence number. See the section "OPEN_CONFIRM - Confirm Open" for 2907 further details. 2909 8.2. Lock Ranges 2911 The protocol allows a lock owner to request a lock with a byte range 2912 and then either upgrade or unlock a sub-range of the initial lock. 2913 It is expected that this will be an uncommon type of request. In any 2914 case, servers or server file systems may not be able to support sub- 2915 range lock semantics. In the event that a server receives a locking 2916 request that represents a sub-range of current locking state for the 2917 lock owner, the server is allowed to return the error 2918 NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock 2919 operations. Therefore, the client should be prepared to receive this 2920 error and, if appropriate, report the error to the requesting 2921 application. 2923 The client is discouraged from combining multiple independent locking 2924 ranges that happen to be adjacent into a single request since the 2925 server may not support sub-range requests and for reasons related to 2926 the recovery of file locking state in the event of server failure. 2927 As discussed in the section "Server Failure and Recovery" below, the 2928 server may employ certain optimizations during recovery that work 2929 effectively only when the client's behavior during lock recovery is 2930 similar to the client's locking behavior prior to server failure. 2932 8.3. Blocking Locks 2934 Some clients require the support of blocking locks. The NFS version 2935 4 protocol must not rely on a callback mechanism and therefore is 2936 unable to notify a client when a previously denied lock has been 2937 granted. Clients have no choice but to continually poll for the 2938 lock. This presents a fairness problem. Two new lock types are 2939 added, READW and WRITEW, and are used to indicate to the server that 2940 the client is requesting a blocking lock. The server should maintain 2941 an ordered list of pending blocking locks. When the conflicting lock 2942 is released, the server may wait the lease period for the first 2943 waiting client to re-request the lock. After the lease period 2944 expires the next waiting client request is allowed the lock. Clients 2945 are required to poll at an interval sufficiently small that it is 2946 likely to acquire the lock in a timely manner. The server is not 2947 required to maintain a list of pending blocked locks as it is used to 2948 increase fairness and not correct operation. Because of the 2949 unordered nature of crash recovery, storing of lock state to stable 2950 storage would be required to guarantee ordered granting of blocking 2951 locks. 2953 Servers may also note the lock types and delay returning denial of 2954 the request to allow extra time for a conflicting lock to be 2955 released, allowing a successful return. In this way, clients can 2956 avoid the burden of needlessly frequent polling for blocking locks. 2957 The server should take care in the length of delay in the event the 2958 client retransmits the request. 2960 8.4. Lease Renewal 2962 The purpose of a lease is to allow a server to remove stale locks 2963 that are held by a client that has crashed or is otherwise 2964 unreachable. It is not a mechanism for cache consistency and lease 2965 renewals may not be denied if the lease interval has not expired. 2967 The following events cause implicit renewal of all of the leases for 2968 a given client (i.e. all those sharing a given clientid). Each of 2969 these is a positive indication that the client is still active and 2970 that the associated state held at the server, for the client, is 2971 still valid. 2973 o An OPEN with a valid clientid. 2975 o Any operation made with a valid stateid (CLOSE, DELEGPURGE, 2976 DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, 2977 READ, RENEW, SETATTR, SETCLIENTID_CONFIRM, WRITE). This does 2978 not include the special stateids of all bits 0 or all bits 1. 2980 Note that if the client had restarted or rebooted, the 2981 client would not be making these requests without issuing 2982 the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of 2983 the SETCLIENTID/SETCLIENTID_CONFIRM operations notifies the 2984 server to drop the locking state associated with the 2985 client. 2987 If the server has rebooted, the stateids 2988 (NFS4ERR_STALE_STATEID error) or the clientid 2989 (NFS4ERR_STALE_CLIENTID error) will not be valid hence 2990 preventing spurious renewals. 2992 This approach allows for low overhead lease renewal which scales 2993 well. In the typical case no extra RPC calls are required for lease 2994 renewal and in the worst case one RPC is required every lease period 2995 (i.e. a RENEW operation). The number of locks held by the client is 2996 not a factor since all state for the client is involved with the 2997 lease renewal action. 2999 Since all operations that create a new lease also renew existing 3000 leases, the server must maintain a common lease expiration time for 3001 all valid leases for a given client. This lease time can then be 3002 easily updated upon implicit lease renewal actions. 3004 8.5. Crash Recovery 3006 The important requirement in crash recovery is that both the client 3007 and the server know when the other has failed. Additionally, it is 3008 required that a client sees a consistent view of data across server 3009 restarts or reboots. All READ and WRITE operations that may have 3010 been queued within the client or network buffers must wait until the 3011 client has successfully recovered the locks protecting the READ and 3012 WRITE operations. 3014 8.5.1. Client Failure and Recovery 3016 In the event that a client fails, the server may recover the client's 3017 locks when the associated leases have expired. Conflicting locks 3018 from another client may only be granted after this lease expiration. 3019 If the client is able to restart or reinitialize within the lease 3020 period the client may be forced to wait the remainder of the lease 3021 period before obtaining new locks. 3023 To minimize client delay upon restart, lock requests are associated 3024 with an instance of the client by a client supplied verifier. This 3025 verifier is part of the initial SETCLIENTID call made by the client. 3026 The server returns a clientid as a result of the SETCLIENTID 3027 operation. The client then confirms the use of the clientid with 3028 SETCLIENTID_CONFIRM. The clientid in combination with an opaque 3029 owner field is then used by the client to identify the lock owner for 3030 OPEN. This chain of associations is then used to identify all locks 3031 for a particular client. 3033 Since the verifier will be changed by the client upon each 3034 initialization, the server can compare a new verifier to the verifier 3035 associated with currently held locks and determine that they do not 3036 match. This signifies the client's new instantiation and subsequent 3037 loss of locking state. As a result, the server is free to release 3038 all locks held which are associated with the old clientid which was 3039 derived from the old verifier. 3041 For secure environments, a change in the verifier must only cause the 3042 release of locks associated with the authenticated requester. This 3043 is required to prevent a rogue entity from freeing otherwise valid 3044 locks. 3046 Note that the verifier must have the same uniqueness properties of 3047 the verifier for the COMMIT operation. 3049 8.5.2. Server Failure and Recovery 3051 If the server loses locking state (usually as a result of a restart 3052 or reboot), it must allow clients time to discover this fact and re- 3053 establish the lost locking state. The client must be able to re- 3054 establish the locking state without having the server deny valid 3055 requests because the server has granted conflicting access to another 3056 client. Likewise, if there is the possibility that clients have not 3057 yet re-established their locking state for a file, the server must 3058 disallow READ and WRITE operations for that file. The duration of 3059 this recovery period is equal to the duration of the lease period. 3061 A client can determine that server failure (and thus loss of locking 3062 state) has occurred, when it receives one of two errors. The 3063 NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a 3064 reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a 3065 clientid invalidated by reboot or restart. When either of these are 3066 received, the client must establish a new clientid (See the section 3067 "Client ID") and re-establish the locking state as discussed below. 3069 The period of special handling of locking and READs and WRITEs, equal 3070 in duration to the lease period, is referred to as the "grace 3071 period". During the grace period, clients recover locks and the 3072 associated state by reclaim-type locking requests (i.e. LOCK requests 3073 with reclaim set to true and OPEN operations with a claim type of 3074 CLAIM_PREVIOUS). During the grace period, the server must reject 3075 READ and WRITE operations and non-reclaim locking requests (i.e. 3076 other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. 3078 If the server can reliably determine that granting a non-reclaim 3079 request will not conflict with reclamation of locks by other clients, 3080 the NFS4ERR_GRACE error does not have to be returned and the non- 3081 reclaim client request can be serviced. For the server to be able to 3082 service READ and WRITE operations during the grace period, it must 3083 again be able to guarantee that no possible conflict could arise 3084 between an impending reclaim locking request and the READ or WRITE 3085 operation. If the server is unable to offer that guarantee, the 3086 NFS4ERR_GRACE error must be returned to the client. 3088 For a server to provide simple, valid handling during the grace 3089 period, the easiest method is to simply reject all non-reclaim 3090 locking requests and READ and WRITE operations by returning the 3091 NFS4ERR_GRACE error. However, a server may keep information about 3092 granted locks in stable storage. With this information, the server 3093 could determine if a regular lock or READ or WRITE operation can be 3094 safely processed. 3096 For example, if a count of locks on a given file is available in 3097 stable storage, the server can track reclaimed locks for the file and 3098 when all reclaims have been processed, non-reclaim locking requests 3099 may be processed. This way the server can ensure that non-reclaim 3100 locking requests will not conflict with potential reclaim requests. 3101 With respect to I/O requests, if the server is able to determine that 3102 there are no outstanding reclaim requests for a file by information 3103 from stable storage or another similar mechanism, the processing of 3104 I/O requests could proceed normally for the file. 3106 To reiterate, for a server that allows non-reclaim lock and I/O 3107 requests to be processed during the grace period, it MUST determine 3108 that no lock subsequently reclaimed will be rejected and that no lock 3109 subsequently reclaimed would have prevented any I/O operation 3110 processed during the grace period. 3112 Clients should be prepared for the return of NFS4ERR_GRACE errors for 3113 non-reclaim lock and I/O requests. In this case the client should 3114 employ a retry mechanism for the request. A delay (on the order of 3115 several seconds) between retries should be used to avoid overwhelming 3116 the server. Further discussion of the general is included in 3117 [Floyd]. The client must account for the server that is able to 3118 perform I/O and non-reclaim locking requests within the grace period 3119 as well as those that can not do so. 3121 A reclaim-type locking request outside the server's grace period can 3122 only succeed if the server can guarantee that no conflicting lock or 3123 I/O request has been granted since reboot or restart. 3125 8.5.3. Network Partitions and Recovery 3127 If the duration of a network partition is greater than the lease 3128 period provided by the server, the server will have not received a 3129 lease renewal from the client. If this occurs, the server may free 3130 all locks held for the client. As a result, all stateids held by the 3131 client will become invalid or stale. Once the client is able to 3132 reach the server after such a network partition, all I/O submitted by 3133 the client with the now invalid stateids will fail with the server 3134 returning the error NFS4ERR_EXPIRED. Once this error is received, 3135 the client will suitably notify the application that held the lock. 3137 As a courtesy to the client or as an optimization, the server may 3138 continue to hold locks on behalf of a client for which recent 3139 communication has extended beyond the lease period. If the server 3140 receives a lock or I/O request that conflicts with one of these 3141 courtesy locks, the server must free the courtesy lock and grant the 3142 new request. 3144 If the server continues to hold locks beyond the expiration of a 3145 client's lease, the server MUST employ a method of recording this 3146 fact in its stable storage. Conflicting locks requests from another 3147 client may be serviced after the lease expiration. There are various 3148 scenarios involving server failure after such an event that require 3149 the storage of these lease expirations or network partitions. One 3150 scenario is as follows: 3152 A client holds a lock at the server and encounters a 3153 network partition and is unable to renew the associated 3154 lease. A second client obtains a conflicting lock and then 3155 frees the lock. After the unlock request by the second 3156 client, the server reboots or reinitializes. Once the 3157 server recovers, the network partition heals and the 3158 original client attempts to reclaim the original lock. 3160 In this scenario and without any state information, the server will 3161 allow the reclaim and the client will be in an inconsistent state 3162 because the server or the client has no knowledge of the conflicting 3163 lock. 3165 The server may choose to store this lease expiration or network 3166 partitioning state in a way that will only identify the client as a 3167 whole. Note that this may potentially lead to lock reclaims being 3168 denied unnecessarily because of a mix of conflicting and non- 3169 conflicting locks. The server may also choose to store information 3170 about each lock that has an expired lease with an associated 3171 conflicting lock. The choice of the amount and type of state 3172 information that is stored is left to the implementor. In any case, 3173 the server must have enough state information to enable correct 3174 recovery from multiple partitions and multiple server failures. 3176 8.6. Recovery from a Lock Request Timeout or Abort 3178 In the event a lock request times out, a client may decide to not 3179 retry the request. The client may also abort the request when the 3180 process for which it was issued is terminated (e.g. in UNIX due to a 3181 signal). It is possible though that the server received the request 3182 and acted upon it. This would change the state on the server without 3183 the client being aware of the change. It is paramount that the 3184 client re-synchronize state with server before it attempts any other 3185 operation that takes a seqid and/or a stateid with the same 3186 nfs_lockowner. This is straightforward to do without a special re- 3187 synchronize operation. 3189 Since the server maintains the last lock request and response 3190 received on the nfs_lockowner, for each nfs_lockowner, the client 3191 should cache the last lock request it sent such that the lock request 3192 did not receive a response. From this, the next time the client does 3193 a lock operation for the nfs_lockowner, it can send the cached 3194 request, if there is one, and if the request was one that established 3195 state (e.g. a LOCK or OPEN operation) the client can follow up with a 3196 request to remove the state (e.g. a LOCKU or CLOSE operation). With 3197 this approach, the sequencing and stateid information on the client 3198 and server for the given nfs_lockowner will re-synchronize and in 3199 turn the lock state will re-synchronize. 3201 8.7. Server Revocation of Locks 3203 At any point, the server can revoke locks held by a client and the 3204 client must be prepared for this event. When the client detects that 3205 its locks have been or may have been revoked, the client is 3206 responsible for validating the state information between itself and 3207 the server. Validating locking state for the client means that it 3208 must verify or reclaim state for each lock currently held. 3210 The first instance of lock revocation is upon server reboot or re- 3211 initialization. In this instance the client will receive an error 3212 (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will 3213 proceed with normal crash recovery as described in the previous 3214 section. 3216 The second lock revocation event is the inability to renew the lease 3217 period. While this is considered a rare or unusual event, the client 3218 must be prepared to recover. Both the server and client will be able 3219 to detect the failure to renew the lease and are capable of 3220 recovering without data corruption. For the server, it tracks the 3221 last renewal event serviced for the client and knows when the lease 3222 will expire. Similarly, the client must track operations which will 3223 renew the lease period. Using the time that each such request was 3224 sent and the time that the corresponding reply was received, the 3225 client should bound the time that the corresponding renewal could 3226 have occurred on the server and thus determine if it is possible that 3227 a lease period expiration could have occurred. 3229 The third lock revocation event can occur as a result of 3230 administrative intervention within the lease period. While this is 3231 considered a rare event, it is possible that the server's 3232 administrator has decided to release or revoke a particular lock held 3233 by the client. As a result of revocation, the client will receive an 3234 error of NFS4ERR_EXPIRED and the error is received within the lease 3235 period for the lock. In this instance the client may assume that 3236 only the nfs_lockowner's locks have been lost. The client notifies 3237 the lock holder appropriately. The client may not assume the lease 3238 period has been renewed as a result of failed operation. 3240 When the client determines the lease period may have expired, the 3241 client must mark all locks held for the associated lease as 3242 "unvalidated". This means the client has been unable to re-establish 3243 or confirm the appropriate lock state with the server. As described 3244 in the previous section on crash recovery, there are scenarios in 3245 which the server may grant conflicting locks after the lease period 3246 has expired for a client. When it is possible that the lease period 3247 has expired, the client must validate each lock currently held to 3248 ensure that a conflicting lock has not been granted. The client may 3249 accomplish this task by issuing an I/O request, either a pending I/O 3250 or a zero-length read, specifying the stateid associated with the 3251 lock in question. If the response to the request is success, the 3252 client has validated all of the locks governed by that stateid and 3253 re-established the appropriate state between itself and the server. 3254 If the I/O request is not successful, then one or more of the locks 3255 associated with the stateid was revoked by the server and the client 3256 must notify the owner. 3258 8.8. Share Reservations 3260 A share reservation is a mechanism to control access to a file. It 3261 is a separate and independent mechanism from record locking. When a 3262 client opens a file, it issues an OPEN operation to the server 3263 specifying the type of access required (READ, WRITE, or BOTH) and the 3264 type of access to deny others (deny NONE, READ, WRITE, or BOTH). If 3265 the OPEN fails the client will fail the application's open request. 3267 Pseudo-code definition of the semantics: 3269 if ((request.access & file_state.deny)) || 3270 (request.deny & file_state.access)) 3271 return (NFS4ERR_DENIED) 3273 The constants used for the OPEN and OPEN_DOWNGRADE operations for the 3274 access and deny fields are as follows: 3276 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 3277 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 3278 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 3280 const OPEN4_SHARE_DENY_NONE = 0x00000000; 3281 const OPEN4_SHARE_DENY_READ = 0x00000001; 3282 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 3283 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 3285 8.9. OPEN/CLOSE Operations 3287 To provide correct share semantics, a client MUST use the OPEN 3288 operation to obtain the initial filehandle and indicate the desired 3289 access and what if any access to deny. Even if the client intends to 3290 use a stateid of all 0's or all 1's, it must still obtain the 3291 filehandle for the regular file with the OPEN operation so the 3292 appropriate share semantics can be applied. For clients that do not 3293 have a deny mode built into their open programming interfaces, deny 3294 equal to NONE should be used. 3296 The OPEN operation with the CREATE flag, also subsumes the CREATE 3297 operation for regular files as used in previous versions of the NFS 3298 protocol. This allows a create with a share to be done atomically. 3300 The CLOSE operation removes all share locks held by the nfs_lockowner 3301 on that file. If record locks are held, the client SHOULD release 3302 all locks before issuing a CLOSE. The server MAY free all 3303 outstanding locks on CLOSE but some servers may not support the CLOSE 3304 of a file that still has record locks held. The server MUST return 3305 failure if any locks would exist after the CLOSE. 3307 The LOOKUP operation will return a filehandle without establishing 3308 any lock state on the server. Without a valid stateid, the server 3309 will assume the client has the least access. For example, a file 3310 opened with deny READ/WRITE cannot be accessed using a filehandle 3311 obtained through LOOKUP because it would not have a valid stateid 3312 (i.e. using a stateid of all bits 0 or all bits 1). 3314 8.10. Open Upgrade and Downgrade 3316 When an OPEN is done for a file and the lockowner for which the open 3317 is being done already has the file open, the result is to upgrade the 3318 open file status maintained on the server to include the access and 3319 deny bits specified by the new OPEN as well as those for the existing 3320 OPEN. The result is that there is one open file, as far as the 3321 protocol is concerned, and it includes the union of the access and 3322 deny bits for all of the OPEN requests completed. Only a single 3323 CLOSE will be done to reset the effects of both OPEN's. Note that 3324 the client, when issuing the OPEN, may not know that the same file is 3325 in fact being opened. The above only applies if both OPEN's result 3326 in the OPEN'ed object being designated by the same filehandle. 3328 When the server chooses to export multiple filehandles corresponding 3329 to the same file object and returns different filehandles on two 3330 different OPEN's of the same file object, the server MUST NOT "OR" 3331 together the access and deny bits and coalesce the two open files. 3332 Instead the server must maintain separate OPEN's with separate 3333 stateid's and will require separate CLOSE's to free them. 3335 When multiple open files on the client are merged into a single open 3336 file object on the server, the close of one of the open files (on the 3337 client) may necessitate change of the access and deny status of the 3338 open file on the server. This is because the union of the access and 3339 deny bits for the remaining open's may be smaller (i.e. a proper 3340 subset) than previously. The OPEN_DOWNGRADE operation is used to 3341 make the necessary change and the client should use it to update the 3342 server so that share reservation requests by other clients are 3343 handled properly. 3345 8.11. Short and Long Leases 3347 When determining the time period for the server lease, the usual 3348 lease tradeoffs apply. Short leases are good for fast server 3349 recovery at a cost of increased RENEW or READ (with zero length) 3350 requests. Longer leases are certainly kinder and gentler to large 3351 internet servers trying to handle very large numbers of clients. The 3352 number of RENEW requests drop in proportion to the lease time. The 3353 disadvantages of long leases are slower recovery after server failure 3354 (server must wait for leases to expire and grace period before 3355 granting new lock requests) and increased file contention (if client 3356 fails to transmit an unlock request then server must wait for lease 3357 expiration before granting new locks). 3359 Long leases are usable if the server is able to store lease state in 3360 non-volatile memory. Upon recovery, the server can reconstruct the 3361 lease state from its non-volatile memory and continue operation with 3362 its clients and therefore long leases are not an issue. 3364 8.12. Clocks and Calculating Lease Expiration 3366 To avoid the need for synchronized clocks, lease times are granted by 3367 the server as a time delta. However, there is a requirement that the 3368 client and server clocks do not drift excessively over the duration 3369 of the lock. There is also the issue of propagation delay across the 3370 network which could easily be several hundred milliseconds as well as 3371 the possibility that requests will be lost and need to be 3372 retransmitted. 3374 To take propagation delay into account, the client should subtract it 3375 from lease times (e.g. if the client estimates the one-way 3376 propagation delay as 200 msec, then it can assume that the lease is 3377 already 200 msec old when it gets it). In addition, it will take 3378 another 200 msec to get a response back to the server. So the client 3379 must send a lock renewal or write data back to the server 400 msec 3380 before the lease would expire. 3382 8.13. Migration, Replication and State 3384 When responsibility for handling a given file system is transferred 3385 to a new server (migration) or the client chooses to use an alternate 3386 server (e.g. in response to server unresponsiveness) in the context 3387 of file system replication, the appropriate handling of state shared 3388 between the client and server (i.e. locks, leases, stateid's, and 3389 clientid's) is as described below. The handling differs between 3390 migration and replication. For related discussion of file server 3391 state and recover of such see the sections under "File Locking and 3392 Share Reservations" 3394 8.13.1. Migration and State 3396 In the case of migration, the servers involved in the migration of a 3397 file system SHOULD transfer all server state from the original to the 3398 new server. This must be done in a way that is transparent to the 3399 client. This state transfer will ease the client's transition when a 3400 file system migration occurs. If the servers are successful in 3401 transferring all state, the client will continue to use stateid's 3402 assigned by the original server. Therefore the new server must 3403 recognize these stateid's as valid. This holds true for the clientid 3404 as well. Since responsibility for an entire file system is 3405 transferred with a migration event, there is no possibility that 3406 conflicts will arise on the new server as a result of the transfer of 3407 locks. 3409 As part of the transfer of information between servers, leases would 3410 be transferred as well. The leases being transferred to the new 3411 server will typically have a different expiration time from those for 3412 the same client, previously on the new server. To maintain the 3413 property that all leases on a given server for a given client expire 3414 at the same time, the server should advance the expiration time to 3415 the later of the leases being transferred or the leases already 3416 present. This allows the client to maintain lease renewal of both 3417 classes without special effort. 3419 The servers may choose not to transfer the state information upon 3420 migration. However, this choice is discouraged. In this case, when 3421 the client presents state information from the original server, the 3422 client must be prepared to receive either NFS4ERR_STALE_CLIENTID or 3423 NFS4ERR_STALE_STATEID from the new server. The client should then 3424 recover its state information as it normally would in response to a 3425 server failure. The new server must take care to allow for the 3426 recovery of state information as it would in the event of server 3427 restart. 3429 8.13.2. Replication and State 3431 Since client switch-over in the case of replication is not under 3432 server control, the handling of state is different. In this case, 3433 leases, stateid's and clientid's do not have validity across a 3434 transition from one server to another. The client must re-establish 3435 its locks on the new server. This can be compared to the re- 3436 establishment of locks by means of reclaim-type requests after a 3437 server reboot. The difference is that the server has no provision to 3438 distinguish requests reclaiming locks from those obtaining new locks 3439 or to defer the latter. Thus, a client re-establishing a lock on the 3440 new server (by means of a LOCK or OPEN request), may have the 3441 requests denied due to a conflicting lock. Since replication is 3442 intended for read-only use of filesystems, such denial of locks 3443 should not pose large difficulties in practice. When an attempt to 3444 re-establish a lock on a new server is denied, the client should 3445 treat the situation as if his original lock had been revoked. 3447 8.13.3. Notification of Migrated Lease 3449 In the case of lease renewal, the client may not be submitting 3450 requests for a file system that has been migrated to another server. 3451 This can occur because of the implicit lease renewal mechanism. The 3452 client renews leases for all file systems when submitting a request 3453 to any one file system at the server. 3455 In order for the client to schedule renewal of leases that may have 3456 been relocated to the new server, the client must find out about 3457 lease relocation before those leases expire. To accomplish this, all 3458 operations which implicitly renew leases for a client (i.e. OPEN, 3459 CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error 3460 NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be 3461 renewed has been transferred to a new server. This condition will 3462 continue until the client receives an NFS4ERR_MOVED error and the 3463 server receives the subsequent GETATTR(fs_locations) for an access to 3464 each file system for which a lease has been moved to a new server. 3466 When a client receives an NFS4ERR_LEASE_MOVED error, it should 3467 perform some operation, such as a RENEW, on each file system 3468 associated with the server in question. When the client receives an 3469 NFS4ERR_MOVED error, the client can follow the normal process to 3470 obtain the new server information (through the fs_locations 3471 attribute) and perform renewal of those leases on the new server. If 3472 the server has not had state transferred to it transparently, it will 3473 receive either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from 3474 the new server, as described above, and can then recover state 3475 information as it does in the event of server failure. 3477 9. Client-Side Caching 3479 Client-side caching of data, of file attributes, and of file names is 3480 essential to providing good performance with the NFS protocol. 3481 Providing distributed cache coherence is a difficult problem and 3482 previous versions of the NFS protocol have not attempted it. 3483 Instead, several NFS client implementation techniques have been used 3484 to reduce the problems that a lack of coherence poses for users. 3485 These techniques have not been clearly defined by earlier protocol 3486 specifications and it is often unclear what is valid or invalid 3487 client behavior. 3489 The NFS version 4 protocol uses many techniques similar to those that 3490 have been used in previous protocol versions. The NFS version 4 3491 protocol does not provide distributed cache coherence. However, it 3492 defines a more limited set of caching guarantees to allow locks and 3493 share reservations to be used without destructive interference from 3494 client side caching. 3496 In addition, the NFS version 4 protocol introduces a delegation 3497 mechanism which allows many decisions normally made by the server to 3498 be made locally by clients. This mechanism provides efficient 3499 support of the common cases where sharing is infrequent or where 3500 sharing is read-only. 3502 9.1. Performance Challenges for Client-Side Caching 3504 Caching techniques used in previous versions of the NFS protocol have 3505 been successful in providing good performance. However, several 3506 scalability challenges can arise when those techniques are used with 3507 very large numbers of clients. This is particularly true when 3508 clients are geographically distributed which classically increases 3509 the latency for cache revalidation requests. 3511 The previous versions of the NFS protocol repeat their file data 3512 cache validation requests at the time the file is opened. This 3513 behavior can have serious performance drawbacks. A common case is 3514 one in which a file is only accessed by a single client. Therefore, 3515 sharing is infrequent. 3517 In this case, repeated reference to the server to find that no 3518 conflicts exist is expensive. A better option with regards to 3519 performance is to allow a client that repeatedly opens a file to do 3520 so without reference to the server. This is done until potentially 3521 conflicting operations from another client actually occur. 3523 A similar situation arises in connection with file locking. Sending 3524 file lock and unlock requests to the server as well as the read and 3525 write requests necessary to make data caching consistent with the 3526 locking semantics (see the section "Data Caching and File Locking") 3527 can severely limit performance. When locking is used to provide 3528 protection against infrequent conflicts, a large penalty is incurred. 3529 This penalty may discourage the use of file locking by applications. 3531 The NFS version 4 protocol provides more aggressive caching 3532 strategies with the following design goals: 3534 o Compatibility with a large range of server semantics. 3536 o Provide the same caching benefits as previous versions of the 3537 NFS protocol when unable to provide the more aggressive model. 3539 o Requirements for aggressive caching are organized so that a 3540 large portion of the benefit can be obtained even when not all 3541 of the requirements can be met. 3543 The appropriate requirements for the server are discussed in later 3544 sections in which specific forms of caching are covered. (see the 3545 section "Open Delegation"). 3547 9.2. Delegation and Callbacks 3549 Recallable delegation of server responsibilities for a file to a 3550 client improves performance by avoiding repeated requests to the 3551 server in the absence of inter-client conflict. With the use of a 3552 "callback" RPC from server to client, a server recalls delegated 3553 responsibilities when another client engages in sharing of a 3554 delegated file. 3556 A delegation is passed from the server to the client, specifying the 3557 object of the delegation and the type of delegation. There are 3558 different types of delegations but each type contains a stateid to be 3559 used to represent the delegation when performing operations that 3560 depend on the delegation. This stateid is similar to those 3561 associated with locks and share reservations but differs in that the 3562 stateid for a delegation is associated with a clientid and may be 3563 used on behalf of all the nfs_lockowners for the given client. A 3564 delegation is made to the client as a whole and not to any specific 3565 process or thread of control within it. 3567 Because callback RPCs may not work in all environments (due to 3568 firewalls, for example), correct protocol operation does not depend 3569 on them. Preliminary testing of callback functionality by means of a 3570 CB_NULL procedure determines whether callbacks can be supported. The 3571 CB_NULL procedure checks the continuity of the callback path. A 3572 server makes a preliminary assessment of callback availability to a 3573 given client and avoids delegating responsibilities until it has 3574 determined that callbacks are supported. Because the granting of a 3575 delegation is always conditional upon the absence of conflicting 3576 access, clients must not assume that a delegation will be granted and 3577 they must always be prepared for OPENs to be processed without any 3578 delegations being granted. 3580 Once granted, a delegation behaves in most ways like a lock. There 3581 is an associated lease that is subject to renewal together with all 3582 of the other leases held by that client. 3584 Unlike locks, an operation by a second client to a delegated file 3585 will cause the server to recall a delegation through a callback. 3587 On recall, the client holding the delegation must flush modified 3588 state (such as modified data) to the server and return the 3589 delegation. The conflicting request will not receive a response 3590 until the recall is complete. The recall is considered complete when 3591 the client returns the delegation or the server times out on the 3592 recall and revokes the delegation as a result of the timeout. 3593 Following the resolution of the recall, the server has the 3594 information necessary to grant or deny the second client's request. 3596 At the time the client receives a delegation recall, it may have 3597 substantial state that needs to be flushed to the server. Therefore, 3598 the server should allow sufficient time for the delegation to be 3599 returned since it may involve numerous RPCs to the server. If the 3600 server is able to determine that the client is diligently flushing 3601 state to the server as a result of the recall, the server may extend 3602 the usual time allowed for a recall. However, the time allowed for 3603 recall completion should not be unbounded. 3605 An example of this is when responsibility to mediate opens on a given 3606 file is delegated to a client (see the section "Open Delegation"). 3607 The server will not know what opens are in effect on the client. 3608 Without this knowledge the server will be unable to determine if the 3609 access and deny state for the file allows any particular open until 3610 the delegation for the file has been returned. 3612 A client failure or a network partition can result in failure to 3613 respond to a recall callback. In this case, the server will revoke 3614 the delegation which in turn will render useless any modified state 3615 still on the client. 3617 9.2.1. Delegation Recovery 3619 There are three situations that delegation recovery must deal with: 3621 o Client reboot or restart 3623 o Server reboot or restart 3625 o Network partition (full or callback-only) 3627 In the event the client reboots or restarts, the failure to renew 3628 leases will result in the revocation of record locks and share 3629 reservations. Delegations, however, may be treated a bit 3630 differently. 3632 There will be situations in which delegations will need to be 3633 reestablished after a client reboots or restarts. The reason for 3634 this is the client may have file data stored locally and this data 3635 was associated with the previously held delegations. The client will 3636 need to reestablish the appropriate file state on the server. 3638 To allow for this type of client recovery, the server may extend the 3639 period for delegation recovery beyond the typical lease expiration 3640 period. This implies that requests from other clients that conflict 3641 with these delegations will need to wait. Because the normal recall 3642 process may require significant time for the client to flush changed 3643 state to the server, other clients need be prepared for delays that 3644 occur because of a conflicting delegation. This longer interval 3645 would increase the window for clients to reboot and consult stable 3646 storage so that the delegations can be reclaimed. For open 3647 delegations, such delegations are reclaimed using OPEN with a claim 3648 type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and 3649 Revocation" and "Operation 18: OPEN" for discussion of open 3650 delegation and the details of OPEN respectively). 3652 When the server reboots or restarts, delegations are reclaimed (using 3653 the OPEN operation with CLAIM_DELEGATE_PREV) in a similar fashion to 3654 record locks and share reservations. However, there is a slight 3655 semantic difference. In the normal case if the server decides that a 3656 delegation should not be granted, it performs the requested action 3657 (e.g. OPEN) without granting any delegation. For reclaim, the server 3658 grants the delegation but a special designation is applied so that 3659 the client treats the delegation as having been granted but recalled 3660 by the server. Because of this, the client has the duty to write all 3661 modified state to the server and then return the delegation. This 3662 process of handling delegation reclaim reconciles three principles of 3663 the NFS Version 4 protocol: 3665 o Upon reclaim, a client reporting resources assigned to it by an 3666 earlier server instance must be granted those resources. 3668 o The server has unquestionable authority to determine whether 3669 delegations are to be granted and, once granted, whether they 3670 are to be continued. 3672 o The use of callbacks is not to be depended upon until the client 3673 has proven its ability to receive them. 3675 When a network partition occurs, delegations are subject to freeing 3676 by the server when the lease renewal period expires. This is similar 3677 to the behavior for locks and share reservations. For delegations, 3678 however, the server may extend the period in which conflicting 3679 requests are held off. Eventually the occurrence of a conflicting 3680 request from another client will cause revocation of the delegation. 3681 A loss of the callback path (e.g. by later network configuration 3682 change) will have the same effect. A recall request will fail and 3683 revocation of the delegation will result. 3685 A client normally finds out about revocation of a delegation when it 3686 uses a stateid associated with a delegation and receives the error 3687 NFS4ERR_EXPIRED. It also may find out about delegation revocation 3688 after a client reboot when it attempts to reclaim a delegation and 3689 receives that same error. Note that in the case of a revoked write 3690 open delegation, there are issues because data may have been modified 3691 by the client whose delegation is revoked and separately by other 3692 clients. See the section "Revocation Recovery for Write Open 3693 Delegation" for a discussion of such issues. Note also that when 3694 delegations are revoked, information about the revoked delegation 3695 will be written by the server to stable storage (as described in the 3696 section "Crash Recovery"). This is done to deal with the case in 3697 which a server reboots after revoking a delegation but before the 3698 client holding the revoked delegation is notified about the 3699 revocation. 3701 9.3. Data Caching 3703 When applications share access to a set of files, they need to be 3704 implemented so as to take account of the possibility of conflicting 3705 access by another application. This is true whether the applications 3706 in question execute on different clients or reside on the same 3707 client. 3709 Share reservations and record locks are the facilities the NFS 3710 version 4 protocol provides to allow applications to coordinate 3711 access by providing mutual exclusion facilities. The NFS version 4 3712 protocol's data caching must be implemented such that it does not 3713 invalidate the assumptions that those using these facilities depend 3714 upon. 3716 9.3.1. Data Caching and OPENs 3718 In order to avoid invalidating the sharing assumptions that 3719 applications rely on, NFS version 4 clients should not provide cached 3720 data to applications or modify it on behalf of an application when it 3721 would not be valid to obtain or modify that same data via a READ or 3722 WRITE operation. 3724 Furthermore, in the absence of open delegation (see the section "Open 3725 Delegation") two additional rules apply. Note that these rules are 3726 obeyed in practice by many NFS version 2 and version 3 clients. 3728 o First, cached data present on a client must be revalidated after 3729 doing an OPEN. This is to ensure that the data for the OPENed 3730 file is still correctly reflected in the client's cache. This 3731 validation must be done at least when the client's OPEN 3732 operation includes DENY=WRITE or BOTH thus terminating a period 3733 in which other clients may have had the opportunity to open the 3734 file with WRITE access. Clients may choose to do the 3735 revalidation more often (i.e. at OPENs specifying DENY=NONE) to 3736 parallel the NFS version 3 protocol's practice for the benefit 3737 of users assuming this degree of cache revalidation. 3739 o Second, modified data must be flushed to the server before 3740 closing a file OPENed for write. This is complementary to the 3741 first rule. If the data is not flushed at CLOSE, the 3742 revalidation done after client OPENs as file is unable to 3743 achieve its purpose. The other aspect to flushing the data 3744 before close is that the data must be committed to stable 3745 storage, at the server, before the CLOSE operation is requested 3746 by the client. In the case of a server reboot or restart and a 3747 CLOSEd file, it may not be possible to retransmit the data to be 3748 written to the file. Hence, this requirement. 3750 9.3.2. Data Caching and File Locking 3752 For those applications that choose to use file locking instead of 3753 share reservations to exclude inconsistent file access, there is an 3754 analogous set of constraints that apply to client side data caching. 3755 These rules are effective only if the file locking is used in a way 3756 that matches in an equivalent way the actual READ and WRITE 3757 operations executed. This is as opposed to file locking that is 3758 based on pure convention. For example, it is possible to manipulate 3759 a two-megabyte file by dividing the file into two one-megabyte 3760 regions and protecting access to the two regions by file locks on 3761 bytes zero and one. A lock for write on byte zero of the file would 3762 represent the right to do READ and WRITE operations on the first 3763 region. A lock for write on byte one of the file would represent the 3764 right to do READ and WRITE operations on the second region. As long 3765 as all applications manipulating the file obey this convention, they 3766 will work on a local file system. However, they may not work with 3767 the NFS version 4 protocol unless clients refrain from data caching. 3769 The rules for data caching in the file locking environment are: 3771 o First, when a client obtains a file lock for a particular 3772 region, the data cache corresponding to that region (if any 3773 cache data exists) must be revalidated. If the change attribute 3774 indicates that the file may have been updated since the cached 3775 data was obtained, the client must flush or invalidate the 3776 cached data for the newly locked region. A client might choose 3777 to invalidate all of non-modified cached data that it has for 3778 the file but the only requirement for correct operation is to 3779 invalidate all of the data in the newly locked region. 3781 o Second, before releasing a write lock for a region, all modified 3782 data for that region must be flushed to the server. The 3783 modified data must also be written to stable storage. 3785 Note that flushing data to the server and the invalidation of cached 3786 data must reflect the actual byte ranges locked or unlocked. 3787 Rounding these up or down to reflect client cache block boundaries 3788 will cause problems if not carefully done. For example, writing a 3789 modified block when only half of that block is within an area being 3790 unlocked may cause invalid modification to the region outside the 3791 unlocked area. This, in turn, may be part of a region locked by 3792 another client. Clients can avoid this situation by synchronously 3793 performing portions of write operations that overlap that portion 3794 (initial or final) that is not a full block. Similarly, invalidating 3795 a locked area which is not an integral number of full buffer blocks 3796 would require the client to read one or two partial blocks from the 3797 server if the revalidation procedure shows that the data which the 3798 client possesses may not be valid. 3800 The data that is written to the server as a pre-requisite to the 3801 unlocking of a region must be written, at the server, to stable 3802 storage. The client may accomplish this either with synchronous 3803 writes or by following asynchronous writes with a COMMIT operation. 3804 This is required because retransmission of the modified data after a 3805 server reboot might conflict with a lock held by another client. 3807 A client implementation may choose to accommodate applications which 3808 use record locking in non-standard ways (e.g. using a record lock as 3809 a global semaphore) by flushing to the server more data upon an LOCKU 3810 than is covered by the locked range. This may include modified data 3811 within files other than the one for which the unlocks are being done. 3812 In such cases, the client must not interfere with applications whose 3813 READs and WRITEs are being done only within the bounds of record 3814 locks which the application holds. For example, an application locks 3815 a single byte of a file and proceeds to write that single byte. A 3816 client that chose to handle a LOCKU by flushing all modified data to 3817 the server could validly write that single byte in response to an 3818 unrelated unlock. However, it would not be valid to write the entire 3819 block in which that single written byte was located since it includes 3820 an area that is not locked and might be locked by another client. 3821 Client implementations can avoid this problem by dividing files with 3822 modified data into those for which all modifications are done to 3823 areas covered by an appropriate record lock and those for which there 3824 are modifications not covered by a record lock. Any writes done for 3825 the former class of files must not include areas not locked and thus 3826 not modified on the client. 3828 9.3.3. Data Caching and Mandatory File Locking 3830 Client side data caching needs to respect mandatory file locking when 3831 it is in effect. The presence of mandatory file locking for a given 3832 file is indicated in the result flags for an OPEN. When mandatory 3833 locking is in effect for a file, the client must check for an 3834 appropriate file lock for data being read or written. If a lock 3835 exists for the range being read or written, the client may satisfy 3836 the request using the client's validated cache. If an appropriate 3837 file lock is not held for the range of the read or write, the read or 3838 write request must not be satisfied by the client's cache and the 3839 request must be sent to the server for processing. When a read or 3840 write request partially overlaps a locked region, the request should 3841 be subdivided into multiple pieces with each region (locked or not) 3842 treated appropriately. 3844 9.3.4. Data Caching and File Identity 3846 When clients cache data, the file data needs to organized according 3847 to the file system object to which the data belongs. For NFS version 3848 3 clients, the typical practice has been to assume for the purpose of 3849 caching that distinct filehandles represent distinct file system 3850 objects. The client then has the choice to organize and maintain the 3851 data cache on this basis. 3853 In the NFS version 4 protocol, there is now the possibility to have 3854 significant deviations from a "one filehandle per object" model 3855 because a filehandle may be constructed on the basis of the object's 3856 pathname. Therefore, clients need a reliable method to determine if 3857 two filehandles designate the same file system object. If clients 3858 were simply to assume that all distinct filehandles denote distinct 3859 objects and proceed to do data caching on this basis, caching 3860 inconsistencies would arise between the distinct client side objects 3861 which mapped to the same server side object. 3863 By providing a method to differentiate filehandles, the NFS version 4 3864 protocol alleviates a potential functional regression in comparison 3865 with the NFS version 3 protocol. Without this method, caching 3866 inconsistencies within the same client could occur and this has not 3867 been present in previous versions of the NFS protocol. Note that it 3868 is possible to have such inconsistencies with applications executing 3869 on multiple clients but that is not the issue being addressed here. 3871 For the purposes of data caching, the following steps allow an NFS 3872 version 4 client to determine whether two distinct filehandles denote 3873 the same server side object: 3875 o If GETATTR directed to two filehandles have different values of 3876 the fsid attribute, then the filehandles represent distinct 3877 objects. 3879 o If GETATTR for any file with an fsid that matches the fsid of 3880 the two filehandles in question returns a unique_handles 3881 attribute with a value of TRUE, then the two objects are 3882 distinct. 3884 o If GETATTR directed to the two filehandles does not return the 3885 fileid attribute for one or both of the handles, then it cannot 3886 be determined whether the two objects are the same. Therefore, 3887 operations which depend on that knowledge (e.g. client side data 3888 caching) cannot be done reliably. 3890 o If GETATTR directed to the two filehandles returns different 3891 values for the fileid attribute, then they are distinct objects. 3893 o Otherwise they are the same object. 3895 9.4. Open Delegation 3897 When a file is being OPENed, the server may delegate further handling 3898 of opens and closes for that file to the opening client. Any such 3899 delegation is recallable, since the circumstances that allowed for 3900 the delegation are subject to change. In particular, the server may 3901 receive a conflicting OPEN from another client, the server must 3902 recall the delegation before deciding whether the OPEN from the other 3903 client may be granted. Making a delegation is up to the server and 3904 clients should not assume that any particular OPEN either will or 3905 will not result in an open delegation. The following is a typical 3906 set of conditions that servers might use in deciding whether OPEN 3907 should be delegated: 3909 o The client must be able to respond to the server's callback 3910 requests. The server will use the CB_NULL procedure for a test 3911 of callback ability. 3913 o The client must have responded properly to previous recalls. 3915 o There must be no current open conflicting with the requested 3916 delegation. 3918 o There should be no current delegation that conflicts with the 3919 delegation being requested. 3921 o The probability of future conflicting open requests should be 3922 low based on the recent history of the file. 3924 o The existence of any server-specific semantics of OPEN/CLOSE 3925 that would make the required handling incompatible with the 3926 prescribed handling that the delegated client would apply (see 3927 below). 3929 There are two types of open delegations, read and write. A read open 3930 delegation allows a client to handle, on its own, requests to open a 3931 file for reading that do not deny read access to others. Multiple 3932 read open delegations may be outstanding simultaneously and do not 3933 conflict. A write open delegation allows the client to handle, on 3934 its own, all opens. Only one write open delegation may exist for a 3935 given file at a given time and it is inconsistent with any read open 3936 delegations. 3938 When a client has a read open delegation, it may not make any changes 3939 to the contents or attributes of the file but it is assured that no 3940 other client may do so. When a client has a write open delegation, 3941 it may modify the file data since no other client will be accessing 3942 the file's data. The client holding a write delegation may only 3943 affect file attributes which are intimately connected with the file 3944 data: size, time_modify, change. 3946 When a client has an open delegation, it does not send OPENs or 3947 CLOSEs to the server but updates the appropriate status internally. 3948 For a read open delegation, opens that cannot be handled locally 3949 (opens for write or that deny read access) must be sent to the 3950 server. 3952 When an open delegation is made, the response to the OPEN contains an 3953 open delegation structure which specifies the following: 3955 o the type of delegation (read or write) 3957 o space limitation information to control flushing of data on 3958 close (write open delegation only, see the section "Open 3959 Delegation and Data Caching") 3961 o an nfsace4 specifying read and write permissions 3963 o a stateid to represent the delegation for READ and WRITE 3965 The stateid is separate and distinct from the stateid for the OPEN 3966 proper. The standard stateid, unlike the delegation stateid, is 3967 associated with a particular nfs_lockowner and will continue to be 3968 valid after the delegation is recalled and the file remains open. 3970 When a request internal to the client is made to open a file and open 3971 delegation is in effect, it will be accepted or rejected solely on 3972 the basis of the following conditions. Any requirement for other 3973 checks to be made by the delegate should result in open delegation 3974 being denied so that the checks can be made by the server itself. 3976 o The access and deny bits for the request and the file as 3977 described in the section "Share Reservations". 3979 o The read and write permissions as determined below. 3981 The nfsace4 passed with delegation can be used to avoid frequent 3982 ACCESS calls. The permission check should be as follows: 3984 o If the nfsace4 indicates that the open may be done, then it 3985 should be granted without reference to the server. 3987 o If the nfsace4 indicates that the open may not be done, then an 3988 ACCESS request must be sent to the server to obtain the 3989 definitive answer. 3991 The server may return an nfsace4 that is more restrictive than the 3992 actual ACL of the file. This includes an nfsace4 that specifies 3993 denial of all access. Note that some common practices such as 3994 mapping the traditional user "root" to the user "nobody" may make it 3995 incorrect to return the actual ACL of the file in the delegation 3996 response. 3998 The use of delegation together with various other forms of caching 3999 creates the possibility that no server authentication will ever be 4000 performed for a given user since all of the user's requests might be 4001 satisfied locally. Where the client is depending on the server for 4002 authentication, the client should be sure authentication occurs for 4003 each user by use of the ACCESS operation. This should be the case 4004 even if an ACCESS operation would not be required otherwise. As 4005 mentioned before, the server may enforce frequent authentication by 4006 returning an nfsace4 denying all access with every open delegation. 4008 9.4.1. Open Delegation and Data Caching 4010 OPEN delegation allows much of the message overhead associated with 4011 the opening and closing files to be eliminated. An open when an open 4012 delegation is in effect does not require that a validation message be 4013 sent to the server. The continued endurance of the "read open 4014 delegation" provides a guarantee that no OPEN for write and thus no 4015 write has occurred. Similarly, when closing a file opened for write 4016 and if write open delegation is in effect, the data written does not 4017 have to be flushed to the server until the open delegation is 4018 recalled. The continued endurance of the open delegation provides a 4019 guarantee that no open and thus no read or write has been done by 4020 another client. 4022 For the purposes of open delegation, READs and WRITEs done without an 4023 OPEN are treated as the functional equivalents of a corresponding 4024 type of OPEN. This refers to the READs and WRITEs that use the 4025 special stateids consisting of all zero bits or all one bits. 4026 Therefore, READs or WRITEs with a special stateid done by another 4027 client will force the server to recall a write open delegation. A 4028 WRITE with a special stateid done by another client will force a 4029 recall of read open delegations. 4031 With delegations, a client is able to avoid writing data to the 4032 server when the CLOSE of a file is serviced. The CLOSE operation is 4033 the usual point at which the client is notified of a lack of stable 4034 storage for the modified file data generated by the application. At 4035 the CLOSE, file data is written to the server and through normal 4036 accounting the server is able to determine if the available file 4037 system space for the data has been exceeded (i.e. server returns 4038 NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting includes quotas. 4039 The introduction of delegations requires that a alternative method be 4040 in place for the same type of communication to occur between client 4041 and server. 4043 In the delegation response, the server provides either the limit of 4044 the size of the file or the number of modified blocks and associated 4045 block size. The server must ensure that the client will be able to 4046 flush data to the server of a size equal to that provided in the 4047 original delegation. The server must make this assurance for all 4048 outstanding delegations. Therefore, the server must be careful in 4049 its management of available space for new or modified data taking 4050 into account available file system space and any applicable quotas. 4051 The server can recall delegations as a result of managing the 4052 available file system space. The client should abide by the server's 4053 state space limits for delegations. If the client exceeds the stated 4054 limits for the delegation, the server's behavior is undefined. 4056 Based on server conditions, quotas or available file system space, 4057 the server may grant write open delegations with very restrictive 4058 space limitations. The limitations may be defined in a way that will 4059 always force modified data to be flushed to the server on close. 4061 With respect to authentication, flushing modified data to the server 4062 after a CLOSE has occurred may be problematic. For example, the user 4063 of the application may have logged off of the client and unexpired 4064 authentication credentials may not be present. In this case, the 4065 client may need to take special care to ensure that local unexpired 4066 credentials will in fact be available. This may be accomplished by 4067 tracking the expiration time of credentials and flushing data well in 4068 advance of their expiration or by making private copies of 4069 credentials to assure their availability when needed. 4071 9.4.2. Open Delegation and File Locks 4073 When a client holds a write open delegation, lock operations are 4074 performed locally. This includes those required for mandatory file 4075 locking. This can be done since the delegation implies that there 4076 can be no conflicting locks. Similarly, all of the revalidations 4077 that would normally be associated with obtaining locks and the 4078 flushing of data associated with the releasing of locks need not be 4079 done. 4081 9.4.3. Recall of Open Delegation 4083 The following events necessitate recall of an open delegation: 4085 o Potentially conflicting OPEN request (or READ/WRITE done with 4086 "special" stateid) 4088 o SETATTR issued by another client 4090 o REMOVE request for the file 4092 o RENAME request for the file as either source or target of the 4093 RENAME 4095 Whether a RENAME of a directory in the path leading to the file 4096 results in recall of an open delegation depends on the semantics of 4097 the server file system. If that file system denies such RENAMEs when 4098 a file is open, the recall must be performed to determine whether the 4099 file in question is, in fact, open. 4101 In addition to the situations above, the server may choose to recall 4102 open delegations at any time if resource constraints make it 4103 advisable to do so. Clients should always be prepared for the 4104 possibility of recall. 4106 The server needs to employ special handling for a GETATTR where the 4107 target is a file that has a write open delegation in effect. In this 4108 case, the client holding the delegation needs to be interrogated. 4109 The server will use a CB_GETATTR callback, if the GETATTR attribute 4110 bits include any of the attributes that a write open delegate may 4111 modify (size, time_modify, change). 4113 When a client receives a recall for an open delegation, it needs to 4114 update state on the server before returning the delegation. These 4115 same updates must be done whenever a client chooses to return a 4116 delegation voluntarily. The following items of state need to be 4117 dealt with: 4119 o If the file associated with the delegation is no longer open and 4120 no previous CLOSE operation has been sent to the server, a CLOSE 4121 operation must be sent to the server. 4123 o If a file has other open references at the client, then OPEN 4124 operations must be sent to the server. The appropriate stateids 4125 will be provided by the server for subsequent use by the client 4126 since the delegation stateid will not longer be valid. These 4127 OPEN requests are done with the claim type of 4128 CLAIM_DELEGATE_CUR. This will allow the presentation of the 4129 delegation stateid so that the client can establish the 4130 appropriate rights to perform the OPEN. (see the section 4131 "Operation 18: OPEN" for details.) 4133 o If there are granted file locks, the corresponding LOCK 4134 operations need to be performed. This applies to the write open 4135 delegation case only. 4137 o For a write open delegation, if at the time of recall the file 4138 is not open for write, all modified data for the file must be 4139 flushed to the server. If the delegation had not existed, the 4140 client would have done this data flush before the CLOSE 4141 operation. 4143 o For a write open delegation when a file is still open at the 4144 time of recall, any modified data for the file needs to be 4145 flushed to the server. 4147 o With the write open delegation in place, it is possible that the 4148 file was truncated during the duration of the delegation. For 4149 example, the truncation could have occurred as a result of an 4150 OPEN UNCHECKED with a size attribute value of zero. Therefore, 4151 if a truncation of the file has occurred and this operation has 4152 not been propagated to the server, the truncation must occur 4153 before any modified data is written to the server. 4155 In the case of write open delegation, file locking imposes some 4156 additional requirements. The flushing of any modified data in any 4157 region for which a write lock was released while the write open 4158 delegation was in effect is what is required to precisely maintain 4159 the associated invariant. However, because the write open delegation 4160 implies no other locking by other clients, a simpler implementation 4161 is to flush all modified data for the file (as described just above) 4162 if any write lock has been released while the write open delegation 4163 was in effect. 4165 9.4.4. Delegation Revocation 4167 At the point a delegation is revoked, if there are associated opens 4168 on the client, the applications holding these opens need to be 4169 notified. This notification usually occurs by returning errors for 4170 READ/WRITE operations or when a close is attempted for the open file. 4172 If no opens exist for the file at the point the delegation is 4173 revoked, then notification of the revocation is unnecessary. 4174 However, if there is modified data present at the client for the 4175 file, the user of the application should be notified. Unfortunately, 4176 it may not be possible to notify the user since active applications 4177 may not be present at the client. See the section "Revocation 4178 Recovery for Write Open Delegation" for additional details. 4180 9.5. Data Caching and Revocation 4182 When locks and delegations are revoked, the assumptions upon which 4183 successful caching depend are no longer guaranteed. The owner of the 4184 locks or share reservations which have been revoked needs to be 4185 notified. This notification includes applications with a file open 4186 that has a corresponding delegation which has been revoked. Cached 4187 data associated with the revocation must be removed from the client. 4188 In the case of modified data existing in the client's cache, that 4189 data must be removed from the client without it being written to the 4190 server. As mentioned, the assumptions made by the client are no 4191 longer valid at the point when a lock or delegation has been revoked. 4192 For example, another client may have been granted a conflicting lock 4193 after the revocation of the lock at the first client. Therefore, the 4194 data within the lock range may have been modified by the other 4195 client. Obviously, the first client is unable to guarantee to the 4196 application what has occurred to the file in the case of revocation. 4198 Notification to a lock owner will in many cases consist of simply 4199 returning an error on the next and all subsequent READs/WRITEs to the 4200 open file or on the close. Where the methods available to a client 4201 make such notification impossible because errors for certain 4202 operations may not be returned, more drastic action such as signals 4203 or process termination may be appropriate. The justification for 4204 this is that an invariant for which an application depends on may be 4205 violated. Depending on how errors are typically treated for the 4206 client operating environment, further levels of notification 4207 including logging, console messages, and GUI pop-ups may be 4208 appropriate. 4210 9.5.1. Revocation Recovery for Write Open Delegation 4212 Revocation recovery for a write open delegation poses the special 4213 issue of modified data in the client cache while the file is not 4214 open. In this situation, any client which does not flush modified 4215 data to the server on each close must ensure that the user receives 4216 appropriate notification of the failure as a result of the 4217 revocation. Since such situations may require human action to 4218 correct problems, notification schemes in which the appropriate user 4219 or administrator is notified may be necessary. Logging and console 4220 messages are typical examples. 4222 If there is modified data on the client, it must not be flushed 4223 normally to the server. A client may attempt to provide a copy of 4224 the file data as modified during the delegation under a different 4225 name in the file system name space to ease recovery. Unless the 4226 client can determine that the file has not modified by any other 4227 client, this technique must be limited to situations in which a 4228 client has a complete cached copy of the file in question. Use of 4229 such a technique may be limited to files under a certain size or may 4230 only be used when sufficient disk space is guaranteed to be available 4231 within the target file system and when the client has sufficient 4232 buffering resources to keep the cached copy available until it is 4233 properly stored to the target file system. 4235 9.6. Attribute Caching 4237 The attributes discussed in this section do not include named 4238 attributes. Individual named attributes are analogous to files and 4239 caching of the data for these needs to be handled just as data 4240 caching is for ordinary files. Similarly, LOOKUP results from an 4241 OPENATTR directory are to be cached on the same basis as any other 4242 pathnames and similarly for directory contents. 4244 Clients may cache file attributes obtained from the server and use 4245 them to avoid subsequent GETATTR requests. Such caching is write 4246 through in that modification to file attributes is always done by 4247 means of requests to the server and should not be done locally and 4248 cached. The exception to this are modifications to attributes that 4249 are intimately connected with data caching. Therefore, extending a 4250 file by writing data to the local data cache is reflected immediately 4251 in the size as seen on the client without this change being 4252 immediately reflected on the server. Normally such changes are not 4253 propagated directly to the server but when the modified data is 4254 flushed to the server, analogous attribute changes are made on the 4255 server. When open delegation is in effect, the modified attributes 4256 may be returned to the server in the response to a CB_RECALL call. 4258 The result of local caching of attributes is that the attribute 4259 caches maintained on individual clients will not be coherent. Changes 4260 made in one order on the server may be seen in a different order on 4261 one client and in a third order on a different client. 4263 The typical file system application programming interfaces do not 4264 provide means to atomically modify or interrogate attributes for 4265 multiple files at the same time. The following rules provide an 4266 environment where the potential incoherences mentioned above can be 4267 reasonably managed. These rules are derived from the practice of 4268 previous NFS protocols. 4270 o All attributes for a given file (per-fsid attributes excepted) 4271 are cached as a unit at the client so that no non- 4272 serializability can arise within the context of a single file. 4274 o An upper time boundary is maintained on how long a client cache 4275 entry can be kept without being refreshed from the server. 4277 o When operations are performed that change attributes at the 4278 server, the updated attribute set is requested as part of the 4279 containing RPC. This includes directory operations that update 4280 attributes indirectly. This is accomplished by following the 4281 modifying operation with a GETATTR operation and then using the 4282 results of the GETATTR to update the client's cached attributes. 4284 Note that if the full set of attributes to be cached is requested by 4285 READDIR, the results can be cached by the client on the same basis as 4286 attributes obtained via GETATTR. 4288 A client may validate its cached version of attributes for a file by 4289 fetching only the change attribute and assuming that if the change 4290 attribute has the same value as it did when the attributes were 4291 cached, then no attributes have changed. The possible exception is 4292 the attribute time_access. 4294 9.7. Name Caching 4296 The results of LOOKUP and READDIR operations may be cached to avoid 4297 the cost of subsequent LOOKUP operations. Just as in the case of 4298 attribute caching, inconsistencies may arise among the various client 4299 caches. To mitigate the effects of these inconsistencies and given 4300 the context of typical file system APIs, the following rules should 4301 be followed: 4303 o The results of unsuccessful LOOKUPs should not be cached, unless 4304 they are specifically reverified at the point of use. 4306 o An upper time boundary is maintained on how long a client name 4307 cache entry can be kept without verifying that the entry has not 4308 been made invalid by a directory change operation performed by 4309 another client. 4311 When a client is not making changes to a directory for which there 4312 exist name cache entries, the client needs to periodically fetch 4313 attributes for that directory to ensure that it is not being 4314 modified. After determining that no modification has occurred, the 4315 expiration time for the associated name cache entries may be updated 4316 to be the current time plus the name cache staleness bound. 4318 When a client is making changes to a given directory, it needs to 4319 determine whether there have been changes made to the directory by 4320 other clients. It does this by using the change attribute as 4321 reported before and after the directory operation in the associated 4322 change_info4 value returned for the operation. The server is able to 4323 communicate to the client whether the change_info4 data is provided 4324 atomically with respect to the directory operation. If the change 4325 values are provided atomically, the client is then able to compare 4326 the pre-operation change value with the change value in the client's 4327 name cache. If the comparison indicates that the directory was 4328 updated by another client, the name cache associated with the 4329 modified directory is purged from the client. If the comparison 4330 indicates no modification, the name cache can be updated on the 4331 client to reflect the directory operation and the associated timeout 4332 extended. The post-operation change value needs to be saved as the 4333 basis for future change_info4 comparisons. 4335 As demonstrated by the scenario above, name caching requires that the 4336 client revalidate name cache data by inspecting the change attribute 4337 of a directory at the point when the name cache item was cached. 4338 This requires that the server update the change attribute for 4339 directories when the contents of the corresponding directory is 4340 modified. For a client to use the change_info4 information 4341 appropriately and correctly, the server must report the pre and post 4342 operation change attribute values atomically. When the server is 4343 unable to report the before and after values atomically with respect 4344 to the directory operation, the server must indicate that fact in the 4345 change_info4 return value. When the information is not atomically 4346 reported, the client should not assume that other clients have not 4347 changed the directory. 4349 9.8. Directory Caching 4351 The results of READDIR operations may be used to avoid subsequent 4352 READDIR operations. Just as in the cases of attribute and name 4353 caching, inconsistencies may arise among the various client caches. 4354 To mitigate the effects of these inconsistencies, and given the 4355 context of typical file system APIs, the following rules should be 4356 followed: 4358 o Cached READDIR information for a directory which is not obtained 4359 in a single READDIR operation must always be a consistent 4360 snapshot of directory contents. This is determined by using a 4361 GETATTR before the first READDIR and after the last of READDIR 4362 that contributes to the cache. 4364 o An upper time boundary is maintained to indicate the length of 4365 time a directory cache entry is considered valid before the 4366 client must revalidate the cached information. 4368 The revalidation technique parallels that discussed in the case of 4369 name caching. When the client is not changing the directory in 4370 question, checking the change attribute of the directory with GETATTR 4371 is adequate. The lifetime of the cache entry can be extended at 4372 these checkpoints. When a client is modifying the directory, the 4373 client needs to use the change_info4 data to determine whether there 4374 are other clients modifying the directory. If it is determined that 4375 no other client modifications are occurring, the client may update 4376 its directory cache to reflect its own changes. 4378 As demonstrated previously, directory caching requires that the 4379 client revalidate directory cache data by inspecting the change 4380 attribute of a directory at the point when the directory was cached. 4381 This requires that the server update the change attribute for 4382 directories when the contents of the corresponding directory is 4383 modified. For a client to use the change_info4 information 4384 appropriately and correctly, the server must report the pre and post 4385 operation change attribute values atomically. When the server is 4386 unable to report the before and after values atomically with respect 4387 to the directory operation, the server must indicate that fact in the 4388 change_info4 return value. When the information is not atomically 4389 reported, the client should not assume that other clients have not 4390 changed the directory. 4392 10. Minor Versioning 4394 To address the requirement of an NFS protocol that can evolve as the 4395 need arises, the NFS version 4 protocol contains the rules and 4396 framework to allow for future minor changes or versioning. 4398 The base assumption with respect to minor versioning is that any 4399 future accepted minor version must follow the IETF process and be 4400 documented in a standards track RFC. Therefore, each minor version 4401 number will correspond to an RFC. Minor version zero of the NFS 4402 version 4 protocol is represented by this RFC. The COMPOUND 4403 procedure will support the encoding of the minor version being 4404 requested by the client. 4406 The following items represent the basic rules for the development of 4407 minor versions. Note that a future minor version may decide to 4408 modify or add to the following rules as part of the minor version 4409 definition. 4411 1 Procedures are not added or deleted 4413 To maintain the general RPC model, NFS version 4 minor versions 4414 will not add or delete procedures from the NFS program. 4416 2 Minor versions may add operations to the COMPOUND and 4417 CB_COMPOUND procedures. 4419 The addition of operations to the COMPOUND and CB_COMPOUND 4420 procedures does not affect the RPC model. 4422 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, 4423 and GETATTR4res. 4425 This allows for the expansion of the attribute model to allow 4426 for future growth or adaptation. 4428 2.2 Minor version X must append any new attributes after the last 4429 documented attribute. 4431 Since attribute results are specified as an opaque array of 4432 per-attribute XDR encoded results, the complexity of adding new 4433 attributes in the midst of the current definitions will be too 4434 burdensome. 4436 3 Minor versions must not modify the structure of an existing 4437 operation's arguments or results. 4439 Again the complexity of handling multiple structure definitions 4440 for a single operation is too burdensome. New operations should 4441 be added instead of modifying existing structures for a minor 4442 version. 4444 This rule does not preclude the following adaptations in a minor 4445 version. 4447 o adding bits to flag fields such as new attributes to 4448 GETATTR's bitmap4 data type 4450 o adding bits to existing attributes like ACLs that have flag 4451 words 4453 o extending enumerated types (including NFS4ERR_*) with new 4454 values 4456 4 Minor versions may not modify the structure of existing 4457 attributes. 4459 5 Minor versions may not delete operations. 4461 This prevents the potential reuse of a particular operation 4462 "slot" in a future minor version. 4464 6 Minor versions may not delete attributes. 4466 7 Minor versions may not delete flag bits or enumeration values. 4468 8 Minor versions may declare an operation as mandatory to NOT 4469 implement. 4471 Specifying an operation as "mandatory to not implement" is 4472 equivalent to obsoleting an operation. For the client, it means 4473 that the operation should not be sent to the server. For the 4474 server, an NFS error can be returned as opposed to "dropping" 4475 the request as an XDR decode error. This approach allows for 4476 the obsolescence of an operation while maintaining its structure 4477 so that a future minor version can reintroduce the operation. 4479 8.1 Minor versions may declare attributes mandatory to NOT 4480 implement. 4482 8.2 Minor versions may declare flag bits or enumeration values as 4483 mandatory to NOT implement. 4485 9 Minor versions may downgrade features from mandatory to 4486 recommended, or recommended to optional. 4488 10 Minor versions may upgrade features from optional to recommended 4489 or recommended to mandatory. 4491 11 A client and server that support minor version X must support 4492 minor versions 0 (zero) through X-1 as well. 4494 12 No new features may be introduced as mandatory in a minor 4495 version. 4497 This rule allows for the introduction of new functionality and 4498 forces the use of implementation experience before designating a 4499 feature as mandatory. 4501 13 A client MUST NOT attempt to use a stateid, file handle, or 4502 similar returned object from the COMPOUND procedure with minor 4503 version X for another COMPOUND procedure with minor version Y, 4504 where X != Y. 4506 11. Internationalization 4508 The primary issue in which NFS needs to deal with 4509 internationalization, or I18N, is with respect to file names and 4510 other strings as used within the protocol. The choice of string 4511 representation must allow reasonable name/string access to clients 4512 which use various languages. The UTF-8 encoding of the UCS as 4513 defined by [ISO10646] allows for this type of access and follows the 4514 policy described in "IETF Policy on Character Sets and Languages", 4515 [RFC2277]. This choice is explained further in the following. 4517 11.1. Universal Versus Local Character Sets 4519 [RFC1345] describes a table of 16 bit characters for many different 4520 languages (the bit encodings match Unicode, though of course RFC1345 4521 is somewhat out of date with respect to current Unicode assignments). 4522 Each character from each language has a unique 16 bit value in the 16 4523 bit character set. Thus this table can be thought of as a universal 4524 character set. [RFC1345] then talks about groupings of subsets of 4525 the entire 16 bit character set into "Charset Tables". For example 4526 one might take all the Greek characters from the 16 bit table (which 4527 are consecutively allocated), and normalize their offsets to a table 4528 that fits in 7 bits. Thus it is determined that "lower case alpha" 4529 is in the same position as "upper case a" in the US-ASCII table, and 4530 "upper case alpha" is in the same position as "lower case a" in the 4531 US-ASCII table. 4533 These normalized subset character sets can be thought of as "local 4534 character sets", suitable for an operating system locale. 4536 Local character sets are not suitable for the NFS protocol. Consider 4537 someone who creates a file with a name in a Swedish character set. 4538 If someone else later goes to access the file with their locale set 4539 to the Swedish language, then there are no problems. But if someone 4540 in say the US-ASCII locale goes to access the file, the file name 4541 will look very different, because the Swedish characters in the 7 bit 4542 table will now be represented in US-ASCII characters on the display. 4543 It would be preferable to give the US-ASCII user a way to display the 4544 file name using Swedish glyphs. In order to do that, the NFS protocol 4545 would have to include the locale with the file name on each operation 4546 to create a file. 4548 But then what of the situation when there is a path name on the 4549 server like: 4551 /component-1/component-2/component-3 4553 Each component could have been created with a different locale. If 4554 one issues CREATE with multi-component path name, and if some of the 4555 leading components already exist, what is to be done with the 4556 existing components? Is the current locale attribute replaced with 4557 the user's current one? These types of situations quickly become too 4558 complex when there is an alternate solution. 4560 If the NFS version 4 protocol used a universal 16 bit or 32 bit 4561 character set (or an encoding of a 16 bit or 32 bit character set 4562 into octets), then the server and client need not care if the locale 4563 of the user accessing the file is different than the locale of the 4564 user who created the file. The unique 16 bit or 32 bit encoding of 4565 the character allows for determination of what language the character 4566 is from and also how to display that character on the client. The 4567 server need not know what locales are used. 4569 11.2. Overview of Universal Character Set Standards 4571 The previous section makes a case for using a universal character 4572 set. This section makes the case for using UTF-8 as the specific 4573 universal character set for the NFS version 4 protocol. 4575 [RFC2279] discusses UTF-* (UTF-8 and other UTF-XXX encodings), 4576 Unicode, and UCS-*. There are two standards bodies managing 4577 universal code sets: 4579 o ISO/IEC which has the standard 10646-1 4581 o Unicode which has the Unicode standard 4583 Both standards bodies have pledged to track each other's assignments 4584 of character codes. 4586 The following is a brief analysis of the various standards. 4588 UCS Universal Character Set. This is ISO/IEC 10646-1: "a 4589 multi-octet character set called the Universal Character 4590 Set (UCS), which encompasses most of the world's writing 4591 systems." 4593 UCS-2 a two octet per character encoding that addresses the first 4594 2^16 characters of UCS. Currently there are no UCS 4595 characters beyond that range. 4597 UCS-4 a four octet per character encoding that permits the 4598 encoding of up to 2^31 characters. 4600 UTF UTF is an abbreviation of the term "UCS transformation 4601 format" and is used in the naming of various standards for 4602 encoding of UCS characters as described below. 4604 UTF-1 Only historical interest; it has been removed from 10646-1 4605 UTF-7 Encodes the entire "repertoire" of UCS "characters using 4606 only octets with the higher order bit clear". [RFC2152] 4607 describes UTF-7. UTF-7 accomplishes this by reserving one 4608 of the 7bit US-ASCII characters as a "shift" character to 4609 indicate non-US-ASCII characters. 4611 UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII 4612 characters are encoded as before unchanged. Any octet with 4613 the high bit cleared can only mean a US-ASCII character. 4614 The high bit set means that a UCS character is being 4615 encoded. 4617 UTF-16 Encodes UCS-4 characters into UCS-2 characters using a 4618 reserved range in UCS-2. 4620 Unicode Unicode and UCS-2 are the same; [RFC2279] states: 4622 Up to the present time, changes in Unicode and amendments 4623 to ISO/IEC 10646 have tracked each other, so that the 4624 character repertoires and code point assignments have 4625 remained in sync. The relevant standardization committees 4626 have committed to maintain this very useful synchronism. 4628 11.3. Difficulties with UCS-4, UCS-2, Unicode 4630 Adapting existing applications, and file systems to multi-octet 4631 schemes like UCS and Unicode can be difficult. A significant amount 4632 of code has been written to process streams of bytes. Also there are 4633 many existing stored objects described with 7 bit or 8 bit 4634 characters. Doubling or quadrupling the bandwidth and storage 4635 requirements seems like an expensive way to accomplish I18N. 4637 UCS-2 and Unicode are "only" 16 bits long. That might seem to be 4638 enough but, according to [Unicode1], 49,194 Unicode characters are 4639 already assigned. According to [Unicode2] there are still more 4640 languages that need to be added. 4642 11.4. UTF-8 and its solutions 4644 UTF-8 solves problems for NFS that exist with the use of UCS and 4645 Unicode. UTF-8 will encode 16 bit and 32 bit characters in a way 4646 that will be compact for most users. The encoding table from UCS-4 to 4647 UTF-8, as copied from [RFC2279]: 4649 UCS-4 range (hex.) UTF-8 octet sequence (binary) 4650 0000 0000-0000 007F 0xxxxxxx 4651 0000 0080-0000 07FF 110xxxxx 10xxxxxx 4652 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 4653 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 4654 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 4655 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 4656 10xxxxxx 4658 See [RFC2279] for precise encoding and decoding rules. Note because 4659 of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account 4660 for the reserved range between D800 and DFFF. 4662 Note that the 16 bit UCS or Unicode characters require no more than 3 4663 octets to encode into UTF-8 4665 Interestingly, UTF-8 has room to handle characters larger than 31 4666 bits, because the leading octet of form: 4668 1111111x 4670 is not defined. If needed, ISO could either use that octet to 4671 indicate a sequence of an encoded 8 octet character, or perhaps use 4672 11111110 to permit the next octet to indicate an even more expandable 4673 character set. 4675 So using UTF-8 to represent character encodings means never having to 4676 run out of room. 4678 11.5. Normalization 4680 The client and server operating environments may differ in their 4681 policies and operational methods with respect to character 4682 normalization (See [Unicode1] for a discussion of normalization 4683 forms). This difference may also exist between applications on the 4684 same client. This adds to the difficulty of providing a single 4685 normalization policy for the protocol that allows for maximal 4686 interoperability. This issue is similar to the character case issues 4687 where the server may or may not support case insensitive file name 4688 matching and may or may not preserve the character case when storing 4689 file names. The protocol does not mandate a particular behavior but 4690 allows for the various permutations. 4692 The NFS version 4 protocol does not mandate the use of a particular 4693 normalization form at this time. A later revision of this 4694 specification may specify a particular normalization form. 4695 Therefore, the server and client can expect that they may receive 4696 unnormalized characters within protocol requests and responses. If 4697 the operating environment requires normalization, then the 4698 implementation must normalize the various UTF-8 encoded strings 4699 within the protocol before presenting the information to an 4700 application (at the client) or local file system (at the server). 4702 12. Error Definitions 4704 NFS error numbers are assigned to failed operations within a compound 4705 request. A compound request contains a number of NFS operations that 4706 have their results encoded in sequence in a compound reply. The 4707 results of successful operations will consist of an NFS4_OK status 4708 followed by the encoded results of the operation. If an NFS 4709 operation fails, an error status will be entered in the reply and the 4710 compound request will be terminated. 4712 A description of each defined error follows: 4714 NFS4_OK Indicates the operation completed successfully. 4716 NFS4ERR_ACCESS Permission denied. The caller does not have the 4717 correct permission to perform the requested 4718 operation. Contrast this with NFS4ERR_PERM, 4719 which restricts itself to owner or privileged 4720 user permission failures. 4722 NFS4ERR_ATTRNOTSUPP An attribute specified is not supported by the 4723 server. Does not apply to the GETATTR 4724 operation. 4726 NFS4ERR_BADHANDLE Illegal NFS file handle. The file handle failed 4727 internal consistency checks. 4729 NFS4ERR_BADOWNER An owner, owner_group, or ACL attribute value 4730 can not be translated to local representation. 4732 NFS4ERR_BADTYPE An attempt was made to create an object of a 4733 type not supported by the server. 4735 NFS4ERR_BAD_COOKIE READDIR cookie is stale. 4737 NFS4ERR_BAD_SEQID The sequence number in a locking request is 4738 neither the next expected number or the last 4739 number processed. 4741 NFS4ERR_BAD_STATEID A stateid generated by the current server 4742 instance, but which does not designate any 4743 locking state (either current or superseded) 4744 for a current lockowner-file pair, was used. 4746 NFS4ERR_BADXDR The server encountered an XDR decoding error 4747 while processing an operation. 4749 NFS4ERR_CLID_INUSE The SETCLIENTID procedure has found that a 4750 client id is already in use by another client. 4752 NFS4ERR_DELAY The server initiated the request, but was not 4753 able to complete it in a timely fashion. The 4754 client should wait and then try the request 4755 with a new RPC transaction ID. For example, 4756 this error should be returned from a server 4757 that supports hierarchical storage and receives 4758 a request to process a file that has been 4759 migrated. In this case, the server should start 4760 the immigration process and respond to client 4761 with this error. This error may also occur 4762 when a necessary delegation recall makes 4763 processing a request in a timely fashion 4764 impossible. 4766 NFS4ERR_DENIED An attempt to lock a file is denied. Since 4767 this may be a temporary condition, the client 4768 is encouraged to retry the lock request until 4769 the lock is accepted. 4771 NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The 4772 user's resource limit on the server has been 4773 exceeded. 4775 NFS4ERR_EXIST File exists. The file specified already exists. 4777 NFS4ERR_EXPIRED A lease has expired that is being used in the 4778 current procedure. 4780 NFS4ERR_FBIG File too large. The operation would have caused 4781 a file to grow beyond the server's limit. 4783 NFS4ERR_FHEXPIRED The file handle provided is volatile and has 4784 expired at the server. 4786 NFS4ERR_GRACE The server is in its recovery or grace period 4787 which should match the lease period of the 4788 server. 4790 NFS4ERR_INVAL Invalid argument or unsupported argument for an 4791 operation. Two examples are attempting a 4792 READLINK on an object other than a symbolic 4793 link or attempting to SETATTR a time field on a 4794 server that does not support this operation. 4796 NFS4ERR_IO I/O error. A hard error (for example, a disk 4797 error) occurred while processing the requested 4798 operation. 4800 NFS4ERR_ISDIR Is a directory. The caller specified a 4801 directory in a non-directory operation. 4803 NFS4ERR_LEASE_MOVED A lease being renewed is associated with a file 4804 system that has been migrated to a new server. 4806 NFS4ERR_LOCKED A read or write operation was attempted on a 4807 locked file. 4809 NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a 4810 current lock for the lock owner and the server 4811 does not support this type of request. 4813 NFS4ERR_MINOR_VERS_MISMATCH 4814 The server has received a request that 4815 specifies an unsupported minor version. The 4816 server must return a COMPOUND4res with a zero 4817 length operations result array. 4819 NFS4ERR_MLINK Too many hard links. 4821 NFS4ERR_MOVED The filesystem which contains the current 4822 filehandle object has been relocated or 4823 migrated to another server. The client may 4824 obtain the new filesystem location by obtaining 4825 the "fs_locations" attribute for the current 4826 filehandle. For further discussion, refer to 4827 the section "Filesystem Migration or 4828 Relocation". 4830 NFS4ERR_NAMETOOLONG The filename in an operation was too long. 4832 NFS4ERR_NODEV No such device. 4834 NFS4ERR_NOENT No such file or directory. The file or 4835 directory name specified does not exist. 4837 NFS4ERR_NOFILEHANDLE The logical current file handle value has not 4838 been set properly. This may be a result of a 4839 malformed COMPOUND operation (i.e. no PUTFH or 4840 PUTROOTFH before an operation that requires the 4841 current file handle be set). 4843 NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of 4844 the grace period of the server. As a result, 4845 the server can not guarantee that conflicting 4846 state has not been provided to another client. 4848 NFS4ERR_NOSPC No space left on device. The operation would 4849 have caused the server's file system to exceed 4850 its limit. 4852 NFS4ERR_NOTDIR Not a directory. The caller specified a non- 4853 directory in a directory operation. 4855 NFS4ERR_NOTEMPTY An attempt was made to remove a directory that 4856 was not empty. 4858 NFS4ERR_NOTSUPP Operation is not supported. 4860 NFS4ERR_NOT_SAME This error is returned by the VERIFY operation 4861 to signify that the attributes compared were 4862 not the same as provided in the client's 4863 request. 4865 NFS4ERR_NXIO I/O error. No such device or address. 4867 NFS4ERR_OLD_STATEID A stateid which designates the locking state 4868 for a lockowner-file at an earlier time was 4869 used. 4871 NFS4ERR_OPENMODE The client attempted a READ, WRITE, or SETATTR 4872 operation not sanctioned by the stateid passed 4873 (e.g. writing to a file opened only for read). 4875 NFS4ERR_PERM Not owner. The operation was not allowed 4876 because the caller is either not a privileged 4877 user (root) or not the owner of the target of 4878 the operation. 4880 NFS4ERR_READDIR_NOSPC The encoded response to a READDIR request 4881 exceeds the size limit set by the initial 4882 request. 4884 NFS4ERR_RECLAIM_BAD The reclaim provided by the client does not 4885 match any of the server's state consistency 4886 checks and is bad. 4888 NFS4ERR_RECLAIM_CONFLICT 4889 The reclaim provided by the client has 4890 encountered a conflict and can not be provided. 4891 Potentially indicates a misbehaving client. 4893 NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, 4894 the server may exhaust available resources and 4895 can not continue processing procedures within 4896 the COMPOUND operation. This error will be 4897 returned from the server in those instances of 4898 resource exhaustion related to the processing 4899 of the COMPOUND procedure. 4901 NFS4ERR_ROFS Read-only file system. A modifying operation 4902 was attempted on a read-only file system. 4904 NFS4ERR_SAME This error is returned by the NVERIFY operation 4905 to signify that the attributes compared were 4906 the same as provided in the client's request. 4908 NFS4ERR_SERVERFAULT An error occurred on the server which does not 4909 map to any of the legal NFS version 4 protocol 4910 error values. The client should translate this 4911 into an appropriate error. UNIX clients may 4912 choose to translate this to EIO. 4914 NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share 4915 reservation has failed because of a share 4916 conflict. 4918 NFS4ERR_STALE Invalid file handle. The file handle given in 4919 the arguments was invalid. The file referred to 4920 by that file handle no longer exists or access 4921 to it has been revoked. 4923 NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was 4924 used in a locking or SETCLIENTID_CONFIRM 4925 request. 4927 NFS4ERR_STALE_STATEID A stateid generated by an earlier server 4928 instance was used. 4930 NFS4ERR_SYMLINK The current file handle provided for a LOOKUP 4931 is not a directory but a symbolic link. Also 4932 used if the final component of the OPEN path is 4933 a symbolic link. 4935 NFS4ERR_TOOSMALL Buffer or request is too small. 4937 NFS4ERR_WRONGSEC The security mechanism being used by the client 4938 for the procedure does not match the server's 4939 security policy. The client should change the 4940 security mechanism being used and retry the 4941 operation. 4943 NFS4ERR_XDEV Attempt to do a cross-device hard link. 4945 13. NFS Version 4 Requests 4947 For the NFS version 4 RPC program, there are two traditional RPC 4948 procedures: NULL and COMPOUND. All other functionality is defined as 4949 a set of operations and these operations are defined in normal 4950 XDR/RPC syntax and semantics. However, these operations are 4951 encapsulated within the COMPOUND procedure. This requires that the 4952 client combine one or more of the NFS version 4 operations into a 4953 single request. 4955 The NFS4_CALLBACK program is used to provide server to client 4956 signaling and is constructed in a similar fashion as the NFS version 4957 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the 4958 same way as NULL and COMPOUND are within the NFS program. The 4959 CB_COMPOUND request also encapsulates the remaining operations of the 4960 NFS4_CALLBACK program. There is no predefined RPC program number for 4961 the NFS4_CALLBACK program. It is up to the client to specify a 4962 program number in the "transient" program range. The program and 4963 port number of the NFS4_CALLBACK program are provided by the client 4964 as part of the SETCLIENTID operation and therefore is fixed for the 4965 life of the client instantiation. 4967 13.1. Compound Procedure 4969 The COMPOUND procedure provides the opportunity for better 4970 performance within high latency networks. The client can avoid 4971 cumulative latency of multiple RPCs by combining multiple dependent 4972 operations into a single COMPOUND procedure. A compound operation 4973 may provide for protocol simplification by allowing the client to 4974 combine basic procedures into a single request that is customized for 4975 the client's environment. 4977 The CB_COMPOUND procedure precisely parallels the features of 4978 COMPOUND as described above. 4980 The basic structure of the COMPOUND procedure is: 4982 +-----+--------------+--------+-----------+-----------+-----------+-- 4983 | tag | minorversion | numops | op + args | op + args | op + args | 4984 +-----+--------------+--------+-----------+-----------+-----------+-- 4986 and the reply's structure is: 4988 +------------+-----+--------+-----------------------+-- 4989 |last status | tag | numres | status + op + results | 4990 +------------+-----+--------+-----------------------+-- 4992 The numops and numres fields, used in the depiction above, represent 4993 the count for the counted array encoding use to signify the number of 4994 arguments or results encoded in the request and response. As per the 4995 XDR encoding, these counts must match exactly the number of operation 4996 arguments or results encoded. 4998 13.2. Evaluation of a Compound Request 5000 The server will process the COMPOUND procedure by evaluating each of 5001 the operations within the COMPOUND procedure in order. Each 5002 component operation consists of a 32 bit operation code, followed by 5003 the argument of length determined by the type of operation. The 5004 results of each operation are encoded in sequence into a reply 5005 buffer. The results of each operation are preceded by the opcode and 5006 a status code (normally zero). If an operation results in a non-zero 5007 status code, the status will be encoded and evaluation of the 5008 compound sequence will halt and the reply will be returned. Note 5009 that evaluation stops even in the event of "non error" conditions 5010 such as NFS4ERR_SAME. 5012 There are no atomicity requirements for the operations contained 5013 within the COMPOUND procedure. The operations being evaluated as 5014 part of a COMPOUND request may be evaluated simultaneously with other 5015 COMPOUND requests that the server receives. 5017 It is the client's responsibility for recovering from any partially 5018 completed COMPOUND procedure. Partially completed COMPOUND 5019 procedures may occur at any point due to errors such as 5020 NFS4ERR_RESOURCE and NFS4ERR_DELAY. This may occur even given an 5021 otherwise valid operation string. Further, a server reboot which 5022 occurs in the middle of processing a COMPOUND procedure may leave the 5023 client with the difficult task of determining how far COMPOUND 5024 processing has proceeded. Therefore, the client should avoid overly 5025 complex COMPOUND procedures in the event of the failure of an 5026 operation within the procedure. 5028 Each operation assumes a "current" and "saved" filehandle that is 5029 available as part of the execution context of the compound request. 5030 Operations may set, change, or return the current filehandle. The 5031 "saved" filehandle is used for temporary storage of a filehandle 5032 value and as operands for the RENAME and LINK operations. 5034 13.3. Synchronous Modifying Operations 5036 NFS version 4 operations that modify the file system are synchronous. 5037 When an operation is successfully completed at the server, the client 5038 can depend that any data associated with the request is now on stable 5039 storage (the one exception is in the case of the file data in a WRITE 5040 operation with the UNSTABLE option specified). 5042 This implies that any previous operations within the same compound 5043 request are also reflected in stable storage. This behavior enables 5044 the client's ability to recover from a partially executed compound 5045 request which may resulted from the failure of the server. For 5046 example, if a compound request contains operations A and B and the 5047 server is unable to send a response to the client, depending on the 5048 progress the server made in servicing the request the result of both 5049 operations may be reflected in stable storage or just operation A may 5050 be reflected. The server must not have just the results of operation 5051 B in stable storage. 5053 13.4. Operation Values 5055 The operations encoded in the COMPOUND procedure are identified by 5056 operation values. To avoid overlap with the RPC procedure numbers, 5057 operations 0 (zero) and 1 are not defined. Operation 2 is not 5058 defined but reserved for future use with minor versioning. 5060 14. NFS Version 4 Procedures 5062 14.1. Procedure 0: NULL - No Operation 5064 SYNOPSIS 5066 5068 ARGUMENT 5070 void; 5072 RESULT 5074 void; 5076 DESCRIPTION 5078 Standard NULL procedure. Void argument, void response. This 5079 procedure has no functionality associated with it. Because of this 5080 it is sometimes used to measure the overhead of processing a 5081 service request. Therefore, the server should ensure that no 5082 unnecessary work is done in servicing this procedure. 5084 ERRORS 5086 None. 5088 14.2. Procedure 1: COMPOUND - Compound Operations 5090 SYNOPSIS 5092 compoundargs -> compoundres 5094 ARGUMENT 5096 union nfs_argop4 switch (nfs_opnum4 argop) { 5097 case : ; 5098 ... 5099 }; 5101 struct COMPOUND4args { 5102 utf8string tag; 5103 uint32_t minorversion; 5104 nfs_argop4 argarray<>; 5105 }; 5107 RESULT 5109 union nfs_resop4 switch (nfs_opnum4 resop){ 5110 case : ; 5111 ... 5112 }; 5114 struct COMPOUND4res { 5115 nfsstat4 status; 5116 utf8string tag; 5117 nfs_resop4 resarray<>; 5118 }; 5120 DESCRIPTION 5122 The COMPOUND procedure is used to combine one or more of the NFS 5123 operations into a single RPC request. The main NFS RPC program has 5124 two main procedures: NULL and COMPOUND. All other operations use 5125 the COMPOUND procedure as a wrapper. 5127 The COMPOUND procedure is used to combine individual operations 5128 into a single RPC request. The server interprets each of the 5129 operations in turn. If an operation is executed by the server and 5130 the status of that operation is NFS4_OK, then the next operation in 5131 the COMPOUND procedure is executed. The server continues this 5132 process until there are no more operations to be executed or one of 5133 the operations has a status value other than NFS4_OK. 5135 In the processing of the COMPOUND procedure, the server may find 5136 that it does not have the available resources to execute any or all 5137 of the operations within the COMPOUND sequence. In this case, the 5138 error NFS4ERR_RESOURCE will be returned for the particular 5139 operation within the COMPOUND procedure where the resource 5140 exhaustion occurred. This assumes that all previous operations 5141 within the COMPOUND sequence have been evaluated successfully. The 5142 results for all of the evaluated operations must be returned to the 5143 client. 5145 The server will generally choose between two methods of decoding 5146 the client's request. The first would be the traditional one pass 5147 XDR decode. If there is an XDR decoding error in this case, the 5148 RPC XDR decode error would be returned. The second method would be 5149 to make an initial pass to decode the basic COMPOUND request and 5150 then to XDR decode the individual operations; the most interesting 5151 is the decode of attributes. In this case, the server may 5152 encounter an XDR decode error during the second pass. In this 5153 case, the server would return the error NFS4ERR_BADXDR to signify 5154 the decode error. 5156 The COMPOUND arguments contain a "minorversion" field. The initial 5157 and default value for this field is 0 (zero). This field will be 5158 used by future minor versions such that the client can communicate 5159 to the server what minor version is being requested. If the server 5160 receives a COMPOUND procedure with a minorversion field value that 5161 it does not support, the server MUST return an error of 5162 NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. 5164 Contained within the COMPOUND results is a "status" field. If the 5165 results array length is non-zero, this status must be equivalent to 5166 the status of the last operation that was executed within the 5167 COMPOUND procedure. Therefore, if an operation incurred an error 5168 then the "status" value will be the same error value as is being 5169 returned for the operation that failed. 5171 Note that operations, 0 (zero) and 1 (one) are not defined for the 5172 COMPOUND procedure. If the server receives an operation array with 5173 either of these included, an error of NFS4ERR_NOTSUPP must be 5174 returned. Operation 2 is not defined but reserved for future 5175 definition and use with minor versioning. If the server receives a 5176 operation array that contains operation 2 and the minorversion 5177 field has a value of 0 (zero), an error of NFS4ERR_NOTSUPP is 5178 returned. If an operation array contains an operation 2 and the 5179 minorversion field is non-zero and the server does not support the 5180 minor version, the server returns an error of 5181 NFS4ERR_MINOR_VERS_MISMATCH. Therefore, the 5182 NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other 5183 errors. 5185 It is possible that the server receives a request that contains an 5186 operation that is beyond the last defined operation (e.g. 5188 OP_WRITE). In this case, the server obviously will fail the 5189 unknown operation. If this occurs, the server will return an 5190 operation "opcode" that is 1 greater than the largest defined 5191 operation. For example, the server would return an opcode of 5192 OP_WRITE + 1. The server would then return a status of 5193 NFS4ERR_NOTSUPP to indicate an operation that is not defined and 5194 therefore not supported. 5196 The definition of the "tag" in the request is left to the 5197 implementor. It may be used to summarize the content of the 5198 compound request for the benefit of packet sniffers and engineers 5199 debugging implementations. However, the value of "tag" in the 5200 response MUST be the same value as provided in the request. 5202 IMPLEMENTATION 5204 Since an error of any type may occur after only a portion of the 5205 operations have been evaluated, the client must be prepared to 5206 recover from any failure. If the source of an NFS4ERR_RESOURCE 5207 error was a complex or lengthy set of operations, it is likely that 5208 if the number of operations were reduced the server would be able 5209 to evaluate them successfully. Therefore, the client is 5210 responsible for dealing with this type of complexity in recovery. 5212 ERRORS 5214 All errors defined in the protocol 5216 14.2.1. Operation 3: ACCESS - Check Access Rights 5218 SYNOPSIS 5220 (cfh), accessreq -> supported, accessrights 5222 ARGUMENT 5224 const ACCESS4_READ = 0x00000001; 5225 const ACCESS4_LOOKUP = 0x00000002; 5226 const ACCESS4_MODIFY = 0x00000004; 5227 const ACCESS4_EXTEND = 0x00000008; 5228 const ACCESS4_DELETE = 0x00000010; 5229 const ACCESS4_EXECUTE = 0x00000020; 5231 struct ACCESS4args { 5232 /* CURRENT_FH: object */ 5233 uint32_t access; 5234 }; 5236 RESULT 5238 struct ACCESS4resok { 5239 uint32_t supported; 5240 uint32_t access; 5241 }; 5243 union ACCESS4res switch (nfsstat4 status) { 5244 case NFS4_OK: 5245 ACCESS4resok resok4; 5246 default: 5247 void; 5248 }; 5250 DESCRIPTION 5252 ACCESS determines the access rights that a user, as identified by 5253 the credentials in the RPC request, has with respect to the file 5254 system object specified by the current filehandle. The client 5255 encodes the set of access rights that are to be checked in the bit 5256 mask "access". The server checks the permissions encoded in the 5257 bit mask. If a status of NFS4_OK is returned, two bit masks are 5258 included in the response. The first, "supported", represents the 5259 access rights for which the server can verify reliably. The 5260 second, "access", represents the access rights available to the 5261 user for the filehandle provided. On success, the current 5262 filehandle retains its value. 5264 Note that the supported field will contain only as many values as 5265 was originally sent in the arguments. For example, if the client 5266 sends an ACCESS operation with only the ACCESS4_READ value set and 5267 the server supports this value, the server will return only 5268 ACCESS4_READ even if it could have reliably checked other values. 5270 The results of this operation are necessarily advisory in nature. 5271 A return status of NFS4_OK and the appropriate bit set in the bit 5272 mask does not imply that such access will be allowed to the file 5273 system object in the future. This is because access rights can be 5274 revoked by the server at any time. 5276 The following access permissions may be requested: 5278 ACCESS4_READ Read data from file or read a directory. 5280 ACCESS4_LOOKUP Look up a name in a directory (no meaning for non- 5281 directory objects). 5283 ACCESS4_MODIFY Rewrite existing file data or modify existing 5284 directory entries. 5286 ACCESS4_EXTEND Write new data or add directory entries. 5288 ACCESS4_DELETE Delete an existing directory entry (no meaning for 5289 non-directory objects). 5291 ACCESS4_EXECUTE Execute file (no meaning for a directory). 5293 On success, the current filehandle retains its value. 5295 IMPLEMENTATION 5297 For the NFS version 4 protocol, the use of the ACCESS procedure 5298 when opening a regular file is deprecated in favor of using OPEN. 5300 In general, it is not sufficient for the client to attempt to 5301 deduce access permissions by inspecting the uid, gid, and mode 5302 fields in the file attributes or by attempting to interpret the 5303 contents of the ACL attribute. This is because the server may 5304 perform uid or gid mapping or enforce additional access control 5305 restrictions. It is also possible that the server may not be in 5306 the same ID space as the client. In these cases (and perhaps 5307 others), the client can not reliably perform an access check with 5308 only current file attributes. 5310 In the NFS version 2 protocol, the only reliable way to determine 5311 whether an operation was allowed was to try it and see if it 5312 succeeded or failed. Using the ACCESS procedure in the NFS version 5313 4 protocol, the client can ask the server to indicate whether or 5314 not one or more classes of operations are permitted. The ACCESS 5315 operation is provided to allow clients to check before doing a 5316 series of operations which will result in an access failure. The 5317 OPEN operation provides a point where the server can verify access 5318 to the file object and method to return that information to the 5319 client. The ACCESS operation is still useful for directory 5320 operations or for use in the case the UNIX API "access" is used on 5321 the client. 5323 The information returned by the server in response to an ACCESS 5324 call is not permanent. It was correct at the exact time that the 5325 server performed the checks, but not necessarily afterwards. The 5326 server can revoke access permission at any time. 5328 The client should use the effective credentials of the user to 5329 build the authentication information in the ACCESS request used to 5330 determine access rights. It is the effective user and group 5331 credentials that are used in subsequent read and write operations. 5333 Many implementations do not directly support the ACCESS4_DELETE 5334 permission. Operating systems like UNIX will ignore the 5335 ACCESS4_DELETE bit if set on an access request on a non-directory 5336 object. In these systems, delete permission on a file is 5337 determined by the access permissions on the directory in which the 5338 file resides, instead of being determined by the permissions of the 5339 file itself. Therefore, the mask returned enumerating which access 5340 rights can be determined will have the ACCESS4_DELETE value set to 5341 0. This indicates to the client that the server was unable to 5342 check that particular access right. The ACCESS4_DELETE bit in the 5343 access mask returned will then be ignored by the client. 5345 ERRORS 5347 NFS4ERR_ACCESS 5348 NFS4ERR_BADHANDLE 5349 NFS4ERR_BADXDR 5350 NFS4ERR_DELAY 5351 NFS4ERR_FHEXPIRED 5352 NFS4ERR_IO 5353 NFS4ERR_MOVED 5354 NFS4ERR_NOFILEHANDLE 5355 NFS4ERR_RESOURCE 5356 NFS4ERR_SERVERFAULT 5357 NFS4ERR_STALE 5358 NFS4ERR_WRONGSEC 5360 14.2.2. Operation 4: CLOSE - Close File 5362 SYNOPSIS 5364 (cfh), seqid, open_stateid -> open_stateid 5366 ARGUMENT 5368 struct CLOSE4args { 5369 /* CURRENT_FH: object */ 5370 seqid4 seqid 5371 stateid4 open_stateid; 5372 }; 5374 RESULT 5376 union CLOSE4res switch (nfsstat4 status) { 5377 case NFS4_OK: 5378 stateid4 open_stateid; 5379 default: 5380 void; 5381 }; 5383 DESCRIPTION 5385 The CLOSE operation releases share reservations for the regular or 5386 named attribute file as specified by the current filehandle. The 5387 share reservations and other state information released at the 5388 server as a result of this CLOSE is only associated with the 5389 supplied stateid. The sequence id provides for the correct 5390 ordering. State associated with other OPENs is not affected. 5392 If record locks are held, the client SHOULD release all locks 5393 before issuing a CLOSE. The server MAY free all outstanding locks 5394 on CLOSE but some servers may not support the CLOSE of a file that 5395 still has record locks held. The server MUST return failure if any 5396 locks would exist after the CLOSE. 5398 On success, the current filehandle retains its value. 5400 IMPLEMENTATION 5402 Even though CLOSE returns a stateid, this stateid is not useful to 5403 the client and should be treated as deprecated. CLOSE "shuts down" 5404 the state associated with all OPENs for the file by a single 5405 open_owner. As noted above, CLOSE will either release all file 5406 locking state or return an error. Therefore, the stateid returned 5407 by CLOSE is not useful for operations that follow. 5409 ERRORS 5411 NFS4ERR_BADHANDLE 5412 NFS4ERR_BAD_SEQID 5413 NFS4ERR_BAD_STATEID 5414 NFS4ERR_BADXDR 5415 NFS4ERR_DELAY 5416 NFS4ERR_EXPIRED 5417 NFS4ERR_FHEXPIRED 5418 NFS4ERR_GRACE 5419 NFS4ERR_INVAL 5420 NFS4ERR_ISDIR 5421 NFS4ERR_LEASE_MOVED 5422 NFS4ERR_LOCKS_HELD 5423 NFS4ERR_MOVED 5424 NFS4ERR_NOFILEHANDLE 5425 NFS4ERR_OLD_STATEID 5426 NFS4ERR_RESOURCE 5427 NFS4ERR_SERVERFAULT 5428 NFS4ERR_STALE 5429 NFS4ERR_STALE_STATEID 5431 14.2.3. Operation 5: COMMIT - Commit Cached Data 5433 SYNOPSIS 5435 (cfh), offset, count -> verifier 5437 ARGUMENT 5439 struct COMMIT4args { 5440 /* CURRENT_FH: file */ 5441 offset4 offset; 5442 count4 count; 5443 }; 5445 RESULT 5447 struct COMMIT4resok { 5448 verifier4 writeverf; 5449 }; 5451 union COMMIT4res switch (nfsstat4 status) { 5452 case NFS4_OK: 5453 COMMIT4resok resok4; 5454 default: 5455 void; 5456 }; 5458 DESCRIPTION 5460 The COMMIT operation forces or flushes data to stable storage for 5461 the file specified by the current file handle. The flushed data is 5462 that which was previously written with a WRITE operation which had 5463 the stable field set to UNSTABLE4. 5465 The offset specifies the position within the file where the flush 5466 is to begin. An offset value of 0 (zero) means to flush data 5467 starting at the beginning of the file. The count specifies the 5468 number of bytes of data to flush. If count is 0 (zero), a flush 5469 from offset to the end of the file is done. 5471 The server returns a write verifier upon successful completion of 5472 the COMMIT. The write verifier is used by the client to determine 5473 if the server has restarted or rebooted between the initial 5474 WRITE(s) and the COMMIT. The client does this by comparing the 5475 write verifier returned from the initial writes and the verifier 5476 returned by the COMMIT procedure. The server must vary the value 5477 of the write verifier at each server event or instantiation that 5478 may lead to a loss of uncommitted data. Most commonly this occurs 5479 when the server is rebooted; however, other events at the server 5480 may result in uncommitted data loss as well. 5482 On success, the current filehandle retains its value. 5484 IMPLEMENTATION 5486 The COMMIT procedure is similar in operation and semantics to the 5487 POSIX fsync(2) system call that synchronizes a file's state with 5488 the disk (file data and metadata is flushed to disk or stable 5489 storage). COMMIT performs the same operation for a client, flushing 5490 any unsynchronized data and metadata on the server to the server's 5491 disk or stable storage for the specified file. Like fsync(2), it 5492 may be that there is some modified data or no modified data to 5493 synchronize. The data may have been synchronized by the server's 5494 normal periodic buffer synchronization activity. COMMIT should 5495 return NFS4_OK, unless there has been an unexpected error. 5497 COMMIT differs from fsync(2) in that it is possible for the client 5498 to flush a range of the file (most likely triggered by a buffer- 5499 reclamation scheme on the client before file has been completely 5500 written). 5502 The server implementation of COMMIT is reasonably simple. If the 5503 server receives a full file COMMIT request, that is starting at 5504 offset 0 and count 0, it should do the equivalent of fsync()'ing 5505 the file. Otherwise, it should arrange to have the cached data in 5506 the range specified by offset and count to be flushed to stable 5507 storage. In both cases, any metadata associated with the file must 5508 be flushed to stable storage before returning. It is not an error 5509 for there to be nothing to flush on the server. This means that 5510 the data and metadata that needed to be flushed have already been 5511 flushed or lost during the last server failure. 5513 The client implementation of COMMIT is a little more complex. 5514 There are two reasons for wanting to commit a client buffer to 5515 stable storage. The first is that the client wants to reuse a 5516 buffer. In this case, the offset and count of the buffer are sent 5517 to the server in the COMMIT request. The server then flushes any 5518 cached data based on the offset and count, and flushes any metadata 5519 associated with the file. It then returns the status of the flush 5520 and the write verifier. The other reason for the client to 5521 generate a COMMIT is for a full file flush, such as may be done at 5522 close. In this case, the client would gather all of the buffers 5523 for this file that contain uncommitted data, do the COMMIT 5524 operation with an offset of 0 and count of 0, and then free all of 5525 those buffers. Any other dirty buffers would be sent to the server 5526 in the normal fashion. 5528 After a buffer is written by the client with the stable parameter 5529 set to UNSTABLE4, the buffer must be considered as modified by the 5530 client until the buffer has either been flushed via a COMMIT 5531 operation or written via a WRITE operation with stable parameter 5532 set to FILE_SYNC4 or DATA_SYNC4. This is done to prevent the buffer 5533 from being freed and reused before the data can be flushed to 5534 stable storage on the server. 5536 When a response is returned from either a WRITE or a COMMIT 5537 operation and it contains a write verifier that is different than 5538 previously returned by the server, the client will need to 5539 retransmit all of the buffers containing uncommitted cached data to 5540 the server. How this is to be done is up to the implementor. If 5541 there is only one buffer of interest, then it should probably be 5542 sent back over in a WRITE request with the appropriate stable 5543 parameter. If there is more than one buffer, it might be 5544 worthwhile retransmitting all of the buffers in WRITE requests with 5545 the stable parameter set to UNSTABLE4 and then retransmitting the 5546 COMMIT operation to flush all of the data on the server to stable 5547 storage. The timing of these retransmissions is left to the 5548 implementor. 5550 The above description applies to page-cache-based systems as well 5551 as buffer-cache-based systems. In those systems, the virtual 5552 memory system will need to be modified instead of the buffer cache. 5554 ERRORS 5556 NFS4ERR_ACCESS 5557 NFS4ERR_BADHANDLE 5558 NFS4ERR_BADXDR 5559 NFS4ERR_FHEXPIRED 5560 NFS4ERR_INVAL 5561 NFS4ERR_IO 5562 NFS4ERR_ISDIR 5563 NFS4ERR_MOVED 5564 NFS4ERR_NOFILEHANDLE 5565 NFS4ERR_RESOURCE 5566 NFS4ERR_ROFS 5567 NFS4ERR_SERVERFAULT 5568 NFS4ERR_STALE 5569 NFS4ERR_WRONGSEC 5571 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 5573 SYNOPSIS 5575 (cfh), name, type, attrs -> (cfh), change_info, attrs_set 5577 ARGUMENT 5579 union createtype4 switch (nfs_ftype4 type) { 5580 case NF4LNK: 5581 linktext4 linkdata; 5582 case NF4BLK: 5583 case NF4CHR: 5584 specdata4 devdata; 5585 case NF4SOCK: 5586 case NF4FIFO: 5587 case NF4DIR: 5588 void; 5589 }; 5591 struct CREATE4args { 5592 /* CURRENT_FH: directory for creation */ 5593 component4 objname; 5594 createtype4 objtype; 5595 fattr4 createattrs; 5596 }; 5598 RESULT 5600 struct CREATE4resok { 5601 change_info4 cinfo; 5602 bitmap4 attrset; /* attributes set */ 5603 }; 5605 union CREATE4res switch (nfsstat4 status) { 5606 case NFS4_OK: 5607 CREATE4resok resok4; 5608 default: 5609 void; 5610 }; 5612 DESCRIPTION 5614 The CREATE operation creates a non-regular file object in a 5615 directory with a given name. The OPEN procedure MUST be used to 5616 create a regular file. 5618 The objname specifies the name for the new object. The objtype 5619 determines the type of object to be created: directory, symlink, 5620 etc. 5622 If an object of the same name already exists in the directory, the 5623 server will return the error NFS4ERR_EXIST. 5625 For the directory where the new file object was created, the server 5626 returns change_info4 information in cinfo. With the atomic field 5627 of the change_info4 struct, the server will indicate if the before 5628 and after change attributes were obtained atomically with respect 5629 to the file object creation. 5631 If the objname has a length of 0 (zero), or if objname does not 5632 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 5633 returned. 5635 The current filehandle is replaced by that of the new object. 5637 The createattrs specifies the initial set of attributes for the 5638 object. The set of attributes may include any writable attribute 5639 valid for the object type. When the operation is successful, the 5640 server will return to the client an attribute mask signifying which 5641 attributes were successfully set for the object. 5643 IMPLEMENTATION 5645 If the client desires to set attribute values after the create, a 5646 SETATTR operation can be added to the COMPOUND request so that the 5647 appropriate attributes will be set. 5649 It may be that the server's implementation places special meaning 5650 on the names "." and ".." where they refer to special directories. 5651 If this is the case and the client requests to CREATE a directory 5652 (or other object) with these names, the server may return 5653 NFS4ERR_INVAL. However, if the server does not place special 5654 meaning on these names and a file object already exists with a 5655 matching name, the server may return NFS4ERR_EXIST. 5657 ERRORS 5659 NFS4ERR_ACCESS 5660 NFS4ERR_ATTRNOTSUPP 5661 NFS4ERR_BADHANDLE 5662 NFS4ERR_BADTYPE 5663 NFS4ERR_BADXDR 5664 NFS4ERR_DQUOT 5665 NFS4ERR_EXIST 5666 NFS4ERR_FHEXPIRED 5667 NFS4ERR_INVAL 5668 NFS4ERR_IO 5669 NFS4ERR_MOVED 5670 NFS4ERR_NAMETOOLONG 5671 NFS4ERR_NOFILEHANDLE 5672 NFS4ERR_NOSPC 5673 NFS4ERR_NOTDIR 5674 NFS4ERR_NOTSUPP 5675 NFS4ERR_RESOURCE 5676 NFS4ERR_ROFS 5677 NFS4ERR_SERVERFAULT 5678 NFS4ERR_STALE 5679 NFS4ERR_WRONGSEC 5681 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery 5683 SYNOPSIS 5685 clientid -> 5687 ARGUMENT 5689 struct DELEGPURGE4args { 5690 clientid4 clientid; 5691 }; 5693 RESULT 5695 struct DELEGPURGE4res { 5696 nfsstat4 status; 5697 }; 5699 DESCRIPTION 5701 Purges all of the delegations awaiting recovery for a given client. 5702 This is useful for clients which do not commit delegation 5703 information to stable storage to indicate that conflicting requests 5704 need not be delayed by the server awaiting recovery of delegation 5705 information. 5707 This operation should be used by clients that record delegation 5708 information on stable storage on the client. In this case, 5709 DELEGPURGE should be issued immediately after doing delegation 5710 recovery on all delegations know to the client. Doing so will 5711 notify the server that no additional delegations for the client 5712 will be recovered allowing it to free resources, and avoid delaying 5713 other clients who make requests that conflict with the unrecovered 5714 delegations. The set of delegations known to the server and the 5715 client may be different. The reason for this is that a client may 5716 fail after making a request which resulted in delegation but before 5717 it received the results and committed them to the client's stable 5718 storage. 5720 ERRORS 5722 NFS4ERR_BADXDR 5723 NFS4ERR_RESOURCE 5724 NFS4ERR_SERVERFAULT 5725 NFS4ERR_STALE_CLIENTID 5727 14.2.6. Operation 8: DELEGRETURN - Return Delegation 5729 SYNOPSIS 5731 (cfh), stateid -> 5733 ARGUMENT 5735 struct DELEGRETURN4args { 5736 /* CURRENT_FH: delegated file */ 5737 stateid4 stateid; 5738 }; 5740 RESULT 5742 struct DELEGRETURN4res { 5743 nfsstat4 status; 5744 }; 5746 DESCRIPTION 5748 Returns the delegation represented by the current filehandle and 5749 stateid. 5751 ERRORS 5753 NFS4ERR_BAD_STATEID 5754 NFS4ERR_BADXDR 5755 NFS4ERR_EXPIRED 5756 NFS4ERR_OLD_STATEID 5757 NFS4ERR_RESOURCE 5758 NFS4ERR_SERVERFAULT 5759 NFS4ERR_STALE_STATEID 5761 14.2.7. Operation 9: GETATTR - Get Attributes 5763 SYNOPSIS 5765 (cfh), attrbits -> attrbits, attrvals 5767 ARGUMENT 5769 struct GETATTR4args { 5770 /* CURRENT_FH: directory or file */ 5771 bitmap4 attr_request; 5772 }; 5774 RESULT 5776 struct GETATTR4resok { 5777 fattr4 obj_attributes; 5778 }; 5780 union GETATTR4res switch (nfsstat4 status) { 5781 case NFS4_OK: 5782 GETATTR4resok resok4; 5783 default: 5784 void; 5785 }; 5787 DESCRIPTION 5789 The GETATTR operation will obtain attributes for the file system 5790 object specified by the current filehandle. The client sets a bit 5791 in the bitmap argument for each attribute value that it would like 5792 the server to return. The server returns an attribute bitmap that 5793 indicates the attribute values for which it was able to return, 5794 followed by the attribute values ordered lowest attribute number 5795 first. 5797 The server must return a value for each attribute that the client 5798 requests if the attribute is supported by the server. If the 5799 server does not support an attribute or cannot approximate a useful 5800 value then it must not return the attribute value and must not set 5801 the attribute bit in the result bitmap. The server must return an 5802 error if it supports an attribute but cannot obtain its value. In 5803 that case no attribute values will be returned. 5805 All servers must support the mandatory attributes as specified in 5806 the section "File Attributes". 5808 On success, the current filehandle retains its value. 5810 IMPLEMENTATION 5812 ERRORS 5814 NFS4ERR_ACCESS 5815 NFS4ERR_BADHANDLE 5816 NFS4ERR_BADXDR 5817 NFS4ERR_DELAY 5818 NFS4ERR_FHEXPIRED 5819 NFS4ERR_INVAL 5820 NFS4ERR_IO 5821 NFS4ERR_MOVED 5822 NFS4ERR_NOFILEHANDLE 5823 NFS4ERR_RESOURCE 5824 NFS4ERR_SERVERFAULT 5825 NFS4ERR_STALE 5826 NFS4ERR_WRONGSEC 5828 14.2.8. Operation 10: GETFH - Get Current Filehandle 5830 SYNOPSIS 5832 (cfh) -> filehandle 5834 ARGUMENT 5836 /* CURRENT_FH: */ 5837 void; 5839 RESULT 5841 struct GETFH4resok { 5842 nfs_fh4 object; 5843 }; 5845 union GETFH4res switch (nfsstat4 status) { 5846 case NFS4_OK: 5847 GETFH4resok resok4; 5848 default: 5849 void; 5850 }; 5852 DESCRIPTION 5854 This operation returns the current filehandle value. 5856 On success, the current filehandle retains its value. 5858 IMPLEMENTATION 5860 Operations that change the current filehandle like LOOKUP or CREATE 5861 do not automatically return the new filehandle as a result. For 5862 instance, if a client needs to lookup a directory entry and obtain 5863 its filehandle then the following request is needed. 5865 PUTFH (directory filehandle) 5866 LOOKUP (entry name) 5867 GETFH 5869 ERRORS 5871 NFS4ERR_BADHANDLE 5872 NFS4ERR_FHEXPIRED 5873 NFS4ERR_MOVED 5874 NFS4ERR_NOFILEHANDLE 5875 NFS4ERR_RESOURCE 5876 NFS4ERR_SERVERFAULT 5877 NFS4ERR_STALE 5878 NFS4ERR_WRONGSEC 5880 14.2.9. Operation 11: LINK - Create Link to a File 5882 SYNOPSIS 5884 (sfh), (cfh), newname -> (cfh), change_info 5886 ARGUMENT 5888 struct LINK4args { 5889 /* SAVED_FH: source object */ 5890 /* CURRENT_FH: target directory */ 5891 component4 newname; 5892 }; 5894 RESULT 5896 struct LINK4resok { 5897 change_info4 cinfo; 5898 }; 5900 union LINK4res switch (nfsstat4 status) { 5901 case NFS4_OK: 5902 LINK4resok resok4; 5903 default: 5904 void; 5905 }; 5907 DESCRIPTION 5909 The LINK operation creates an additional newname for the file 5910 represented by the saved filehandle, as set by the SAVEFH 5911 operation, in the directory represented by the current filehandle. 5912 The existing file and the target directory must reside within the 5913 same file system on the server. On success, the current filehandle 5914 will continue to be the target directory. If an object exists in 5915 the target directory with the same name as newname, the server must 5916 return NFS4ERR_EXIST. 5918 For the target directory, the server returns change_info4 5919 information in cinfo. With the atomic field of the change_info4 5920 struct, the server will indicate if the before and after change 5921 attributes were obtained atomically with respect to the link 5922 creation. 5924 If the newname has a length of 0 (zero), or if newname does not 5925 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 5926 returned. 5928 IMPLEMENTATION 5930 Changes to any property of the "hard" linked files are reflected in 5931 all of the linked files. When a link is made to a file, the 5932 attributes for the file should have a value for numlinks that is 5933 one greater than the value before the LINK operation. 5935 The statement "file and the target directory must reside within the 5936 same file system on the server" means that the fsid fields in the 5937 attributes for the objects are the same. If they reside on 5938 different file systems, the error, NFS4ERR_XDEV, is returned. On 5939 some servers, the filenames, "." and "..", are illegal as newname. 5941 In the case that newname is already linked to the file represented 5942 by the saved filehandle, the server will return NFS4ERR_EXIST. 5944 Note that symbolic links are created with the CREATE operation. 5946 ERRORS 5948 NFS4ERR_ACCESS 5949 NFS4ERR_BADHANDLE 5950 NFS4ERR_BADXDR 5951 NFS4ERR_DELAY 5952 NFS4ERR_DQUOT 5953 NFS4ERR_EXIST 5954 NFS4ERR_FHEXPIRED 5955 NFS4ERR_INVAL 5956 NFS4ERR_IO 5957 NFS4ERR_ISDIR 5958 NFS4ERR_MLINK 5959 NFS4ERR_MOVED 5960 NFS4ERR_NAMETOOLONG 5961 NFS4ERR_NOENT 5962 NFS4ERR_NOFILEHANDLE 5963 NFS4ERR_NOSPC 5964 NFS4ERR_NOTDIR 5965 NFS4ERR_NOTSUPP 5966 NFS4ERR_RESOURCE 5967 NFS4ERR_ROFS 5968 NFS4ERR_SERVERFAULT 5969 NFS4ERR_STALE 5970 NFS4ERR_WRONGSEC 5971 NFS4ERR_XDEV 5973 14.2.10. Operation 12: LOCK - Create Lock 5975 SYNOPSIS 5977 (cfh) locktype, reclaim, offset, length, locker -> stateid 5979 ARGUMENT 5981 struct open_to_lock_owner4 { 5982 seqid4 open_seqid; 5983 stateid4 open_stateid; 5984 seqid4 lock_seqid; 5985 lock_owner4 lock_owner; 5986 }; 5988 struct exist_lock_owner4 { 5989 stateid4 lock_stateid; 5990 seqid4 lock_seqid; 5991 }; 5993 union locker4 switch (bool new_lock_owner) { 5994 case TRUE: 5995 open_to_lock_owner4 open_owner; 5996 case FALSE: 5997 exist_lock_owner4 lock_owner; 5998 }; 6000 enum nfs4_lock_type { 6001 READ_LT = 1, 6002 WRITE_LT = 2, 6003 READW_LT = 3, /* blocking read */ 6004 WRITEW_LT = 4 /* blocking write */ 6005 }; 6007 struct LOCK4args { 6008 /* CURRENT_FH: file */ 6009 nfs_lock_type4 locktype; 6010 bool reclaim; 6011 offset4 offset; 6012 length4 length; 6013 locker4 locker; 6014 }; 6016 RESULT 6018 struct LOCK4denied { 6019 offset4 offset; 6020 length4 length; 6021 nfs_lock_type4 locktype; 6022 lock_owner4 owner; 6023 }; 6025 struct LOCK4resok { 6026 stateid4 lock_stateid; 6027 }; 6029 union LOCK4res switch (nfsstat4 status) { 6030 case NFS4_OK: 6031 LOCK4resok resok4; 6032 case NFS4ERR_DENIED: 6033 LOCK4denied denied; 6034 default: 6035 void; 6036 }; 6038 DESCRIPTION 6040 The LOCK operation requests a record lock for the byte range 6041 specified by the offset and length parameters. The lock type is 6042 also specified to be one of the nfs4_lock_types. If this is a 6043 reclaim request, the reclaim parameter will be TRUE; 6045 Bytes in a file may be locked even if those bytes are not currently 6046 allocated to the file. To lock the file from a specific offset 6047 through the end-of-file (no matter how long the file actually is) 6048 use a length field with all bits set to 1 (one). To lock the 6049 entire file, use an offset of 0 (zero) and a length with all bits 6050 set to 1. A length of 0 is reserved and should not be used. 6052 In the case that the lock is denied, the owner, offset, and length 6053 of a conflicting lock are returned. 6055 On success, the current filehandle retains its value. 6057 IMPLEMENTATION 6059 If the server is unable to determine the exact offset and length of 6060 the conflicting lock, the same offset and length that were provided 6061 in the arguments should be returned in the denied results. The 6062 File Locking section contains a full description of this and the 6063 other file locking operations. 6065 ERRORS 6067 NFS4ERR_ACCESS 6068 NFS4ERR_BADHANDLE 6069 NFS4ERR_BAD_SEQID 6070 NFS4ERR_BAD_STATEID 6071 NFS4ERR_BADXDR 6072 NFS4ERR_DELAY 6073 NFS4ERR_DENIED 6074 NFS4ERR_EXPIRED 6075 NFS4ERR_FHEXPIRED 6076 NFS4ERR_GRACE 6077 NFS4ERR_INVAL 6078 NFS4ERR_ISDIR 6079 NFS4ERR_LEASE_MOVED 6080 NFS4ERR_LOCK_RANGE 6081 NFS4ERR_MOVED 6082 NFS4ERR_NOFILEHANDLE 6083 NFS4ERR_NO_GRACE 6084 NFS4ERR_OLD_STATEID 6085 NFS4ERR_RECLAIM_BAD 6086 NFS4ERR_RECLAIM_CONFLICT 6087 NFS4ERR_RESOURCE 6088 NFS4ERR_SERVERFAULT 6089 NFS4ERR_STALE 6090 NFS4ERR_STALE_CLIENTID 6091 NFS4ERR_STALE_STATEID 6092 NFS4ERR_WRONGSEC 6094 14.2.11. Operation 13: LOCKT - Test For Lock 6096 SYNOPSIS 6098 (cfh) type, owner, offset, length -> {void, NFS4ERR_DENIED -> 6099 owner} 6101 ARGUMENT 6103 struct LOCKT4args { 6104 /* CURRENT_FH: file */ 6105 nfs_lock_type4 locktype; 6106 nfs_lockowner4 owner; 6107 offset4 offset; 6108 length4 length; 6109 }; 6111 RESULT 6113 struct LOCK4denied { 6114 nfs_lockowner4 owner; 6115 offset4 offset; 6116 length4 length; 6117 nfs_lock_type4 locktype; 6118 }; 6120 union LOCKT4res switch (nfsstat4 status) { 6121 case NFS4ERR_DENIED: 6122 LOCK4denied denied; 6123 case NFS4_OK: 6124 void; 6125 default: 6126 void; 6127 }; 6129 DESCRIPTION 6131 The LOCKT operation tests the lock as specified in the arguments. 6132 If a conflicting lock exists, the owner, offset, length, and type 6133 of the conflicting lock are returned; if no lock is held, nothing 6134 other than NFS4_OK is returned. 6136 On success, the current filehandle retains its value. 6138 IMPLEMENTATION 6140 If the server is unable to determine the exact offset and length of 6141 the conflicting lock, the same offset and length that were provided 6142 in the arguments should be returned in the denied results. The 6143 File Locking section contains further discussion of the file 6144 locking mechanisms. 6146 LOCKT uses nfs_lockowner4 instead of a stateid4, as LOCK does, to 6147 identify the owner so that the client does not have to open the 6148 file to test for the existence of a lock. 6150 ERRORS 6152 NFS4ERR_ACCESS 6153 NFS4ERR_BADHANDLE 6154 NFS4ERR_BADXDR 6155 NFS4ERR_DELAY 6156 NFS4ERR_DENIED 6157 NFS4ERR_FHEXPIRED 6158 NFS4ERR_GRACE 6159 NFS4ERR_INVAL 6160 NFS4ERR_ISDIR 6161 NFS4ERR_LEASE_MOVED 6162 NFS4ERR_LOCK_RANGE 6163 NFS4ERR_MOVED 6164 NFS4ERR_NOFILEHANDLE 6165 NFS4ERR_RESOURCE 6166 NFS4ERR_SERVERFAULT 6167 NFS4ERR_STALE 6168 NFS4ERR_STALE_CLIENTID 6169 NFS4ERR_WRONGSEC 6171 14.2.12. Operation 14: LOCKU - Unlock File 6173 SYNOPSIS 6175 (cfh) type, seqid, stateid, offset, length -> stateid 6177 ARGUMENT 6179 struct LOCKU4args { 6180 /* CURRENT_FH: file */ 6181 nfs_lock_type4 locktype; 6182 seqid4 seqid; 6183 stateid4 stateid; 6184 offset4 offset; 6185 length4 length; 6186 }; 6188 RESULT 6190 union LOCKU4res switch (nfsstat4 status) { 6191 case NFS4_OK: 6192 stateid4 stateid; 6193 default: 6194 void; 6195 }; 6197 DESCRIPTION 6199 The LOCKU operation unlocks the record lock specified by the 6200 parameters. 6202 On success, the current filehandle retains its value. 6204 IMPLEMENTATION 6206 The File Locking section contains a full description of this and 6207 the other file locking procedures. 6209 ERRORS 6211 NFS4ERR_ACCESS 6212 NFS4ERR_BADHANDLE 6213 NFS4ERR_BAD_SEQID 6214 NFS4ERR_BAD_STATEID 6215 NFS4ERR_BADXDR 6216 NFS4ERR_EXPIRED 6217 NFS4ERR_FHEXPIRED 6218 NFS4ERR_GRACE 6219 NFS4ERR_INVAL 6220 NFS4ERR_ISDIR 6221 NFS4ERR_LOCK_RANGE 6222 NFS4ERR_LEASE_MOVED 6223 NFS4ERR_MOVED 6224 NFS4ERR_NOFILEHANDLE 6225 NFS4ERR_OLD_STATEID 6226 NFS4ERR_RESOURCE 6227 NFS4ERR_SERVERFAULT 6228 NFS4ERR_STALE 6229 NFS4ERR_STALE_CLIENTID 6230 NFS4ERR_STALE_STATEID 6232 14.2.13. Operation 15: LOOKUP - Lookup Filename 6234 SYNOPSIS 6236 (cfh), component -> (cfh) 6238 ARGUMENT 6240 struct LOOKUP4args { 6241 /* CURRENT_FH: directory */ 6242 component4 objname; 6243 }; 6245 RESULT 6247 struct LOOKUP4res { 6248 /* CURRENT_FH: object */ 6249 nfsstat4 status; 6250 }; 6252 DESCRIPTION 6254 This operation LOOKUPs or finds a file system object using the 6255 directory specified by the current filehandle. LOOKUP evaluates 6256 the component and if the object exists the current filehandle is 6257 replaced with the component's filehandle. 6259 If the component cannot be evaluated either because it does not 6260 exist or because the client does not have permission to evaluate 6261 the component, then an error will be returned and the current 6262 filehandle will be unchanged. 6264 If the component is a zero length string or if any component does 6265 not obey the UTF-8 definition, the error NFS4ERR_INVAL will be 6266 returned. 6268 IMPLEMENTATION 6270 If the client wants to acheive the effect of a multi-component 6271 lookup, it may construct a COMPOUND request such as (and obtain 6272 each filehandle): 6274 PUTFH (directory filehandle) 6275 LOOKUP "pub" 6276 GETFH 6277 LOOKUP "foo" 6278 GETFH 6279 LOOKUP "bar" 6280 GETFH 6282 NFS version 4 servers depart from the semantics of previous NFS 6283 versions in allowing LOOKUP requests to cross mountpoints on the 6284 server. The client can detect a mountpoint crossing by comparing 6285 the fsid attribute of the directory with the fsid attribute of the 6286 directory looked up. If the fsids are different then the new 6287 directory is a server mountpoint. Unix clients that detect a 6288 mountpoint crossing will need to mount the server's filesystem. 6289 This needs to be done to maintain the file object identity checking 6290 mechanisms common to Unix clients. 6292 Servers that limit NFS access to "shares" or "exported" filesystems 6293 should provide a pseudo-filesystem into which the exported 6294 filesystems can be integrated, so that clients can browse the 6295 server's name space. The clients view of a pseudo filesystem will 6296 be limited to paths that lead to exported filesystems. 6298 Note: previous versions of the protocol assigned special semantics 6299 to the names "." and "..". NFS version 4 assigns no special 6300 semantics to these names. The LOOKUPP operator must be used to 6301 lookup a parent directory. 6303 Note that this procedure does not follow symbolic links. The 6304 client is responsible for all parsing of filenames including 6305 filenames that are modified by symbolic links encountered during 6306 the lookup process. 6308 If the current file handle supplied is not a directory but a 6309 symbolic link, the error NFS4ERR_SYMLINK is returned as the error. 6310 For all other non-directory file types, the error NFS4ERR_NOTDIR is 6311 returned. 6313 ERRORS 6315 NFS4ERR_ACCESS 6316 NFS4ERR_BADHANDLE 6317 NFS4ERR_BADXDR 6318 NFS4ERR_FHEXPIRED 6319 NFS4ERR_INVAL 6320 NFS4ERR_IO 6321 NFS4ERR_MOVED 6322 NFS4ERR_NAMETOOLONG 6323 NFS4ERR_NOENT 6324 NFS4ERR_NOFILEHANDLE 6325 NFS4ERR_NOTDIR 6326 NFS4ERR_RESOURCE 6327 NFS4ERR_SERVERFAULT 6328 NFS4ERR_STALE 6329 NFS4ERR_SYMLINK 6330 NFS4ERR_WRONGSEC 6332 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory 6334 SYNOPSIS 6336 (cfh) -> (cfh) 6338 ARGUMENT 6340 /* CURRENT_FH: object */ 6341 void; 6343 RESULT 6345 struct LOOKUPP4res { 6346 /* CURRENT_FH: directory */ 6347 nfsstat4 status; 6348 }; 6350 DESCRIPTION 6352 The current filehandle is assumed to refer to a regular directory 6353 or a named attribute directory. LOOKUPP assigns the filehandle for 6354 its parent directory to be the current filehandle. If there is no 6355 parent directory an NFS4ERR_NOENT error must be returned. 6356 Therefore, NFS4ERR_NOENT will be returned by the server when the 6357 current filehandle is at the root or top of the server's file tree. 6359 IMPLEMENTATION 6361 As for LOOKUP, LOOKUPP will also cross mountpoints. 6363 If the current filehandle is not a directory or named attribute 6364 directory, the error NFS4ERR_NOTDIR is returned. 6366 ERRORS 6368 NFS4ERR_ACCESS 6369 NFS4ERR_BADHANDLE 6370 NFS4ERR_FHEXPIRED 6371 NFS4ERR_INVAL 6372 NFS4ERR_IO 6373 NFS4ERR_MOVED 6374 NFS4ERR_NOENT 6375 NFS4ERR_NOFILEHANDLE 6376 NFS4ERR_NOTDIR 6377 NFS4ERR_RESOURCE 6378 NFS4ERR_SERVERFAULT 6379 NFS4ERR_STALE 6380 NFS4ERR_WRONGSEC 6382 14.2.15. Operation 17: NVERIFY - Verify Difference in Attributes 6384 SYNOPSIS 6386 (cfh), fattr -> - 6388 ARGUMENT 6390 struct NVERIFY4args { 6391 /* CURRENT_FH: object */ 6392 fattr4 obj_attributes; 6393 }; 6395 RESULT 6397 struct NVERIFY4res { 6398 nfsstat4 status; 6399 }; 6401 DESCRIPTION 6403 This operation is used to prefix a sequence of operations to be 6404 performed if one or more attributes have changed on some filesystem 6405 object. If all the attributes match then the error NFS4ERR_SAME 6406 must be returned. 6408 On success, the current filehandle retains its value. 6410 IMPLEMENTATION 6412 This operation is useful as a cache validation operator. If the 6413 object to which the attributes belong has changed then the 6414 following operations may obtain new data associated with that 6415 object. For instance, to check if a file has been changed and 6416 obtain new data if it has: 6418 PUTFH (public) 6419 LOOKUP "pub" "foo" "bar" 6420 NVERIFY attrbits attrs 6421 READ 0 32767 6423 In the case that a recommended attribute is specified in the 6424 NVERIFY operation and the server does not support that attribute 6425 for the file system object, the error NFS4ERR_NOTSUPP is returned 6426 to the client. 6428 ERRORS 6430 NFS4ERR_ACCESS 6431 NFS4ERR_ATTRNOTSUPP 6432 NFS4ERR_BADHANDLE 6433 NFS4ERR_BADXDR 6434 NFS4ERR_DELAY 6435 NFS4ERR_FHEXPIRED 6436 NFS4ERR_INVAL 6437 NFS4ERR_IO 6438 NFS4ERR_MOVED 6439 NFS4ERR_NOFILEHANDLE 6440 NFS4ERR_NOTSUPP 6441 NFS4ERR_RESOURCE 6442 NFS4ERR_SAME 6443 NFS4ERR_SERVERFAULT 6444 NFS4ERR_STALE 6445 NFS4ERR_WRONGSEC 6447 14.2.16. Operation 18: OPEN - Open a Regular File 6449 SYNOPSIS 6451 (cfh), claim, openhow, owner, seqid, access, deny -> (cfh), 6452 stateid, cinfo, rflags, open_confirm, attrset delegation 6454 ARGUMENT 6456 struct OPEN4args { 6457 open_claim4 claim; 6458 openflag4 openhow; 6459 nfs_lockowner4 owner; 6460 seqid4 seqid; 6461 uint32_t share_access; 6462 uint32_t share_deny; 6463 }; 6465 enum createmode4 { 6466 UNCHECKED4 = 0, 6467 GUARDED4 = 1, 6468 EXCLUSIVE4 = 2 6469 }; 6471 union createhow4 switch (createmode4 mode) { 6472 case UNCHECKED4: 6473 case GUARDED4: 6474 fattr4 createattrs; 6475 case EXCLUSIVE4: 6476 verifier4 createverf; 6477 }; 6479 enum opentype4 { 6480 OPEN4_NOCREATE = 0, 6481 OPEN4_CREATE = 1 6482 }; 6484 union openflag4 switch (opentype4 opentype) { 6485 case OPEN4_CREATE: 6486 createhow4 how; 6487 default: 6488 void; 6489 }; 6491 /* Next definitions used for OPEN delegation */ 6492 enum limit_by4 { 6493 NFS_LIMIT_SIZE = 1, 6494 NFS_LIMIT_BLOCKS = 2 6495 /* others as needed */ 6496 }; 6497 struct nfs_modified_limit4 { 6498 uint32_t num_blocks; 6499 uint32_t bytes_per_block; 6500 }; 6502 union nfs_space_limit4 switch (limit_by4 limitby) { 6503 /* limit specified as file size */ 6504 case NFS_LIMIT_SIZE: 6505 uint64_t filesize; 6506 /* limit specified by number of blocks */ 6507 case NFS_LIMIT_BLOCKS: 6508 nfs_modified_limit4 mod_blocks; 6509 } ; 6511 enum open_delegation_type4 { 6512 OPEN_DELEGATE_NONE = 0, 6513 OPEN_DELEGATE_READ = 1, 6514 OPEN_DELEGATE_WRITE = 2 6515 }; 6517 enum open_claim_type4 { 6518 CLAIM_NULL = 0, 6519 CLAIM_PREVIOUS = 1, 6520 CLAIM_DELEGATE_CUR = 2, 6521 CLAIM_DELEGATE_PREV = 3 6522 }; 6524 struct open_claim_delegate_cur4 { 6525 stateid4 delegate_stateid; 6526 component4 file; 6527 }; 6529 union open_claim4 switch (open_claim_type4 claim) { 6530 /* 6531 * No special rights to file. Ordinary OPEN of the specified file. 6532 */ 6533 case CLAIM_NULL: 6534 /* CURRENT_FH: directory */ 6535 component4 file; 6537 /* 6538 * Right to the file established by an open previous to server 6539 * reboot. File identified by filehandle obtained at that time 6540 * rather than by name. 6541 */ 6542 case CLAIM_PREVIOUS: 6543 /* CURRENT_FH: file being reclaimed */ 6544 open_delegation_type4 delegate_type; 6546 /* 6547 * Right to file based on a delegation granted by the server. 6548 * File is specified by name. 6550 */ 6551 case CLAIM_DELEGATE_CUR: 6552 /* CURRENT_FH: directory */ 6553 open_claim_delegate_cur4 delegate_cur_info; 6555 /* Right to file based on a delegation granted to a previous boot 6556 * instance of the client. File is specified by name. 6557 */ 6558 case CLAIM_DELEGATE_PREV: 6559 /* CURRENT_FH: directory */ 6560 component4 file_delegate_prev; 6561 }; 6563 RESULT 6565 struct open_read_delegation4 { 6566 stateid4 stateid; /* Stateid for delegation*/ 6567 bool recall; /* Pre-recalled flag for 6568 delegations obtained 6569 by reclaim 6570 (CLAIM_PREVIOUS) */ 6571 nfsace4 permissions; /* Defines users who don't 6572 need an ACCESS call to 6573 open for read */ 6574 }; 6576 struct open_write_delegation4 { 6577 stateid4 stateid; /* Stateid for delegation*/ 6578 bool recall; /* Pre-recalled flag for 6579 delegations obtained 6580 by reclaim 6581 (CLAIM_PREVIOUS) */ 6582 nfs_space_limit4 space_limit; /* Defines condition that 6583 the client must check to 6584 determine whether the 6585 file needs to be flushed 6586 to the server on close. 6587 */ 6588 nfsace4 permissions; /* Defines users who don't 6589 need an ACCESS call as 6590 part of a delegated 6591 open. */ 6592 }; 6594 union open_delegation4 6595 switch (open_delegation_type4 delegation_type) { 6596 case OPEN_DELEGATE_NONE: 6597 void; 6598 case OPEN_DELEGATE_READ: 6599 open_read_delegation4 read; 6601 case OPEN_DELEGATE_WRITE: 6602 open_write_delegation4 write; 6603 }; 6605 const OPEN4_RESULT_MLOCK = 0x00000001; 6606 const OPEN4_RESULT_CONFIRM= 0x00000002; 6607 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 6609 struct OPEN4resok { 6610 stateid4 stateid; /* Stateid for open */ 6611 change_info4 cinfo; /* Directory Change Info */ 6612 uint32_t rflags; /* Result flags */ 6613 bitmap4 attrset; /* attributes on create */ 6614 open_delegation4 delegation; /* Info on any open 6615 delegation */ 6616 }; 6618 union OPEN4res switch (nfsstat4 status) { 6619 case NFS4_OK: 6620 /* CURRENT_FH: opened file */ 6621 OPEN4resok resok4; 6622 default: 6623 void; 6624 }; 6626 WARNING TO CLIENT IMPLEMENTORS 6628 OPEN resembles LOOKUP in that it generates a filehandle for the 6629 client to use. Unlike LOOKUP though, OPEN creates server state on 6630 the filehandle. In normal circumstances, the client can only 6631 release this state with a CLOSE operation. CLOSE uses the current 6632 filehandle to determine which file to close. Therefore the client 6633 MUST follow every OPEN operation with a GETFH operation in the same 6634 COMPOUND procedure. This will supply the client with the 6635 filehandle such that CLOSE can be used appropriately. 6637 Simply waiting for the lease on the file to expire is insufficient 6638 because the server may maintain the state indefinitely as long as 6639 another client does not attempt to make a conflicting access to the 6640 same file. 6642 DESCRIPTION 6644 The OPEN operation creates and/or opens a regular file in a 6645 directory with the provided name. If the file does not exist at 6646 the server and creation is desired, specification of the method of 6647 creation is provided by the openhow parameter. The client has the 6648 choice of three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. 6650 UNCHECKED means that the file should be created if a file of that 6651 name does not exist and encountering an existing regular file of 6652 that name is not an error. For this type of create, createattrs 6653 specifies the initial set of attributes for the file. The set of 6654 attributes may include any writable attribute valid for regular 6655 files. When an UNCHECKED create encounters an existing file, the 6656 attributes specified by createattrs is not used, except that when 6657 an size of zero is specified, the existing file is truncated. If 6658 GUARDED is specified, the server checks for the presence of a 6659 duplicate object by name before performing the create. If a 6660 duplicate exists, an error of NFS4ERR_EXIST is returned as the 6661 status. If the object does not exist, the request is performed as 6662 described for UNCHECKED. For each of these cases (UNCHECKED and 6663 GUARDED) where the operation is successful, the server will return 6664 to the client an attribute mask signifying which attributes were 6665 successfully set for the object. 6667 EXCLUSIVE specifies that the server is to follow exclusive creation 6668 semantics, using the verifier to ensure exclusive creation of the 6669 target. The server should check for the presence of a duplicate 6670 object by name. If the object does not exist, the server creates 6671 the object and stores the verifier with the object. If the object 6672 does exist and the stored verifier matches the client provided 6673 verifier, the server uses the existing object as the newly created 6674 object. If the stored verifier does not match, then an error of 6675 NFS4ERR_EXIST is returned. No attributes may be provided in this 6676 case, since the server may use an attribute of the target object to 6677 store the verifier. If the server uses an attribute to store the 6678 exclusive create verifier, it will signify which attribute by 6679 setting the appropriate bit in the attribute mask that is returned 6680 in the results. 6682 For the target directory, the server returns change_info4 6683 information in cinfo. With the atomic field of the change_info4 6684 struct, the server will indicate if the before and after change 6685 attributes were obtained atomically with respect to the link 6686 creation. 6688 Upon successful creation, the current filehandle is replaced by 6689 that of the new object. 6691 The OPEN procedure provides for DOS SHARE capability with the use 6692 of the access and deny fields of the OPEN arguments. The client 6693 specifies at OPEN the required access and deny modes. For clients 6694 that do not directly support SHAREs (i.e. Unix), the expected deny 6695 value is DENY_NONE. In the case that there is a existing SHARE 6696 reservation that conflicts with the OPEN request, the server 6697 returns the error NFS4ERR_DENIED. For a complete SHARE request, 6698 the client must provide values for the owner and seqid fields for 6699 the OPEN argument. For additional discussion of SHARE semantics 6700 see the section on 'Share Reservations'. 6702 In the case that the client is recovering state from a server 6703 failure, the reclaim field of the OPEN argument is used to signify 6704 that the request is meant to reclaim state previously held. 6706 The "claim" field of the OPEN argument is used to specify the file 6707 to be opened and the state information which the client claims to 6708 possess. There are four basic claim types which cover the various 6709 situations for an OPEN. They are as follows: 6711 CLAIM_NULL 6712 For the client, this is a new OPEN 6713 request and there is no previous state 6714 associate with the file for the client. 6716 CLAIM_PREVIOUS 6717 The client is claiming basic OPEN state 6718 for a file that was held previous to a 6719 server reboot. Generally used when a 6720 server is returning persistent file 6721 handles; the client may not have the 6722 file name to reclaim the OPEN. 6724 CLAIM_DELEGATE_CUR 6725 The client is claiming a delegation for 6726 OPEN as granted by the server. 6727 Generally this is done as part of 6728 recalling a delegation. 6730 CLAIM_DELEGATE_PREV 6731 The client is claiming a delegation 6732 granted to a previous client instance; 6733 used after the client reboots. 6735 For OPEN requests whose claim type is other than CLAIM_PREVIOUS 6736 (i.e. requests other than those devoted to reclaiming opens after a 6737 server reboot) that reach the server during its grace or lease 6738 expiration period, the server returns an error of NFS4ERR_GRACE. 6740 For any OPEN request, the server may return an open delegation, 6741 which allows further opens and closes to be handled locally on the 6742 client as described in the section Open Delegation. Note that 6743 delegation is up to the server to decide. The client should never 6744 assume that delegation will or will not be granted in a particular 6745 instance. It should always be prepared for either case. A partial 6746 exception is the reclaim (CLAIM_PREVIOUS) case, in which a 6747 delegation type is claimed. In this case, delegation will always 6748 be granted, although the server may specify an immediate recall in 6749 the delegation structure. 6751 The rflags returned by a successful OPEN allow the server to return 6752 information governing how the open file is to be handled. 6753 OPEN4_RESULT_MLOCK indicates to the caller that mandatory locking 6754 is in effect for this file and the client should act appropriately 6755 with regard to data cached on the client. OPEN4_RESULT_CONFIRM 6756 indicates that the client MUST execute an OPEN_CONFIRM operation 6757 before using the open file. OPEN4_RESULT_LOCKTYPE_POSIX indicates 6758 the server's file locking behavior is Posix like with respect to 6759 lock range coalescing. From this the client can choose to manage 6760 file locking state in a way to handle a mis-match of file locking 6761 management. 6763 If the component is of zero length or if the component does not 6764 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 6765 returned. 6767 When an OPEN is done and the specified lockowner already has the 6768 resulting filehandle open, the result is to "OR" together the new 6769 share and deny status together with the existing status. In this 6770 case, only a single CLOSE need be done, even though multiple OPEN's 6771 were completed. 6773 If the underlying filesystem at the server is only accessible in a 6774 read-only mode and the OPEN request has specified ACCESS_WRITE or 6775 ACCESS_BOTH, the server will return NFS4ERR_ROFS to indicate a 6776 read-only filesystem. 6778 IMPLEMENTATION 6780 The OPEN procedure contains support for EXCLUSIVE create. The 6781 mechanism is similar to the support in NFS version 3 [RFC1813]. As 6782 in NFS version 3, this mechanism provides reliable exclusive 6783 creation. Exclusive create is invoked when the how parameter is 6784 EXCLUSIVE. In this case, the client provides a verifier that can 6785 reasonably be expected to be unique. A combination of a client 6786 identifier, perhaps the client network address, and a unique number 6787 generated by the client, perhaps the RPC transaction identifier, 6788 may be appropriate. 6790 If the object does not exist, the server creates the object and 6791 stores the verifier in stable storage. For file systems that do not 6792 provide a mechanism for the storage of arbitrary file attributes, 6793 the server may use one or more elements of the object meta-data to 6794 store the verifier. The verifier must be stored in stable storage 6795 to prevent erroneous failure on retransmission of the request. It 6796 is assumed that an exclusive create is being performed because 6797 exclusive semantics are critical to the application. Because of the 6798 expected usage, exclusive CREATE does not rely solely on the 6799 normally volatile duplicate request cache for storage of the 6800 verifier. The duplicate request cache in volatile storage does not 6801 survive a crash and may actually flush on a long network partition, 6802 opening failure windows. In the UNIX local file system 6803 environment, the expected storage location for the verifier on 6804 creation is the meta-data (time stamps) of the object. For this 6805 reason, an exclusive object create may not include initial 6806 attributes because the server would have nowhere to store the 6807 verifier. 6809 If the server can not support these exclusive create semantics, 6810 possibly because of the requirement to commit the verifier to 6811 stable storage, it should fail the OPEN request with the error, 6812 NFS4ERR_NOTSUPP. 6814 During an exclusive CREATE request, if the object already exists, 6815 the server reconstructs the object's verifier and compares it with 6816 the verifier in the request. If they match, the server treats the 6817 request as a success. The request is presumed to be a duplicate of 6818 an earlier, successful request for which the reply was lost and 6819 that the server duplicate request cache mechanism did not detect. 6820 If the verifiers do not match, the request is rejected with the 6821 status, NFS4ERR_EXIST. 6823 Once the client has performed a successful exclusive create, it 6824 must issue a SETATTR to set the correct object attributes. Until 6825 it does so, it should not rely upon any of the object attributes, 6826 since the server implementation may need to overload object meta- 6827 data to store the verifier. The subsequent SETATTR must not occur 6828 in the same COMPOUND request as the OPEN. This separation will 6829 guarantee that the exclusive create mechanism will continue to 6830 function properly in the face of retransmission of the request. 6832 Use of the GUARDED attribute does not provide exactly-once 6833 semantics. In particular, if a reply is lost and the server does 6834 not detect the retransmission of the request, the procedure can 6835 fail with NFS4ERR_EXIST, even though the create was performed 6836 successfully. The client would use this behavior in the case that 6837 the application has not requested an exclusive create but has asked 6838 to have the file truncated when the file is opened. In the case of 6839 the client timing out and retransmitting the create request, the 6840 client can use GUARDED to prevent against a sequence like: create, 6841 write, create (retransmitted) from occurring. 6843 For SHARE reservations, the client must specify a value for access 6844 that is one of READ, WRITE, or BOTH. For deny, the client must 6845 specify one of NONE, READ, WRITE, or BOTH. If the client fails to 6846 do this, the server must return NFS4ERR_INVAL. 6848 If the component provided to OPEN is a symbolic link, the error 6849 NFS4ERR_SYMLINK will be returned to the client. If the current 6850 filehandle is not a directory, the error NFS4ERR_NOTDIR will be 6851 returned. 6853 ERRORS 6855 NFS4ERR_ACCESS 6856 NFS4ERR_ATTRNOTSUPP 6857 NFS4ERR_BAD_SEQID 6858 NFS4ERR_BADXDR 6859 NFS4ERR_DELAY 6860 NFS4ERR_DQUOT 6861 NFS4ERR_EXIST 6862 NFS4ERR_EXPIRED 6863 NFS4ERR_FHEXPIRED 6864 NFS4ERR_GRACE 6865 NFS4ERR_IO 6866 NFS4ERR_ISDIR 6867 NFS4ERR_LEASE_MOVED 6868 NFS4ERR_MOVED 6869 NFS4ERR_NAMETOOLONG 6870 NFS4ERR_NOENT* 6871 NFS4ERR_NOFILEHANDLE 6872 NFS4ERR_NO_GRACE 6873 NFS4ERR_NOSPC 6874 NFS4ERR_NOTDIR 6875 NFS4ERR_NOTSUPP 6876 NFS4ERR_RECLAIM_BAD 6877 NFS4ERR_RECLAIM_CONFLICT 6878 NFS4ERR_RESOURCE 6879 NFS4ERR_ROFS 6880 NFS4ERR_SERVERFAULT 6881 NFS4ERR_SHARE_DENIED 6882 NFS4ERR_STALE_CLIENTID 6883 NFS4ERR_SYMLINK 6885 14.2.17. Operation 19: OPENATTR - Open Named Attribute Directory 6887 SYNOPSIS 6889 (cfh) createdir -> (cfh) 6891 ARGUMENT 6893 struct OPENATTR4args { 6894 /* CURRENT_FH: object */ 6895 bool createdir; 6896 }; 6898 RESULT 6900 struct OPENATTR4res { 6901 /* CURRENT_FH: named attr directory*/ 6902 nfsstat4 status; 6903 }; 6905 DESCRIPTION 6907 The OPENATTR operation is used to obtain the filehandle of the 6908 named attribute directory associated with the current filehandle. 6909 The result of the OPENATTR will be a filehandle to an object of 6910 type NF4ATTRDIR. From this filehandle, READDIR and LOOKUP 6911 procedures can be used to obtain filehandles for the various named 6912 attributes associated with the original file system object. 6913 Filehandles returned within the named attribute directory will have 6914 a type of NF4NAMEDATTR. 6916 The createdir argument allows the client to signify if a named 6917 attribute directory should be created as a result of the OPENATTR 6918 operation. Some clients may use the OPENATTR operation with a 6919 value of FALSE for createdir to determine if any named attributes 6920 exist for the object. If none exist, then NFS4ERR_NOENT will be 6921 returned. If createdir has a value of TRUE and no named attribute 6922 directory exists, one is created. The creation of a named 6923 attribute directory assumes that the server has implemented named 6924 attribute support in this fashion and is not required to do so by 6925 this definition. 6927 IMPLEMENTATION 6929 If the server does not support named attributes for the current 6930 filehandle, an error of NFS4ERR_NOTSUPP will be returned to the 6931 client. 6933 ERRORS 6935 NFS4ERR_ACCESS 6936 NFS4ERR_BADHANDLE 6937 NFS4ERR_BADXDR 6938 NFS4ERR_DELAY 6939 NFS4ERR_ROFS 6940 NFS4ERR_FHEXPIRED 6941 NFS4ERR_INVAL 6942 NFS4ERR_IO 6943 NFS4ERR_MOVED 6944 NFS4ERR_NOENT 6945 NFS4ERR_NOFILEHANDLE 6946 NFS4ERR_NOSPC 6947 NFS4ERR_NOTSUPP 6948 NFS4ERR_RESOURCE 6949 NFS4ERR_SERVERFAULT 6950 NFS4ERR_STALE 6951 NFS4ERR_WRONGSEC 6953 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open 6955 SYNOPSIS 6957 (cfh), seqid, stateid-> stateid 6959 ARGUMENT 6961 struct OPEN_CONFIRM4args { 6962 /* CURRENT_FH: opened file */ 6963 seqid4 seqid; 6964 stateid4 stateid; 6965 }; 6967 RESULT 6969 struct OPEN_CONFIRM4resok { 6970 stateid4 stateid; 6971 }; 6973 union OPEN_CONFIRM4res switch (nfsstat4 status) { 6974 case NFS4_OK: 6975 OPEN_CONFIRM4resok resok4; 6976 default: 6977 void; 6978 }; 6980 DESCRIPTION 6982 This operation is used to confirm the sequence id usage for the 6983 first time that a nfs_lockowner is used by a client. The stateid 6984 returned from the OPEN operation is used as the argument for this 6985 operation along with the next sequence id for the nfs_lockowner. 6986 The sequence id passed to the OPEN_CONFIRM must be 1 (one) greater 6987 than the seqid passed to the OPEN operation from which the 6988 open_confirm value was obtained. If the server receives an 6989 unexpected sequence id with respect to the original open, then the 6990 server assumes that the client will not confirm the original OPEN 6991 and all state associated with the original OPEN is released by the 6992 server. 6994 On success, the current filehandle retains its value. 6996 IMPLEMENTATION 6998 A given client might generate many nfs_lockowner data structures 6999 for a given clientid. The client will periodically either dispose 7000 of its nfs_lockowners or stop using them for indefinite periods of 7001 time. The latter situation is why the NFS version 4 protocol does 7002 not have a an explicit operation to exit an nfs_lockowner: such an 7003 operation is of no use in that situation. Instead, to avoid 7004 unbounded memory use, the server needs to implement a strategy for 7005 disposing of nfs_lockowners that have no current lock, open, or 7006 delegation state for any files and have not been used recently. 7007 The time period used to determine when to dispose of nfs_lockowners 7008 is an implementation choice. The time period should certainly be 7009 no less than the lease time plus any grace period the server wishes 7010 to implement beyond a lease time. The OPEN_CONFIRM operation 7011 allows the server to safely dispose of unused nfs_lockowner data 7012 structures. 7014 In the case that a client issues an OPEN operation and the server 7015 no longer has a record of the nfs_lockowner, the server needs 7016 ensure that this is a new OPEN and not a replay or retransmission. 7018 A lazy server implementation might require confirmation for every 7019 nfs_lockowner for which it has no record. However, this is not 7020 necessary until the server records the fact that it has disposed of 7021 one nfs_lockowner for the given clientid. 7023 The server must hold unconfirmed OPEN state until one of three 7024 events occur. First, the client sends an OPEN_CONFIRM request with 7025 the appropriate sequence id and stateid within the lease period. 7026 In this case, the OPEN state on the server goes to confirmed, and 7027 the nfs_lockowner on the server is fully established. 7029 Second, the client sends another OPEN request with a sequence id 7030 that is incorrect for the nfs_lockowner (out of sequence). In this 7031 case, the server assumes the second OPEN request is valid and the 7032 first one is a replay. The server cancels the OPEN state of the 7033 first OPEN request, establishes an unconfirmed OPEN state for the 7034 second OPEN request, and responds to the second OPEN request with 7035 an indication that an OPEN_CONFIRM is needed. The process then 7036 repeats itself. While there is a potential for a denial of service 7037 attack on the client, it is mitigated if the client and server 7038 require the use of a security flavor based on Kerberos V5, LIPKEY, 7039 or some other flavor that uses cryptography. 7041 What if the server is in the unconfirmed OPEN state for a given 7042 nfs_lockowner, and it receives an operation on the nfs_lockowner 7043 that has a stateid but the operation is not OPEN, or it is 7044 OPEN_CONFIRM but with the wrong stateid? Then, even if the seqid 7045 is correct, the server returns NFS4ERR_BAD_STATEID, because the 7046 server assumes the operation is a replay: if the server has no 7047 established OPEN state, then there is no way, for example, a LOCK 7048 operation could be valid. 7050 Third, neither of the two aforementioned events occur for the 7051 nfs_lockowner within the lease period. In this case, the OPEN 7052 state is cancelled and disposal of the nfs_lockowner can occur. 7054 ERRORS 7056 NFS4ERR_BADHANDLE 7057 NFS4ERR_BAD_SEQID 7058 NFS4ERR_BADXDR 7059 NFS4ERR_EXPIRED 7060 NFS4ERR_FHEXPIRED 7061 NFS4ERR_GRACE 7062 NFS4ERR_INVAL 7063 NFS4ERR_ISDIR 7064 NFS4ERR_MOVED 7065 NFS4ERR_NOENT 7066 NFS4ERR_NOFILEHANDLE 7067 NFS4ERR_NOTSUPP 7068 NFS4ERR_RESOURCE 7069 NFS4ERR_SERVERFAULT 7070 NFS4ERR_STALE 7071 NFS4ERR_WRONGSEC 7073 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access 7075 SYNOPSIS 7077 (cfh), stateid, seqid, access, deny -> stateid 7079 ARGUMENT 7081 struct OPEN_DOWNGRADE4args { 7082 /* CURRENT_FH: opened file */ 7083 stateid4 stateid; 7084 seqid4 seqid; 7085 uint32_t share_access; 7086 uint32_t share_deny; 7087 }; 7089 RESULT 7091 struct OPEN_DOWNGRADE4resok { 7092 stateid4 stateid; 7093 }; 7095 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 7096 case NFS4_OK: 7097 OPEN_DOWNGRADE4resok resok4; 7098 default: 7099 void; 7100 }; 7102 This operation is used to adjust the access and deny bits for a given 7103 open. This is necessary when a given lockowner opens the same file 7104 multiple times with different access and deny flags. In this 7105 situation, a close of one of the open's may change the appropriate 7106 access and deny flags to remove bits associated with open's no longer 7107 in effect. 7109 The access and deny bits specified in this operation replace the 7110 current ones for the specified open file. If either the access or 7111 the deny mode specified includes bits not in effect for the open, the 7112 error NFS4ERR_INVAL should be returned. Since access and deny bits 7113 are subsets of those already granted, it is not possible for this 7114 request to be denied because of conflicting share reservations. 7116 On success, the current filehandle retains its value. 7118 ERRORS 7120 NFS4ERR_BADHANDLE 7121 NFS4ERR_BAD_SEQID 7122 NFS4ERR_BAD_STATEID 7123 NFS4ERR_BADXDR 7124 NFS4ERR_EXPIRED 7125 NFS4ERR_FHEXPIRED 7126 NFS4ERR_INVAL 7127 NFS4ERR_MOVED 7128 NFS4ERR_NOFILEHANDLE 7129 NFS4ERR_OLD_STATEID 7130 NFS4ERR_RESOURCE 7131 NFS4ERR_SERVERFAULT 7132 NFS4ERR_STALE 7133 NFS4ERR_STALE_STATEID 7135 14.2.20. Operation 22: PUTFH - Set Current Filehandle 7137 SYNOPSIS 7139 filehandle -> (cfh) 7141 ARGUMENT 7143 struct PUTFH4args { 7144 nfs_fh4 object; 7145 }; 7147 RESULT 7149 struct PUTFH4res { 7150 /* CURRENT_FH: */ 7151 nfsstat4 status; 7152 }; 7154 DESCRIPTION 7156 Replaces the current filehandle with the filehandle provided as an 7157 argument. 7159 IMPLEMENTATION 7161 Commonly used as the first operator in an NFS request to set the 7162 context for following operations. 7164 ERRORS 7166 NFS4ERR_BADHANDLE 7167 NFS4ERR_BADXDR 7168 NFS4ERR_FHEXPIRED 7169 NFS4ERR_MOVED 7170 NFS4ERR_RESOURCE 7171 NFS4ERR_SERVERFAULT 7172 NFS4ERR_STALE 7173 NFS4ERR_WRONGSEC 7175 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle 7177 SYNOPSIS 7179 - -> (cfh) 7181 ARGUMENT 7183 void; 7185 RESULT 7187 struct PUTPUBFH4res { 7188 /* CURRENT_FH: public fh */ 7189 nfsstat4 status; 7190 }; 7192 DESCRIPTION 7194 Replaces the current filehandle with the filehandle that represents 7195 the public filehandle of the server's name space. This filehandle 7196 may be different from the "root" filehandle which may be associated 7197 with some other directory on the server. 7199 The public filehandle represents the concepts embodied in RFC 2054, 7200 RFC 2055, RFC 2224. The intent for NFS version 4 is that the 7201 public filehandle (represented by the PUTPUBFH operation) be used 7202 as a method of providing WebNFS server compatibility with NFS 7203 versions 2 and 3. 7205 The public filehandle and the root filehandle (represented by the 7206 PUTROOTFH operation) should be equivalent. If the public and root 7207 filehandles are not equivalent, then the public filehandle MUST be 7208 a descendant of the root filehandle. 7210 IMPLEMENTATION 7212 Used as the first operator in an NFS request to set the context for 7213 following operations. 7215 With the NFS version 2 and 3 public filehandle, the client is able 7216 to specify whether the path name provided in the LOOKUP should be 7217 evaluated as either an absolute path relative to the server's root 7218 or relative to the public filehandle. RFC 2224 contains further 7219 discussion of the functionality. With NFSv4, that type of 7220 specification is not directly available in the LOOKUP operation. 7221 The reason for this is because the component separators needed to 7222 specify absolute vs. relative are not allowed in NFS version 4. 7224 Therefore, the client is responsible for constructing its request 7225 such that the use of either PUTROOTFH or PUTPUBFH are used to 7226 signify absolute or relative evaluation of an NFS URL respectively. 7228 Note that there are warnings mentioned in RFC 2224 with respect to 7229 the use of absolute evaluation and the restrictions the server may 7230 place on that evaluation with respect to how much of its namespace 7231 has been made available. These same warnings apply to NFS version 7232 4. It is likely, therefore that because of server implementation 7233 details, an NFS version 3 absolute public filehandle lookup may 7234 behave differently than an NFS version 4 absolute resolution. 7236 There is a form of security negotiation as described in RFC 2755 7237 that uses the public filehandle a method of employing SNEGO. This 7238 method is not available with NFS version 4 as filehandles are not 7239 overloaded with special meaning and therefore do not provide the 7240 same framework as NFS versions 2 and 3. Clients should therefore 7241 use the security negotiation mechanisms described in this RFC. 7243 ERRORS 7245 NFS4ERR_RESOURCE 7246 NFS4ERR_SERVERFAULT 7247 NFS4ERR_WRONGSEC 7249 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle 7251 SYNOPSIS 7253 - -> (cfh) 7255 ARGUMENT 7257 void; 7259 RESULT 7261 struct PUTROOTFH4res { 7262 /* CURRENT_FH: root fh */ 7263 nfsstat4 status; 7264 }; 7266 DESCRIPTION 7268 Replaces the current filehandle with the filehandle that represents 7269 the root of the server's name space. From this filehandle a LOOKUP 7270 operation can locate any other filehandle on the server. This 7271 filehandle may be different from the "public" filehandle which may 7272 be associated with some other directory on the server. 7274 IMPLEMENTATION 7276 Commonly used as the first operator in an NFS request to set the 7277 context for following operations. 7279 ERRORS 7281 NFS4ERR_RESOURCE 7282 NFS4ERR_SERVERFAULT 7283 NFS4ERR_WRONGSEC 7285 14.2.23. Operation 25: READ - Read from File 7287 SYNOPSIS 7289 (cfh), stateid, offset, count -> eof, data 7291 ARGUMENT 7293 struct READ4args { 7294 /* CURRENT_FH: file */ 7295 stateid4 stateid; 7296 offset4 offset; 7297 count4 count; 7298 }; 7300 RESULT 7302 struct READ4resok { 7303 bool eof; 7304 opaque data<>; 7305 }; 7307 union READ4res switch (nfsstat4 status) { 7308 case NFS4_OK: 7309 READ4resok resok4; 7310 default: 7311 void; 7312 }; 7314 DESCRIPTION 7316 The READ operation reads data from the regular file identified by 7317 the current filehandle. 7319 The client provides an offset of where the READ is to start and a 7320 count of how many bytes are to be read. An offset of 0 (zero) 7321 means to read data starting at the beginning of the file. If 7322 offset is greater than or equal to the size of the file, the 7323 status, NFS4_OK, is returned with a data length set to 0 (zero) and 7324 eof is set to TRUE. The READ is subject to access permissions 7325 checking. 7327 If the client specifies a count value of 0 (zero), the READ 7328 succeeds and returns 0 (zero) bytes of data again subject to access 7329 permissions checking. The server may choose to return fewer bytes 7330 than specified by the client. The client needs to check for this 7331 condition and handle the condition appropriately. 7333 The stateid value for a READ request represents a value returned 7334 from a previous record lock or share reservation request. Used by 7335 the server to verify that the associated lock is still valid and to 7336 update lease timeouts for the client. 7338 If the read ended at the end-of-file (formally, in a correctly 7339 formed READ request, if offset + count is equal to the size of the 7340 file), or the read request extends beyond the size of the file (if 7341 offset + count is greater than the size of the file), eof is 7342 returned as TRUE; otherwise it is FALSE. A successful READ of an 7343 empty file will always return eof as TRUE. 7345 If the current filehandle is not a regular file, an error will be 7346 returned to the client. In the case the current filehandle 7347 represents a directory, NFS4ERR_ISDIR is return; otherwise, 7348 NFS4ERR_INVAL is returned. 7350 On success, the current filehandle retains its value. 7352 IMPLEMENTATION 7354 It is possible for the server to return fewer than count bytes of 7355 data. If the server returns less than the count requested and eof 7356 set to FALSE, the client should issue another READ to get the 7357 remaining data. A server may return less data than requested under 7358 several circumstances. The file may have been truncated by another 7359 client or perhaps on the server itself, changing the file size from 7360 what the requesting client believes to be the case. This would 7361 reduce the actual amount of data available to the client. It is 7362 possible that the server may back off the transfer size and reduce 7363 the read request return. Server resource exhaustion may also occur 7364 necessitating a smaller read return. 7366 If the file is locked the server will return an NFS4ERR_LOCKED 7367 error. Since the lock may be of short duration, the client may 7368 choose to retransmit the READ request (with exponential backoff) 7369 until the operation succeeds. 7371 ERRORS 7373 NFS4ERR_ACCESS 7374 NFS4ERR_BADHANDLE 7375 NFS4ERR_BAD_STATEID 7376 NFS4ERR_BADXDR 7377 NFS4ERR_DELAY 7378 NFS4ERR_EXPIRED 7379 NFS4ERR_FHEXPIRED 7380 NFS4ERR_GRACE 7381 NFS4ERR_INVAL 7382 NFS4ERR_IO 7383 NFS4ERR_ISDIR 7384 NFS4ERR_LEASE_MOVED 7385 NFS4ERR_LOCKED 7386 NFS4ERR_MOVED 7387 NFS4ERR_NOFILEHANDLE 7388 NFS4ERR_NXIO 7389 NFS4ERR_OLD_STATEID 7390 NFS4ERR_OPENMODE 7391 NFS4ERR_RESOURCE 7392 NFS4ERR_SERVERFAULT 7393 NFS4ERR_STALE 7394 NFS4ERR_STALE_STATEID 7395 NFS4ERR_WRONGSEC 7397 14.2.24. Operation 26: READDIR - Read Directory 7399 SYNOPSIS 7400 (cfh), cookie, cookieverf, dircount, maxcount, attr_request -> 7401 cookieverf { cookie, name, attrs } 7403 ARGUMENT 7405 struct READDIR4args { 7406 /* CURRENT_FH: directory */ 7407 nfs_cookie4 cookie; 7408 verifier4 cookieverf; 7409 count4 dircount; 7410 count4 maxcount; 7411 bitmap4 attr_request; 7412 }; 7414 RESULT 7416 struct entry4 { 7417 nfs_cookie4 cookie; 7418 component4 name; 7419 fattr4 attrs; 7420 entry4 *nextentry; 7421 }; 7423 struct dirlist4 { 7424 entry4 *entries; 7425 bool eof; 7426 }; 7428 struct READDIR4resok { 7429 verifier4 cookieverf; 7430 dirlist4 reply; 7431 }; 7433 union READDIR4res switch (nfsstat4 status) { 7434 case NFS4_OK: 7435 READDIR4resok resok4; 7436 default: 7437 void; 7438 }; 7440 DESCRIPTION 7442 The READDIR operation retrieves a variable number of entries from a 7443 file system directory and returns client requested attributes for 7444 each entry along with information to allow the client to request 7445 additional directory entries in a subsequent READDIR. 7447 The arguments contain a cookie value that represents where the 7448 READDIR should start within the directory. A value of 0 (zero) for 7449 the cookie is used to start reading at the beginning of the 7450 directory. For subsequent READDIR requests, the client specifies a 7451 cookie value that is provided by the server on a previous READDIR 7452 request. 7454 The cookieverf value should be set to 0 (zero) when the cookie 7455 value is 0 (zero) (first directory read). On subsequent requests, 7456 it should be a cookieverf as returned by the server. The 7457 cookieverf must match that returned by the READDIR in which the 7458 cookie was acquired. 7460 The dircount portion of the argument is a hint of the maximum 7461 number of bytes of directory information that should be returned. 7462 This value represents the length of the names of the directory 7463 entries and the cookie value for these entries. This length 7464 represents the XDR encoding of the data (names and cookies) and not 7465 the length in the native format of the server. The server may 7466 return less data. 7468 The maxcount value of the argument is the maximum number of bytes 7469 for the result. This maximum size represents all of the data being 7470 returned and includes the XDR overhead. The server may return less 7471 data. If the server is unable to return a single directory entry 7472 within the maxcount limit, the error NFS4ERR_READDIR_NOSPC will be 7473 returned to the client. 7475 Finally, attrbits represents the list of attributes to be returned 7476 for each directory entry supplied by the server. 7478 On successful return, the server's response will provide a list of 7479 directory entries. Each of these entries contains the name of the 7480 directory entry, a cookie value for that entry, and the associated 7481 attributes as requested. 7483 The cookie value is only meaningful to the server and is used as a 7484 "bookmark" for the directory entry. As mentioned, this cookie is 7485 used by the client for subsequent READDIR operations so that it may 7486 continue reading a directory. The cookie is similar in concept to 7487 a READ offset but should not be interpreted as such by the client. 7488 Ideally, the cookie value should not change if the directory is 7489 modified since the client may be caching these values. 7491 In some cases, the server may encounter an error while obtaining 7492 the attributes for a directory entry. Instead of returning an 7493 error for the entire READDIR operation, the server can instead 7494 return the attribute 'fattr4_rdattr_error'. With this, the server 7495 is able to communicate the failure to the client and not fail the 7496 entire operation in the instance of what might be a transient 7497 failure. Obviously, the client must request the 7498 fattr4_rdattr_error attribute for this method to work properly. If 7499 the client does not request the attribute, the server has no choice 7500 but to return failure for the entire READDIR operation. 7502 For some file system environments, the directory entries "." and 7503 ".." have special meaning and in other environments, they may not. 7504 If the server supports these special entries within a directory, 7505 they should not be returned to the client as part of the READDIR 7506 response. To enable some client environments, the cookie values of 7507 0, 1, and 2 are to be considered reserved. Note that the Unix 7508 client will use these values when combining the server's response 7509 and local representations to enable a fully formed Unix directory 7510 presentation to the application. 7512 For READDIR arguments, cookie values of 1 and 2 should not be used 7513 and for READDIR results cookie values of 0, 1, and 2 should not 7514 returned. 7516 On success, the current filehandle retains its value. 7518 IMPLEMENTATION 7520 The server's file system directory representations can differ 7521 greatly. A client's programming interfaces may also be bound to 7522 the local operating environment in a way that does not translate 7523 well into the NFS protocol. Therefore the use of the dircount and 7524 maxcount fields are provided to allow the client the ability to 7525 provide guidelines to the server. If the client is aggressive 7526 about attribute collection during a READDIR, the server has an idea 7527 of how to limit the encoded response. The dircount field provides 7528 a hint on the number of entries based solely on the names of the 7529 directory entries. Since it is a hint, it may be possible that a 7530 dircount value is zero. In this case, the server is free to ignore 7531 the dircount value and return directory information based on the 7532 specified maxcount value. 7534 The cookieverf may be used by the server to help manage cookie 7535 values that may become stale. It should be a rare occurrence that 7536 a server is unable to continue properly reading a directory with 7537 the provided cookie/cookieverf pair. The server should make every 7538 effort to avoid this condition since the application at the client 7539 may not be able to properly handle this type of failure. 7541 The use of the cookieverf will also protect the client from using 7542 READDIR cookie values that may be stale. For example, if the file 7543 system has been migrated, the server may or may not be able to use 7544 the same cookie values to service READDIR as the previous server 7545 used. With the client providing the cookieverf, the server is able 7546 to provide the appropriate response to the client. This prevents 7547 the case where the server may accept a cookie value but the 7548 underlying directory has changed and the response is invalid from 7549 the client's context of its previous READDIR. 7551 Since some servers will not be returning "." and ".." entries as 7552 has been done with previous versions of the NFS protocol, the 7553 client that requires these entries be present in READDIR responses 7554 must fabricate them. 7556 ERRORS 7558 NFS4ERR_ACCESS 7559 NFS4ERR_BADHANDLE 7560 NFS4ERR_BAD_COOKIE 7561 NFS4ERR_BADXDR 7562 NFS4ERR_DELAY 7563 NFS4ERR_FHEXPIRED 7564 NFS4ERR_INVAL 7565 NFS4ERR_IO 7566 NFS4ERR_MOVED 7567 NFS4ERR_NOFILEHANDLE 7568 NFS4ERR_NOTDIR 7569 NFS4ERR_NOTSUPP 7570 NFS4ERR_READDIR_NOSPC 7571 NFS4ERR_RESOURCE 7572 NFS4ERR_SERVERFAULT 7573 NFS4ERR_STALE 7574 NFS4ERR_TOOSMALL 7575 NFS4ERR_WRONGSEC 7577 14.2.25. Operation 27: READLINK - Read Symbolic Link 7579 SYNOPSIS 7581 (cfh) -> linktext 7583 ARGUMENT 7585 /* CURRENT_FH: symlink */ 7586 void; 7588 RESULT 7590 struct READLINK4resok { 7591 linktext4 link; 7592 }; 7594 union READLINK4res switch (nfsstat4 status) { 7595 case NFS4_OK: 7596 READLINK4resok resok4; 7597 default: 7598 void; 7599 }; 7601 DESCRIPTION 7603 READLINK reads the data associated with a symbolic link. The data 7604 is a UTF-8 string that is opaque to the server. That is, whether 7605 created by an NFS client or created locally on the server, the data 7606 in a symbolic link is not interpreted when created, but is simply 7607 stored. 7609 On success, the current filehandle retains its value. 7611 IMPLEMENTATION 7613 A symbolic link is nominally a pointer to another file. The data 7614 is not necessarily interpreted by the server, just stored in the 7615 file. It is possible for a client implementation to store a path 7616 name that is not meaningful to the server operating system in a 7617 symbolic link. A READLINK operation returns the data to the client 7618 for interpretation. If different implementations want to share 7619 access to symbolic links, then they must agree on the 7620 interpretation of the data in the symbolic link. 7622 The READLINK operation is only allowed on objects of type NF4LNK. 7623 The server should return the error, NFS4ERR_INVAL, if the object is 7624 not of type, NF4LNK. 7626 ERRORS 7628 NFS4ERR_ACCESS 7629 NFS4ERR_BADHANDLE 7630 NFS4ERR_DELAY 7631 NFS4ERR_FHEXPIRED 7632 NFS4ERR_INVAL 7633 NFS4ERR_IO 7634 NFS4ERR_MOVED 7635 NFS4ERR_NOFILEHANDLE 7636 NFS4ERR_NOTSUPP 7637 NFS4ERR_RESOURCE 7638 NFS4ERR_SERVERFAULT 7639 NFS4ERR_STALE 7640 NFS4ERR_WRONGSEC 7642 14.2.26. Operation 28: REMOVE - Remove Filesystem Object 7644 SYNOPSIS 7646 (cfh), filename -> change_info 7648 ARGUMENT 7650 struct REMOVE4args { 7651 /* CURRENT_FH: directory */ 7652 component4 target; 7653 }; 7655 RESULT 7657 struct REMOVE4resok { 7658 change_info4 cinfo; 7659 } 7661 union REMOVE4res switch (nfsstat4 status) { 7662 case NFS4_OK: 7663 REMOVE4resok resok4; 7664 default: 7665 void; 7666 } 7668 DESCRIPTION 7670 The REMOVE operation removes (deletes) a directory entry named by 7671 filename from the directory corresponding to the current 7672 filehandle. If the entry in the directory was the last reference 7673 to the corresponding file system object, the object may be 7674 destroyed. 7676 For the directory where the filename was removed, the server 7677 returns change_info4 information in cinfo. With the atomic field 7678 of the change_info4 struct, the server will indicate if the before 7679 and after change attributes were obtained atomically with respect 7680 to the removal. 7682 If the target has a length of 0 (zero), or if target does not obey 7683 the UTF-8 definition, the error NFS4ERR_INVAL will be returned. 7685 On success, the current filehandle retains its value. 7687 IMPLEMENTATION 7689 NFS versions 2 and 3 required a different operator RMDIR for 7690 directory removal. NFS version 4 REMOVE can be used to delete any 7691 directory entry independent of its file type. 7693 The concept of last reference is server specific. However, if the 7694 numlinks field in the previous attributes of the object had the 7695 value 1, the client should not rely on referring to the object via 7696 a file handle. Likewise, the client should not rely on the 7697 resources (disk space, directory entry, and so on) formerly 7698 associated with the object becoming immediately available. Thus, if 7699 a client needs to be able to continue to access a file after using 7700 REMOVE to remove it, the client should take steps to make sure that 7701 the file will still be accessible. The usual mechanism used is to 7702 RENAME the file from its old name to a new hidden name. 7704 ERRORS 7706 NFS4ERR_ACCESS 7707 NFS4ERR_BADHANDLE 7708 NFS4ERR_BADXDR 7709 NFS4ERR_DELAY 7710 NFS4ERR_FHEXPIRED 7711 NFS4ERR_INVAL 7712 NFS4ERR_IO 7713 NFS4ERR_MOVED 7714 NFS4ERR_NAMETOOLONG 7715 NFS4ERR_NOENT 7716 NFS4ERR_NOFILEHANDLE 7717 NFS4ERR_NOTDIR 7718 NFS4ERR_NOTEMPTY 7719 NFS4ERR_NOTSUPP 7720 NFS4ERR_RESOURCE 7721 NFS4ERR_ROFS 7722 NFS4ERR_SERVERFAULT 7723 NFS4ERR_STALE 7724 NFS4ERR_WRONGSEC 7726 14.2.27. Operation 29: RENAME - Rename Directory Entry 7728 SYNOPSIS 7730 (sfh), oldname, (cfh), newname -> source_change_info, 7731 target_change_info 7733 ARGUMENT 7735 struct RENAME4args { 7736 /* SAVED_FH: source directory */ 7737 component4 oldname; 7738 /* CURRENT_FH: target directory */ 7739 component4 newname; 7740 }; 7742 RESULT 7744 struct RENAME4resok { 7745 change_info4 source_cinfo; 7746 change_info4 target_cinfo; 7747 }; 7749 union RENAME4res switch (nfsstat4 status) { 7750 case NFS4_OK: 7751 RENAME4resok resok4; 7752 default: 7753 void; 7754 }; 7756 DESCRIPTION 7758 The RENAME operation renames the object identified by oldname in 7759 the source directory corresponding to the saved filehandle, as set 7760 by the SAVEFH operation, to newname in the target directory 7761 corresponding to the current filehandle. The operation is required 7762 to be atomic to the client. Source and target directories must 7763 reside on the same file system on the server. On success, the 7764 current filehandle will continue to be the target directory. 7766 If the target directory already contains an entry with the name, 7767 newname, the source object must be compatible with the target: 7768 either both are non-directories or both are directories and the 7769 target must be empty. If compatible, the existing target is 7770 removed before the rename occurs. If they are not compatible or if 7771 the target is a directory but not empty, the server will return the 7772 error, NFS4ERR_EXIST. 7774 If oldname and newname both refer to the same file (they might be 7775 hard links of each other), then RENAME should perform no action and 7776 return success. 7778 For both directories involved in the RENAME, the server returns 7779 change_info4 information. With the atomic field of the 7780 change_info4 struct, the server will indicate if the before and 7781 after change attributes were obtained atomically with respect to 7782 the rename. 7784 If the oldname refers to a named attribute and the saved and 7785 current filehandles refer to different filesystem objects, the 7786 server will return NFS4ERR_XDEV just as if the saved and current 7787 filehandles represented directories on different filesystems. 7789 If the oldname or newname has a length of 0 (zero), or if oldname 7790 or newname does not obey the UTF-8 definition, the error 7791 NFS4ERR_INVAL will be returned. 7793 IMPLEMENTATION 7795 The RENAME operation must be atomic to the client. The statement 7796 "source and target directories must reside on the same file system 7797 on the server" means that the fsid fields in the attributes for the 7798 directories are the same. If they reside on different file systems, 7799 the error, NFS4ERR_XDEV, is returned. 7801 A filehandle may or may not become stale or expire on a rename. 7802 However, server implementors are strongly encouraged to attempt to 7803 keep file handles from becoming stale or expiring in this fashion. 7805 On some servers, the filenames, "." and "..", are illegal as either 7806 oldname or newname. In addition, neither oldname nor newname can 7807 be an alias for the source directory. These servers will return 7808 the error, NFS4ERR_INVAL, in these cases. 7810 If either of the source or target filehandles are not directories, 7811 the server will return NFS4ERR_NOTDIR. 7813 ERRORS 7815 NFS4ERR_ACCESS 7816 NFS4ERR_BADHANDLE 7817 NFS4ERR_BADXDR 7818 NFS4ERR_DELAY 7819 NFS4ERR_DQUOT 7820 NFS4ERR_EXIST 7821 NFS4ERR_FHEXPIRED 7822 NFS4ERR_INVAL 7823 NFS4ERR_IO 7824 NFS4ERR_MOVED 7825 NFS4ERR_NAMETOOLONG 7826 NFS4ERR_NOENT 7827 NFS4ERR_NOFILEHANDLE 7828 NFS4ERR_NOSPC 7829 NFS4ERR_NOTDIR 7830 NFS4ERR_NOTEMPTY 7831 NFS4ERR_NOTSUPP 7832 NFS4ERR_RESOURCE 7833 NFS4ERR_ROFS 7834 NFS4ERR_SERVERFAULT 7835 NFS4ERR_STALE 7836 NFS4ERR_WRONGSEC 7837 NFS4ERR_XDEV 7839 14.2.28. Operation 30: RENEW - Renew a Lease 7841 SYNOPSIS 7843 clientid -> () 7845 ARGUMENT 7847 struct RENEW4args { 7848 clientid4clientid; 7849 }; 7851 RESULT 7853 struct RENEW4res { 7854 nfsstat4 status; 7855 }; 7857 DESCRIPTION 7859 The RENEW operation is used by the client to renew leases which it 7860 currently holds at a server. In processing the RENEW request, the 7861 server renews all leases associated with the client. The 7862 associated leases are determined by the clientid provided via the 7863 SETCLIENTID procedure. 7865 IMPLEMENTATION 7867 ERRORS 7869 NFS4ERR_BADXDR 7870 NFS4ERR_EXPIRED 7871 NFS4ERR_GRACE 7872 NFS4ERR_INVAL 7873 NFS4ERR_LEASE_MOVED 7874 NFS4ERR_MOVED 7875 NFS4ERR_RESOURCE 7876 NFS4ERR_SERVERFAULT 7877 NFS4ERR_STALE_CLIENTID 7878 NFS4ERR_WRONGSEC 7880 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle 7882 SYNOPSIS 7884 (sfh) -> (cfh) 7886 ARGUMENT 7888 /* SAVED_FH: */ 7889 void; 7891 RESULT 7893 struct RESTOREFH4res { 7894 /* CURRENT_FH: value of saved fh */ 7895 nfsstat4 status; 7896 }; 7898 DESCRIPTION 7900 Set the current filehandle to the value in the saved filehandle. 7901 If there is no saved filehandle then return an error 7902 NFS4ERR_NOFILEHANDLE. 7904 IMPLEMENTATION 7906 Operations like OPEN and LOOKUP use the current filehandle to 7907 represent a directory and replace it with a new filehandle. 7908 Assuming the previous filehandle was saved with a SAVEFH operator, 7909 the previous filehandle can be restored as the current filehandle. 7910 This is commonly used to obtain post-operation attributes for the 7911 directory, e.g. 7913 PUTFH (directory filehandle) 7914 SAVEFH 7915 GETATTR attrbits (pre-op dir attrs) 7916 CREATE optbits "foo" attrs 7917 GETATTR attrbits (file attributes) 7918 RESTOREFH 7919 GETATTR attrbits (post-op dir attrs) 7921 ERRORS 7923 NFS4ERR_BADHANDLE 7924 NFS4ERR_FHEXPIRED 7925 NFS4ERR_MOVED 7926 NFS4ERR_NOFILEHANDLE 7927 NFS4ERR_RESOURCE 7928 NFS4ERR_SERVERFAULT 7929 NFS4ERR_STALE 7930 NFS4ERR_WRONGSEC 7932 14.2.30. Operation 32: SAVEFH - Save Current Filehandle 7934 SYNOPSIS 7936 (cfh) -> (sfh) 7938 ARGUMENT 7940 /* CURRENT_FH: */ 7941 void; 7943 RESULT 7945 struct SAVEFH4res { 7946 /* SAVED_FH: value of current fh */ 7947 nfsstat4 status; 7948 }; 7950 DESCRIPTION 7952 Save the current filehandle. If a previous filehandle was saved 7953 then it is no longer accessible. The saved filehandle can be 7954 restored as the current filehandle with the RESTOREFH operator. 7956 On success, the current filehandle retains its value. 7958 IMPLEMENTATION 7960 ERRORS 7962 NFS4ERR_BADHANDLE 7963 NFS4ERR_FHEXPIRED 7964 NFS4ERR_MOVED 7965 NFS4ERR_NOFILEHANDLE 7966 NFS4ERR_RESOURCE 7967 NFS4ERR_SERVERFAULT 7968 NFS4ERR_STALE 7969 NFS4ERR_WRONGSEC 7971 14.2.31. Operation 33: SECINFO - Obtain Available Security 7973 SYNOPSIS 7975 (cfh), name -> { secinfo } 7977 ARGUMENT 7979 struct SECINFO4args { 7980 /* CURRENT_FH: directory */ 7981 component4 name; 7982 }; 7984 RESULT 7986 enum rpc_gss_svc_t { 7987 RPC_GSS_SVC_NONE = 1, 7988 RPC_GSS_SVC_INTEGRITY = 2, 7989 RPC_GSS_SVC_PRIVACY = 3 7990 }; 7992 struct rpcsec_gss_info { 7993 sec_oid4 oid; 7994 qop4 qop; 7995 rpc_gss_svc_t service; 7996 }; 7998 union secinfo4 switch (uint32_t flavor) { 7999 case RPCSEC_GSS: 8000 rpcsec_gss_info flavor_info; 8001 default: 8002 void; 8003 }; 8005 typedef secinfo4 SECINFO4resok<>; 8007 union SECINFO4res switch (nfsstat4 status) { 8008 case NFS4_OK: 8009 SECINFO4resok resok4; 8010 default: 8011 void; 8012 }; 8014 DESCRIPTION 8016 The SECINFO operation is used by the client to obtain a list of 8017 valid RPC authentication flavors for a specific directory 8018 filehandle, file name pair. The result will contain an array which 8019 represents the security mechanisms available. The array entries 8020 are represented by the secinfo4 structure. The field 'flavor' will 8021 contain a value of AUTH_NONE, AUTH_SYS (as defined in [RFC1831]), 8022 or RPCSEC_GSS (as defined in [RFC2203]). 8024 For the flavors AUTH_NONE and AUTH_SYS, no additional security 8025 information is returned. For a return value of RPCSEC_GSS, a 8026 security triple is returned that contains the mechanism object id 8027 (as defined in [RFC2078]), the quality of protection (as defined in 8028 [RFC2078]) and the service type (as defined in [RFC2203]). It is 8029 possible for SECINFO to return multiple entries with flavor equal 8030 to RPCSEC_GSS with different security triple values. 8032 On success, the current filehandle retains its value. 8034 If the name has a length of 0 (zero), or if name does not obey the 8035 UTF-8 definition, the error NFS4ERR_INVAL will be returned. 8037 IMPLEMENTATION 8039 The SECINFO operation is expected to be used by the NFS client when 8040 the error value of NFS4ERR_WRONGSEC is returned from another NFS 8041 operation. This signifies to the client that the server's security 8042 policy is different from what the client is currently using. At 8043 this point, the client is expected to obtain a list of possible 8044 security flavors and choose what best suits its policies. 8046 It is recommended that the client issue the SECINFO call protected 8047 by a security triple that uses either rpc_gss_svc_integrity or 8048 rpc_gss_svc_privacy service. The use of rpc_gss_svc_none would 8049 allow an attacker in the middle to modify the SECINFO results such 8050 that the client might select a weaker algorithm in the set allowed 8051 by server, making the client and/or server vulnerable to further 8052 attacks. 8054 ERRORS 8056 NFS4ERR_ACCESS 8057 NFS4ERR_BADHANDLE 8058 NFS4ERR_BADXDR 8059 NFS4ERR_FHEXPIRED 8060 NFS4ERR_INVAL 8061 NFS4ERR_MOVED 8062 NFS4ERR_NAMETOOLONG 8063 NFS4ERR_NOENT 8064 NFS4ERR_NOFILEHANDLE 8065 NFS4ERR_NOTDIR 8066 NFS4ERR_RESOURCE 8067 NFS4ERR_SERVERFAULT 8068 NFS4ERR_STALE 8069 NFS4ERR_WRONGSEC 8071 14.2.32. Operation 34: SETATTR - Set Attributes 8073 SYNOPSIS 8075 (cfh), stateid, attrmask, attr_vals -> attrsset 8077 ARGUMENT 8079 struct SETATTR4args { 8080 /* CURRENT_FH: target object */ 8081 stateid4 stateid; 8082 fattr4 obj_attributes; 8083 }; 8085 RESULT 8087 struct SETATTR4res { 8088 nfsstat4 status; 8089 bitmap4 attrsset; 8090 }; 8092 DESCRIPTION 8094 The SETATTR operation changes one or more of the attributes of a 8095 file system object. The new attributes are specified with a bitmap 8096 and the attributes that follow the bitmap in bit order. 8098 The stateid argument for SETATTR is used to provide file locking 8099 context that is necessary for SETATTR requests that set the size 8100 attribute. Since setting the size attribute modifies the file's 8101 data, it has the same locking requirements as a corresponding 8102 WRITE. Any SETATTR that sets the size attribute is incompatible 8103 with a share lock that specifies DENY_WRITE. The area between the 8104 old end-of-file and the new end-of-file is considered to be 8105 modified just as would have been the case had the area in question 8106 been specified as the target of WRITE, for the purpose of checking 8107 conflicts with record locks, for those cases in which a server is 8108 implementing mandatory record locking behavior. A valid stateid 8109 should always be specified. When the file size attribute is not 8110 set, the special stateid consisting of all bits zero should be 8111 passed. 8113 On either success or failure of the operation, the server will 8114 return the attrsset bitmask to represent what (if any) attributes 8115 were successfully set. The attrsset in the response is a subset of 8116 the bitmap4 that is part of the obj_attributes in the argument. 8118 On success, the current filehandle retains its value. 8120 IMPLEMENTATION 8122 If the request specifies the owner attribute to be set, the server 8123 should allow the operation to succeed if the current owner of the 8124 object matches the value specified in the request. Some servers 8125 may be implemented in a way as to prohibit the setting of the owner 8126 attribute unless the requestor has privilege to do so. If the 8127 server is lenient in this one case of matching owner values, the 8128 client implementation may be simplified in cases of creation of an 8129 object followed by a SETATTR. 8131 The file size attribute is used to request changes to the size of a 8132 file. A value of 0 (zero) causes the file to be truncated, a value 8133 less than the current size of the file causes data from new size to 8134 the end of the file to be discarded, and a size greater than the 8135 current size of the file causes logically zeroed data bytes to be 8136 added to the end of the file. Servers are free to implement this 8137 using holes or actual zero data bytes. Clients should not make any 8138 assumptions regarding a server's implementation of this feature, 8139 beyond that the bytes returned will be zeroed. Servers must 8140 support extending the file size via SETATTR. 8142 SETATTR is not guaranteed atomic. A failed SETATTR may partially 8143 change a file's attributes. 8145 Changing the size of a file with SETATTR indirectly changes the 8146 time_modify. A client must account for this as size changes can 8147 result in data deletion. 8149 The attributes time_access_set and time_modify_set are write-only 8150 attributes constructed as a switched union so the client can direct 8151 the server in setting the time values. If the switched union 8152 specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 8153 to be used for the operation. If the switch union does not specify 8154 SET_TO_CLIENT_TIME4, the server is to use its current time for the 8155 SETATTR operation. 8157 If server and client times differ, programs that compare client 8158 time to file times can break. A time maintenance protocol should be 8159 used to limit client/server time skew. 8161 Use of a COMPOUND containing a VERIFY operation specifying only the 8162 time_metadata attribute, immediately followed by a SETATTR, 8163 provides a means whereby a client may specify a request that 8164 emulates the functionality of the SETATTR guard mechanism of NFS 8165 version 3. Since the function of the guard mechanism is to avoid 8166 changes to the file attributes based on stale information, delays 8167 between checking of the guard condition and the setting of the 8168 attributes have the potential to compromise this function, as would 8169 the corresponding delay in the NFS version 4 emulation. Therefore, 8170 NFS version 4 servers should take care to avoid such delays, to the 8171 degree possible, when executing such a request. 8173 If the server does not support an attribute as requested by the 8174 client, the server should return NFS4ERR_ATTRNOTSUPP. 8176 A mask of the attibutes actually set is returned by SETATTR in all 8177 cases. That mask must not include attributes bits not requested to 8178 be set by the client, and must be equal to the mask of attributes 8179 requested to be set only if the SETATTR completes without error. 8181 ERRORS 8183 NFS4ERR_ACCESS 8184 NFS4ERR_ATTRNOTSUPP 8185 NFS4ERR_BADHANDLE 8186 NFS4ERR_BAD_STATEID 8187 NFS4ERR_BADXDR 8188 NFS4ERR_DELAY 8189 NFS4ERR_DQUOT 8190 NFS4ERR_EXPIRED 8191 NFS4ERR_FBIG 8192 NFS4ERR_FHEXPIRED 8193 NFS4ERR_GRACE 8194 NFS4ERR_INVAL 8195 NFS4ERR_IO 8196 NFS4ERR_ISDIR 8197 NFS4ERR_LOCKED 8198 NFS4ERR_MOVED 8199 NFS4ERR_NOFILEHANDLE 8200 NFS4ERR_NOSPC 8201 NFS4ERR_NOTSUPP 8202 NFS4ERR_OLD_STATEID 8203 NFS4ERR_OPENMODE 8204 NFS4ERR_PERM 8205 NFS4ERR_RESOURCE 8206 NFS4ERR_ROFS 8207 NFS4ERR_SERVERFAULT 8208 NFS4ERR_STALE 8209 NFS4ERR_STALE_STATEID 8210 NFS4ERR_WRONGSEC 8212 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid 8214 SYNOPSIS 8216 client, callback -> clientid 8218 ARGUMENT 8220 struct SETCLIENTID4args { 8221 nfs_client_id4 client; 8222 cb_client4 callback; 8223 uint32_t callback_ident; 8224 }; 8226 RESULT 8228 struct SETCLIENTID4resok { 8229 clientid4 clientid; 8230 }; 8232 union SETCLIENTID4res switch (nfsstat4 status) { 8233 case NFS4_OK: 8234 SETCLIENTID4resok resok4; 8235 case NFS4ERR_CLID_INUSE: 8236 clientaddr4 client_using; 8237 default: 8238 void; 8239 }; 8241 DESCRIPTION 8243 The SETCLIENTID operation introduces the ability of the client to 8244 notify the server of its intention to use a particular client 8245 identifier and verifier pair. Upon successful completion the 8246 server will return a clientid which is used in subsequent file 8247 locking requests and for a confirmation step. The client will use 8248 the SETCLIENTID_CONFIRM operation to return the clientid, as a 8249 verifier, to the server. At that point, the client may use the 8250 clientid in subsequent operations that require an nfs_lockowner. 8252 The callback information provided in this operation will be used if 8253 the client is provided an open delegation at a future point. 8254 Therefore, the client must correctly reflect the program and port 8255 numbers for the callback program at the time SETCLIENTID is used. 8257 The callback_ident value is used by the server on the callback. 8258 The client can use the callback_ident as a method of use a single 8259 callback RPC program number while still being able to determine 8260 which server is initiating the callback. 8262 IMPLEMENTATION 8264 The server takes the verifier and client identification supplied in 8265 the nfs_client_id4 and searches for a match of the client 8266 identification. If no match is found the server saves the 8267 principal/uid information along with the verifier and client 8268 identification and returns a unique clientid that is used as a 8269 shorthand reference to the supplied information. 8271 If the server finds a matching client identification, the server 8272 will only assign a new clientid if the principal/uid matches the 8273 original entry. This is to protect against rogue clients 8274 attempting to release client state indiscriminately at the server. 8275 The principal, or principal to user-identifier mapping is taken 8276 from the credential presented in the RPC. As mentioned, the server 8277 will use the credential and associated principal for the matching 8278 with existing clientids. If the client is a traditional host-based 8279 client like a Unix NFS client, then the credential presented may be 8280 the host credential. If the client is a user level client or 8281 lightweight client, the credential used may be the end user's 8282 credential. The client should take care in choosing an appropriate 8283 credential since denial of service attacks could be attempted by a 8284 rogue client that has access to the credential. 8286 ERRORS 8288 NFS4ERR_BADXDR 8289 NFS4ERR_CLID_INUSE 8290 NFS4ERR_INVAL 8291 NFS4ERR_RESOURCE 8292 NFS4ERR_SERVERFAULT 8294 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 8296 SYNOPSIS 8298 clientid -> - 8300 ARGUMENT 8302 struct SETCLIENTID_CONFIRM4args { 8303 clientid4 clientid; 8304 }; 8306 RESULT 8308 struct SETCLIENTID_CONFIRM4res { 8309 nfsstat4 status; 8310 }; 8312 DESCRIPTION 8314 This operation is used by the client to confirm the results from a 8315 previous call to SETCLIENTID. The client provides the server 8316 supplied (from a SETCLIENTID response) clientid. The server 8317 responds with a simple status of success or failure. 8319 IMPLEMENTATION 8321 The client must use the SETCLIENTID_CONFIRM operation to confirm 8322 its use of client identifier. If the server is holding state for a 8323 client which has presented a new verifier via SETCLIENTID, then the 8324 state will not be released, as described in the section "Client 8325 Failure and Recovery", until a valid SETCLIENTID_CONFIRM is 8326 received. Upon successful confirmation the server will release the 8327 previous state held on behalf of the client. 8329 ERRORS 8331 NFS4ERR_BADXDR 8332 NFS4ERR_CLID_INUSE 8333 NFS4ERR_INVAL 8334 NFS4ERR_RESOURCE 8335 NFS4ERR_SERVERFAULT 8336 NFS4ERR_STALE_CLIENTID 8338 14.2.35. Operation 37: VERIFY - Verify Same Attributes 8340 SYNOPSIS 8342 (cfh), fattr -> - 8344 ARGUMENT 8346 struct VERIFY4args { 8347 /* CURRENT_FH: object */ 8348 fattr4 obj_attributes; 8349 }; 8351 RESULT 8353 struct VERIFY4res { 8354 nfsstat4 status; 8355 }; 8357 DESCRIPTION 8359 The VERIFY operation is used to verify that attributes have a value 8360 assumed by the client before proceeding with following operations 8361 in the compound request. If any of the attributes do not match 8362 then the error NFS4ERR_NOT_SAME must be returned. The current 8363 filehandle retains its value after successful completion of the 8364 operation. 8366 IMPLEMENTATION 8368 One possible use of the VERIFY operation is the following compound 8369 sequence. With this the client is attempting to verify that the 8370 file being removed will match what the client expects to be 8371 removed. This sequence can help prevent the unintended deletion of 8372 a file. 8374 PUTFH (directory filehandle) 8375 LOOKUP (file name) 8376 VERIFY (filehandle == fh) 8377 PUTFH (directory filehandle) 8378 REMOVE (file name) 8380 This sequence does not prevent a second client from removing and 8381 creating a new file in the middle of this sequence but it does help 8382 avoid the unintended result. 8384 In the case that a recommended attribute is specified in the VERIFY 8385 operation and the server does not support that attribute for the 8386 file system object, the error NFS4ERR_NOTSUPP is returned to the 8387 client. 8389 ERRORS 8391 NFS4ERR_ACCESS 8392 NFS4ERR_ATTRNOTSUPP 8393 NFS4ERR_BADHANDLE 8394 NFS4ERR_BADXDR 8395 NFS4ERR_DELAY 8396 NFS4ERR_FHEXPIRED 8397 NFS4ERR_INVAL 8398 NFS4ERR_MOVED 8399 NFS4ERR_NOFILEHANDLE 8400 NFS4ERR_NOTSUPP 8401 NFS4ERR_NOT_SAME 8402 NFS4ERR_RESOURCE 8403 NFS4ERR_SERVERFAULT 8404 NFS4ERR_STALE 8405 NFS4ERR_WRONGSEC 8407 14.2.36. Operation 38: WRITE - Write to File 8409 SYNOPSIS 8411 (cfh), stateid, offset, stable, data -> count, committed, writeverf 8413 ARGUMENT 8415 enum stable_how4 { 8416 UNSTABLE4 = 0, 8417 DATA_SYNC4 = 1, 8418 FILE_SYNC4 = 2 8419 }; 8421 struct WRITE4args { 8422 /* CURRENT_FH: file */ 8423 stateid4 stateid; 8424 offset4 offset; 8425 stable_how4 stable; 8426 opaque data<>; 8427 }; 8429 RESULT 8431 struct WRITE4resok { 8432 count4 count; 8433 stable_how4 committed; 8434 verifier4 writeverf; 8435 }; 8437 union WRITE4res switch (nfsstat4 status) { 8438 case NFS4_OK: 8439 WRITE4resok resok4; 8440 default: 8441 void; 8442 }; 8444 DESCRIPTION 8446 The WRITE operation is used to write data to a regular file. The 8447 target file is specified by the current filehandle. The offset 8448 specifies the offset where the data should be written. An offset 8449 of 0 (zero) specifies that the write should start at the beginning 8450 of the file. The count, as encoded as part of the opaque data 8451 parameter, represents the number of bytes of data that are to be 8452 written. If the count is 0 (zero), the WRITE will succeed and 8453 return a count of 0 (zero) subject to permissions checking. The 8454 server may choose to write fewer bytes than requested by the 8455 client. 8457 Part of the write request is a specification of how the write is to 8458 be performed. The client specifies with the stable parameter the 8459 method of how the data is to be processed by the server. If stable 8460 is FILE_SYNC4, the server must commit the data written plus all 8461 file system metadata to stable storage before returning results. 8462 This corresponds to the NFS version 2 protocol semantics. Any 8463 other behavior constitutes a protocol violation. If stable is 8464 DATA_SYNC4, then the server must commit all of the data to stable 8465 storage and enough of the metadata to retrieve the data before 8466 returning. The server implementor is free to implement DATA_SYNC4 8467 in the same fashion as FILE_SYNC4, but with a possible performance 8468 drop. If stable is UNSTABLE4, the server is free to commit any 8469 part of the data and the metadata to stable storage, including all 8470 or none, before returning a reply to the client. There is no 8471 guarantee whether or when any uncommitted data will subsequently be 8472 committed to stable storage. The only guarantees made by the server 8473 are that it will not destroy any data without changing the value of 8474 verf and that it will not commit the data and metadata at a level 8475 less than that requested by the client. 8477 The stateid returned from a previous record lock or share 8478 reservation request is provided as part of the argument. The 8479 stateid is used by the server to verify that the associated lock is 8480 still valid and to update lease timeouts for the client. 8482 Upon successful completion, the following results are returned. 8483 The count result is the number of bytes of data written to the 8484 file. The server may write fewer bytes than requested. If so, the 8485 actual number of bytes written starting at location, offset, is 8486 returned. 8488 The server also returns an indication of the level of commitment of 8489 the data and metadata via committed. If the server committed all 8490 data and metadata to stable storage, committed should be set to 8491 FILE_SYNC4. If the level of commitment was at least as strong as 8492 DATA_SYNC4, then committed should be set to DATA_SYNC4. Otherwise, 8493 committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, 8494 then committed must also be FILE_SYNC4: anything else constitutes a 8495 protocol violation. If stable was DATA_SYNC4, then committed may be 8496 FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol 8497 violation. If stable was UNSTABLE4, then committed may be either 8498 FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. 8500 The final portion of the result is the write verifier. The write 8501 verifier is a cookie that the client can use to determine whether 8502 the server has changed instance (boot) state between a call to 8503 WRITE and a subsequent call to either WRITE or COMMIT. This cookie 8504 must be consistent during a single instance of the NFS version 4 8505 protocol service and must be unique between instances of the NFS 8506 version 4 protocol server, where uncommitted data may be lost. 8508 If a client writes data to the server with the stable argument set 8509 to UNSTABLE4 and the reply yields a committed response of 8510 DATA_SYNC4 or UNSTABLE4, the client will follow up some time in the 8511 future with a COMMIT operation to synchronize outstanding 8512 asynchronous data and metadata with the server's stable storage, 8513 barring client error. It is possible that due to client crash or 8514 other error that a subsequent COMMIT will not be received by the 8515 server. 8517 On success, the current filehandle retains its value. 8519 IMPLEMENTATION 8521 It is possible for the server to write fewer bytes of data than 8522 requested by the client. In this case, the server should not 8523 return an error unless no data was written at all. If the server 8524 writes less than the number of bytes specified, the client should 8525 issue another WRITE to write the remaining data. 8527 It is assumed that the act of writing data to a file will cause the 8528 time_modified of the file to be updated. However, the 8529 time_modified of the file should not be changed unless the contents 8530 of the file are changed. Thus, a WRITE request with count set to 0 8531 should not cause the time_modified of the file to be updated. 8533 The definition of stable storage has been historically a point of 8534 contention. The following expected properties of stable storage 8535 may help in resolving design issues in the implementation. Stable 8536 storage is persistent storage that survives: 8538 1. Repeated power failures. 8539 2. Hardware failures (of any board, power supply, etc.). 8540 3. Repeated software crashes, including reboot cycle. 8542 This definition does not address failure of the stable storage 8543 module itself. 8545 The verifier is defined to allow a client to detect different 8546 instances of an NFS version 4 protocol server over which cached, 8547 uncommitted data may be lost. In the most likely case, the verifier 8548 allows the client to detect server reboots. This information is 8549 required so that the client can safely determine whether the server 8550 could have lost cached data. If the server fails unexpectedly and 8551 the client has uncommitted data from previous WRITE requests (done 8552 with the stable argument set to UNSTABLE4 and in which the result 8553 committed was returned as UNSTABLE4 as well) it may not have 8554 flushed cached data to stable storage. The burden of recovery is on 8555 the client and the client will need to retransmit the data to the 8556 server. 8558 A suggested verifier would be to use the time that the server was 8559 booted or the time the server was last started (if restarting the 8560 server without a reboot results in lost buffers). 8562 The committed field in the results allows the client to do more 8563 effective caching. If the server is committing all WRITE requests 8564 to stable storage, then it should return with committed set to 8565 FILE_SYNC4, regardless of the value of the stable field in the 8566 arguments. A server that uses an NVRAM accelerator may choose to 8567 implement this policy. The client can use this to increase the 8568 effectiveness of the cache by discarding cached data that has 8569 already been committed on the server. 8571 Some implementations may return NFS4ERR_NOSPC instead of 8572 NFS4ERR_DQUOT when a user's quota is exceeded. In the case that 8573 the current filehandle is a directory, the server will return 8574 NFS4ERR_ISDIR. If the current filehandle is not a regular file or 8575 a directory, the server will return NFS4ERR_INVAL. 8577 ERRORS 8579 NFS4ERR_ACCESS 8580 NFS4ERR_BADHANDLE 8581 NFS4ERR_BAD_STATEID 8582 NFS4ERR_BADXDR 8583 NFS4ERR_DELAY 8584 NFS4ERR_DQUOT 8585 NFS4ERR_EXPIRED 8586 NFS4ERR_FBIG 8587 NFS4ERR_FHEXPIRED 8588 NFS4ERR_GRACE 8589 NFS4ERR_INVAL 8590 NFS4ERR_IO 8591 NFS4ERR_ISDIR 8592 NFS4ERR_LEASE_MOVED 8593 NFS4ERR_LOCKED 8594 NFS4ERR_MOVED 8595 NFS4ERR_NOFILEHANDLE 8596 NFS4ERR_NOSPC 8597 NFS4ERR_NXIO 8598 NFS4ERR_OLD_STATEID 8599 NFS4ERR_OPENMODE 8600 NFS4ERR_RESOURCE 8601 NFS4ERR_ROFS 8602 NFS4ERR_SERVERFAULT 8603 NFS4ERR_STALE 8604 NFS4ERR_STALE_STATEID 8605 NFS4ERR_WRONGSEC 8607 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner State 8609 SYNOPSIS 8611 lockowner -> () 8613 ARGUMENT 8615 struct RELEASE_LOCKOWNER4args { 8616 lock_owner4 lock_owner; 8617 }; 8619 RESULT 8621 struct RELEASE_LOCKOWNER4res { 8622 nfsstat4 status; 8623 }; 8625 DESCRIPTION 8627 This operation is used to notify the server that the lock_owner is 8628 no longer in use by the client. This allows the server to release 8629 cached state related to the specified lock_owner. 8631 IMPLEMENTATION 8633 The client may choose to use this operation to ease the amount of 8634 server state that is cached. 8636 ERRORS 8638 NFS4ERR_BADXDR 8639 NFS4ERR_EXPIRED 8640 NFS4ERR_GRACE 8641 NFS4ERR_LEASE_MOVED 8642 NFS4ERR_RESOURCE 8643 NFS4ERR_SERVERFAULT 8644 NFS4ERR_STALE_CLIENTID 8646 15. NFS Version 4 Callback Procedures 8648 The procedures used for callbacks are defined in the following 8649 sections. In the interest of clarity, the terms "client" and 8650 "server" refer to NFS clients and servers, despite the fact that for 8651 an individual callback RPC, the sense of these terms would be 8652 precisely the opposite. 8654 15.1. Procedure 0: CB_NULL - No Operation 8656 SYNOPSIS 8658 8660 ARGUMENT 8662 void; 8664 RESULT 8666 void; 8668 DESCRIPTION 8670 Standard NULL procedure. Void argument, void response. Even 8671 though there is no direct functionality associated with this 8672 procedure, the server will use CB_NULL to confirm the existence of 8673 a path for RPCs from server to client. 8675 ERRORS 8677 None. 8679 15.2. Procedure 1: CB_COMPOUND - Compound Operations 8681 SYNOPSIS 8683 compoundargs -> compoundres 8685 ARGUMENT 8687 enum nfs_cb_opnum4 { 8688 OP_CB_GETATTR = 3, 8689 OP_CB_RECALL = 4 8690 }; 8692 union nfs_cb_argop4 switch (unsigned argop) { 8693 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 8694 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 8695 }; 8697 struct CB_COMPOUND4args { 8698 utf8string tag; 8699 uint32_t minorversion; 8700 uint32_t callback_ident; 8701 nfs_cb_argop4 argarray<>; 8702 }; 8704 RESULT 8706 union nfs_cb_resop4 switch (unsigned resop){ 8707 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 8708 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 8709 }; 8711 struct CB_COMPOUND4res { 8712 nfsstat4 status; 8713 utf8string tag; 8714 nfs_cb_resop4 resarray<>; 8715 }; 8717 DESCRIPTION 8719 The CB_COMPOUND procedure is used to combine one or more of the 8720 callback procedures into a single RPC request. The main callback 8721 RPC program has two main procedures: CB_NULL and CB_COMPOUND. All 8722 other operations use the CB_COMPOUND procedure as a wrapper. 8724 In the processing of the CB_COMPOUND procedure, the client may find 8725 that it does not have the available resources to execute any or all 8726 of the operations within the CB_COMPOUND sequence. In this case, 8727 the error NFS4ERR_RESOURCE will be returned for the particular 8728 operation within the CB_COMPOUND procedure where the resource 8729 exhaustion occurred. This assumes that all previous operations 8730 within the CB_COMPOUND sequence have been evaluated successfully. 8732 Contained within the CB_COMPOUND results is a 'status' field. This 8733 status must be equivalent to the status of the last operation that 8734 was executed within the CB_COMPOUND procedure. Therefore, if an 8735 operation incurred an error then the 'status' value will be the 8736 same error value as is being returned for the operation that 8737 failed. 8739 The definition of the "tag" in the request is left to the 8740 implementor. It may be used to summarize the content of the 8741 callback compound request for the benefit of packet sniffers and 8742 engineers debugging implementations. However, the value of "tag" 8743 in the response MUST be the same value as provided in the request. 8745 The value of callback_ident is supplied by the client during 8746 SETCLIENTID. The server must use the client supplied 8747 callback_ident during the CB_COMPOUND to allow the client to 8748 properly identify the server. 8750 IMPLEMENTATION 8752 The CB_COMPOUND procedure is used to combine individual operations 8753 into a single RPC request. The client interprets each of the 8754 operations in turn. If an operation is executed by the client and 8755 the status of that operation is NFS4_OK, then the next operation in 8756 the CB_COMPOUND procedure is executed. The client continues this 8757 process until there are no more operations to be executed or one of 8758 the operations has a status value other than NFS4_OK. 8760 ERRORS 8762 NFS4ERR_BADHANDLE 8763 NFS4ERR_BAD_STATEID 8764 NFS4ERR_RESOURCE 8766 15.2.1. Operation 3: CB_GETATTR - Get Attributes 8768 SYNOPSIS 8770 fh, attrbits -> attrbits, attrvals 8772 ARGUMENT 8774 struct CB_GETATTR4args { 8775 nfs_fh4 fh; 8776 bitmap4 attr_request; 8777 }; 8779 RESULT 8781 struct CB_GETATTR4resok { 8782 fattr4 obj_attributes; 8783 }; 8785 union CB_GETATTR4res switch (nfsstat4 status) { 8786 case NFS4_OK: 8787 CB_GETATTR4resok resok4; 8788 default: 8789 void; 8790 }; 8792 DESCRIPTION 8794 The CB_GETATTR operation is used to obtain the attributes modified 8795 by an open delegate to allow the server to respond to GETATTR 8796 requests for a file which is the subject of an open delegation. 8798 If the handle specified is not one for which the client holds a 8799 write open delegation, an NFS4ERR_BADHANDLE error is returned. 8801 IMPLEMENTATION 8803 The client returns attrbits and the associated attribute values 8804 only for attributes that it may change (change, time_modify, size). 8806 ERRORS 8808 NFS4ERR_BADHANDLE 8809 NFS4ERR_BADXDR 8810 NFS4ERR_RESOURCE 8811 NFS4ERR_SERVERFAULT 8813 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation 8815 SYNOPSIS 8817 stateid, truncate, fh -> status 8819 ARGUMENT 8821 struct CB_RECALL4args { 8822 stateid4 stateid; 8823 bool truncate; 8824 nfs_fh4 fh; 8825 }; 8827 RESULT 8829 struct CB_RECALL4res { 8830 nfsstat4 status; 8831 }; 8833 DESCRIPTION 8835 The CB_RECALL operation is used to begin the process of recalling 8836 an open delegation and returning it to the server. 8838 The truncate flag is used to optimize recall for a file which is 8839 about to be truncated to zero. When it is set, the client is freed 8840 of obligation to propagate modified data for the file to the 8841 server, since this data is irrelevant. 8843 If the handle specified is not one for which the client holds an 8844 open delegation, an NFS4ERR_BADHANDLE error is returned. 8846 If the stateid specified is not one corresponding to an open 8847 delegation for the file specified by the filehandle, an 8848 NFS4ERR_BAD_STATEID is returned. 8850 IMPLEMENTATION 8852 The client should reply to the callback immediately. Replying does 8853 not complete the recall. The recall is not complete until the 8854 delegation is returned using a DELEGRETURN. 8856 ERRORS 8858 NFS4ERR_BADHANDLE 8859 NFS4ERR_BAD_STATEID 8860 NFS4ERR_BADXDR 8861 NFS4ERR_RESOURCE 8862 NFS4ERR_SERVERFAULT 8864 16. Security Considerations 8866 The major security feature to consider is the authentication of the 8867 user making the request of NFS service. Consideration should also be 8868 given to the integrity and privacy of this NFS request. These 8869 specific issues are discussed as part of the section on "RPC and 8870 Security Flavor". 8872 17. IANA Considerations 8874 17.1. Named Attribute Definition 8876 The NFS version 4 protocol provides for the association of named 8877 attributes to files. The name space identifiers for these attributes 8878 are defined as string names. The protocol does not define the 8879 specific assignment of the name space for these file attributes; the 8880 application developer or system vendor is allowed to define the 8881 attribute, its semantics, and the associated name. Even though this 8882 name space will not be specifically controlled to prevent collisions, 8883 the application developer or system vendor is strongly encouraged to 8884 provide the name assignment and associated semantics for attributes 8885 via an Informational RFC. This will provide for interoperability 8886 where common interests exist. 8888 18. RPC definition file 8890 /* 8891 * Copyright (C) The Internet Society (1998,1999,2000,2001,2002). 8892 * All Rights Reserved. 8893 */ 8895 /* 8896 * nfs4_prot.x 8897 * 8898 */ 8900 %#pragma ident "@(#)nfs4_prot.x 1.109" 8902 /* 8903 * Basic typedefs for RFC 1832 data type definitions 8904 */ 8905 typedef int int32_t; 8906 typedef unsigned int uint32_t; 8907 typedef hyper int64_t; 8908 typedef unsigned hyper uint64_t; 8910 /* 8911 * Sizes 8912 */ 8913 const NFS4_FHSIZE = 128; 8914 const NFS4_VERIFIER_SIZE = 8; 8915 const NFS4_OPAQUE_LIMIT = 1024; 8917 /* 8918 * File types 8919 */ 8920 enum nfs_ftype4 { 8921 NF4REG = 1, /* Regular File */ 8922 NF4DIR = 2, /* Directory */ 8923 NF4BLK = 3, /* Special File - block device */ 8924 NF4CHR = 4, /* Special File - character device */ 8925 NF4LNK = 5, /* Symbolic Link */ 8926 NF4SOCK = 6, /* Special File - socket */ 8927 NF4FIFO = 7, /* Special File - fifo */ 8928 NF4ATTRDIR = 8, /* Attribute Directory */ 8929 NF4NAMEDATTR = 9 /* Named Attribute */ 8930 }; 8932 /* 8933 * Error status 8934 */ 8935 enum nfsstat4 { 8936 NFS4_OK = 0, 8937 NFS4ERR_PERM = 1, 8938 NFS4ERR_NOENT = 2, 8939 NFS4ERR_IO = 5, 8940 NFS4ERR_NXIO = 6, 8941 NFS4ERR_ACCESS = 13, 8942 NFS4ERR_EXIST = 17, 8943 NFS4ERR_XDEV = 18, 8944 NFS4ERR_NODEV = 19, 8945 NFS4ERR_NOTDIR = 20, 8946 NFS4ERR_ISDIR = 21, 8947 NFS4ERR_INVAL = 22, 8948 NFS4ERR_FBIG = 27, 8949 NFS4ERR_NOSPC = 28, 8950 NFS4ERR_ROFS = 30, 8951 NFS4ERR_MLINK = 31, 8952 NFS4ERR_NAMETOOLONG = 63, 8953 NFS4ERR_NOTEMPTY = 66, 8954 NFS4ERR_DQUOT = 69, 8955 NFS4ERR_STALE = 70, 8956 NFS4ERR_BADHANDLE = 10001, 8957 NFS4ERR_BAD_COOKIE = 10003, 8958 NFS4ERR_NOTSUPP = 10004, 8959 NFS4ERR_TOOSMALL = 10005, 8960 NFS4ERR_SERVERFAULT = 10006, 8961 NFS4ERR_BADTYPE = 10007, 8962 NFS4ERR_DELAY = 10008, 8963 NFS4ERR_SAME = 10009,/* nverify says attrs same */ 8964 NFS4ERR_DENIED = 10010,/* lock unavailable */ 8965 NFS4ERR_EXPIRED = 10011,/* lock lease expired */ 8966 NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */ 8967 NFS4ERR_GRACE = 10013,/* in grace period */ 8968 NFS4ERR_FHEXPIRED = 10014,/* file handle expired */ 8969 NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */ 8970 NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */ 8971 NFS4ERR_CLID_INUSE = 10017,/* clientid in use */ 8972 NFS4ERR_RESOURCE = 10018,/* resource exhaustion */ 8973 NFS4ERR_MOVED = 10019,/* filesystem relocated */ 8974 NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */ 8975 NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ 8976 NFS4ERR_STALE_CLIENTID = 10022, 8977 NFS4ERR_STALE_STATEID = 10023, 8978 NFS4ERR_OLD_STATEID = 10024, 8979 NFS4ERR_BAD_STATEID = 10025, 8980 NFS4ERR_BAD_SEQID = 10026, 8981 NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */ 8982 NFS4ERR_LOCK_RANGE = 10028, 8983 NFS4ERR_SYMLINK = 10029, 8984 NFS4ERR_READDIR_NOSPC = 10030, 8985 NFS4ERR_LEASE_MOVED = 10031, 8986 NFS4ERR_ATTRNOTSUPP = 10032, 8987 NFS4ERR_NO_GRACE = 10033, 8988 NFS4ERR_RECLAIM_BAD = 10034, 8989 NFS4ERR_RECLAIM_CONFLICT = 10035, 8990 NFS4ERR_BADXDR = 10036, 8991 NFS4ERR_LOCKS_HELD = 10037, 8992 NFS4ERR_OPENMODE = 10038, 8993 NFS4ERR_BADOWNER = 10039 8994 }; 8996 /* 8997 * Basic data types 8998 */ 8999 typedef uint32_t bitmap4<>; 9000 typedef uint64_t offset4; 9001 typedef uint32_t count4; 9002 typedef uint64_t length4; 9003 typedef uint64_t clientid4; 9004 typedef uint32_t seqid4; 9005 typedef opaque utf8string<>; 9006 typedef utf8string component4; 9007 typedef component4 pathname4<>; 9008 typedef uint64_t nfs_lockid4; 9009 typedef uint64_t nfs_cookie4; 9010 typedef utf8string linktext4; 9011 typedef opaque sec_oid4<>; 9012 typedef uint32_t qop4; 9013 typedef uint32_t mode4; 9014 typedef uint64_t changeid4; 9015 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 9017 /* 9018 * Timeval 9019 */ 9020 struct nfstime4 { 9021 int64_t seconds; 9022 uint32_t nseconds; 9023 }; 9025 enum time_how4 { 9026 SET_TO_SERVER_TIME4 = 0, 9027 SET_TO_CLIENT_TIME4 = 1 9028 }; 9030 union settime4 switch (time_how4 set_it) { 9031 case SET_TO_CLIENT_TIME4: 9032 nfstime4 time; 9033 default: 9034 void; 9035 }; 9037 /* 9038 * File access handle 9039 */ 9040 typedef opaque nfs_fh4; 9042 /* 9043 * File attribute definitions 9044 */ 9046 /* 9047 * FSID structure for major/minor 9048 */ 9049 struct fsid4 { 9050 uint64_t major; 9051 uint64_t minor; 9052 }; 9054 /* 9055 * Filesystem locations attribute for relocation/migration 9056 */ 9057 struct fs_location4 { 9058 utf8string server<>; 9059 pathname4 rootpath; 9060 }; 9062 struct fs_locations4 { 9063 pathname4 fs_root; 9064 fs_location4 locations<>; 9065 }; 9067 /* 9068 * Various Access Control Entry definitions 9069 */ 9071 /* 9072 * Mask that indicates which Access Control Entries are supported. 9073 * Values for the fattr4_aclsupport attribute. 9074 */ 9075 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 9076 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 9077 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 9078 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 9080 typedef uint32_t acetype4; 9082 /* 9083 * acetype4 values, others can be added as needed. 9084 */ 9085 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 9086 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 9087 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 9088 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 9090 /* 9091 * ACE flag 9092 */ 9094 typedef uint32_t aceflag4; 9096 /* 9097 * ACE flag values 9098 */ 9099 const ACE4_FILE_INHERIT_ACE = 0x00000001; 9100 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 9101 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 9102 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 9103 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 9104 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 9105 const ACE4_IDENTIFIER_GROUP = 0x00000040; 9107 /* 9108 * ACE mask 9109 */ 9110 typedef uint32_t acemask4; 9112 /* 9113 * ACE mask values 9114 */ 9115 const ACE4_READ_DATA = 0x00000001; 9116 const ACE4_LIST_DIRECTORY = 0x00000001; 9117 const ACE4_WRITE_DATA = 0x00000002; 9118 const ACE4_ADD_FILE = 0x00000002; 9119 const ACE4_APPEND_DATA = 0x00000004; 9120 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 9121 const ACE4_READ_NAMED_ATTRS = 0x00000008; 9122 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 9123 const ACE4_EXECUTE = 0x00000020; 9124 const ACE4_DELETE_CHILD = 0x00000040; 9125 const ACE4_READ_ATTRIBUTES = 0x00000080; 9126 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 9128 const ACE4_DELETE = 0x00010000; 9129 const ACE4_READ_ACL = 0x00020000; 9130 const ACE4_WRITE_ACL = 0x00040000; 9131 const ACE4_WRITE_OWNER = 0x00080000; 9132 const ACE4_SYNCHRONIZE = 0x00100000; 9134 /* 9135 * ACE4_GENERIC_READ -- defined as combination of 9136 * ACE4_READ_ACL | 9137 * ACE4_READ_DATA | 9138 * ACE4_READ_ATTRIBUTES | 9139 * ACE4_SYNCHRONIZE 9140 */ 9142 const ACE4_GENERIC_READ = 0x00120081; 9144 /* 9145 * ACE4_GENERIC_WRITE -- defined as combination of 9146 * ACE4_READ_ACL | 9147 * ACE4_WRITE_DATA | 9148 * ACE4_WRITE_ATTRIBUTES | 9149 * ACE4_WRITE_ACL | 9150 * ACE4_APPEND_DATA | 9151 * ACE4_SYNCHRONIZE 9152 */ 9153 const ACE4_GENERIC_WRITE = 0x00160106; 9155 /* 9156 * ACE4_GENERIC_EXECUTE -- defined as combination of 9157 * ACE4_READ_ACL 9158 * ACE4_READ_ATTRIBUTES 9159 * ACE4_EXECUTE 9160 * ACE4_SYNCHRONIZE 9161 */ 9162 const ACE4_GENERIC_EXECUTE = 0x001200A0; 9164 /* 9165 * Access Control Entry definition 9166 */ 9167 struct nfsace4 { 9168 acetype4 type; 9169 aceflag4 flag; 9170 acemask4 access_mask; 9171 utf8string who; 9172 }; 9174 /* 9175 * Special data/attribute associated with 9176 * file types NF4BLK and NF4CHR. 9177 */ 9178 struct specdata4 { 9179 uint32_t specdata1; 9180 uint32_t specdata2; 9181 }; 9183 /* 9184 * Values for fattr4_fh_expire_type 9185 */ 9186 const FH4_PERSISTENT = 0x00000000; 9187 const FH4_NOEXPIRE_WITH_OPEN = 0x00000001; 9188 const FH4_VOLATILE_ANY = 0x00000002; 9189 const FH4_VOL_MIGRATION = 0x00000004; 9190 const FH4_VOL_RENAME = 0x00000008; 9192 typedef bitmap4 fattr4_supported_attrs; 9193 typedef nfs_ftype4 fattr4_type; 9194 typedef uint32_t fattr4_fh_expire_type; 9195 typedef changeid4 fattr4_change; 9196 typedef uint64_t fattr4_size; 9197 typedef bool fattr4_link_support; 9198 typedef bool fattr4_symlink_support; 9199 typedef bool fattr4_named_attr; 9200 typedef fsid4 fattr4_fsid; 9201 typedef bool fattr4_unique_handles; 9202 typedef uint32_t fattr4_lease_time; 9203 typedef nfsstat4 fattr4_rdattr_error; 9205 typedef nfsace4 fattr4_acl<>; 9206 typedef uint32_t fattr4_aclsupport; 9207 typedef bool fattr4_archive; 9208 typedef bool fattr4_cansettime; 9209 typedef bool fattr4_case_insensitive; 9210 typedef bool fattr4_case_preserving; 9211 typedef bool fattr4_chown_restricted; 9212 typedef uint64_t fattr4_fileid; 9213 typedef uint64_t fattr4_files_avail; 9214 typedef nfs_fh4 fattr4_filehandle; 9215 typedef uint64_t fattr4_files_free; 9216 typedef uint64_t fattr4_files_total; 9217 typedef fs_locations4 fattr4_fs_locations; 9218 typedef bool fattr4_hidden; 9219 typedef bool fattr4_homogeneous; 9220 typedef uint64_t fattr4_maxfilesize; 9221 typedef uint32_t fattr4_maxlink; 9222 typedef uint32_t fattr4_maxname; 9223 typedef uint64_t fattr4_maxread; 9224 typedef uint64_t fattr4_maxwrite; 9225 typedef utf8string fattr4_mimetype; 9226 typedef mode4 fattr4_mode; 9227 typedef bool fattr4_no_trunc; 9228 typedef uint32_t fattr4_numlinks; 9229 typedef utf8string fattr4_owner; 9230 typedef utf8string fattr4_owner_group; 9231 typedef uint64_t fattr4_quota_avail_hard; 9232 typedef uint64_t fattr4_quota_avail_soft; 9233 typedef uint64_t fattr4_quota_used; 9234 typedef specdata4 fattr4_rawdev; 9235 typedef uint64_t fattr4_space_avail; 9236 typedef uint64_t fattr4_space_free; 9237 typedef uint64_t fattr4_space_total; 9238 typedef uint64_t fattr4_space_used; 9239 typedef bool fattr4_system; 9240 typedef nfstime4 fattr4_time_access; 9241 typedef settime4 fattr4_time_access_set; 9242 typedef nfstime4 fattr4_time_backup; 9243 typedef nfstime4 fattr4_time_create; 9244 typedef nfstime4 fattr4_time_delta; 9245 typedef nfstime4 fattr4_time_metadata; 9246 typedef nfstime4 fattr4_time_modify; 9247 typedef settime4 fattr4_time_modify_set; 9249 /* 9250 * Mandatory Attributes 9251 */ 9252 const FATTR4_SUPPORTED_ATTRS = 0; 9253 const FATTR4_TYPE = 1; 9254 const FATTR4_FH_EXPIRE_TYPE = 2; 9255 const FATTR4_CHANGE = 3; 9256 const FATTR4_SIZE = 4; 9257 const FATTR4_LINK_SUPPORT = 5; 9258 const FATTR4_SYMLINK_SUPPORT = 6; 9259 const FATTR4_NAMED_ATTR = 7; 9260 const FATTR4_FSID = 8; 9261 const FATTR4_UNIQUE_HANDLES = 9; 9262 const FATTR4_LEASE_TIME = 10; 9263 const FATTR4_RDATTR_ERROR = 11; 9265 /* 9266 * Recommended Attributes 9267 */ 9268 const FATTR4_ACL = 12; 9269 const FATTR4_ACLSUPPORT = 13; 9270 const FATTR4_ARCHIVE = 14; 9271 const FATTR4_CANSETTIME = 15; 9272 const FATTR4_CASE_INSENSITIVE = 16; 9273 const FATTR4_CASE_PRESERVING = 17; 9274 const FATTR4_CHOWN_RESTRICTED = 18; 9275 const FATTR4_FILEHANDLE = 19; 9276 const FATTR4_FILEID = 20; 9277 const FATTR4_FILES_AVAIL = 21; 9278 const FATTR4_FILES_FREE = 22; 9279 const FATTR4_FILES_TOTAL = 23; 9280 const FATTR4_FS_LOCATIONS = 24; 9281 const FATTR4_HIDDEN = 25; 9282 const FATTR4_HOMOGENEOUS = 26; 9283 const FATTR4_MAXFILESIZE = 27; 9284 const FATTR4_MAXLINK = 28; 9285 const FATTR4_MAXNAME = 29; 9286 const FATTR4_MAXREAD = 30; 9287 const FATTR4_MAXWRITE = 31; 9288 const FATTR4_MIMETYPE = 32; 9289 const FATTR4_MODE = 33; 9290 const FATTR4_NO_TRUNC = 34; 9291 const FATTR4_NUMLINKS = 35; 9292 const FATTR4_OWNER = 36; 9293 const FATTR4_OWNER_GROUP = 37; 9294 const FATTR4_QUOTA_AVAIL_HARD = 38; 9295 const FATTR4_QUOTA_AVAIL_SOFT = 39; 9296 const FATTR4_QUOTA_USED = 40; 9297 const FATTR4_RAWDEV = 41; 9298 const FATTR4_SPACE_AVAIL = 42; 9299 const FATTR4_SPACE_FREE = 43; 9300 const FATTR4_SPACE_TOTAL = 44; 9301 const FATTR4_SPACE_USED = 45; 9302 const FATTR4_SYSTEM = 46; 9303 const FATTR4_TIME_ACCESS = 47; 9304 const FATTR4_TIME_ACCESS_SET = 48; 9305 const FATTR4_TIME_BACKUP = 49; 9306 const FATTR4_TIME_CREATE = 50; 9307 const FATTR4_TIME_DELTA = 51; 9308 const FATTR4_TIME_METADATA = 52; 9309 const FATTR4_TIME_MODIFY = 53; 9310 const FATTR4_TIME_MODIFY_SET = 54; 9312 typedef opaque attrlist4<>; 9314 /* 9315 * File attribute container 9316 */ 9317 struct fattr4 { 9318 bitmap4 attrmask; 9319 attrlist4 attr_vals; 9320 }; 9322 /* 9323 * Change info for the client 9324 */ 9325 struct change_info4 { 9326 bool atomic; 9327 changeid4 before; 9328 changeid4 after; 9329 }; 9331 struct clientaddr4 { 9332 /* see struct rpcb in RFC 1833 */ 9333 string r_netid<>; /* network id */ 9334 string r_addr<>; /* universal address */ 9335 }; 9337 /* 9338 * Callback program info as provided by the client 9339 */ 9340 struct cb_client4 { 9341 uint32_t cb_program; 9342 clientaddr4 cb_location; 9343 }; 9345 /* 9346 * Stateid 9347 */ 9349 struct stateid4 { 9350 uint32_t seqid; 9351 opaque other[12]; 9352 }; 9354 /* 9355 * Client ID 9356 */ 9357 struct nfs_client_id4 { 9358 verifier4 verifier; 9359 opaque id; 9360 }; 9362 struct open_owner4 { 9363 clientid4 clientid; 9364 opaque owner; 9365 }; 9367 struct lock_owner4 { 9368 clientid4 clientid; 9369 opaque owner; 9370 }; 9372 enum nfs_lock_type4 { 9373 READ_LT = 1, 9374 WRITE_LT = 2, 9375 READW_LT = 3, /* blocking read */ 9376 WRITEW_LT = 4 /* blocking write */ 9377 }; 9379 /* 9380 * ACCESS: Check access permission 9381 */ 9382 const ACCESS4_READ = 0x00000001; 9383 const ACCESS4_LOOKUP = 0x00000002; 9384 const ACCESS4_MODIFY = 0x00000004; 9385 const ACCESS4_EXTEND = 0x00000008; 9386 const ACCESS4_DELETE = 0x00000010; 9387 const ACCESS4_EXECUTE = 0x00000020; 9389 struct ACCESS4args { 9390 /* CURRENT_FH: object */ 9391 uint32_t access; 9392 }; 9394 struct ACCESS4resok { 9395 uint32_t supported; 9396 uint32_t access; 9397 }; 9399 union ACCESS4res switch (nfsstat4 status) { 9400 case NFS4_OK: 9402 ACCESS4resok resok4; 9403 default: 9404 void; 9405 }; 9407 /* 9408 * CLOSE: Close a file and release share locks 9409 */ 9410 struct CLOSE4args { 9411 /* CURRENT_FH: object */ 9412 seqid4 seqid; 9413 stateid4 open_stateid; 9414 }; 9416 union CLOSE4res switch (nfsstat4 status) { 9417 case NFS4_OK: 9418 stateid4 open_stateid; 9419 default: 9420 void; 9421 }; 9423 /* 9424 * COMMIT: Commit cached data on server to stable storage 9425 */ 9426 struct COMMIT4args { 9427 /* CURRENT_FH: file */ 9428 offset4 offset; 9429 count4 count; 9430 }; 9432 struct COMMIT4resok { 9433 verifier4 writeverf; 9434 }; 9436 union COMMIT4res switch (nfsstat4 status) { 9437 case NFS4_OK: 9438 COMMIT4resok resok4; 9439 default: 9440 void; 9441 }; 9443 /* 9444 * CREATE: Create a non-regular file 9445 */ 9446 union createtype4 switch (nfs_ftype4 type) { 9447 case NF4LNK: 9448 linktext4 linkdata; 9449 case NF4BLK: 9450 case NF4CHR: 9451 specdata4 devdata; 9452 case NF4SOCK: 9454 case NF4FIFO: 9455 case NF4DIR: 9456 void; 9457 default: 9458 void; /* server should return NFS4ERR_BADTYPE */ 9459 }; 9461 struct CREATE4args { 9462 /* CURRENT_FH: directory for creation */ 9463 createtype4 objtype; 9464 component4 objname; 9465 fattr4 createattrs; 9466 }; 9468 struct CREATE4resok { 9469 change_info4 cinfo; 9470 bitmap4 attrset; /* attributes set */ 9471 }; 9473 union CREATE4res switch (nfsstat4 status) { 9474 case NFS4_OK: 9475 CREATE4resok resok4; 9476 default: 9477 void; 9478 }; 9480 /* 9481 * DELEGPURGE: Purge Delegations Awaiting Recovery 9482 */ 9483 struct DELEGPURGE4args { 9484 clientid4 clientid; 9485 }; 9487 struct DELEGPURGE4res { 9488 nfsstat4 status; 9489 }; 9491 /* 9492 * DELEGRETURN: Return a delegation 9493 */ 9494 struct DELEGRETURN4args { 9495 /* CURRENT_FH: delegated file */ 9496 stateid4 deleg_stateid; 9497 }; 9499 struct DELEGRETURN4res { 9500 nfsstat4 status; 9501 }; 9503 /* 9504 * GETATTR: Get file attributes 9505 */ 9507 struct GETATTR4args { 9508 /* CURRENT_FH: directory or file */ 9509 bitmap4 attr_request; 9510 }; 9512 struct GETATTR4resok { 9513 fattr4 obj_attributes; 9514 }; 9516 union GETATTR4res switch (nfsstat4 status) { 9517 case NFS4_OK: 9518 GETATTR4resok resok4; 9519 default: 9520 void; 9521 }; 9523 /* 9524 * GETFH: Get current filehandle 9525 */ 9526 struct GETFH4resok { 9527 nfs_fh4 object; 9528 }; 9530 union GETFH4res switch (nfsstat4 status) { 9531 case NFS4_OK: 9532 GETFH4resok resok4; 9533 default: 9534 void; 9535 }; 9537 /* 9538 * LINK: Create link to an object 9539 */ 9540 struct LINK4args { 9541 /* SAVED_FH: source object */ 9542 /* CURRENT_FH: target directory */ 9543 component4 newname; 9544 }; 9546 struct LINK4resok { 9547 change_info4 cinfo; 9548 }; 9550 union LINK4res switch (nfsstat4 status) { 9551 case NFS4_OK: 9552 LINK4resok resok4; 9553 default: 9554 void; 9555 }; 9557 /* 9558 * For LOCK, transition from open_owner to new lock_owner 9559 */ 9560 struct open_to_lock_owner4 { 9561 seqid4 open_seqid; 9562 stateid4 open_stateid; 9563 seqid4 lock_seqid; 9564 lock_owner4 lock_owner; 9565 }; 9567 /* 9568 * For LOCK, existing lock_owner continues to request file locks 9569 */ 9570 struct exist_lock_owner4 { 9571 stateid4 lock_stateid; 9572 seqid4 lock_seqid; 9573 }; 9575 union locker4 switch (bool new_lock_owner) { 9576 case TRUE: 9577 open_to_lock_owner4 open_owner; 9578 case FALSE: 9579 exist_lock_owner4 lock_owner; 9580 }; 9582 /* 9583 * LOCK/LOCKT/LOCKU: Record lock management 9584 */ 9585 struct LOCK4args { 9586 /* CURRENT_FH: file */ 9587 nfs_lock_type4 locktype; 9588 bool reclaim; 9589 offset4 offset; 9590 length4 length; 9591 locker4 locker; 9592 }; 9594 struct LOCK4denied { 9595 offset4 offset; 9596 length4 length; 9597 nfs_lock_type4 locktype; 9598 lock_owner4 owner; 9599 }; 9601 struct LOCK4resok { 9602 stateid4 lock_stateid; 9603 }; 9605 union LOCK4res switch (nfsstat4 status) { 9606 case NFS4_OK: 9607 LOCK4resok resok4; 9608 case NFS4ERR_DENIED: 9609 LOCK4denied denied; 9610 default: 9612 void; 9613 }; 9615 struct LOCKT4args { 9616 /* CURRENT_FH: file */ 9617 nfs_lock_type4 locktype; 9618 offset4 offset; 9619 length4 length; 9620 lock_owner4 owner; 9621 }; 9623 union LOCKT4res switch (nfsstat4 status) { 9624 case NFS4ERR_DENIED: 9625 LOCK4denied denied; 9626 case NFS4_OK: 9627 void; 9628 default: 9629 void; 9630 }; 9632 struct LOCKU4args { 9633 /* CURRENT_FH: file */ 9634 nfs_lock_type4 locktype; 9635 seqid4 seqid; 9636 stateid4 lock_stateid; 9637 offset4 offset; 9638 length4 length; 9639 }; 9641 union LOCKU4res switch (nfsstat4 status) { 9642 case NFS4_OK: 9643 stateid4 lock_stateid; 9644 default: 9645 void; 9646 }; 9648 /* 9649 * LOOKUP: Lookup filename 9650 */ 9651 struct LOOKUP4args { 9652 /* CURRENT_FH: directory */ 9653 component4 objname; 9654 }; 9656 struct LOOKUP4res { 9657 /* CURRENT_FH: object */ 9658 nfsstat4 status; 9659 }; 9661 /* 9662 * LOOKUPP: Lookup parent directory 9663 */ 9665 struct LOOKUPP4res { 9666 /* CURRENT_FH: directory */ 9667 nfsstat4 status; 9668 }; 9670 /* 9671 * NVERIFY: Verify attributes different 9672 */ 9673 struct NVERIFY4args { 9674 /* CURRENT_FH: object */ 9675 fattr4 obj_attributes; 9676 }; 9678 struct NVERIFY4res { 9679 nfsstat4 status; 9680 }; 9682 /* 9683 * Various definitions for OPEN 9684 */ 9685 enum createmode4 { 9686 UNCHECKED4 = 0, 9687 GUARDED4 = 1, 9688 EXCLUSIVE4 = 2 9689 }; 9691 union createhow4 switch (createmode4 mode) { 9692 case UNCHECKED4: 9693 case GUARDED4: 9694 fattr4 createattrs; 9695 case EXCLUSIVE4: 9696 verifier4 createverf; 9697 }; 9699 enum opentype4 { 9700 OPEN4_NOCREATE = 0, 9701 OPEN4_CREATE = 1 9702 }; 9704 union openflag4 switch (opentype4 opentype) { 9705 case OPEN4_CREATE: 9706 createhow4 how; 9707 default: 9708 void; 9709 }; 9711 /* Next definitions used for OPEN delegation */ 9712 enum limit_by4 { 9713 NFS_LIMIT_SIZE = 1, 9714 NFS_LIMIT_BLOCKS = 2 9715 /* others as needed */ 9716 }; 9717 struct nfs_modified_limit4 { 9718 uint32_t num_blocks; 9719 uint32_t bytes_per_block; 9720 }; 9722 union nfs_space_limit4 switch (limit_by4 limitby) { 9723 /* limit specified as file size */ 9724 case NFS_LIMIT_SIZE: 9725 uint64_t filesize; 9726 /* limit specified by number of blocks */ 9727 case NFS_LIMIT_BLOCKS: 9728 nfs_modified_limit4 mod_blocks; 9729 } ; 9731 /* 9732 * Share Access and Deny constants for open argument 9733 */ 9734 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 9735 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 9736 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 9738 const OPEN4_SHARE_DENY_NONE = 0x00000000; 9739 const OPEN4_SHARE_DENY_READ = 0x00000001; 9740 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 9741 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 9743 enum open_delegation_type4 { 9744 OPEN_DELEGATE_NONE = 0, 9745 OPEN_DELEGATE_READ = 1, 9746 OPEN_DELEGATE_WRITE = 2 9747 }; 9749 enum open_claim_type4 { 9750 CLAIM_NULL = 0, 9751 CLAIM_PREVIOUS = 1, 9752 CLAIM_DELEGATE_CUR = 2, 9753 CLAIM_DELEGATE_PREV = 3 9754 }; 9756 struct open_claim_delegate_cur4 { 9757 stateid4 delegate_stateid; 9758 component4 file; 9759 }; 9761 union open_claim4 switch (open_claim_type4 claim) { 9762 /* 9763 * No special rights to file. Ordinary OPEN of the specified file. 9764 */ 9765 case CLAIM_NULL: 9766 /* CURRENT_FH: directory */ 9767 component4 file; 9769 /* 9770 * Right to the file established by an open previous to server 9771 * reboot. File identified by filehandle obtained at that time 9772 * rather than by name. 9773 */ 9774 case CLAIM_PREVIOUS: 9775 /* CURRENT_FH: file being reclaimed */ 9776 open_delegation_type4 delegate_type; 9778 /* 9779 * Right to file based on a delegation granted by the server. 9780 * File is specified by name. 9781 */ 9782 case CLAIM_DELEGATE_CUR: 9783 /* CURRENT_FH: directory */ 9784 open_claim_delegate_cur4 delegate_cur_info; 9786 /* Right to file based on a delegation granted to a previous boot 9787 * instance of the client. File is specified by name. 9788 */ 9789 case CLAIM_DELEGATE_PREV: 9790 /* CURRENT_FH: directory */ 9791 component4 file_delegate_prev; 9792 }; 9794 /* 9795 * OPEN: Open a file, potentially receiving an open delegation 9796 */ 9797 struct OPEN4args { 9798 seqid4 seqid; 9799 uint32_t share_access; 9800 uint32_t share_deny; 9801 open_owner4 owner; 9802 openflag4 openhow; 9803 open_claim4 claim; 9804 }; 9806 struct open_read_delegation4 { 9807 stateid4 stateid; /* Stateid for delegation*/ 9808 bool recall; /* Pre-recalled flag for 9809 delegations obtained 9810 by reclaim 9811 (CLAIM_PREVIOUS) */ 9812 nfsace4 permissions; /* Defines users who don't 9813 need an ACCESS call to 9814 open for read */ 9815 }; 9817 struct open_write_delegation4 { 9818 stateid4 stateid; /* Stateid for delegation */ 9819 bool recall; /* Pre-recalled flag for 9820 delegations obtained 9821 by reclaim 9822 (CLAIM_PREVIOUS) */ 9823 nfs_space_limit4 space_limit; /* Defines condition that 9824 the client must check to 9825 determine whether the 9826 file needs to be flushed 9827 to the server on close. 9828 */ 9829 nfsace4 permissions; /* Defines users who don't 9830 need an ACCESS call as 9831 part of a delegated 9832 open. */ 9833 }; 9835 union open_delegation4 9836 switch (open_delegation_type4 delegation_type) { 9837 case OPEN_DELEGATE_NONE: 9838 void; 9839 case OPEN_DELEGATE_READ: 9840 open_read_delegation4 read; 9841 case OPEN_DELEGATE_WRITE: 9842 open_write_delegation4 write; 9843 }; 9845 /* 9846 * Result flags 9847 */ 9848 /* Mandatory locking is in effect for this file. */ 9849 const OPEN4_RESULT_MLOCK = 0x00000001; 9850 /* Client must confirm open */ 9851 const OPEN4_RESULT_CONFIRM = 0x00000002; 9852 /* Type of file locking behavior at the server */ 9853 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 9855 struct OPEN4resok { 9856 stateid4 stateid; /* Stateid for open */ 9857 change_info4 cinfo; /* Directory Change Info */ 9858 uint32_t rflags; /* Result flags */ 9859 bitmap4 attrset; /* attribute set for create*/ 9860 open_delegation4 delegation; /* Info on any open 9861 delegation */ 9862 }; 9864 union OPEN4res switch (nfsstat4 status) { 9865 case NFS4_OK: 9866 /* CURRENT_FH: opened file */ 9867 OPEN4resok resok4; 9868 default: 9869 void; 9870 }; 9872 /* 9873 * OPENATTR: open named attributes directory 9874 */ 9875 struct OPENATTR4args { 9876 /* CURRENT_FH: object */ 9877 bool createdir; 9878 }; 9880 struct OPENATTR4res { 9881 /* CURRENT_FH: named attr directory */ 9882 nfsstat4 status; 9883 }; 9885 /* 9886 * OPEN_CONFIRM: confirm the open 9887 */ 9888 struct OPEN_CONFIRM4args { 9889 /* CURRENT_FH: opened file */ 9890 stateid4 open_stateid; 9891 seqid4 seqid; 9892 }; 9894 struct OPEN_CONFIRM4resok { 9895 stateid4 open_stateid; 9896 }; 9898 union OPEN_CONFIRM4res switch (nfsstat4 status) { 9899 case NFS4_OK: 9900 OPEN_CONFIRM4resok resok4; 9901 default: 9902 void; 9903 }; 9905 /* 9906 * OPEN_DOWNGRADE: downgrade the access/deny for a file 9907 */ 9908 struct OPEN_DOWNGRADE4args { 9909 /* CURRENT_FH: opened file */ 9910 stateid4 open_stateid; 9911 seqid4 seqid; 9912 uint32_t share_access; 9913 uint32_t share_deny; 9914 }; 9916 struct OPEN_DOWNGRADE4resok { 9917 stateid4 open_stateid; 9918 }; 9920 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 9921 case NFS4_OK: 9922 OPEN_DOWNGRADE4resok resok4; 9923 default: 9924 void; 9926 }; 9928 /* 9929 * PUTFH: Set current filehandle 9930 */ 9931 struct PUTFH4args { 9932 nfs_fh4 object; 9933 }; 9935 struct PUTFH4res { 9936 /* CURRENT_FH: */ 9937 nfsstat4 status; 9938 }; 9940 /* 9941 * PUTPUBFH: Set public filehandle 9942 */ 9943 struct PUTPUBFH4res { 9944 /* CURRENT_FH: public fh */ 9945 nfsstat4 status; 9946 }; 9948 /* 9949 * PUTROOTFH: Set root filehandle 9950 */ 9951 struct PUTROOTFH4res { 9952 /* CURRENT_FH: root fh */ 9953 nfsstat4 status; 9954 }; 9956 /* 9957 * READ: Read from file 9958 */ 9959 struct READ4args { 9960 /* CURRENT_FH: file */ 9961 stateid4 stateid; 9962 offset4 offset; 9963 count4 count; 9964 }; 9966 struct READ4resok { 9967 bool eof; 9968 opaque data<>; 9969 }; 9971 union READ4res switch (nfsstat4 status) { 9972 case NFS4_OK: 9973 READ4resok resok4; 9974 default: 9975 void; 9976 }; 9977 /* 9978 * READDIR: Read directory 9979 */ 9980 struct READDIR4args { 9981 /* CURRENT_FH: directory */ 9982 nfs_cookie4 cookie; 9983 verifier4 cookieverf; 9984 count4 dircount; 9985 count4 maxcount; 9986 bitmap4 attr_request; 9987 }; 9989 struct entry4 { 9990 nfs_cookie4 cookie; 9991 component4 name; 9992 fattr4 attrs; 9993 entry4 *nextentry; 9994 }; 9996 struct dirlist4 { 9997 entry4 *entries; 9998 bool eof; 9999 }; 10001 struct READDIR4resok { 10002 verifier4 cookieverf; 10003 dirlist4 reply; 10004 }; 10006 union READDIR4res switch (nfsstat4 status) { 10007 case NFS4_OK: 10008 READDIR4resok resok4; 10009 default: 10010 void; 10011 }; 10013 /* 10014 * READLINK: Read symbolic link 10015 */ 10016 struct READLINK4resok { 10017 linktext4 link; 10018 }; 10020 union READLINK4res switch (nfsstat4 status) { 10021 case NFS4_OK: 10022 READLINK4resok resok4; 10023 default: 10024 void; 10025 }; 10026 /* 10027 * REMOVE: Remove filesystem object 10028 */ 10029 struct REMOVE4args { 10030 /* CURRENT_FH: directory */ 10031 component4 target; 10032 }; 10034 struct REMOVE4resok { 10035 change_info4 cinfo; 10036 }; 10038 union REMOVE4res switch (nfsstat4 status) { 10039 case NFS4_OK: 10040 REMOVE4resok resok4; 10041 default: 10042 void; 10043 }; 10045 /* 10046 * RENAME: Rename directory entry 10047 */ 10048 struct RENAME4args { 10049 /* SAVED_FH: source directory */ 10050 component4 oldname; 10051 /* CURRENT_FH: target directory */ 10052 component4 newname; 10053 }; 10055 struct RENAME4resok { 10056 change_info4 source_cinfo; 10057 change_info4 target_cinfo; 10058 }; 10060 union RENAME4res switch (nfsstat4 status) { 10061 case NFS4_OK: 10062 RENAME4resok resok4; 10063 default: 10064 void; 10065 }; 10067 /* 10068 * RENEW: Renew a Lease 10069 */ 10070 struct RENEW4args { 10071 clientid4 clientid; 10072 }; 10074 struct RENEW4res { 10075 nfsstat4 status; 10076 }; 10077 /* 10078 * RESTOREFH: Restore saved filehandle 10079 */ 10081 struct RESTOREFH4res { 10082 /* CURRENT_FH: value of saved fh */ 10083 nfsstat4 status; 10084 }; 10086 /* 10087 * SAVEFH: Save current filehandle 10088 */ 10089 struct SAVEFH4res { 10090 /* SAVED_FH: value of current fh */ 10091 nfsstat4 status; 10092 }; 10094 /* 10095 * SECINFO: Obtain Available Security Mechanisms 10096 */ 10097 struct SECINFO4args { 10098 /* CURRENT_FH: directory */ 10099 component4 name; 10100 }; 10102 /* 10103 * From RFC 2203 10104 */ 10105 enum rpc_gss_svc_t { 10106 RPC_GSS_SVC_NONE = 1, 10107 RPC_GSS_SVC_INTEGRITY = 2, 10108 RPC_GSS_SVC_PRIVACY = 3 10109 }; 10111 struct rpcsec_gss_info { 10112 sec_oid4 oid; 10113 qop4 qop; 10114 rpc_gss_svc_t service; 10115 }; 10117 /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ 10118 union secinfo4 switch (uint32_t flavor) { 10119 case RPCSEC_GSS: 10120 rpcsec_gss_info flavor_info; 10121 default: 10122 void; 10123 }; 10125 typedef secinfo4 SECINFO4resok<>; 10127 union SECINFO4res switch (nfsstat4 status) { 10128 case NFS4_OK: 10130 SECINFO4resok resok4; 10131 default: 10132 void; 10133 }; 10135 /* 10136 * SETATTR: Set attributes 10137 */ 10138 struct SETATTR4args { 10139 /* CURRENT_FH: target object */ 10140 stateid4 stateid; 10141 fattr4 obj_attributes; 10142 }; 10144 struct SETATTR4res { 10145 nfsstat4 status; 10146 bitmap4 attrsset; 10147 }; 10149 /* 10150 * SETCLIENTID 10151 */ 10152 struct SETCLIENTID4args { 10153 nfs_client_id4 client; 10154 cb_client4 callback; 10155 uint32_t callback_ident; 10156 }; 10158 struct SETCLIENTID4resok { 10159 clientid4 clientid; 10160 }; 10162 union SETCLIENTID4res switch (nfsstat4 status) { 10163 case NFS4_OK: 10164 SETCLIENTID4resok resok4; 10165 case NFS4ERR_CLID_INUSE: 10166 clientaddr4 client_using; 10167 default: 10168 void; 10169 }; 10171 struct SETCLIENTID_CONFIRM4args { 10172 clientid4 clientid; 10173 }; 10175 struct SETCLIENTID_CONFIRM4res { 10176 nfsstat4 status; 10177 }; 10179 /* 10180 * VERIFY: Verify attributes same 10181 */ 10183 struct VERIFY4args { 10184 /* CURRENT_FH: object */ 10185 fattr4 obj_attributes; 10186 }; 10188 struct VERIFY4res { 10189 nfsstat4 status; 10190 }; 10192 /* 10193 * WRITE: Write to file 10194 */ 10195 enum stable_how4 { 10196 UNSTABLE4 = 0, 10197 DATA_SYNC4 = 1, 10198 FILE_SYNC4 = 2 10199 }; 10201 struct WRITE4args { 10202 /* CURRENT_FH: file */ 10203 stateid4 stateid; 10204 offset4 offset; 10205 stable_how4 stable; 10206 opaque data<>; 10207 }; 10209 struct WRITE4resok { 10210 count4 count; 10211 stable_how4 committed; 10212 verifier4 writeverf; 10213 }; 10215 union WRITE4res switch (nfsstat4 status) { 10216 case NFS4_OK: 10217 WRITE4resok resok4; 10218 default: 10219 void; 10220 }; 10222 /* 10223 * RELEASE_LOCKOWNER: Notify server to release lockowner 10224 */ 10225 struct RELEASE_LOCKOWNER4args { 10226 lock_owner4 lock_owner; 10227 }; 10229 struct RELEASE_LOCKOWNER4res { 10230 nfsstat4 status; 10231 }; 10233 /* 10234 * Operation arrays 10235 */ 10237 enum nfs_opnum4 { 10238 OP_ACCESS = 3, 10239 OP_CLOSE = 4, 10240 OP_COMMIT = 5, 10241 OP_CREATE = 6, 10242 OP_DELEGPURGE = 7, 10243 OP_DELEGRETURN = 8, 10244 OP_GETATTR = 9, 10245 OP_GETFH = 10, 10246 OP_LINK = 11, 10247 OP_LOCK = 12, 10248 OP_LOCKT = 13, 10249 OP_LOCKU = 14, 10250 OP_LOOKUP = 15, 10251 OP_LOOKUPP = 16, 10252 OP_NVERIFY = 17, 10253 OP_OPEN = 18, 10254 OP_OPENATTR = 19, 10255 OP_OPEN_CONFIRM = 20, 10256 OP_OPEN_DOWNGRADE = 21, 10257 OP_PUTFH = 22, 10258 OP_PUTPUBFH = 23, 10259 OP_PUTROOTFH = 24, 10260 OP_READ = 25, 10261 OP_READDIR = 26, 10262 OP_READLINK = 27, 10263 OP_REMOVE = 28, 10264 OP_RENAME = 29, 10265 OP_RENEW = 30, 10266 OP_RESTOREFH = 31, 10267 OP_SAVEFH = 32, 10268 OP_SECINFO = 33, 10269 OP_SETATTR = 34, 10270 OP_SETCLIENTID = 35, 10271 OP_SETCLIENTID_CONFIRM = 36, 10272 OP_VERIFY = 37, 10273 OP_WRITE = 38, 10274 OP_RELEASE_LOCKOWNER = 39 10275 }; 10277 union nfs_argop4 switch (nfs_opnum4 argop) { 10278 case OP_ACCESS: ACCESS4args opaccess; 10279 case OP_CLOSE: CLOSE4args opclose; 10280 case OP_COMMIT: COMMIT4args opcommit; 10281 case OP_CREATE: CREATE4args opcreate; 10282 case OP_DELEGPURGE: DELEGPURGE4args opdelegpurge; 10283 case OP_DELEGRETURN: DELEGRETURN4args opdelegreturn; 10284 case OP_GETATTR: GETATTR4args opgetattr; 10285 case OP_GETFH: void; 10286 case OP_LINK: LINK4args oplink; 10287 case OP_LOCK: LOCK4args oplock; 10288 case OP_LOCKT: LOCKT4args oplockt; 10289 case OP_LOCKU: LOCKU4args oplocku; 10290 case OP_LOOKUP: LOOKUP4args oplookup; 10291 case OP_LOOKUPP: void; 10292 case OP_NVERIFY: NVERIFY4args opnverify; 10293 case OP_OPEN: OPEN4args opopen; 10294 case OP_OPENATTR: OPENATTR4args opopenattr; 10295 case OP_OPEN_CONFIRM: OPEN_CONFIRM4args opopen_confirm; 10296 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4args opopen_downgrade; 10297 case OP_PUTFH: PUTFH4args opputfh; 10298 case OP_PUTPUBFH: void; 10299 case OP_PUTROOTFH: void; 10300 case OP_READ: READ4args opread; 10301 case OP_READDIR: READDIR4args opreaddir; 10302 case OP_READLINK: void; 10303 case OP_REMOVE: REMOVE4args opremove; 10304 case OP_RENAME: RENAME4args oprename; 10305 case OP_RENEW: RENEW4args oprenew; 10306 case OP_RESTOREFH: void; 10307 case OP_SAVEFH: void; 10308 case OP_SECINFO: SECINFO4args opsecinfo; 10309 case OP_SETATTR: SETATTR4args opsetattr; 10310 case OP_SETCLIENTID: SETCLIENTID4args opsetclientid; 10311 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4args 10312 opsetclientid_confirm; 10313 case OP_VERIFY: VERIFY4args opverify; 10314 case OP_WRITE: WRITE4args opwrite; 10315 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4args 10316 oprelease_lockowner; 10317 }; 10319 union nfs_resop4 switch (nfs_opnum4 resop){ 10320 case OP_ACCESS: ACCESS4res opaccess; 10321 case OP_CLOSE: CLOSE4res opclose; 10322 case OP_COMMIT: COMMIT4res opcommit; 10323 case OP_CREATE: CREATE4res opcreate; 10324 case OP_DELEGPURGE: DELEGPURGE4res opdelegpurge; 10325 case OP_DELEGRETURN: DELEGRETURN4res opdelegreturn; 10326 case OP_GETATTR: GETATTR4res opgetattr; 10327 case OP_GETFH: GETFH4res opgetfh; 10328 case OP_LINK: LINK4res oplink; 10329 case OP_LOCK: LOCK4res oplock; 10330 case OP_LOCKT: LOCKT4res oplockt; 10331 case OP_LOCKU: LOCKU4res oplocku; 10332 case OP_LOOKUP: LOOKUP4res oplookup; 10333 case OP_LOOKUPP: LOOKUPP4res oplookupp; 10334 case OP_NVERIFY: NVERIFY4res opnverify; 10335 case OP_OPEN: OPEN4res opopen; 10336 case OP_OPENATTR: OPENATTR4res opopenattr; 10337 case OP_OPEN_CONFIRM: OPEN_CONFIRM4res opopen_confirm; 10338 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4res opopen_downgrade; 10339 case OP_PUTFH: PUTFH4res opputfh; 10340 case OP_PUTPUBFH: PUTPUBFH4res opputpubfh; 10341 case OP_PUTROOTFH: PUTROOTFH4res opputrootfh; 10342 case OP_READ: READ4res opread; 10343 case OP_READDIR: READDIR4res opreaddir; 10344 case OP_READLINK: READLINK4res opreadlink; 10345 case OP_REMOVE: REMOVE4res opremove; 10346 case OP_RENAME: RENAME4res oprename; 10347 case OP_RENEW: RENEW4res oprenew; 10348 case OP_RESTOREFH: RESTOREFH4res oprestorefh; 10349 case OP_SAVEFH: SAVEFH4res opsavefh; 10350 case OP_SECINFO: SECINFO4res opsecinfo; 10351 case OP_SETATTR: SETATTR4res opsetattr; 10352 case OP_SETCLIENTID: SETCLIENTID4res opsetclientid; 10353 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4res 10354 opsetclientid_confirm; 10355 case OP_VERIFY: VERIFY4res opverify; 10356 case OP_WRITE: WRITE4res opwrite; 10357 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4res 10358 oprelease_lockowner; 10359 }; 10361 struct COMPOUND4args { 10362 utf8string tag; 10363 uint32_t minorversion; 10364 nfs_argop4 argarray<>; 10365 }; 10367 struct COMPOUND4res { 10368 nfsstat4 status; 10369 utf8string tag; 10370 nfs_resop4 resarray<>; 10371 }; 10373 /* 10374 * Remote file service routines 10375 */ 10376 program NFS4_PROGRAM { 10377 version NFS_V4 { 10378 void 10379 NFSPROC4_NULL(void) = 0; 10381 COMPOUND4res 10382 NFSPROC4_COMPOUND(COMPOUND4args) = 1; 10384 } = 4; 10385 } = 100003; 10387 /* 10388 * NFS4 Callback Procedure Definitions and Program 10389 */ 10391 /* 10392 * CB_GETATTR: Get Current Attributes 10393 */ 10394 struct CB_GETATTR4args { 10395 nfs_fh4 fh; 10396 bitmap4 attr_request; 10397 }; 10399 struct CB_GETATTR4resok { 10400 fattr4 obj_attributes; 10401 }; 10403 union CB_GETATTR4res switch (nfsstat4 status) { 10404 case NFS4_OK: 10405 CB_GETATTR4resok resok4; 10406 default: 10407 void; 10408 }; 10410 /* 10411 * CB_RECALL: Recall an Open Delegation 10412 */ 10413 struct CB_RECALL4args { 10414 stateid4 stateid; 10415 bool truncate; 10416 nfs_fh4 fh; 10417 }; 10419 struct CB_RECALL4res { 10420 nfsstat4 status; 10421 }; 10423 /* 10424 * Various definitions for CB_COMPOUND 10425 */ 10426 enum nfs_cb_opnum4 { 10427 OP_CB_GETATTR = 3, 10428 OP_CB_RECALL = 4 10429 }; 10431 union nfs_cb_argop4 switch (unsigned argop) { 10432 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 10433 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 10434 }; 10436 union nfs_cb_resop4 switch (unsigned resop){ 10437 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 10438 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 10439 }; 10440 struct CB_COMPOUND4args { 10441 utf8string tag; 10442 uint32_t minorversion; 10443 uint32_t callback_ident; 10444 nfs_cb_argop4 argarray<>; 10445 }; 10447 struct CB_COMPOUND4res { 10448 nfsstat4 status; 10449 utf8string tag; 10450 nfs_cb_resop4 resarray<>; 10451 }; 10453 /* 10454 * Program number is in the transient range since the client 10455 * will assign the exact transient program number and provide 10456 * that to the server via the SETCLIENTID operation. 10457 */ 10458 program NFS4_CALLBACK { 10459 version NFS_CB { 10460 void 10461 CB_NULL(void) = 0; 10462 CB_COMPOUND4res 10463 CB_COMPOUND(CB_COMPOUND4args) = 1; 10464 } = 1; 10465 } = 0x40000000; 10467 19. Bibliography 10469 [Floyd] 10470 S. Floyd, V. Jacobson, "The Synchronization of Periodic Routing 10471 Messages," IEEE/ACM Transactions on Networking, 2(2), pp. 122-136, 10472 April 1994. 10474 [Gray] 10475 C. Gray, D. Cheriton, "Leases: An Efficient Fault-Tolerant Mechanism 10476 for Distributed File Cache Consistency," Proceedings of the Twelfth 10477 Symposium on Operating Systems Principles, p. 202-210, December 1989. 10479 [ISO10646] 10480 "ISO/IEC 10646-1:1993. International Standard -- Information 10481 technology -- Universal Multiple-Octet Coded Character Set (UCS) -- 10482 Part 1: Architecture and Basic Multilingual Plane." 10484 [Juszczak] 10485 Juszczak, Chet, "Improving the Performance and Correctness of an NFS 10486 Server," USENIX Conference Proceedings, USENIX Association, Berkeley, 10487 CA, June 1990, pages 53-63. Describes reply cache implementation 10488 that avoids work in the server by handling duplicate requests. More 10489 important, though listed as a side-effect, the reply cache aids in 10490 the avoidance of destructive non-idempotent operation re-application 10491 -- improving correctness. 10493 [Kazar] 10494 Kazar, Michael Leon, "Synchronization and Caching Issues in the 10495 Andrew File System," USENIX Conference Proceedings, USENIX 10496 Association, Berkeley, CA, Dallas Winter 1988, pages 27-36. A 10497 description of the cache consistency scheme in AFS. Contrasted with 10498 other distributed file systems. 10500 [Macklem] 10501 Macklem, Rick, "Lessons Learned Tuning the 4.3BSD Reno Implementation 10502 of the NFS Protocol," Winter USENIX Conference Proceedings, USENIX 10503 Association, Berkeley, CA, January 1991. Describes performance work 10504 in tuning the 4.3BSD Reno NFS implementation. Describes performance 10505 improvement (reduced CPU loading) through elimination of data copies. 10507 [Mogul] 10508 Mogul, Jeffrey C., "A Recovery Protocol for Spritely NFS," USENIX 10509 File System Workshop Proceedings, Ann Arbor, MI, USENIX Association, 10510 Berkeley, CA, May 1992. Second paper on Spritely NFS proposes a 10511 lease-based scheme for recovering state of consistency protocol. 10513 [Nowicki] 10514 Nowicki, Bill, "Transport Issues in the Network File System," ACM 10515 SIGCOMM newsletter Computer Communication Review, April 1989. A 10516 brief description of the basis for the dynamic retransmission work. 10518 [Pawlowski] 10519 Pawlowski, Brian, Ron Hixon, Mark Stein, Joseph Tumminaro, "Network 10520 Computing in the UNIX and IBM Mainframe Environment," Uniforum `89 10521 Conf. Proc., (1989) Description of an NFS server implementation for 10522 IBM's MVS operating system. 10524 [RFC1094] 10525 Sun Microsystems, Inc., "NFS: Network File System Protocol 10526 Specification", RFC1094, March 1989. 10528 http://www.ietf.org/rfc/rfc1094.txt 10530 [RFC1345] 10531 Simonsen, K., "Character Mnemonics & Character Sets", RFC1345, 10532 Rationel Almen Planlaegning, June 1992. 10534 http://www.ietf.org/rfc/rfc1345.txt 10536 [RFC1700] 10537 Reynolds, J., Postel, J., "Assigned Numbers", RFC1700, ISI, October 10538 1994 10540 http://www.ietf.org/rfc/rfc1700.txt 10542 [RFC1813] 10543 Callaghan, B., Pawlowski, B., Staubach, P., "NFS Version 3 Protocol 10544 Specification", RFC1813, Sun Microsystems, Inc., June 1995. 10546 http://www.ietf.org/rfc/rfc1813.txt 10548 [RFC1831] 10549 Srinivasan, R., "RPC: Remote Procedure Call Protocol Specification 10550 Version 2", RFC1831, Sun Microsystems, Inc., August 1995. 10552 http://www.ietf.org/rfc/rfc1831.txt 10554 [RFC1832] 10555 Srinivasan, R., "XDR: External Data Representation Standard", 10556 RFC1832, Sun Microsystems, Inc., August 1995. 10558 http://www.ietf.org/rfc/rfc1832.txt 10560 [RFC1833] 10561 Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC1833, 10562 Sun Microsystems, Inc., August 1995. 10564 http://www.ietf.org/rfc/rfc1833.txt 10566 [RFC2025] 10567 Adams, C., "The Simple Public-Key GSS-API Mechanism (SPKM)", RFC2025, 10568 Bell-Northern Research, October 1996. 10570 http://www.ietf.org/rfc/rfc2026.txt 10572 [RFC2054] 10573 Callaghan, B., "WebNFS Client Specification", RFC2054, Sun 10574 Microsystems, Inc., October 1996 10576 http://www.ietf.org/rfc/rfc2054.txt 10578 [RFC2055] 10579 Callaghan, B., "WebNFS Server Specification", RFC2055, Sun 10580 Microsystems, Inc., October 1996 10582 http://www.ietf.org/rfc/rfc2055.txt 10584 [RFC2078] 10585 Linn, J., "Generic Security Service Application Program Interface, 10586 Version 2", RFC2078, OpenVision Technologies, January 1997. 10588 http://www.ietf.org/rfc/rfc2078.txt 10590 [RFC2152] 10591 Goldsmith, D., "UTF-7 A Mail-Safe Transformation Format of Unicode", 10592 RFC2152, Apple Computer, Inc., May 1997 10594 http://www.ietf.org/rfc/rfc2152.txt 10596 [RFC2203] 10597 Eisler, M., Chiu, A., Ling, L., "RPCSEC_GSS Protocol Specification", 10598 RFC2203, Sun Microsystems, Inc., August 1995. 10600 http://www.ietf.org/rfc/rfc2203.txt 10602 [RFC2277] 10603 Alvestrand, H., "IETF Policy on Character Sets and Languages", 10604 RFC2277, UNINETT, January 1998. 10606 http://www.ietf.org/rfc/rfc2277.txt 10608 [RFC2279] 10609 Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC2279, 10610 Alis Technologies, January 1998. 10612 http://www.ietf.org/rfc/rfc2279.txt 10614 [RFC2623] 10615 Eisler, M., "NFS Version 2 and Version 3 Security Issues and the NFS 10616 Protocol's Use of RPCSEC_GSS and Kerberos V5", RFC2623, Sun 10617 Microsystems, June 1999 10619 http://www.ietf.org/rfc/rfc2623.txt 10621 [RFC2624] 10622 Shepler, S., "NFS Version 4 Design Considerations", RFC2624, Sun 10623 Microsystems, June 1999 10625 http://www.ietf.org/rfc/rfc2624.txt 10627 [RFC2847] 10628 Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism Using 10629 SPKM", RFC2847, Sun Microsystems, June 2000 10631 http://www.ietf.org/internet-drafts/draft-ietf-cat-lipkey-03.txt 10633 [Sandberg] 10634 Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, "Design 10635 and Implementation of the Sun Network Filesystem," USENIX Conference 10636 Proceedings, USENIX Association, Berkeley, CA, Summer 1985. The 10637 basic paper describing the SunOS implementation of the NFS version 2 10638 protocol, and discusses the goals, protocol specification and trade- 10639 offs. 10641 [Srinivasan] 10642 Srinivasan, V., Jeffrey C. Mogul, "Spritely NFS: Implementation and 10643 Performance of Cache Consistency Protocols", WRL Research Report 10644 89/5, Digital Equipment Corporation Western Research Laboratory, 100 10645 Hamilton Ave., Palo Alto, CA, 94301, May 1989. This paper analyzes 10646 the effect of applying a Sprite-like consistency protocol applied to 10647 standard NFS. The issues of recovery in a stateful environment are 10648 covered in [Mogul]. 10650 [Unicode1] 10651 The Unicode Consortium, "The Unicode Standard, Version 3.0", 10652 Addison-Wesley Developers Press, Reading, MA, 2000. ISBN 0-201- 10653 61633-5. 10655 More information available at: http://www.unicode.org/ 10657 [Unicode2] 10658 "Unsupported Scripts" Unicode, Inc., The Unicode Consortium, P.O. Box 10659 700519, San Jose, CA 95710-0519 USA, September 1999 10661 http://www.unicode.org/unicode/standard/unsupported.html 10663 [XNFS] 10664 The Open Group, Protocols for Interworking: XNFS, Version 3W, The 10665 Open Group, 1010 El Camino Real Suite 380, Menlo Park, CA 94025, ISBN 10666 1-85912-184-5, February 1998. 10668 HTML version available: http://www.opengroup.org 10670 20. Authors 10672 20.1. Editor's Address 10674 Spencer Shepler 10675 Sun Microsystems, Inc. 10676 7808 Moonflower Drive 10677 Austin, Texas 78750 10679 Phone: +1 512-349-9376 10680 E-mail: spencer.shepler@sun.com 10682 20.2. Authors' Addresses 10684 Carl Beame 10685 Hummingbird Ltd. 10687 E-mail: beame@bws.com 10689 Brent Callaghan 10690 Sun Microsystems, Inc. 10691 901 San Antonio Road 10692 Palo Alto, CA 94303 10694 Phone: +1 650-786-5067 10695 E-mail: brent.callaghan@sun.com 10697 Mike Eisler 10698 5565 Wilson Road 10699 Colorado Springs, CO 80919 10701 Phone: +1 719-599-9026 10702 E-mail: mike@eisler.com 10704 David Noveck 10705 Network Appliance 10706 375 Totten Pond Road 10707 Waltham, MA 02451 10709 Phone: +1 781-895-4949 10710 E-mail: dnoveck@netapp.com 10712 David Robinson 10713 Sun Microsystems, Inc. 10714 901 San Antonio Road 10715 Palo Alto, CA 94303 10716 Phone: +1 650-786-5088 10717 E-mail: david.robinson@sun.com 10719 Robert Thurlow 10720 Sun Microsystems, Inc. 10721 901 San Antonio Road 10722 Palo Alto, CA 94303 10724 Phone: +1 650-786-5096 10725 E-mail: robert.thurlow@sun.com 10727 20.3. Acknowledgements 10729 The author thanks and acknowledges: 10731 Neil Brown for his extensive review and comments of various drafts. 10733 21. Full Copyright Statement 10735 "Copyright (C) The Internet Society (2000-2002). All Rights 10736 Reserved. 10738 This document and translations of it may be copied and furnished to 10739 others, and derivative works that comment on or otherwise explain it 10740 or assist in its implementation may be prepared, copied, published 10741 and distributed, in whole or in part, without restriction of any 10742 kind, provided that the above copyright notice and this paragraph are 10743 included on all such copies and derivative works. However, this 10744 document itself may not be modified in any way, such as by removing 10745 the copyright notice or references to the Internet Society or other 10746 Internet organizations, except as needed for the purpose of 10747 developing Internet standards in which case the procedures for 10748 copyrights defined in the Internet Standards process must be 10749 followed, or as required to translate it into languages other than 10750 English. 10752 The limited permissions granted above are perpetual and will not be 10753 revoked by the Internet Society or its successors or assigns. 10755 This document and the information contained herein is provided on an 10756 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 10757 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 10758 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 10759 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 10760 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."