idnits 2.17.1 draft-ietf-nfsv4-rfc3010bis-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 176) being 62 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119], [RFC1094], [RFC1813]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. -- The draft header indicates that this document obsoletes RFC3010, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 641 has weird spacing: '...ned int uin...' == Line 645 has weird spacing: '...d hyper uint6...' == Line 707 has weird spacing: '...8string typ...' == Line 793 has weird spacing: '...8string ser...' == Line 899 has weird spacing: '...ned int cb_pr...' == (39 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2002) is 7866 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC1094' on line 13097 looks like a reference -- Missing reference section? 'RFC1813' on line 13115 looks like a reference -- Missing reference section? 'RFC2119' on line 13171 looks like a reference -- Missing reference section? 'RFC1831' on line 13121 looks like a reference -- Missing reference section? 'RFC1832' on line 13127 looks like a reference -- Missing reference section? 'RFC2203' on line 13185 looks like a reference -- Missing reference section? 'RFC1964' on line 13147 looks like a reference -- Missing reference section? 'RFC2847' on line 13237 looks like a reference -- Missing reference section? 'RFC2743' on line 13222 looks like a reference -- Missing reference section? 'RFC1833' on line 13135 looks like a reference -- Missing reference section? 'RFC1884' on line 13141 looks like a reference -- Missing reference section? '12' on line 11841 looks like a reference -- Missing reference section? 'RFC1700' on line 13109 looks like a reference -- Missing reference section? 'Floyd' on line 13040 looks like a reference -- Missing reference section? 'RFC2623' on line 13209 looks like a reference -- Missing reference section? 'RFC2025' on line 13153 looks like a reference -- Missing reference section? 'RFC2054' on line 13159 looks like a reference -- Missing reference section? 'RFC2055' on line 13165 looks like a reference -- Missing reference section? 'RFC2624' on line 13216 looks like a reference -- Missing reference section? 'RFC1345' on line 13103 looks like a reference -- Missing reference section? 'XNFS' on line 13275 looks like a reference -- Missing reference section? 'Juszczak' on line 13055 looks like a reference -- Missing reference section? 'ISO10646' on line 13050 looks like a reference -- Missing reference section? 'RFC2277' on line 13197 looks like a reference -- Missing reference section? 'RFC2279' on line 13203 looks like a reference -- Missing reference section? 'RFC2152' on line 13179 looks like a reference -- Missing reference section? 'Unicode1' on line 13260 looks like a reference -- Missing reference section? 'Unicode2' on line 13267 looks like a reference -- Missing reference section? 'RFC2224' on line 13191 looks like a reference -- Missing reference section? 'RFC2755' on line 13231 looks like a reference -- Missing reference section? 'Gray' on line 13045 looks like a reference -- Missing reference section? 'Kazar' on line 13064 looks like a reference -- Missing reference section? 'Macklem' on line 13071 looks like a reference -- Missing reference section? 'Mogul' on line 13258 looks like a reference -- Missing reference section? 'Nowicki' on line 13086 looks like a reference -- Missing reference section? 'Pawlowski' on line 13091 looks like a reference -- Missing reference section? 'Sandberg' on line 13243 looks like a reference -- Missing reference section? 'Srinivasan' on line 13251 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 42 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFS version 4 Working Group S. Shepler 3 INTERNET-DRAFT Sun Microsystems, Inc. 4 Obsoletes: 3010 C. Beame 5 Document: draft-ietf-nfsv4-rfc3010bis-03.txt Hummingbird Ltd. 6 B. Callaghan 7 Sun Microsystems, Inc. 8 M. Eisler 9 Network Appliance, Inc. 10 D. Noveck 11 Network Appliance, Inc. 12 D. Robinson 13 Sun Microsystems, Inc. 14 R. Thurlow 15 Sun Microsystems, Inc. 16 September 2002 18 NFS version 4 Protocol 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet- Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Abstract 43 NFS version 4 is a distributed filesystem protocol which owes 44 heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. 46 Draft Specification NFS version 4 Protocol September 2002 48 Unlike earlier versions, the NFS version 4 protocol supports 49 traditional file access while integrating support for file locking 50 and the mount protocol. In addition, support for strong security 51 (and its negotiation), compound operations, client caching, and 52 internationalization have been added. Of course, attention has been 53 applied to making NFS version 4 operate well in an Internet 54 environment. 56 Copyright 58 Copyright (C) The Internet Society (2000-2002). All Rights Reserved. 60 Key Words 62 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 63 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 64 document are to be interpreted as described in [RFC2119]. 66 Draft Specification NFS version 4 Protocol September 2002 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 71 1.1. Inconsistencies of this Document with Section 18 . . . . . 7 72 1.2. Overview of NFS version 4 Features . . . . . . . . . . . . 8 73 1.2.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 74 1.2.2. Procedure and Operation Structure . . . . . . . . . . . 8 75 1.2.3. Filesystem Model . . . . . . . . . . . . . . . . . . . . 9 76 1.2.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 77 1.2.3.2. Attribute Types . . . . . . . . . . . . . . . . . . 10 78 1.2.3.3. Filesystem Replication and Migration . . . . . . . . 10 79 1.2.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . . 11 80 1.2.5. File locking . . . . . . . . . . . . . . . . . . . . . 11 81 1.2.6. Client Caching and Delegation . . . . . . . . . . . . 11 82 1.3. General Definitions . . . . . . . . . . . . . . . . . . 12 83 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . 14 84 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 14 85 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 86 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 21 87 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 21 88 3.1.1. Client Retransmission Behavior . . . . . . . . . . . . 21 89 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 22 90 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 22 91 3.2.1.1. Kerberos V5 as a security triple . . . . . . . . . . 22 92 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 23 93 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 24 94 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 24 95 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 24 96 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . . 25 97 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 25 98 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 27 99 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 27 100 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 27 101 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 27 102 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 28 103 4.2.1. General Properties of a Filehandle . . . . . . . . . . 28 104 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 29 105 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 29 106 4.2.4. One Method of Constructing a Volatile Filehandle . . . 30 107 4.3. Client Recovery from Filehandle Expiration . . . . . . . 31 108 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 33 109 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 34 110 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 34 111 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 34 112 5.4. Classification of Attributes . . . . . . . . . . . . . . 35 113 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 37 114 5.6. Recommended Attributes - Definitions . . . . . . . . . . 39 115 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 44 116 5.8. Interpreting owner and owner_group . . . . . . . . . . . 44 117 5.9. Character Case Attributes . . . . . . . . . . . . . . . 46 118 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . 46 119 5.11. Access Control Lists . . . . . . . . . . . . . . . . . 47 121 Draft Specification NFS version 4 Protocol September 2002 123 5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . 48 124 5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 49 125 5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . 51 126 5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 52 127 5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . 53 128 5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . 54 129 5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 54 130 6. Filesystem Migration and Replication . . . . . . . . . . . 56 131 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 56 132 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 56 133 6.3. Interpretation of the fs_locations Attribute . . . . . . 57 134 6.4. Filehandle Recovery for Migration or Replication . . . . 58 135 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 59 136 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 59 137 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 59 138 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 59 139 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 60 140 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 60 141 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 60 142 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 61 143 7.8. Security Policy and Name Space Presentation . . . . . . 61 144 8. File Locking and Share Reservations . . . . . . . . . . . 63 145 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 63 146 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 63 147 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 66 148 8.1.3. lock_owner and stateid Definition . . . . . . . . . . 67 149 8.1.4. Use of the stateid and Locking . . . . . . . . . . . . 68 150 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 70 151 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 71 152 8.1.7. Releasing lock_owner State . . . . . . . . . . . . . . 72 153 8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . . 72 154 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 73 155 8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 73 156 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 74 157 8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 74 158 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 75 159 8.6.1. Client Failure and Recovery . . . . . . . . . . . . . 76 160 8.6.2. Server Failure and Recovery . . . . . . . . . . . . . 76 161 8.6.3. Network Partitions and Recovery . . . . . . . . . . . 78 162 8.7. Recovery from a Lock Request Timeout or Abort . . . . . 81 163 8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 82 164 8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 83 165 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 83 166 8.10.1. Close and Retention of State Information . . . . . . 84 167 8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . 85 168 8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 85 169 8.13. Clocks, Propagation Delay, and Calculating Lease 170 Expiration . . . . . . . . . . . . . . . . . . . . . . 86 171 8.14. Migration, Replication and State . . . . . . . . . . . 86 172 8.14.1. Migration and State . . . . . . . . . . . . . . . . . 87 173 8.14.2. Replication and State . . . . . . . . . . . . . . . . 87 174 8.14.3. Notification of Migrated Lease . . . . . . . . . . . 88 176 Draft Specification NFS version 4 Protocol September 2002 178 8.14.4. Migration and the Lease_time Attribute . . . . . . . 88 179 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 90 180 9.1. Performance Challenges for Client-Side Caching . . . . . 90 181 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 91 182 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 92 183 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 94 184 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 94 185 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 95 186 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 97 187 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 97 188 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 98 189 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 101 190 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 102 191 9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . . 102 192 9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . 105 193 9.4.5. Clients that Fail to Honor Delegation Recalls . . . . 107 194 9.4.6. Delegation Revocation . . . . . . . . . . . . . . . . 107 195 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 108 196 9.5.1. Revocation Recovery for Write Open Delegation . . . . 108 197 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 109 198 9.7. Data and Metadata Caching and Memory Mapped Files . . . 111 199 9.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 113 200 9.9. Directory Caching . . . . . . . . . . . . . . . . . . . 114 201 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 116 202 11. Internationalization . . . . . . . . . . . . . . . . . . 119 203 11.1. Universal Versus Local Character Sets . . . . . . . . . 119 204 11.2. Overview of Universal Character Set Standards . . . . . 120 205 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 121 206 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 121 207 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 122 208 11.6. UTF-8 Related Errors . . . . . . . . . . . . . . . . . 122 209 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 124 210 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . 130 211 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 130 212 13.2. Evaluation of a Compound Request . . . . . . . . . . . 131 213 13.3. Synchronous Modifying Operations . . . . . . . . . . . 131 214 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 132 215 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . 133 216 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 133 217 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 134 218 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 137 219 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 140 220 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 142 221 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 145 222 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 223 Recovery . . . . . . . . . . . . . . . . . . . . . . 148 224 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 150 225 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 151 226 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 153 227 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 155 228 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 157 229 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 161 231 Draft Specification NFS version 4 Protocol September 2002 233 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 163 234 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 165 235 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 168 236 14.2.15. Operation 17: NVERIFY - Verify Difference in 237 Attributes . . . . . . . . . . . . . . . . . . . . . 169 238 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 171 239 14.2.17. Operation 19: OPENATTR - Open Named Attribute 240 Directory . . . . . . . . . . . . . . . . . . . . . 181 241 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 183 242 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access186 243 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 188 244 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 189 245 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 191 246 14.2.23. Operation 25: READ - Read from File . . . . . . . . 192 247 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 195 248 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 199 249 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 201 250 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 204 251 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 207 252 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 209 253 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 211 254 14.2.31. Operation 33: SECINFO - Obtain Available Security . 212 255 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 216 256 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 219 257 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 223 258 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 227 259 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 229 260 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner 261 State . . . . . . . . . . . . . . . . . . . . . . . 234 262 14.2.38. Operation 10044: ILLEGAL - Illegal operation . . . . 236 263 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 237 264 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 237 265 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 238 266 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 240 267 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 242 268 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback 269 Operation . . . . . . . . . . . . . . . . . . . . . . 244 270 16. Security Considerations . . . . . . . . . . . . . . . . . 245 271 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 246 272 17.1. Named Attribute Definition . . . . . . . . . . . . . . 246 273 17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . 246 274 18. RPC definition file . . . . . . . . . . . . . . . . . . . 247 275 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 279 276 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 285 277 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 285 278 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 285 279 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 286 280 21. Full Copyright Statement . . . . . . . . . . . . . . . . 287 282 Draft Specification NFS version 4 Protocol September 2002 284 1. Introduction 286 The NFS version 4 protocol is a further revision of the NFS protocol 287 defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains 288 the essential characteristics of previous versions: design for easy 289 recovery, independent of transport protocols, operating systems and 290 filesystems, simplicity, and good performance. The NFS version 4 291 revision has the following goals: 293 o Improved access and good performance on the Internet. 295 The protocol is designed to transit firewalls easily, perform 296 well where latency is high and bandwidth is low, and scale to 297 very large numbers of clients per server. 299 o Strong security with negotiation built into the protocol. 301 The protocol builds on the work of the ONCRPC working group in 302 supporting the RPCSEC_GSS protocol. Additionally, the NFS 303 version 4 protocol provides a mechanism to allow clients and 304 servers the ability to negotiate security and require clients 305 and servers to support a minimal set of security schemes. 307 o Good cross-platform interoperability. 309 The protocol features a filesystem model that provides a useful, 310 common set of features that does not unduly favor one filesystem 311 or operating system over another. 313 o Designed for protocol extensions. 315 The protocol is designed to accept standard extensions that do 316 not compromise backward compatibility. 318 1.1. Inconsistencies of this Document with Section 18 320 Section 18, RPC Definition File, contains the definitions in XDR 321 description language of the constructs used by the protocol. Prior 322 to Section 18, several of the constructs are reproduced for purposes 323 of explanation. The reader is warned of the possibility of errors in 324 the reproduced constructs outside of Section 18. For any part of the 325 document that is inconsistent with Section 18, Section 18 is to be 326 considered authoritative. 328 Draft Specification NFS version 4 Protocol September 2002 330 1.2. Overview of NFS version 4 Features 332 To provide a reasonable context for the reader, the major features of 333 NFS version 4 protocol will be reviewed in brief. This will be done 334 to provide an appropriate context for both the reader who is familiar 335 with the previous versions of the NFS protocol and the reader that is 336 new to the NFS protocols. For the reader new to the NFS protocols, 337 there is still a fundamental knowledge that is expected. The reader 338 should be familiar with the XDR and RPC protocols as described in 339 [RFC1831] and [RFC1832]. A basic knowledge of filesystems and 340 distributed filesystems is expected as well. 342 1.2.1. RPC and Security 344 As with previous versions of NFS, the External Data Representation 345 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS 346 version 4 protocol are those defined in [RFC1831] and [RFC1832]. To 347 meet end to end security requirements, the RPCSEC_GSS framework 348 [RFC2203] will be used to extend the basic RPC security. With the 349 use of RPCSEC_GSS, various mechanisms can be provided to offer 350 authentication, integrity, and privacy to the NFS version 4 protocol. 351 Kerberos V5 will be used as described in [RFC1964] to provide one 352 security framework. The LIPKEY GSS-API mechanism described in 353 [RFC2847] will be used to provide for the use of user password and 354 server public key by the NFS version 4 protocol. With the use of 355 RPCSEC_GSS, other mechanisms may also be specified and used for NFS 356 version 4 security. 358 To enable in-band security negotiation, the NFS version 4 protocol 359 has added a new operation which provides the client a method of 360 querying the server about its policies regarding which security 361 mechanisms must be used for access to the server's filesystem 362 resources. With this, the client can securely match the security 363 mechanism that meets the policies specified at both the client and 364 server. 366 1.2.2. Procedure and Operation Structure 368 A significant departure from the previous versions of the NFS 369 protocol is the introduction of the COMPOUND procedure. For the NFS 370 version 4 protocol, there are two RPC procedures, NULL and COMPOUND. 371 The COMPOUND procedure is defined in terms of operations and these 372 operations correspond more closely to the traditional NFS procedures. 373 With the use of the COMPOUND procedure, the client is able to build 374 simple or complex requests. These COMPOUND requests allow for a 375 reduction in the number of RPCs needed for logical filesystem 376 operations. For example, without previous contact with a server a 377 client will be able to read data from a file in one request by 378 combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. 379 With previous versions of the NFS protocol, this type of single 381 Draft Specification NFS version 4 Protocol September 2002 383 request was not possible. 385 The model used for COMPOUND is very simple. There is no logical OR 386 or ANDing of operations. The operations combined within a COMPOUND 387 request are evaluated in order by the server. Once an operation 388 returns a failing result, the evaluation ends and the results of all 389 evaluated operations are returned to the client. 391 The NFS version 4 protocol continues to have the client refer to a 392 file or directory at the server by a "filehandle". The COMPOUND 393 procedure has a method of passing a filehandle from one operation to 394 another within the sequence of operations. There is a concept of a 395 "current filehandle" and "saved filehandle". Most operations use the 396 "current filehandle" as the filesystem object to operate upon. The 397 "saved filehandle" is used as temporary filehandle storage within a 398 COMPOUND procedure as well as an additional operand for certain 399 operations. 401 1.2.3. Filesystem Model 403 The general filesystem model used for the NFS version 4 protocol is 404 the same as previous versions. The server filesystem is hierarchical 405 with the regular files contained within being treated as opaque byte 406 streams. In a slight departure, file and directory names are encoded 407 with UTF-8 to deal with the basics of internationalization. 409 The NFS version 4 protocol does not require a separate protocol to 410 provide for the initial mapping between path name and filehandle. 411 Instead of using the older MOUNT protocol for this mapping, the 412 server provides a ROOT filehandle that represents the logical root or 413 top of the filesystem tree provided by the server. The server 414 provides multiple filesystems by gluing them together with pseudo 415 filesystems. These pseudo filesystems provide for potential gaps in 416 the path names between real filesystems. 418 1.2.3.1. Filehandle Types 420 In previous versions of the NFS protocol, the filehandle provided by 421 the server was guaranteed to be valid or persistent for the lifetime 422 of the filesystem object to which it referred. For some server 423 implementations, this persistence requirement has been difficult to 424 meet. For the NFS version 4 protocol, this requirement has been 425 relaxed by introducing another type of filehandle, volatile. With 426 persistent and volatile filehandle types, the server implementation 427 can match the abilities of the filesystem at the server along with 428 the operating environment. The client will have knowledge of the 429 type of filehandle being provided by the server and can be prepared 430 to deal with the semantics of each. 432 Draft Specification NFS version 4 Protocol September 2002 434 1.2.3.2. Attribute Types 436 The NFS version 4 protocol introduces three classes of filesystem or 437 file attributes. Like the additional filehandle type, the 438 classification of file attributes has been done to ease server 439 implementations along with extending the overall functionality of the 440 NFS protocol. This attribute model is structured to be extensible 441 such that new attributes can be introduced in minor revisions of the 442 protocol without requiring significant rework. 444 The three classifications are: mandatory, recommended and named 445 attributes. This is a significant departure from the previous 446 attribute model used in the NFS protocol. Previously, the attributes 447 for the filesystem and file objects were a fixed set of mainly UNIX 448 attributes. If the server or client did not support a particular 449 attribute, it would have to simulate the attribute the best it could. 451 Mandatory attributes are the minimal set of file or filesystem 452 attributes that must be provided by the server and must be properly 453 represented by the server. Recommended attributes represent 454 different filesystem types and operating environments. The 455 recommended attributes will allow for better interoperability and the 456 inclusion of more operating environments. The mandatory and 457 recommended attribute sets are traditional file or filesystem 458 attributes. The third type of attribute is the named attribute. A 459 named attribute is an opaque byte stream that is associated with a 460 directory or file and referred to by a string name. Named attributes 461 are meant to be used by client applications as a method to associate 462 application specific data with a regular file or directory. 464 One significant addition to the recommended set of file attributes is 465 the Access Control List (ACL) attribute. This attribute provides for 466 directory and file access control beyond the model used in previous 467 versions of the NFS protocol. The ACL definition allows for 468 specification of user and group level access control. 470 1.2.3.3. Filesystem Replication and Migration 472 With the use of a special file attribute, the ability to migrate or 473 replicate server filesystems is enabled within the protocol. The 474 filesystem locations attribute provides a method for the client to 475 probe the server about the location of a filesystem. In the event of 476 a migration of a filesystem, the client will receive an error when 477 operating on the filesystem and it can then query as to the new file 478 system location. Similar steps are used for replication, the client 479 is able to query the server for the multiple available locations of a 480 particular filesystem. From this information, the client can use its 481 own policies to access the appropriate filesystem location. 483 Draft Specification NFS version 4 Protocol September 2002 485 1.2.4. OPEN and CLOSE 487 The NFS version 4 protocol introduces OPEN and CLOSE operations. The 488 OPEN operation provides a single point where file lookup, creation, 489 and share semantics can be combined. The CLOSE operation also 490 provides for the release of state accumulated by OPEN. 492 1.2.5. File locking 494 With the NFS version 4 protocol, the support for byte range file 495 locking is part of the NFS protocol. The file locking support is 496 structured so that an RPC callback mechanism is not required. This 497 is a departure from the previous versions of the NFS file locking 498 protocol, Network Lock Manager (NLM). The state associated with file 499 locks is maintained at the server under a lease-based model. The 500 server defines a single lease period for all state held by a NFS 501 client. If the client does not renew its lease within the defined 502 period, all state associated with the client's lease may be released 503 by the server. The client may renew its lease with use of the RENEW 504 operation or implicitly by use of other operations (primarily READ). 506 1.2.6. Client Caching and Delegation 508 The file, attribute, and directory caching for the NFS version 4 509 protocol is similar to previous versions. Attributes and directory 510 information are cached for a duration determined by the client. At 511 the end of a predefined timeout, the client will query the server to 512 see if the related filesystem object has been updated. 514 For file data, the client checks its cache validity when the file is 515 opened. A query is sent to the server to determine if the file has 516 been changed. Based on this information, the client determines if 517 the data cache for the file should kept or released. Also, when the 518 file is closed, any modified data is written to the server. 520 If an application wants to serialize access to file data, file 521 locking of the file data ranges in question should be used. 523 The major addition to NFS version 4 in the area of caching is the 524 ability of the server to delegate certain responsibilities to the 525 client. When the server grants a delegation for a file to a client, 526 the client is guaranteed certain semantics with respect to the 527 sharing of that file with other clients. At OPEN, the server may 528 provide the client either a read or write delegation for the file. 529 If the client is granted a read delegation, it is assured that no 530 other client has the ability to write to the file for the duration of 531 the delegation. If the client is granted a write delegation, the 532 client is assured that no other client has read or write access to 533 the file. 535 Draft Specification NFS version 4 Protocol September 2002 537 Delegations can be recalled by the server. If another client 538 requests access to the file in such a way that the access conflicts 539 with the granted delegation, the server is able to notify the initial 540 client and recall the delegation. This requires that a callback path 541 exist between the server and client. If this callback path does not 542 exist, then delegations can not be granted. The essence of a 543 delegation is that it allows the client to locally service operations 544 such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate 545 interaction with the server. 547 1.3. General Definitions 549 The following definitions are provided for the purpose of providing 550 an appropriate context for the reader. 552 Client The "client" is the entity that accesses the NFS server's 553 resources. The client may be an application which contains 554 the logic to access the NFS server directly. The client 555 may also be the traditional operating system client remote 556 filesystem services for a set of applications. 558 In the case of file locking the client is the entity that 559 maintains a set of locks on behalf of one or more 560 applications. This client is responsible for crash or 561 failure recovery for those locks it manages. 563 Note that multiple clients may share the same transport and 564 multiple clients may exist on the same network node. 566 Clientid A 64-bit quantity used as a unique, short-hand reference to 567 a client supplied Verifier and ID. The server is 568 responsible for supplying the Clientid. 570 Lease An interval of time defined by the server for which the 571 client is irrevocably granted a lock. At the end of a 572 lease period the lock may be revoked if the lease has not 573 been extended. The lock must be revoked if a conflicting 574 lock has been granted after the lease interval. 576 All leases granted by a server have the same fixed 577 interval. Note that the fixed interval was chosen to 578 alleviate the expense a server would have in maintaining 579 state about variable length leases across server failures. 581 Lock The term "lock" is used to refer to both record (byte- 582 range) locks as well as share reservations unless 583 specifically stated otherwise. 585 Server The "Server" is the entity responsible for coordinating 586 client access to a set of filesystems. 588 Draft Specification NFS version 4 Protocol September 2002 590 Stable Storage 591 NFS version 4 servers must be able to recover without data 592 loss from multiple power failures (including cascading 593 power failures, that is, several power failures in quick 594 succession), operating system failures, and hardware 595 failure of components other than the storage medium itself 596 (for example, disk, nonvolatile RAM). 598 Some examples of stable storage that are allowable for an 599 NFS server include: 601 1. Media commit of data, that is, the modified data has 602 been successfully written to the disk media, 603 for example, the disk platter. 605 2. An immediate reply disk drive with battery-backed 606 on-drive intermediate storage or uninterruptible power 607 system (UPS). 609 3. Server commit of data with battery-backed intermediate 610 storage and recovery software. 612 4. Cache commit with uninterruptible power system (UPS) 613 and recovery software. 615 Stateid A 128-bit quantity returned by a server that uniquely 616 defines the open and locking state provided by the server 617 for a specific open or lock owner for a specific file. 619 Stateids composed of all bits 0 or all bits 1 have special 620 meaning and are reserved values. 622 Verifier A 64-bit quantity generated by the client that the server 623 can use to determine if the client has restarted and lost 624 all previous lock state. 626 Draft Specification NFS version 4 Protocol September 2002 628 2. Protocol Data Types 630 The syntax and semantics to describe the data types of the NFS 631 version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] 632 documents. The next sections build upon the XDR data types to define 633 types and structures specific to this protocol. 635 2.1. Basic Data Types 637 Data Type Definition 638 _____________________________________________________________________ 639 int32_t typedef int int32_t; 641 uint32_t typedef unsigned int uint32_t; 643 int64_t typedef hyper int64_t; 645 uint64_t typedef unsigned hyper uint64_t; 647 attrlist4 typedef opaque attrlist4<>; 648 Used for file/directory attributes 650 bitmap4 typedef uint32_t bitmap4<>; 651 Used in attribute array encoding. 653 changeid4 typedef uint64_t changeid4; 654 Used in definition of change_info 656 clientid4 typedef uint64_t clientid4; 657 Shorthand reference to client identification 659 component4 typedef utf8string component4; 660 Represents path name components 662 count4 typedef uint32_t count4; 663 Various count parameters (READ, WRITE, COMMIT) 665 length4 typedef uint64_t length4; 666 Describes LOCK lengths 668 linktext4 typedef utf8string linktext4; 669 Symbolic link contents 671 mode4 typedef uint32_t mode4; 672 Mode attribute data type 674 nfs_cookie4 typedef uint64_t nfs_cookie4; 675 Opaque cookie value for READDIR 677 nfs_fh4 typedef opaque nfs_fh4; 678 Filehandle definition; NFS4_FHSIZE is defined as 128 680 Draft Specification NFS version 4 Protocol September 2002 682 nfs_ftype4 enum nfs_ftype4; 683 Various defined file types 685 nfsstat4 enum nfsstat4; 686 Return value for operations 688 offset4 typedef uint64_t offset4; 689 Various offset designations (READ, WRITE, LOCK, COMMIT) 691 pathname4 typedef component4 pathname4<>; 692 Represents path name for LOOKUP, OPEN and others 694 qop4 typedef uint32_t qop4; 695 Quality of protection designation in SECINFO 697 sec_oid4 typedef opaque sec_oid4<>; 698 Security Object Identifier 699 The sec_oid4 data type is not really opaque. 700 Instead contains an ASN.1 OBJECT IDENTIFIER as used 701 by GSS-API in the mech_type argument to 702 GSS_Init_sec_context. See [RFC2743] for details. 704 seqid4 typedef uint32_t seqid4; 705 Sequence identifier used for file locking 707 utf8string typedef opaque utf8string<>; 708 UTF-8 encoding for strings 710 verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 711 Verifier used for various operations (COMMIT, CREATE, 712 OPEN, READDIR, SETCLIENTID, SETCLIENTID_CONFIRM, WRITE) 713 NFS4_VERIFIER_SIZE is defined as 8. 715 2.2. Structured Data Types 717 nfstime4 718 struct nfstime4 { 719 int64_t seconds; 720 uint32_t nseconds; 721 } 723 The nfstime4 structure gives the number of seconds and 724 nanoseconds since midnight or 0 hour January 1, 1970 Coordinated 725 Universal Time (UTC). Values greater than zero for the seconds 726 field denote dates after the 0 hour January 1, 1970. Values 727 less than zero for the seconds field denote dates before the 0 728 hour January 1, 1970. In both cases, the nseconds field is to 729 be added to the seconds field for the final time representation. 730 For example, if the time to be represented is one-half second 732 Draft Specification NFS version 4 Protocol September 2002 734 before 0 hour January 1, 1970, the seconds field would have a 735 value of negative one (-1) and the nseconds fields would have a 736 value of one-half second (500000000). Values greater than 737 999,999,999 for nseconds are considered invalid. 739 This data type is used to pass time and date information. A 740 server converts to and from its local representation of time 741 when processing time values, preserving as much accuracy as 742 possible. If the precision of timestamps stored for a filesystem 743 object is less than defined, loss of precision can occur. An 744 adjunct time maintenance protocol is recommended to reduce 745 client and server time skew. 747 time_how4 749 enum time_how4 { 750 SET_TO_SERVER_TIME4 = 0, 751 SET_TO_CLIENT_TIME4 = 1 752 }; 754 settime4 756 union settime4 switch (time_how4 set_it) { 757 case SET_TO_CLIENT_TIME4: 758 nfstime4 time; 759 default: 760 void; 761 }; 763 The above definitions are used as the attribute definitions to 764 set time values. If set_it is SET_TO_SERVER_TIME4, then the 765 server uses its local representation of time for the time value. 767 specdata4 769 struct specdata4 { 770 uint32_t specdata1; /* major device number */ 771 uint32_t specdata2; /* minor device number */ 772 }; 774 This data type represents additional information for the device 775 file types NF4CHR and NF4BLK. 777 fsid4 779 struct fsid4 { 780 uint64_t major; 781 uint64_t minor; 783 Draft Specification NFS version 4 Protocol September 2002 785 }; 787 This type is the filesystem identifier that is used as a 788 mandatory attribute. 790 fs_location4 792 struct fs_location4 { 793 utf8string server<>; 794 pathname4 rootpath; 795 }; 797 fs_locations4 799 struct fs_locations4 { 800 pathname4 fs_root; 801 fs_location4 locations<>; 802 }; 804 The fs_location4 and fs_locations4 data types are used for the 805 fs_locations recommended attribute which is used for migration 806 and replication support. 808 fattr4 810 struct fattr4 { 811 bitmap4 attrmask; 812 attrlist4 attr_vals; 813 }; 815 The fattr4 structure is used to represent file and directory 816 attributes. 818 The bitmap is a counted array of 32 bit integers used to contain 819 bit values. The position of the integer in the array that 820 contains bit n can be computed from the expression (n / 32) and 821 its bit within that integer is (n mod 32). 823 0 1 824 +-----------+-----------+-----------+-- 825 | count | 31 .. 0 | 63 .. 32 | 826 +-----------+-----------+-----------+-- 828 change_info4 830 struct change_info4 { 831 bool atomic; 832 changeid4 before; 834 Draft Specification NFS version 4 Protocol September 2002 836 changeid4 after; 837 }; 839 This structure is used with the CREATE, LINK, REMOVE, RENAME 840 operations to let the client know the value of the change 841 attribute for the directory in which the target filesystem 842 object resides. 844 clientaddr4 846 struct clientaddr4 { 847 /* see struct rpcb in RFC1833 */ 848 string r_netid<>; /* network id */ 849 string r_addr<>; /* universal address */ 850 }; 852 The clientaddr4 structure is used as part of the SETCLIENTID 853 operation to either specify the address of the client that is 854 using a clientid or as part of the callback registration. The 855 r_netid and r_addr fields are specified in [RFC1833], but they 856 are underspecified in [RFC1833] as far as what they should look 857 like for specific protocols. 859 For TCP over IPv4 and for UDP over IPv4, the format of r_addr is 860 the US-ASCII string: 862 h1.h2.h3.h4.p1.p2 864 The prefix, "h1.h2.h3.h4", is the standard textual form for 865 representing an IPv4 address, which is always four octets long. 866 Assuming big-endian ordering, h1, h2, h3, and h4, are 867 respectively, the first through fourth octets each converted to 868 ASCII-decimal. Assuming big-endian ordering, p1 and p2 are, 869 respectively, the first and second octets each converted to 870 ASCII-decimal. For example, if a host, in big-endian order, has 871 an address of 0x0A010307 and there is a service listening on, in 872 big endian order, port 0x020F (decimal 527), then complete 873 universal address is "10.1.3.7.2.15". 875 For TCP over IPv4 the value of r_netid is the string "tcp". For 876 UDP over IPv4 the value of r_netid is the string "udp". 878 For TCP over IPv4 and for UDP over IPv6, the format of r_addr is 879 the US-ASCII string: 881 x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 883 The suffix "p1.p2" is the service port, and is computed the same 884 way as with universal addresses for TCP and UDP over IPv4. The 885 prefix, "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form 886 for representing an IPv6 address as defined in Section 2.2 of 888 Draft Specification NFS version 4 Protocol September 2002 890 [RFC1884]. Additionally, the two alternative forms specified in 891 Section 2.2 of [RFC1884] are also acceptable. 893 For TCP over IPv6 the value of r_netid is the string "tcp6". 894 For UDP over IPv6 the value of r_netid is the string "udp6". 896 cb_client4 898 struct cb_client4 { 899 unsigned int cb_program; 900 clientaddr4 cb_location; 901 }; 903 This structure is used by the client to inform the server of its 904 call back address; includes the program number and client 905 address. 907 nfs_client_id4 909 struct nfs_client_id4 { 910 verifier4 verifier; 911 opaque id; 912 }; 914 This structure is part of the arguments to the SETCLIENTID 915 operation. NFS4_OPAQUE_LIMIT is defined as 1024. 917 open_owner4 919 struct open_owner4 { 920 clientid4 clientid; 921 opaque owner; 922 }; 924 This structure is used to identify the owner of open state. 925 NFS4_OPAQUE_LIMIT is defined as 1024. 927 lock_owner4 929 struct lock_owner4 { 930 clientid4 clientid; 931 opaque owner; 932 }; 934 This structure is used to identify the owner of file locking 935 state. NFS4_OPAQUE_LIMIT is defined as 1024. 937 Draft Specification NFS version 4 Protocol September 2002 939 open_to_lock_owner4 941 struct open_to_lock_owner4 { 942 seqid4 open_seqid; 943 stateid4 open_stateid; 944 seqid4 lock_seqid; 945 lock_owner4 lock_owner; 946 }; 948 This structure is used for the first LOCK operation done for an 949 open_owner4. It provides both the open_stateid and lock_owner 950 such that the transition is made from a valid open_stateid 951 sequence to that of the new lock_stateid sequence. Using this 952 mechanism avoids the confirmation of the lock_owner/lock_seqid 953 pair since it is tied to established state in the form of the 954 open_stateid/open_seqid. 956 stateid4 958 struct stateid4 { 959 uint32_t seqid; 960 opaque other[12]; 961 }; 963 This structure is used for the various state sharing mechanisms 964 between the client and server. For the client, this data 965 structure is read-only. The starting value of the seqid field 966 is undefined. The server is required to increment the seqid 967 field monotonically at each transition of the stateid. This is 968 important since the client will inspect the seqid in OPEN 969 stateids to determine the order of OPEN processing done by the 970 server. 972 Draft Specification NFS version 4 Protocol September 2002 974 3. RPC and Security Flavor 976 The NFS version 4 protocol is a Remote Procedure Call (RPC) 977 application that uses RPC version 2 and the corresponding eXternal 978 Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The 979 RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as 980 the mechanism to deliver stronger security for the NFS version 4 981 protocol. 983 3.1. Ports and Transports 985 Historically, NFS version 2 and version 3 servers have resided on 986 port 2049. The registered port 2049 [RFC1700] for the NFS protocol 987 should be the default configuration. Using the registered port for 988 NFS services means the NFS client will not need to use the RPC 989 binding protocols as described in [RFC1833]; this will allow NFS to 990 transit firewalls. 992 Where an NFS version 4 implementation supports operation over the IP 993 network protocol, the supported transports between NFS and IP must be 994 among the IETF-approved congestion control transport protocols, which 995 include TCP and SCTP. To enhance the possibilities for 996 interoperability, an NFS version 4 implementation SHOULD support 997 operation over the TCP transport protocol. 999 If TCP is used as the transport, the client and server SHOULD use 1000 persistent connections. This will prevent the weakening of TCP's 1001 congestion control via short lived connections and will improve 1002 performance for the WAN environment by eliminating the need for SYN 1003 handshakes. 1005 Note that for various timers, the client and server should avoid 1006 inadvertent synchronization of those timers. For further discussion 1007 of the general issue refer to [Floyd]. 1009 3.1.1. Client Retransmission Behavior 1011 When processing a request received over a reliable transport such as 1012 TCP, the NFS version 4 server MUST NOT silently drop the request, 1013 except if the transport connection has been broken. Given such a 1014 contract between NFS version 4 clients and servers, clients MUST NOT 1015 retry a request unless one or both of the following are true: 1017 o The transport connection has been broken 1019 o The procedure being retried is the NULL procedure 1021 Since reliable transports, such as TCP, do not always synchronously 1022 inform a peer when the other peer has broken the connection (for 1023 example, when an NFS server reboots), so the NFS version 4 client may 1025 Draft Specification NFS version 4 Protocol September 2002 1027 want to actively "probe" the connection to see if has been broken. 1028 Use of the NULL procedure is one recommended way to do so. So, when 1029 a client experiences a remote procedure call timeout (of some 1030 arbitrary implementation specific amount), rather than retrying the 1031 remote procedure call, it could instead issue a NULL procedure call 1032 to the server. If the server has died, the transport connection break 1033 will eventually be indicated to the NFS version 4 client. The client 1034 can then reconnect, and then retry the original request. If the NULL 1035 procedure call gets a response, the connection has not broken. The 1036 client can decide to wait longer for the original request's response, 1037 or it can break the transport connection and reconnect before re- 1038 sending the original request. 1040 For callbacks from the server to the client, the same rules apply, 1041 but the server doing the callback becomes the client, and the client 1042 receiving the callback becomes the server. 1044 3.2. Security Flavors 1046 Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, 1047 AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an 1048 additional security flavor of RPCSEC_GSS has been introduced which 1049 uses the functionality of GSS-API [RFC2743]. This allows for the use 1050 of various security mechanisms by the RPC layer without the 1051 additional implementation overhead of adding RPC security flavors. 1052 For NFS version 4, the RPCSEC_GSS security flavor MUST be used to 1053 enable the mandatory security mechanism. Other flavors, such as, 1054 AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. 1056 3.2.1. Security mechanisms for NFS version 4 1058 The use of RPCSEC_GSS requires selection of: mechanism, quality of 1059 protection, and service (authentication, integrity, privacy). The 1060 remainder of this document will refer to these three parameters of 1061 the RPCSEC_GSS security as the security triple. 1063 3.2.1.1. Kerberos V5 as a security triple 1065 The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be 1066 implemented and provide the following security triples. 1068 column descriptions: 1070 1 == number of pseudo flavor 1071 2 == name of pseudo flavor 1072 3 == mechanism's OID 1073 4 == mechanism's algorithm(s) 1074 5 == RPCSEC_GSS service 1076 1 2 3 4 5 1077 ----------------------------------------------------------------------- 1079 Draft Specification NFS version 4 Protocol September 2002 1081 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none 1082 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity 1083 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy 1084 for integrity, 1085 and 56 bit DES 1086 for privacy. 1088 Note that the pseudo flavor is presented here as a mapping aid to the 1089 implementor. Because this NFS protocol includes a method to 1090 negotiate security and it understands the GSS-API mechanism, the 1091 pseudo flavor is not needed. The pseudo flavor is needed for NFS 1092 version 3 since the security negotiation is done via the MOUNT 1093 protocol. 1095 For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please 1096 see [RFC2623]. 1098 3.2.1.2. LIPKEY as a security triple 1100 The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be 1101 implemented and provide the following security triples. The 1102 definition of the columns matches the previous subsection "Kerberos 1103 V5 as security triple" 1105 1 2 3 4 5 1106 ----------------------------------------------------------------------- 1107 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none 1108 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity 1109 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy 1111 The mechanism algorithm is listed as "negotiated". This is because 1112 LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the 1113 confidentiality and integrity algorithms are negotiated. Since 1114 SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit 1115 cast5CBC for confidentiality for privacy as MANDATORY, and further 1116 specifies that HMAC-MD5 and cast5CBC MUST be listed first before 1117 weaker algorithms, specifying "negotiated" in column 4 does not 1118 impair interoperability. In the event an SPKM-3 peer does not 1119 support the mandatory algorithms, the other peer is free to accept or 1120 reject the GSS-API context creation. 1122 Because SPKM-3 negotiates the algorithms, subsequent calls to 1123 LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality 1124 of protection value of 0 (zero). See section 5.2 of [RFC2025] for an 1125 explanation. 1127 LIPKEY uses SPKM-3 to create a secure channel in which to pass a user 1128 name and password from the client to the server. Once the user name 1129 and password have been accepted by the server, calls to the LIPKEY 1130 context are redirected to the SPKM-3 context. See [RFC2847] for more 1131 details. 1133 Draft Specification NFS version 4 Protocol September 2002 1135 3.2.1.3. SPKM-3 as a security triple 1137 The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be 1138 implemented and provide the following security triples. The 1139 definition of the columns matches the previous subsection "Kerberos 1140 V5 as security triple". 1142 1 2 3 4 5 1143 ----------------------------------------------------------------------- 1144 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none 1145 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity 1146 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy 1148 For a discussion as to why the mechanism algorithm is listed as 1149 "negotiated", see the previous section "LIPKEY as a security triple." 1151 Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- 1152 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of 1153 protection value of 0 (zero). See section 5.2 of [RFC2025] for an 1154 explanation. 1156 Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a 1157 mandatory set of triples to handle the situations where the initiator 1158 (the client) is anonymous or where the initiator has its own 1159 certificate. If the initiator is anonymous, there will not be a user 1160 name and password to send to the target (the server). If the 1161 initiator has its own certificate, then using passwords is 1162 superfluous. 1164 3.3. Security Negotiation 1166 With the NFS version 4 server potentially offering multiple security 1167 mechanisms, the client needs a method to determine or negotiate which 1168 mechanism is to be used for its communication with the server. The 1169 NFS server may have multiple points within its filesystem name space 1170 that are available for use by NFS clients. In turn the NFS server 1171 may be configured such that each of these entry points may have 1172 different or multiple security mechanisms in use. 1174 The security negotiation between client and server must be done with 1175 a secure channel to eliminate the possibility of a third party 1176 intercepting the negotiation sequence and forcing the client and 1177 server to choose a lower level of security than required or desired. 1178 See the section "Security Considerations" for further discussion. 1180 3.3.1. SECINFO 1182 The new SECINFO operation will allow the client to determine, on a 1183 per filehandle basis, what security triple is to be used for server 1184 access. In general, the client will not have to use the SECINFO 1186 Draft Specification NFS version 4 Protocol September 2002 1188 operation except during initial communication with the server or when 1189 the client crosses policy boundaries at the server. It is possible 1190 that the server's policies change during the client's interaction 1191 therefore forcing the client to negotiate a new security triple. 1193 3.3.2. Security Error 1195 Based on the assumption that each NFS version 4 client and server 1196 must support a minimum set of security (i.e. LIPKEY, SPKM-3, and 1197 Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its 1198 communication with the server with one of the minimal security 1199 triples. During communication with the server, the client may 1200 receive an NFS error of NFS4ERR_WRONGSEC. This error allows the 1201 server to notify the client that the security triple currently being 1202 used is not appropriate for access to the server's filesystem 1203 resources. The client is then responsible for determining what 1204 security triples are available at the server and choose one which is 1205 appropriate for the client. See the section for the "SECINFO" 1206 operation for further discussion of how the client will respond to 1207 the NFS4ERR_WRONGSEC error and use SECINFO. 1209 3.4. Callback RPC Authentication 1211 Except as noted elsewhere in this section, the callback RPC 1212 (described later) MUST mutually authenticate the NFS server to the 1213 principal that acquired the clientid (also described later), using 1214 the security flavor the original SETCLIENTID operation used. 1216 For AUTH_NONE, there are no principals, so this is a non-issue. 1218 AUTH_SYS has no notions of mutual authentication or a server 1219 principal, so the callback from the server simply uses the AUTH_SYS 1220 credential that the user used when he set up the delegation. 1222 For AUTH_DH, one commonly used convention is that the server uses the 1223 credential corresponding to this AUTH_DH principal: 1225 unix.host@domain 1227 where host and domain are variables corresponding to the name of 1228 server host and directory services domain in which it lives such as a 1229 Network Information System domain or a DNS domain. 1231 Because LIPKEY is layered over SPKM-3, it is permissible for the 1232 server to use SPKM-3 and not LIPKEY for the callback even if the 1233 client used LIPKEY for SETCLIENTID. 1235 Regardless of what security mechanism under RPCSEC_GSS is being used, 1236 the NFS server, MUST identify itself in GSS-API via a 1237 GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE 1239 Draft Specification NFS version 4 Protocol September 2002 1241 names are of the form: 1243 service@hostname 1245 For NFS, the "service" element is 1247 nfs 1249 Implementations of security mechanisms will convert nfs@hostname to 1250 various different forms. For Kerberos V5 and LIPKEY, the following 1251 form is RECOMMENDED: 1253 nfs/hostname 1255 For Kerberos V5, nfs/hostname would be a server principal in the 1256 Kerberos Key Distribution Center database. For LIPKEY, this would be 1257 the username passed to the target (the NFS version 4 client that 1258 receives the callback). 1260 It should be noted that LIPKEY may not work for callbacks, since the 1261 LIPKEY client uses a user id/password. If the NFS client receiving 1262 the callback can authenticate the NFS server's user name/password 1263 pair, and if the user that the NFS server is authenticating to has a 1264 public key certificate, then it works. 1266 In situations where the NFS client uses LIPKEY and uses a per-host 1267 principal for the SETCLIENTID operation, instead of using LIPKEY for 1268 SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication 1269 be used. This effectively means that the client will use a 1270 certificate to authenticate and identify the initiator to the target 1271 on the NFS server. Using SPKM-3 and not LIPKEY has the following 1272 advantages: 1274 o When the server does a callback, it must authenticate to the 1275 principal used in the SETCLIENTID. Even if LIPKEY is used, 1276 because LIPKEY is layered over SPKM-3, the NFS client will need 1277 to have a certificate that corresponds to the principal used in 1278 the SETCLIENTID operation. From an administrative perspective, 1279 having a user name, password, and certificate for both the 1280 client and server is redundant. 1282 o LIPKEY was intended to minimize additional infrastructure 1283 requirements beyond a certificate for the target, and the 1284 expectation is that existing password infrastructure can be 1285 leveraged for the initiator. In some environments, a per-host 1286 password does not exist yet. If certificates are used for any 1287 per-host principals, then additional password infrastructure is 1288 not needed. 1290 o In cases when a host is both an NFS client and server, it can 1291 share the same per-host certificate. 1293 Draft Specification NFS version 4 Protocol September 2002 1295 4. Filehandles 1297 The filehandle in the NFS protocol is a per server unique identifier 1298 for a filesystem object. The contents of the filehandle are opaque 1299 to the client. Therefore, the server is responsible for translating 1300 the filehandle to an internal representation of the filesystem 1301 object. 1303 4.1. Obtaining the First Filehandle 1305 The operations of the NFS protocol are defined in terms of one or 1306 more filehandles. Therefore, the client needs a filehandle to 1307 initiate communication with the server. With the NFS version 2 1308 protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there 1309 exists an ancillary protocol to obtain this first filehandle. The 1310 MOUNT protocol, RPC program number 100005, provides the mechanism of 1311 translating a string based filesystem path name to a filehandle which 1312 can then be used by the NFS protocols. 1314 The MOUNT protocol has deficiencies in the area of security and use 1315 via firewalls. This is one reason that the use of the public 1316 filehandle was introduced in [RFC2054] and [RFC2055]. With the use 1317 of the public filehandle in combination with the LOOKUP operation in 1318 the NFS version 2 and 3 protocols, it has been demonstrated that the 1319 MOUNT protocol is unnecessary for viable interaction between NFS 1320 client and server. 1322 Therefore, the NFS version 4 protocol will not use an ancillary 1323 protocol for translation from string based path names to a 1324 filehandle. Two special filehandles will be used as starting points 1325 for the NFS client. 1327 4.1.1. Root Filehandle 1329 The first of the special filehandles is the ROOT filehandle. The 1330 ROOT filehandle is the "conceptual" root of the filesystem name space 1331 at the NFS server. The client uses or starts with the ROOT 1332 filehandle by employing the PUTROOTFH operation. The PUTROOTFH 1333 operation instructs the server to set the "current" filehandle to the 1334 ROOT of the server's file tree. Once this PUTROOTFH operation is 1335 used, the client can then traverse the entirety of the server's file 1336 tree with the LOOKUP operation. A complete discussion of the server 1337 name space is in the section "NFS Server Name Space". 1339 4.1.2. Public Filehandle 1341 The second special filehandle is the PUBLIC filehandle. Unlike the 1342 ROOT filehandle, the PUBLIC filehandle may be bound or represent an 1343 arbitrary filesystem object at the server. The server is responsible 1345 Draft Specification NFS version 4 Protocol September 2002 1347 for this binding. It may be that the PUBLIC filehandle and the ROOT 1348 filehandle refer to the same filesystem object. However, it is up to 1349 the administrative software at the server and the policies of the 1350 server administrator to define the binding of the PUBLIC filehandle 1351 and server filesystem object. The client may not make any 1352 assumptions about this binding. The client uses the PUBLIC filehandle 1353 via the PUTPUBFH operation. 1355 4.2. Filehandle Types 1357 In the NFS version 2 and 3 protocols, there was one type of 1358 filehandle with a single set of semantics. This type of filehandle 1359 is termed "persistent" in NFS Version 4. The semantics of a 1360 persistent filehandle remain the same as before. A new type of 1361 filehandle introduced in NFS Version 4 is the "volatile" filehandle, 1362 which attempts to accommodate certain server environments. 1364 The volatile filehandle type was introduced to address server 1365 functionality or implementation issues which make correct 1366 implementation of a persistent filehandle infeasible. Some server 1367 environments do not provide a filesystem level invariant that can be 1368 used to construct a persistent filehandle. The underlying server 1369 filesystem may not provide the invariant or the server's filesystem 1370 programming interfaces may not provide access to the needed 1371 invariant. Volatile filehandles may ease the implementation of 1372 server functionality such as hierarchical storage management or 1373 filesystem reorganization or migration. However, the volatile 1374 filehandle increases the implementation burden for the client. 1376 Since the client will need to handle persistent and volatile 1377 filehandles differently, a file attribute is defined which may be 1378 used by the client to determine the filehandle types being returned 1379 by the server. 1381 4.2.1. General Properties of a Filehandle 1383 The filehandle contains all the information the server needs to 1384 distinguish an individual file. To the client, the filehandle is 1385 opaque. The client stores filehandles for use in a later request and 1386 can compare two filehandles from the same server for equality by 1387 doing a byte-by-byte comparison. However, the client MUST NOT 1388 otherwise interpret the contents of filehandles. If two filehandles 1389 from the same server are equal, they MUST refer to the same file. 1390 Servers SHOULD try to maintain a one-to-one correspondence between 1391 filehandles and files but this is not required. Clients MUST use 1392 filehandle comparisons only to improve performance, not for correct 1393 behavior. All clients need to be prepared for situations in which it 1394 cannot be determined whether two filehandles denote the same object 1395 and in such cases, avoid making invalid assumptions which might cause 1396 incorrect behavior. Further discussion of filehandle and attribute 1398 Draft Specification NFS version 4 Protocol September 2002 1400 comparison in the context of data caching is presented in the section 1401 "Data Caching and File Identity". 1403 As an example, in the case that two different path names when 1404 traversed at the server terminate at the same filesystem object, the 1405 server SHOULD return the same filehandle for each path. This can 1406 occur if a hard link is used to create two file names which refer to 1407 the same underlying file object and associated data. For example, if 1408 paths /a/b/c and /a/d/c refer to the same file, the server SHOULD 1409 return the same filehandle for both path names traversals. 1411 4.2.2. Persistent Filehandle 1413 A persistent filehandle is defined as having a fixed value for the 1414 lifetime of the filesystem object to which it refers. Once the 1415 server creates the filehandle for a filesystem object, the server 1416 MUST accept the same filehandle for the object for the lifetime of 1417 the object. If the server restarts or reboots the NFS server must 1418 honor the same filehandle value as it did in the server's previous 1419 instantiation. Similarly, if the filesystem is migrated, the new NFS 1420 server must honor the same filehandle as the old NFS server. 1422 The persistent filehandle will be become stale or invalid when the 1423 filesystem object is removed. When the server is presented with a 1424 persistent filehandle that refers to a deleted object, it MUST return 1425 an error of NFS4ERR_STALE. A filehandle may become stale when the 1426 filesystem containing the object is no longer available. The file 1427 system may become unavailable if it exists on removable media and the 1428 media is no longer available at the server or the filesystem in whole 1429 has been destroyed or the filesystem has simply been removed from the 1430 server's name space (i.e. unmounted in a UNIX environment). 1432 4.2.3. Volatile Filehandle 1434 A volatile filehandle does not share the same longevity 1435 characteristics of a persistent filehandle. The server may determine 1436 that a volatile filehandle is no longer valid at many different 1437 points in time. If the server can definitively determine that a 1438 volatile filehandle refers to an object that has been removed, the 1439 server should return NFS4ERR_STALE to the client (as is the case for 1440 persistent filehandles). In all other cases where the server 1441 determines that a volatile filehandle can no longer be used, it 1442 should return an error of NFS4ERR_FHEXPIRED. 1444 The mandatory attribute "fh_expire_type" is used by the client to 1445 determine what type of filehandle the server is providing for a 1446 particular filesystem. This attribute is a bitmask with the 1447 following values: 1449 Draft Specification NFS version 4 Protocol September 2002 1451 FH4_PERSISTENT 1452 The value of FH4_PERSISTENT is used to indicate a persistent 1453 filehandle, which is valid until the object is removed from the 1454 filesystem. The server will not return NFS4ERR_FHEXPIRED for 1455 this filehandle. FH4_PERSISTENT is defined as a value in which 1456 none of the bits specified below are set. 1458 FH4_VOLATILE_ANY 1459 The filehandle may expire at any time, except as specifically 1460 excluded (i.e. FH4_NO_EXPIRE_WITH_OPEN). 1462 FH4_NOEXPIRE_WITH_OPEN 1463 May only be set when FH4_VOLATILE_ANY is set. If this bit is 1464 set, then the meaning of FH4_VOLATILE_ANY is qualified to 1465 exclude any expiration of the filehandle when it is open. 1467 FH4_VOL_MIGRATION 1468 The filehandle will expire as a result of migration. If 1469 FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. 1471 FH4_VOL_RENAME 1472 The filehandle will expire during rename. This includes a 1473 rename by the requesting client or a rename by any other client. 1474 If FH4_VOL_ANY is set, FH4_VOL_RENAME is redundant. 1476 Servers which provide volatile filehandles that may expire while 1477 open (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if 1478 FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), 1479 should deny a RENAME or REMOVE that would affect an OPEN file of 1480 any of the components leading to the OPEN file. In addition, 1481 the server should deny all RENAME or REMOVE requests during the 1482 grace period upon server restart. 1484 Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow 1485 the client to determine that expiration has occurred whenever a 1486 specific event occurs, without an explicit filehandle expiration 1487 error from the server. FH4_VOL_ANY does not provide this form 1488 of information. In situations where the server will expire many, 1489 but not all filehandles upon migration (e.g. all but those that 1490 are open), FH4_VOLATILE_ANY (in this case with 1491 FH4_NOEXPIRE_WITH_OPEN) is a better choice since the client may 1492 not assume that all filehandles will expire when migration 1493 occurs, and it is likely that additional expirations will occur 1494 (as a result of file CLOSE) that are separated in time from the 1495 migration event itself. 1497 4.2.4. One Method of Constructing a Volatile Filehandle 1499 A volatile filehandle, while opaque to the client could contain: 1501 [volatile bit = 1 | server boot time | slot | generation number] 1503 Draft Specification NFS version 4 Protocol September 2002 1505 o slot is an index in the server volatile filehandle table 1507 o generation number is the generation number for the table 1508 entry/slot 1510 When the client presents a volatile filehandle, the server makes the 1511 following checks, which assume that the check for the volatile bit 1512 has passed. If the server boot time is less than the current server 1513 boot time, return NFS4ERR_FHEXPIRED. If slot is out of range, return 1514 NFS4ERR_BADHANDLE. If the generation number does not match, return 1515 NFS4ERR_FHEXPIRED. 1517 When the server reboots, the table is gone (it is volatile). 1519 If volatile bit is 0, then it is a persistent filehandle with a 1520 different structure following it. 1522 4.3. Client Recovery from Filehandle Expiration 1524 If possible, the client SHOULD recover from the receipt of an 1525 NFS4ERR_FHEXPIRED error. The client must take on additional 1526 responsibility so that it may prepare itself to recover from the 1527 expiration of a volatile filehandle. If the server returns 1528 persistent filehandles, the client does not need these additional 1529 steps. 1531 For volatile filehandles, most commonly the client will need to store 1532 the component names leading up to and including the filesystem object 1533 in question. With these names, the client should be able to recover 1534 by finding a filehandle in the name space that is still available or 1535 by starting at the root of the server's filesystem name space. 1537 If the expired filehandle refers to an object that has been removed 1538 from the filesystem, obviously the client will not be able to recover 1539 from the expired filehandle. 1541 It is also possible that the expired filehandle refers to a file that 1542 has been renamed. If the file was renamed by another client, again 1543 it is possible that the original client will not be able to recover. 1544 However, in the case that the client itself is renaming the file and 1545 the file is open, it is possible that the client may be able to 1546 recover. The client can determine the new path name based on the 1547 processing of the rename request. The client can then regenerate the 1548 new filehandle based on the new path name. The client could also use 1549 the compound operation mechanism to construct a set of operations 1550 like: 1551 RENAME A B 1552 LOOKUP B 1553 GETFH 1554 Note that the COMPOUND procedure does not provide atomicity. This 1555 example only reduces the overhead of recovering from an expired 1557 Draft Specification NFS version 4 Protocol September 2002 1559 filehandle. 1561 Draft Specification NFS version 4 Protocol September 2002 1563 5. File Attributes 1565 To meet the requirements of extensibility and increased 1566 interoperability with non-UNIX platforms, attributes must be handled 1567 in a flexible manner. The NFS version 3 fattr3 structure contains a 1568 fixed list of attributes that not all clients and servers are able to 1569 support or care about. The fattr3 structure can not be extended as 1570 new needs arise and it provides no way to indicate non-support. With 1571 the NFS version 4 protocol, the client is able query what attributes 1572 the server supports and construct requests with only those supported 1573 attributes (or a subset thereof). 1575 To this end, attributes are divided into three groups: mandatory, 1576 recommended, and named. Both mandatory and recommended attributes 1577 are supported in the NFS version 4 protocol by a specific and well- 1578 defined encoding and are identified by number. They are requested by 1579 setting a bit in the bit vector sent in the GETATTR request; the 1580 server response includes a bit vector to list what attributes were 1581 returned in the response. New mandatory or recommended attributes 1582 may be added to the NFS protocol between major revisions by 1583 publishing a standards-track RFC which allocates a new attribute 1584 number value and defines the encoding for the attribute. See the 1585 section "Minor Versioning" for further discussion. 1587 Named attributes are accessed by the new OPENATTR operation, which 1588 accesses a hidden directory of attributes associated with a file 1589 system object. OPENATTR takes a filehandle for the object and 1590 returns the filehandle for the attribute hierarchy. The filehandle 1591 for the named attributes is a directory object accessible by LOOKUP 1592 or READDIR and contains files whose names represent the named 1593 attributes and whose data bytes are the value of the attribute. For 1594 example: 1596 LOOKUP "foo" ; look up file 1597 GETATTR attrbits 1598 OPENATTR ; access foo's named attributes 1599 LOOKUP "x11icon" ; look up specific attribute 1600 READ 0,4096 ; read stream of bytes 1602 Named attributes are intended for data needed by applications rather 1603 than by an NFS client implementation. NFS implementors are strongly 1604 encouraged to define their new attributes as recommended attributes 1605 by bringing them to the IETF standards-track process. 1607 The set of attributes which are classified as mandatory is 1608 deliberately small since servers must do whatever it takes to support 1609 them. A server should support as many of the recommended attributes 1610 as possible but by their definition, the server is not required to 1611 support all of them. Attributes are deemed mandatory if the data is 1612 both needed by a large number of clients and is not otherwise 1614 Draft Specification NFS version 4 Protocol September 2002 1616 reasonably computable by the client when support is not provided on 1617 the server. 1619 Note that the hidden directory returned by OPENATTR is a convenience 1620 for protocol processing. The client should not make any assumptions 1621 about the server's implementation of named attributes and whether the 1622 underlying filesystem at the server has a named attribute directory 1623 or not. Therefore, operations such as SETATTR and GETATTR on the 1624 named attribute directory are undefined. 1626 5.1. Mandatory Attributes 1628 These MUST be supported by every NFS version 4 client and server in 1629 order to ensure a minimum level of interoperability. The server must 1630 store and return these attributes and the client must be able to 1631 function with an attribute set limited to these attributes. With 1632 just the mandatory attributes some client functionality may be 1633 impaired or limited in some ways. A client may ask for any of these 1634 attributes to be returned by setting a bit in the GETATTR request and 1635 the server must return their value. 1637 5.2. Recommended Attributes 1639 These attributes are understood well enough to warrant support in the 1640 NFS version 4 protocol. However, they may not be supported on all 1641 clients and servers. A client may ask for any of these attributes to 1642 be returned by setting a bit in the GETATTR request but must handle 1643 the case where the server does not return them. A client may ask for 1644 the set of attributes the server supports and should not request 1645 attributes the server does not support. A server should be tolerant 1646 of requests for unsupported attributes and simply not return them 1647 rather than considering the request an error. It is expected that 1648 servers will support all attributes they comfortably can and only 1649 fail to support attributes which are difficult to support in their 1650 operating environments. A server should provide attributes whenever 1651 they don't have to "tell lies" to the client. For example, a file 1652 modification time should be either an accurate time or should not be 1653 supported by the server. This will not always be comfortable to 1654 clients but the client is better positioned decide whether and how to 1655 fabricate or construct an attribute or whether to do without the 1656 attribute. 1658 5.3. Named Attributes 1660 These attributes are not supported by direct encoding in the NFS 1661 Version 4 protocol but are accessed by string names rather than 1662 numbers and correspond to an uninterpreted stream of bytes which are 1663 stored with the filesystem object. The name space for these 1665 Draft Specification NFS version 4 Protocol September 2002 1667 attributes may be accessed by using the OPENATTR operation. The 1668 OPENATTR operation returns a filehandle for a virtual "attribute 1669 directory" and further perusal of the name space may be done using 1670 READDIR and LOOKUP operations on this filehandle. Named attributes 1671 may then be examined or changed by normal READ and WRITE and CREATE 1672 operations on the filehandles returned from READDIR and LOOKUP. 1673 Named attributes may have attributes. 1675 It is recommended that servers support arbitrary named attributes. A 1676 client should not depend on the ability to store any named attributes 1677 in the server's filesystem. If a server does support named 1678 attributes, a client which is also able to handle them should be able 1679 to copy a file's data and meta-data with complete transparency from 1680 one location to another; this would imply that names allowed for 1681 regular directory entries are valid for named attribute names as 1682 well. 1684 Names of attributes will not be controlled by this document or other 1685 IETF standards track documents. See the section "IANA 1686 Considerations" for further discussion. 1688 5.4. Classification of Attributes 1690 Each of the Mandatory and Recommended attributes can be classified in 1691 one of three categories: per server, per filesystem, or per 1692 filesystem object. Note that it is possible that some per filesystem 1693 attributes may vary within the filesystem. See the "homogeneous" 1694 attribute for its definition. Note that the attributes 1695 time_access_set and time_modify_set are not listed in this section 1696 because they are write-only attributes corresponding to time_access 1697 and time_modify, and are used in a special instance of SETATTR. 1699 o The per server attribute is: 1701 lease_time 1703 o The per filesystem attributes are: 1705 supp_attr, fh_expire_type, link_support, symlink_support, 1706 unique_handles, aclsupport, cansettime, case_insensitive, 1707 case_preserving, chown_restricted, files_avail, files_free, 1708 files_total, fs_locations, homogeneous, maxfilesize, maxname, 1709 maxread, maxwrite, no_trunc, space_avail, space_free, 1710 space_total, time_delta 1712 o The per filesystem object attributes are: 1714 type, change, size, named_attr, fsid, rdattr_error, filehandle, 1715 ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, 1716 owner, owner_group, rawdev, space_used, system, time_access, 1717 time_backup, time_create, time_metadata, time_modify, 1719 Draft Specification NFS version 4 Protocol September 2002 1721 mounted_on_fileid 1723 For quota_avail_hard, quota_avail_soft, and quota_used see their 1724 definitions below for the appropriate classification. 1726 Draft Specification NFS version 4 Protocol September 2002 1728 5.5. Mandatory Attributes - Definitions 1730 Name # DataType Access Description 1731 ___________________________________________________________________ 1732 supp_attr 0 bitmap READ The bit vector which 1733 would retrieve all 1734 mandatory and 1735 recommended attributes 1736 that are supported for 1737 this object. The 1738 scope of this 1739 attribute applies to 1740 all objects with a 1741 matching fsid. 1743 type 1 nfs4_ftype READ The type of the object 1744 (file, directory, 1745 symlink, etc.) 1747 fh_expire_type 2 uint32 READ Server uses this to 1748 specify filehandle 1749 expiration behavior to 1750 the client. See the 1751 section "Filehandles" 1752 for additional 1753 description. 1755 change 3 uint64 READ A value created by the 1756 server that the client 1757 can use to determine 1758 if file data, 1759 directory contents or 1760 attributes of the 1761 object have been 1762 modified. The server 1763 may return the 1764 object's time_metadata 1765 attribute for this 1766 attribute's value but 1767 only if the filesystem 1768 object can not be 1769 updated more 1770 frequently than the 1771 resolution of 1772 time_metadata. 1774 size 4 uint64 R/W The size of the object 1775 in bytes. 1777 Draft Specification NFS version 4 Protocol September 2002 1779 link_support 5 bool READ True, if the object's 1780 filesystem supports 1781 hard links. 1783 symlink_support 6 bool READ True, if the object's 1784 filesystem supports 1785 symbolic links. 1787 named_attr 7 bool READ True, if this object 1788 has named attributes. 1789 In other words, object 1790 has a non-empty named 1791 attribute directory. 1793 fsid 8 fsid4 READ Unique filesystem 1794 identifier for the 1795 filesystem holding 1796 this object. fsid 1797 contains major and 1798 minor components each 1799 of which are uint64. 1801 unique_handles 9 bool READ True, if two distinct 1802 filehandles guaranteed 1803 to refer to two 1804 different filesystem 1805 objects. 1807 lease_time 10 nfs_lease4 READ Duration of leases at 1808 server in seconds. 1810 rdattr_error 11 enum READ Error returned from 1811 getattr during 1812 readdir. 1814 filehandle 19 nfs_fh4 READ The filehandle of this 1815 object (primarily for 1816 readdir requests). 1818 Draft Specification NFS version 4 Protocol September 2002 1820 5.6. Recommended Attributes - Definitions 1822 Name # Data Type Access Description 1823 ______________________________________________________________________ 1824 ACL 12 nfsace4<> R/W The access control 1825 list for the object. 1827 aclsupport 13 uint32 READ Indicates what types 1828 of ACLs are supported 1829 on the current 1830 filesystem. 1832 archive 14 bool R/W True, if this file 1833 has been archived 1834 since the time of 1835 last modification 1836 (deprecated in favor 1837 of time_backup). 1839 cansettime 15 bool READ True, if the server 1840 able to change the 1841 times for a 1842 filesystem object as 1843 specified in a 1844 SETATTR operation. 1846 case_insensitive 16 bool READ True, if filename 1847 comparisons on this 1848 filesystem are case 1849 insensitive. 1851 case_preserving 17 bool READ True, if filename 1852 case on this 1853 filesystem are 1854 preserved. 1856 chown_restricted 18 bool READ If TRUE, the server 1857 will reject any 1858 request to change 1859 either the owner or 1860 the group associated 1861 with a file if the 1862 caller is not a 1863 privileged user (for 1864 example, "root" in 1865 UNIX operating 1866 environments or in 1867 Windows 2000 the 1868 "Take Ownership" 1869 privilege). 1871 Draft Specification NFS version 4 Protocol September 2002 1873 fileid 20 uint64 READ A number uniquely 1874 identifying the file 1875 within the 1876 filesystem. 1878 files_avail 21 uint64 READ File slots available 1879 to this user on the 1880 filesystem containing 1881 this object - this 1882 should be the 1883 smallest relevant 1884 limit. 1886 files_free 22 uint64 READ Free file slots on 1887 the filesystem 1888 containing this 1889 object - this should 1890 be the smallest 1891 relevant limit. 1893 files_total 23 uint64 READ Total file slots on 1894 the filesystem 1895 containing this 1896 object. 1898 fs_locations 24 fs_locations READ Locations where this 1899 filesystem may be 1900 found. If the server 1901 returns NFS4ERR_MOVED 1902 as an error, this 1903 attribute MUST be 1904 supported. 1906 hidden 25 bool R/W True, if the file is 1907 considered hidden 1908 with respect to the 1909 Windows API? 1911 homogeneous 26 bool READ True, if this 1912 object's filesystem 1913 is homogeneous, i.e. 1914 are per filesystem 1915 attributes the same 1916 for all filesystem's 1917 objects. 1919 maxfilesize 27 uint64 READ Maximum supported 1920 file size for the 1921 filesystem of this 1922 object. 1924 Draft Specification NFS version 4 Protocol September 2002 1926 maxlink 28 uint32 READ Maximum number of 1927 links for this 1928 object. 1930 maxname 29 uint32 READ Maximum filename size 1931 supported for this 1932 object. 1934 maxread 30 uint64 READ Maximum read size 1935 supported for this 1936 object. 1938 maxwrite 31 uint64 READ Maximum write size 1939 supported for this 1940 object. This 1941 attribute SHOULD be 1942 supported if the file 1943 is writable. Lack of 1944 this attribute can 1945 lead to the client 1946 either wasting 1947 bandwidth or not 1948 receiving the best 1949 performance. 1951 mimetype 32 utf8<> R/W MIME body 1952 type/subtype of this 1953 object. 1955 mode 33 mode4 R/W UNIX-style mode and 1956 permission bits for 1957 this object. 1959 no_trunc 34 bool READ True, if a name 1960 longer than name_max 1961 is used, an error be 1962 returned and name is 1963 not truncated. 1965 numlinks 35 uint32 READ Number of hard links 1966 to this object. 1968 owner 36 utf8<> R/W The string name of 1969 the owner of this 1970 object. 1972 owner_group 37 utf8<> R/W The string name of 1973 the group ownership 1974 of this object. 1976 Draft Specification NFS version 4 Protocol September 2002 1978 quota_avail_hard 38 uint64 READ For definition see 1979 "Quota Attributes" 1980 section below. 1982 quota_avail_soft 39 uint64 READ For definition see 1983 "Quota Attributes" 1984 section below. 1986 quota_used 40 uint64 READ For definition see 1987 "Quota Attributes" 1988 section below. 1990 rawdev 41 specdata4 READ Raw device 1991 identifier. UNIX 1992 device major/minor 1993 node information. If 1994 the value of type is 1995 not NF4BLK or NF4CHR, 1996 the value return 1997 SHOULD NOT be 1998 considered useful. 2000 space_avail 42 uint64 READ Disk space in bytes 2001 available to this 2002 user on the 2003 filesystem containing 2004 this object - this 2005 should be the 2006 smallest relevant 2007 limit. 2009 space_free 43 uint64 READ Free disk space in 2010 bytes on the 2011 filesystem containing 2012 this object - this 2013 should be the 2014 smallest relevant 2015 limit. 2017 space_total 44 uint64 READ Total disk space in 2018 bytes on the 2019 filesystem containing 2020 this object. 2022 space_used 45 uint64 READ Number of filesystem 2023 bytes allocated to 2024 this object. 2026 Draft Specification NFS version 4 Protocol September 2002 2028 system 46 bool R/W True, if this file is 2029 a "system" file with 2030 respect to the 2031 Windows API? 2033 time_access 47 nfstime4 READ The time of last 2034 access to the object 2035 by a read that was 2036 satisfied by the 2037 server. 2039 time_access_set 48 settime4 WRITE Set the time of last 2040 access to the object. 2041 SETATTR use only. 2043 time_backup 49 nfstime4 R/W The time of last 2044 backup of the object. 2046 time_create 50 nfstime4 R/W The time of creation 2047 of the object. This 2048 attribute does not 2049 have any relation to 2050 the traditional UNIX 2051 file attribute 2052 "ctime" or "change 2053 time". 2055 time_delta 51 nfstime4 READ Smallest useful 2056 server time 2057 granularity. 2059 time_metadata 52 nfstime4 READ The time of last 2060 meta-data 2061 modification of the 2062 object. 2064 time_modify 53 nfstime4 READ The time of last 2065 modification to the 2066 object. 2068 time_modify_set 54 settime4 WRITE Set the time of last 2069 modification to the 2070 object. SETATTR use 2071 only. 2073 mounted_on_fileid 55 uint64 READ Like fileid, but if 2074 the target filehandle 2075 is the root of a 2076 filesystem return the 2077 fileid of the 2078 underlying directory. 2080 Draft Specification NFS version 4 Protocol September 2002 2082 5.7. Time Access 2084 As defined above, the time_access attribute represents the time of 2085 last access to the object by a read that was satisfied by the server. 2086 The notion of what is an "access" depends on server's operating 2087 environment and/or the server's filesystem semantics. For example, 2088 for servers obeying POSIX semantics, time_access would be updated 2089 only by the READLINK, READ, and READDIR operations and not any of the 2090 operations that modify the content of the object. Of course, setting 2091 the corresponding time_access_set attribute is another way to modify 2092 the time_access attribute. 2094 Whenever the file object resides on a writable filesystem, the server 2095 should make best efforts to record time_access into stable storage. 2096 However, to mitigate the performance effects of doing so, and most 2097 especially whenever the server is satisfying the read of the object's 2098 content from its cache, the server MAY cache access time updates and 2099 lazily write them to stable storage. It is also acceptable to give 2100 administrators of the server the option to disable time_access 2101 updates. 2103 5.8. Interpreting owner and owner_group 2105 The recommended attributes "owner" and "owner_group" (and also users 2106 and groups within the "acl" attribute) are represented in terms of a 2107 UTF-8 string. To avoid a representation that is tied to a particular 2108 underlying implementation at the client or server, the use of the 2109 UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] 2110 provides additional rationale. It is expected that the client and 2111 server will have their own local representation of owner and 2112 owner_group that is used for local storage or presentation to the end 2113 user. Therefore, it is expected that when these attributes are 2114 transferred between the client and server that the local 2115 representation is translated to a syntax of the form 2116 "user@dns_domain". This will allow for a client and server that do 2117 not use the same local representation the ability to translate to a 2118 common syntax that can be interpreted by both. 2120 Similarly, security principals may be represented in different ways 2121 by different security mechanisms. Servers normally translate these 2122 representations into a common format, generally that used by local 2123 storage, to serve as a means of identifying the users corresponding 2124 to these security principals. When these local identifiers are 2125 translated to the form of the owner attribute, associated with files 2126 created by such principals they identify, in a common format, the 2127 users associated with each corresponding set of security principals. 2129 The translation used to interpret owner and group strings is not 2130 specified as part of the protocol. This allows various solutions to 2131 be employed. For example, a local translation table may be consulted 2132 that maps between a numeric id to the user@dns_domain syntax. A name 2134 Draft Specification NFS version 4 Protocol September 2002 2136 service may also be used to accomplish the translation. A server may 2137 provide a more general service, not limited by any particular 2138 translation (which would only translate a limited set of possible 2139 strings) by storing the owner and owner_group attributes in local 2140 storage without any translation or it may augment a translation 2141 method by storing the entire string for attributes for which no 2142 translation is available while using the local representation for 2143 those cases in which a translation is available. 2145 Servers that do not provide support for all possible values of the 2146 owner and owner_group attributes, should return an error 2147 (NFS4ERR_BADOWNER) when a string is presented that has no 2148 translation, as the value to be set for a SETATTR of the owner, 2149 owner_group, or acl attributes. When a server does accept an owner 2150 or owner_group value as valid on a SETATTR (and similarly for the 2151 owner and group strings in an acl), it is promising to return that 2152 same string when a corresponding GETATTR is done. Configuration 2153 changes and ill-constructed name translations (those that contain 2154 aliasing) may make that promise impossible to honor. Servers should 2155 make appropriate efforts to avoid a situation in which these 2156 attributes have their values changed when no real change to ownership 2157 has occurred. 2159 The "dns_domain" portion of the owner string is meant to be a DNS 2160 domain name. For example, user@ietf.org. Servers should accept as 2161 valid a set of users for at least one domain. A server may treat 2162 other domains as having no valid translations. A more general 2163 service is provided when a server is capable of accepting users for 2164 multiple domains, or for all domains, subject to security 2165 constraints. 2167 In the case where there is no translation available to the client or 2168 server, the attribute value must be constructed without the "@". 2169 Therefore, the absence of the @ from the owner or owner_group 2170 attribute signifies that no translation was available at the sender 2171 and that the receiver of the attribute should not use that string as 2172 a basis for translation into its own internal format. Even though 2173 the attribute value can not be translated, it may still be useful. 2174 In the case of a client, the attribute string may be used for local 2175 display of ownership. 2177 To provide a greater degree of compatibility with previous versions 2178 of NFS (i.e. v2 and v3), which identified users and groups by 32-bit 2179 unsigned uid's and gid's, owner and group strings that consist of 2180 decimal numeric values with no leading zeros can be given a special 2181 interpretation by clients and servers which choose to provide such 2182 support. The receiver may treat such a user or group string as 2183 representing the same user as would be represented by a v2/v3 uid or 2184 gid having the corresponding numeric value. A server is not 2185 obligated to accept such a string, but may return an NFS4ERR_BADOWNER 2186 instead. To avoid this mechanism being used to subvert user and 2187 group translation, so that a client might pass all of the owners and 2189 Draft Specification NFS version 4 Protocol September 2002 2191 groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER 2192 error when there is a valid translation for the user or owner 2193 designated in this way. In that case, the client must use the 2194 appropriate name@domain string and not the special form for 2195 compatibility. 2197 The owner string "nobody" may be used to designate an anonymous user, 2198 which will be associated with a file created by a security principal 2199 that cannot be mapped through normal means to the owner attribute. 2201 5.9. Character Case Attributes 2203 With respect to the case_insensitive and case_preserving attributes, 2204 each UCS-4 character (which UTF-8 encodes) has a "long descriptive 2205 name" [RFC1345] which may or may not included the word "CAPITAL" or 2206 "SMALL". The presence of SMALL or CAPITAL allows an NFS server to 2207 implement unambiguous and efficient table driven mappings for case 2208 insensitive comparisons, and non-case-preserving storage. For 2209 general character handling and internationalization issues, see the 2210 section "Internationalization". 2212 5.10. Quota Attributes 2214 For the attributes related to filesystem quotas, the following 2215 definitions apply: 2217 quota_avail_soft 2218 The value in bytes which represents the amount of additional 2219 disk space that can be allocated to this file or directory 2220 before the user may reasonably be warned. It is understood that 2221 this space may be consumed by allocations to other files or 2222 directories though there is a rule as to which other files or 2223 directories. 2225 quota_avail_hard 2226 The value in bytes which represent the amount of additional disk 2227 space beyond the current allocation that can be allocated to 2228 this file or directory before further allocations will be 2229 refused. It is understood that this space may be consumed by 2230 allocations to other files or directories. 2232 quota_used 2233 The value in bytes which represent the amount of disc space used 2234 by this file or directory and possibly a number of other similar 2235 files or directories, where the set of "similar" meets at least 2236 the criterion that allocating space to any file or directory in 2237 the set will reduce the "quota_avail_hard" of every other file 2238 or directory in the set. 2240 Draft Specification NFS version 4 Protocol September 2002 2242 Note that there may be a number of distinct but overlapping sets 2243 of files or directories for which a quota_used value is 2244 maintained. E.g. "all files with a given owner", "all files with 2245 a given group owner". etc. 2247 The server is at liberty to choose any of those sets but should 2248 do so in a repeatable way. The rule may be configured per- 2249 filesystem or may be "choose the set with the smallest quota". 2251 5.11. Access Control Lists 2253 The NFS version 4 ACL attribute is an array of access control entries 2254 (ACE). There are various access control entry types, as defined in 2255 the Section "ACE type". The server is able to communicate which ACE 2256 types are supported by returning the appropriate value within the 2257 aclsupport attribute. Each ACE covers one or more operations on a 2258 file or directory as described in the Section "ACE Access Mask". It 2259 may also contain one or more flags that modify the semantics of the 2260 ACE as defined in the Section "ACE flag". 2262 The NFS ACE attribute is defined as follows: 2264 typedef uint32_t acetype4; 2265 typedef uint32_t aceflag4; 2266 typedef uint32_t acemask4; 2268 struct nfsace4 { 2269 acetype4 type; 2270 aceflag4 flag; 2271 acemask4 access_mask; 2272 utf8string who; 2273 }; 2275 To determine if a request succeeds, each nfsace4 entry is processed 2276 in order by the server. Only ACEs which have a "who" that matches 2277 the requester are considered. Each ACE is processed until all of the 2278 bits of the requester's access have been ALLOWED. Once a bit (see 2279 below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer 2280 considered in the processing of later ACEs. If an ACCESS_DENIED_ACE 2281 is encountered where the requester's access still has unALLOWED bits 2282 in common with the "access_mask" of the ACE, the request is denied. 2283 However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT 2284 ACE types do not affect a requester's access, and instead are for 2285 triggering events as a result of a requester's access attempt. 2286 Therefore, all AUDIT and ALARM ACEs are processed until end of the 2287 ACL. When the ACL is fully processed, if there are bits in 2288 requester's mask that have not been considered whether the server 2289 allows or denies the access is undefined. If there is a mode 2290 attribute on the file, then this cannot happen, since the mode's 2292 Draft Specification NFS version 4 Protocol September 2002 2294 MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify 2295 the requester's access. 2297 The NFS version 4 ACL model is quite rich. Some server platforms may 2298 provide access control functionality that goes beyond the UNIX-style 2299 mode attribute, but which is not as rich as the NFS ACL model. So 2300 that users can take advantage of this more limited functionality, the 2301 server may indicate that it supports ACLs as long as it follows the 2302 guidelines for mapping between its ACL model and the NFS version 4 2303 ACL model. 2305 The situation is complicated by the fact that a server may have 2306 multiple modules that enforce ACLs. For example, the enforcement for 2307 NFS version 4 access may be different from the enforcement for local 2308 access, and both may be different from the enforcement for access 2309 through other protocols such as SMB. So it may be useful for a 2310 server to accept an ACL even if not all of its modules are able to 2311 support it. 2313 The guiding principle in all cases is that the server must not accept 2314 ACLs that appear to make the file more secure than it really is. 2316 5.11.1. ACE type 2318 Type Description 2319 _____________________________________________________ 2320 ALLOW Explicitly grants the access defined in 2321 acemask4 to the file or directory. 2323 DENY Explicitly denies the access defined in 2324 acemask4 to the file or directory. 2326 AUDIT LOG (system dependent) any access 2327 attempt to a file or directory which 2328 uses any of the access methods specified 2329 in acemask4. 2331 ALARM Generate a system ALARM (system 2332 dependent) when any access attempt is 2333 made to a file or directory for the 2334 access methods specified in acemask4. 2336 A server need not support all of the above ACE types. The bitmask 2337 constants used to represent the above definitions within the 2338 aclsupport attribute are as follows: 2340 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 2341 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 2343 Draft Specification NFS version 4 Protocol September 2002 2345 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 2346 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 2348 The semantics of the "type" field follow the descriptions provided 2349 above. 2351 The constants used for the type field (acetype4) are as follows: 2353 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 2354 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 2355 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 2356 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 2358 Clients should not attempt to set an ACE unless the server claims 2359 support for that ACE type. If the server receives a request to set 2360 an ACE that it cannot store, it MUST reject the request with 2361 NFS4ERR_ATTRNOTSUPP. If the server receives a request to set an ACE 2362 that it can store but cannot enforce, the server SHOULD reject the 2363 request with NFS4ERR_ATTRNOTSUPP. 2365 Example: suppose a server can enforce NFS ACLs for NFS access but 2366 cannot enforce ACLs for local access. If arbitrary processes can run 2367 on the server, then the server SHOULD NOT indicate ACL support. On 2368 the other hand, if only trusted administrative programs run locally, 2369 then the server may indicate ACL support. 2371 5.11.2. ACE Access Mask 2373 The access_mask field contains values based on the following: 2375 Access Description 2376 _______________________________________________________________ 2377 READ_DATA Permission to read the data of the file 2378 LIST_DIRECTORY Permission to list the contents of a 2379 directory 2380 WRITE_DATA Permission to modify the file's data 2381 ADD_FILE Permission to add a new file to a 2382 directory 2383 APPEND_DATA Permission to append data to a file 2384 ADD_SUBDIRECTORY Permission to create a subdirectory to a 2385 directory 2386 READ_NAMED_ATTRS Permission to read the named attributes 2387 of a file 2388 WRITE_NAMED_ATTRS Permission to write the named attributes 2389 of a file 2390 EXECUTE Permission to execute a file 2392 Draft Specification NFS version 4 Protocol September 2002 2394 DELETE_CHILD Permission to delete a file or directory 2395 within a directory 2396 READ_ATTRIBUTES The ability to read basic attributes 2397 (non-acls) of a file 2398 WRITE_ATTRIBUTES Permission to change basic attributes 2399 (non-acls) of a file 2401 DELETE Permission to Delete the file 2402 READ_ACL Permission to Read the ACL 2403 WRITE_ACL Permission to Write the ACL 2404 WRITE_OWNER Permission to change the owner 2405 SYNCHRONIZE Permission to access file locally at the 2406 server with synchronous reads and writes 2408 The bitmask constants used for the access mask field are as follows: 2410 const ACE4_READ_DATA = 0x00000001; 2411 const ACE4_LIST_DIRECTORY = 0x00000001; 2412 const ACE4_WRITE_DATA = 0x00000002; 2413 const ACE4_ADD_FILE = 0x00000002; 2414 const ACE4_APPEND_DATA = 0x00000004; 2415 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 2416 const ACE4_READ_NAMED_ATTRS = 0x00000008; 2417 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 2418 const ACE4_EXECUTE = 0x00000020; 2419 const ACE4_DELETE_CHILD = 0x00000040; 2420 const ACE4_READ_ATTRIBUTES = 0x00000080; 2421 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 2422 const ACE4_DELETE = 0x00010000; 2423 const ACE4_READ_ACL = 0x00020000; 2424 const ACE4_WRITE_ACL = 0x00040000; 2425 const ACE4_WRITE_OWNER = 0x00080000; 2426 const ACE4_SYNCHRONIZE = 0x00100000; 2428 Server implementations need not provide the granularity of control 2429 that is implied by this list of masks. For example, POSIX-based 2430 systems might not distinguish APPEND_DATA (the ability to append to a 2431 file) from WRITE_DATA (the ability to modify existing contents); both 2432 masks would be tied to a single ``write'' permission. When such a 2433 server returns attributes to the client, it would show both 2434 APPEND_DATA and WRITE_DATA if and only if the write permission is 2435 enabled. 2437 If a server receives a SETATTR request that it cannot accurately 2438 implement, it should error in the direction of more restricted 2439 access. For example, suppose a server cannot distinguish overwriting 2440 data from appending new data, as described in the previous paragraph. 2441 If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is 2442 not (or vice versa), the server should reject the request with 2443 NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the 2444 server may silently turn on the other bit, so that both APPEND_DATA 2446 Draft Specification NFS version 4 Protocol September 2002 2448 and WRITE_DATA are denied. 2450 5.11.3. ACE flag 2452 The "flag" field contains values based on the following descriptions. 2454 ACE4_FILE_INHERIT_ACE 2456 Can be placed on a directory and indicates that this ACE should be 2457 added to each new non-directory file created. 2459 ACE4_DIRECTORY_INHERIT_ACE 2461 Can be placed on a directory and indicates that this ACE should be 2462 added to each new directory created. 2464 ACE4_INHERIT_ONLY_ACE 2466 Can be placed on a directory but does not apply to the directory, 2467 only to newly created files/directories as specified by the above two 2468 flags. 2470 ACE4_NO_PROPAGATE_INHERIT_ACE 2472 Can be placed on a directory. Normally when a new directory is 2473 created and an ACE exists on the parent directory which is marked 2474 ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new directory. 2475 One for the directory itself and one which is an inheritable ACE for 2476 newly created directories. This flag tells the server to not place 2477 an ACE on the newly created directory which is inheritable by 2478 subdirectories of the created directory. 2480 ACE4_SUCCESSFUL_ACCESS_ACE_FLAG 2482 ACL4_FAILED_ACCESS_ACE_FLAG 2484 The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and 2485 ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to 2486 ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE 2487 (ALARM) ACE types. If during the processing of the file's ACL, the 2488 server encounters an AUDIT or ALARM ACE that matches the principal 2489 attempting the OPEN, the server notes that fact, and the presence, if 2490 any, of the SUCCESS and FAILED flags encountered in the AUDIT or 2491 ALARM ACE. Once the server completes the ACL processing, and the 2492 share reservation processing, and the OPEN call, it then notes if the 2493 OPEN succeeded or failed. If the OPEN succeeded, and if the SUCCESS 2495 Draft Specification NFS version 4 Protocol September 2002 2497 flag was set for a matching AUDIT or ALARM, then the appropriate 2498 AUDIT or ALARM event occurs. If the OPEN failed, and if the FAILED 2499 flag was set for the matching AUDIT or ALARM, then the appropriate 2500 AUDIT or ALARM event occurs. Clearly either or both of the SUCCESS 2501 or FAILED can be set, but if neither is set, the AUDIT or ALARM ACE 2502 is not useful. 2504 The previously described processing applies to that of the ACCESS 2505 operation as well. The difference being that "success" or "failure" 2506 does not mean whether ACCESS returns NFS4_OK or not. Success means 2507 whether ACCESS returns all requested and supported bits. Failure 2508 means whether ACCESS failed to return a bit that was requested and 2509 supported. 2511 ACE4_IDENTIFIER_GROUP 2513 Indicates that the "who" refers to a GROUP as defined under UNIX. 2515 The bitmask constants used for the flag field are as follows: 2517 const ACE4_FILE_INHERIT_ACE = 0x00000001; 2518 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 2519 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 2520 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 2521 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 2522 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 2523 const ACE4_IDENTIFIER_GROUP = 0x00000040; 2525 A server need not support any of these flags. If the server supports 2526 flags that are similar to, but not exactly the same as, these flags, 2527 the implementation may define a mapping between the protocol-defined 2528 flags and the implementation-defined flags. Again, the guiding 2529 principle is that the file not appear to be more secure than it 2530 really is. 2532 For example, suppose a client tries to set an ACE with 2533 ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the 2534 server does not support any form of ACL inheritance, the server 2535 should reject the request with NFS4ERR_ATTRNOTSUPP. If the server 2536 supports a single "inherit ACE" flag that applies to both files and 2537 directories, the server may reject the request (i.e., requiring the 2538 client to set both the file and directory inheritance flags). The 2539 server may also accept the request and silently turn on the 2540 ACE4_DIRECTORY_INHERIT_ACE flag. 2542 5.11.4. ACE who 2544 There are several special identifiers ("who") which need to be 2546 Draft Specification NFS version 4 Protocol September 2002 2548 understood universally, rather than in the context of a particular 2549 DNS domain. Some of these identifiers cannot be understood when an 2550 NFS client accesses the server, but have meaning when a local process 2551 accesses the file. The ability to display and modify these 2552 permissions is permitted over NFS, even if none of the access methods 2553 on the server understands the identifiers. 2555 Who Description 2556 _______________________________________________________________ 2557 "OWNER" The owner of the file. 2558 "GROUP" The group associated with the file. 2559 "EVERYONE" The world. 2560 "INTERACTIVE" Accessed from an interactive terminal. 2561 "NETWORK" Accessed via the network. 2562 "DIALUP" Accessed as a dialup user to the server. 2563 "BATCH" Accessed from a batch job. 2564 "ANONYMOUS" Accessed without any authentication. 2565 "AUTHENTICATED" Any authenticated user (opposite of 2566 ANONYMOUS) 2567 "SERVICE" Access from a system service. 2569 To avoid conflict, these special identifiers are distinguish by an 2570 appended "@" and should appear in the form "xxxx@" (note: no domain 2571 name after the "@"). For example: ANONYMOUS@. 2573 5.11.5. Mode Attribute 2575 The NFS version 4 mode attribute is based on the UNIX mode bits. The 2576 following bits are defined: 2578 const MODE4_SUID = 0x800; /* set user id on execution */ 2579 const MODE4_SGID = 0x400; /* set group id on execution */ 2580 const MODE4_SVTX = 0x200; /* save text even after use */ 2581 const MODE4_RUSR = 0x100; /* read permission: owner */ 2582 const MODE4_WUSR = 0x080; /* write permission: owner */ 2583 const MODE4_XUSR = 0x040; /* execute permission: owner */ 2584 const MODE4_RGRP = 0x020; /* read permission: group */ 2585 const MODE4_WGRP = 0x010; /* write permission: group */ 2586 const MODE4_XGRP = 0x008; /* execute permission: group */ 2587 const MODE4_ROTH = 0x004; /* read permission: other */ 2588 const MODE4_WOTH = 0x002; /* write permission: other */ 2589 const MODE4_XOTH = 0x001; /* execute permission: other */ 2591 Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal 2592 identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and 2593 MODE4_XGRP apply to the principals identified in the owner_group 2594 attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any 2595 principal that does not match that in the owner group, and does not 2597 Draft Specification NFS version 4 Protocol September 2002 2599 have a group matching that of the owner_group attribute. 2601 The remaining bits are not defined by this protocol and MUST NOT be 2602 used. The minor version mechanism must be used to define further bit 2603 usage. 2605 Note that in UNIX, if a file has the MODE4_SGID bit set and no 2606 MODE4_XGRP bit set, then READ and WRITE must use mandatory file 2607 locking. 2609 5.11.6. Mode and ACL Attribute 2611 The server that supports both mode and ACL must take care to 2612 synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the 2613 ACEs which have respective who fields of "OWNER@", "GROUP@", and 2614 "EVERYONE@" so that the client can see semantically equivalent access 2615 permissions exist whether the client asks for owner, owner_group and 2616 mode attributes, or for just the ACL. 2618 Because the mode attribute includes bits (e.g. MODE4_SVTX) that have 2619 nothing to do with ACL semantics, it is permitted for clients to 2620 specify both the ACL attribute and mode in the same SETATTR 2621 operation. However, because there is no prescribed order for 2622 processing the attributes in a SETATTR, the client must ensure that 2623 ACL attribute, if specified without mode, would produce the desired 2624 mode bits, and conversely, the mode attribute if specified without 2625 ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" 2626 ACEs. 2628 5.11.7. mounted_on_fileid 2630 UNIX-based operating environments connect a filesystem into the 2631 namespace by connecting (mounting) the filesystem onto the existing 2632 file object (the mount point, usually a directory) of an existing 2633 filesystem. When the mount point's parent directory is read via an 2634 API like readdir(), the return results are directory entries, each 2635 with a component name and a fileid. The fileid of the mount point's 2636 directory entry will be different from the fileid that the stat() 2637 system call returns. The stat() system call is returning the fileid 2638 of the root of the mounted filesystem, whereas readdir() is returning 2639 the fileid stat() would have returned before any filesystems were 2640 mounted on the mount point. 2642 Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request 2643 to cross other filesystems. The client detects the filesystem 2644 crossing whenever the filehandle argument of LOOKUP has an fsid 2645 attribute different from that of the filehandle returned by LOOKUP. A 2646 UNIX-based client will consider this a "mount point crossing". UNIX 2647 has a legacy scheme for allowing a process to determine its current 2648 working directory. This relies on readdir() of a mount point's parent 2649 and stat() of the mount point returning fileids as previously 2651 Draft Specification NFS version 4 Protocol September 2002 2653 described. The mounted_on_fileid attribute corresponds to the fileid 2654 that readdir() would have returned as described previously. 2656 While the NFS version 4 client could simply fabricate a fileid 2657 corresponding to what mounted_on_fileid provides (and if the server 2658 does not support mounted_on_fileid, the client has no choice), there 2659 is a risk that the client will generate a fileid that conflicts with 2660 one that is already assigned to another object in the filesystem. 2661 Instead, if the server can provide the mounted_on_fileid, the 2662 potential for client operational problems in this area is eliminated. 2664 If the server detects that there is no mounted point at the target 2665 file object, then the value for mounted_on_fileid that it returns is 2666 the same as that of the fileid attribute. 2668 The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD 2669 provide it if possible, and for a UNIX-based server, this is 2670 straightforward. Usually, mounted_on_fileid will be requested during 2671 a READDIR operation, in which case it is trivial (at least for UNIX- 2672 based servers) to return mounted_on_fileid since it is equal to the 2673 fileid of a directory entry returned by readdir(). If 2674 mounted_on_fileid is requested in a GETATTR operation, the server 2675 should obey an invariant that has it returning a value that is equal 2676 to the file object's entry in the object's parent directory, i.e. 2677 what readdir() would have returned. Some operating environments 2678 allow a series of two or more filesystems to be mounted onto a single 2679 mount point. In this case, for the server to obey the aforementioned 2680 invariant, it will need to find the base mount point, and not the 2681 intermediate mount points. 2683 Draft Specification NFS version 4 Protocol September 2002 2685 6. Filesystem Migration and Replication 2687 With the use of the recommended attribute "fs_locations", the NFS 2688 version 4 server has a method of providing filesystem migration or 2689 replication services. For the purposes of migration and replication, 2690 a filesystem will be defined as all files that share a given fsid 2691 (both major and minor values are the same). 2693 The fs_locations attribute provides a list of filesystem locations. 2694 These locations are specified by providing the server name (either 2695 DNS domain or IP address) and the path name representing the root of 2696 the filesystem. Depending on the type of service being provided, the 2697 list will provide a new location or a set of alternate locations for 2698 the filesystem. The client will use this information to redirect its 2699 requests to the new server. 2701 6.1. Replication 2703 It is expected that filesystem replication will be used in the case 2704 of read-only data. Typically, the filesystem will be replicated on 2705 two or more servers. The fs_locations attribute will provide the 2706 list of these locations to the client. On first access of the 2707 filesystem, the client should obtain the value of the fs_locations 2708 attribute. If, in the future, the client finds the server 2709 unresponsive, the client may attempt to use another server specified 2710 by fs_locations. 2712 If applicable, the client must take the appropriate steps to recover 2713 valid filehandles from the new server. This is described in more 2714 detail in the following sections. 2716 6.2. Migration 2718 Filesystem migration is used to move a filesystem from one server to 2719 another. Migration is typically used for a filesystem that is 2720 writable and has a single copy. The expected use of migration is for 2721 load balancing or general resource reallocation. The protocol does 2722 not specify how the filesystem will be moved between servers. This 2723 server-to-server transfer mechanism is left to the server 2724 implementor. However, the method used to communicate the migration 2725 event between client and server is specified here. 2727 Once the servers participating in the migration have completed the 2728 move of the filesystem, the error NFS4ERR_MOVED will be returned for 2729 subsequent requests received by the original server. The 2730 NFS4ERR_MOVED error is returned for all operations except PUTFH and 2731 GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will 2732 obtain the value of the fs_locations attribute. The client will then 2733 use the contents of the attribute to redirect its requests to the 2734 specified server. To facilitate the use of GETATTR, operations such 2736 Draft Specification NFS version 4 Protocol September 2002 2738 as PUTFH must also be accepted by the server for the migrated file 2739 system's filehandles. Note that if the server returns NFS4ERR_MOVED, 2740 the server MUST support the fs_locations attribute. 2742 If the client requests more attributes than just fs_locations, the 2743 server may return fs_locations only. This is to be expected since 2744 the server has migrated the filesystem and may not have a method of 2745 obtaining additional attribute data. 2747 The server implementor needs to be careful in developing a migration 2748 solution. The server must consider all of the state information 2749 clients may have outstanding at the server. This includes but is not 2750 limited to locking/share state, delegation state, and asynchronous 2751 file writes which are represented by WRITE and COMMIT verifiers. The 2752 server should strive to minimize the impact on its clients during and 2753 after the migration process. 2755 6.3. Interpretation of the fs_locations Attribute 2757 The fs_location attribute is structured in the following way: 2759 struct fs_location { 2760 utf8string server<>; 2761 pathname4 rootpath; 2762 }; 2764 struct fs_locations { 2765 pathname4 fs_root; 2766 fs_location locations<>; 2767 }; 2769 The fs_location struct is used to represent the location of a 2770 filesystem by providing a server name and the path to the root of the 2771 filesystem. For a multi-homed server or a set of servers that use 2772 the same rootpath, an array of server names may be provided. An 2773 entry in the server array is an UTF8 string and represents one of a 2774 traditional DNS host name, IPv4 address, or IPv6 address. It is not 2775 a requirement that all servers that share the same rootpath be listed 2776 in one fs_location struct. The array of server names is provided for 2777 convenience. Servers that share the same rootpath may also be listed 2778 in separate fs_location entries in the fs_locations attribute. 2780 The fs_locations struct and attribute then contains an array of 2781 locations. Since the name space of each server may be constructed 2782 differently, the "fs_root" field is provided. The path represented 2783 by fs_root represents the location of the filesystem in the server's 2784 name space. Therefore, the fs_root path is only associated with the 2785 server from which the fs_locations attribute was obtained. The 2786 fs_root path is meant to aid the client in locating the filesystem at 2787 the various servers listed. 2789 Draft Specification NFS version 4 Protocol September 2002 2791 As an example, there is a replicated filesystem located at two 2792 servers (servA and servB). At servA the filesystem is located at 2793 path "/a/b/c". At servB the filesystem is located at path "/x/y/z". 2794 In this example the client accesses the filesystem first at servA 2795 with a multi-component lookup path of "/a/b/c/d". Since the client 2796 used a multi-component lookup to obtain the filehandle at "/a/b/c/d", 2797 it is unaware that the filesystem's root is located in servA's name 2798 space at "/a/b/c". When the client switches to servB, it will need 2799 to determine that the directory it first referenced at servA is now 2800 represented by the path "/x/y/z/d" on servB. To facilitate this, the 2801 fs_locations attribute provided by servA would have a fs_root value 2802 of "/a/b/c" and two entries in fs_location. One entry in fs_location 2803 will be for itself (servA) and the other will be for servB with a 2804 path of "/x/y/z". With this information, the client is able to 2805 substitute "/x/y/z" for the "/a/b/c" at the beginning of its access 2806 path and construct "/x/y/z/d" to use for the new server. 2808 See the section "Security Considerations" for a discussion on the 2809 recommendations for the security flavor to be used by any GETATTR 2810 operation that requests the "fs_locations" attribute. 2812 6.4. Filehandle Recovery for Migration or Replication 2814 Filehandles for filesystems that are replicated or migrated generally 2815 have the same semantics as for filesystems that are not replicated or 2816 migrated. For example, if a filesystem has persistent filehandles 2817 and it is migrated to another server, the filehandle values for the 2818 filesystem will be valid at the new server. 2820 For volatile filehandles, the servers involved likely do not have a 2821 mechanism to transfer filehandle format and content between 2822 themselves. Therefore, a server may have difficulty in determining 2823 if a volatile filehandle from an old server should return an error of 2824 NFS4ERR_FHEXPIRED. Therefore, the client is informed, with the use 2825 of the fh_expire_type attribute, whether volatile filehandles will 2826 expire at the migration or replication event. If the bit 2827 FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client 2828 must treat the volatile filehandle as if the server had returned the 2829 NFS4ERR_FHEXPIRED error. At the migration or replication event in 2830 the presence of the FH4_VOL_MIGRATION bit, the client will not 2831 present the original or old volatile filehandle to the new server. 2832 The client will start its communication with the new server by 2833 recovering its filehandles using the saved file names. 2835 Draft Specification NFS version 4 Protocol September 2002 2837 7. NFS Server Name Space 2839 7.1. Server Exports 2841 On a UNIX server the name space describes all the files reachable by 2842 pathnames under the root directory or "/". On a Windows NT server 2843 the name space constitutes all the files on disks named by mapped 2844 disk letters. NFS server administrators rarely make the entire 2845 server's filesystem name space available to NFS clients. More often 2846 portions of the name space are made available via an "export" 2847 feature. In previous versions of the NFS protocol, the root 2848 filehandle for each export is obtained through the MOUNT protocol; 2849 the client sends a string that identifies the export of name space 2850 and the server returns the root filehandle for it. The MOUNT 2851 protocol supports an EXPORTS procedure that will enumerate the 2852 server's exports. 2854 7.2. Browsing Exports 2856 The NFS version 4 protocol provides a root filehandle that clients 2857 can use to obtain filehandles for these exports via a multi-component 2858 LOOKUP. A common user experience is to use a graphical user 2859 interface (perhaps a file "Open" dialog window) to find a file via 2860 progressive browsing through a directory tree. The client must be 2861 able to move from one export to another export via single-component, 2862 progressive LOOKUP operations. 2864 This style of browsing is not well supported by the NFS version 2 and 2865 3 protocols. The client expects all LOOKUP operations to remain 2866 within a single server filesystem. For example, the device attribute 2867 will not change. This prevents a client from taking name space paths 2868 that span exports. 2870 An automounter on the client can obtain a snapshot of the server's 2871 name space using the EXPORTS procedure of the MOUNT protocol. If it 2872 understands the server's pathname syntax, it can create an image of 2873 the server's name space on the client. The parts of the name space 2874 that are not exported by the server are filled in with a "pseudo 2875 filesystem" that allows the user to browse from one mounted 2876 filesystem to another. There is a drawback to this representation of 2877 the server's name space on the client: it is static. If the server 2878 administrator adds a new export the client will be unaware of it. 2880 7.3. Server Pseudo Filesystem 2882 NFS version 4 servers avoid this name space inconsistency by 2883 presenting all the exports within the framework of a single server 2884 name space. An NFS version 4 client uses LOOKUP and READDIR 2885 operations to browse seamlessly from one export to another. Portions 2887 Draft Specification NFS version 4 Protocol September 2002 2889 of the server name space that are not exported are bridged via a 2890 "pseudo filesystem" that provides a view of exported directories 2891 only. A pseudo filesystem has a unique fsid and behaves like a 2892 normal, read only filesystem. 2894 Based on the construction of the server's name space, it is possible 2895 that multiple pseudo filesystems may exist. For example, 2897 /a pseudo filesystem 2898 /a/b real filesystem 2899 /a/b/c pseudo filesystem 2900 /a/b/c/d real filesystem 2902 Each of the pseudo filesystems are considered separate entities and 2903 therefore will have a unique fsid. 2905 7.4. Multiple Roots 2907 The DOS and Windows operating environments are sometimes described as 2908 having "multiple roots". Filesystems are commonly represented as 2909 disk letters. MacOS represents filesystems as top level names. NFS 2910 version 4 servers for these platforms can construct a pseudo file 2911 system above these root names so that disk letters or volume names 2912 are simply directory names in the pseudo root. 2914 7.5. Filehandle Volatility 2916 The nature of the server's pseudo filesystem is that it is a logical 2917 representation of filesystem(s) available from the server. 2918 Therefore, the pseudo filesystem is most likely constructed 2919 dynamically when the server is first instantiated. It is expected 2920 that the pseudo filesystem may not have an on disk counterpart from 2921 which persistent filehandles could be constructed. Even though it is 2922 preferable that the server provide persistent filehandles for the 2923 pseudo filesystem, the NFS client should expect that pseudo file 2924 system filehandles are volatile. This can be confirmed by checking 2925 the associated "fh_expire_type" attribute for those filehandles in 2926 question. If the filehandles are volatile, the NFS client must be 2927 prepared to recover a filehandle value (e.g. with a multi-component 2928 LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. 2930 7.6. Exported Root 2932 If the server's root filesystem is exported, one might conclude that 2933 a pseudo-filesystem is not needed. This would be wrong. Assume the 2934 following filesystems on a server: 2936 / disk1 (exported) 2937 /a disk2 (not exported) 2939 Draft Specification NFS version 4 Protocol September 2002 2941 /a/b disk3 (exported) 2943 Because disk2 is not exported, disk3 cannot be reached with simple 2944 LOOKUPs. The server must bridge the gap with a pseudo-filesystem. 2946 7.7. Mount Point Crossing 2948 The server filesystem environment may be constructed in such a way 2949 that one filesystem contains a directory which is 'covered' or 2950 mounted upon by a second filesystem. For example: 2952 /a/b (filesystem 1) 2953 /a/b/c/d (filesystem 2) 2955 The pseudo filesystem for this server may be constructed to look 2956 like: 2958 / (place holder/not exported) 2959 /a/b (filesystem 1) 2960 /a/b/c/d (filesystem 2) 2962 It is the server's responsibility to present the pseudo filesystem 2963 that is complete to the client. If the client sends a lookup request 2964 for the path "/a/b/c/d", the server's response is the filehandle of 2965 the filesystem "/a/b/c/d". In previous versions of the NFS protocol, 2966 the server would respond with the filehandle of directory "/a/b/c/d" 2967 within the filesystem "/a/b". 2969 The NFS client will be able to determine if it crosses a server mount 2970 point by a change in the value of the "fsid" attribute. 2972 7.8. Security Policy and Name Space Presentation 2974 The application of the server's security policy needs to be carefully 2975 considered by the implementor. One may choose to limit the 2976 viewability of portions of the pseudo filesystem based on the 2977 server's perception of the client's ability to authenticate itself 2978 properly. However, with the support of multiple security mechanisms 2979 and the ability to negotiate the appropriate use of these mechanisms, 2980 the server is unable to properly determine if a client will be able 2981 to authenticate itself. If, based on its policies, the server 2982 chooses to limit the contents of the pseudo filesystem, the server 2983 may effectively hide filesystems from a client that may otherwise 2984 have legitimate access. 2986 As suggested practice, the server should apply the security policy of 2987 a shared resource in the server's namespace to the components of the 2988 resource's ancestors. For example: 2990 / 2992 Draft Specification NFS version 4 Protocol September 2002 2994 /a/b 2995 /a/b/c 2997 The /a/b/c directory is a real filesystem and is the shared resource. 2998 The security policy for /a/b/c is Kerberos with integrity. The 2999 server should apply the same security policy to /, /a, and /a/b. 3000 This allows for the extension of the protection of the server's 3001 namespace to the ancestors of the real shared resource. 3003 For the case of the use of multiple, disjoint security mechanisms in 3004 the server's resources, the security for a particular object in the 3005 server's namespace should be the union of all security mechanisms of 3006 all direct descendants. 3008 Draft Specification NFS version 4 Protocol September 2002 3010 8. File Locking and Share Reservations 3012 Integrating locking into the NFS protocol necessarily causes it to be 3013 stateful. With the inclusion of share reservations the protocol 3014 becomes substantially more dependent on state than the traditional 3015 combination of NFS and NLM [XNFS]. There are three components to 3016 making this state manageable: 3018 o Clear division between client and server 3020 o Ability to reliably detect inconsistency in state between client 3021 and server 3023 o Simple and robust recovery mechanisms 3025 In this model, the server owns the state information. The client 3026 communicates its view of this state to the server as needed. The 3027 client is also able to detect inconsistent state before modifying a 3028 file. 3030 To support Win32 share reservations it is necessary to atomically 3031 OPEN or CREATE files. Having a separate share/unshare operation 3032 would not allow correct implementation of the Win32 OpenFile API. In 3033 order to correctly implement share semantics, the previous NFS 3034 protocol mechanisms used when a file is opened or created (LOOKUP, 3035 CREATE, ACCESS) need to be replaced. The NFS version 4 protocol has 3036 an OPEN operation that subsumes the NFS version 3 methodology of 3037 LOOKUP, CREATE, and ACCESS. However, because many operations require 3038 a filehandle, the traditional LOOKUP is preserved to map a file name 3039 to filehandle without establishing state on the server. The policy 3040 of granting access or modifying files is managed by the server based 3041 on the client's state. These mechanisms can implement policy ranging 3042 from advisory only locking to full mandatory locking. 3044 8.1. Locking 3046 It is assumed that manipulating a lock is rare when compared to READ 3047 and WRITE operations. It is also assumed that crashes and network 3048 partitions are relatively rare. Therefore it is important that the 3049 READ and WRITE operations have a lightweight mechanism to indicate if 3050 they possess a held lock. A lock request contains the heavyweight 3051 information required to establish a lock and uniquely define the lock 3052 owner. 3054 The following sections describe the transition from the heavy weight 3055 information to the eventual stateid used for most client and server 3056 locking and lease interactions. 3058 8.1.1. Client ID 3060 For each LOCK request, the client must identify itself to the server. 3062 Draft Specification NFS version 4 Protocol September 2002 3064 This is done in such a way as to allow for correct lock 3065 identification and crash recovery. A sequence of a SETCLIENTID 3066 operation followed by a SETCLIENTID_CONFIRM operation is required to 3067 establish the identification onto the server. Establishment of 3068 identification by a new incarnation of the client also has the effect 3069 of immediately breaking any leased state that a previous incarnation 3070 of the client might have had on the server, as opposed to forcing the 3071 new client incarnation to wait for the leases to expire. Breaking 3072 the lease state amounts to the server removing all lock, share 3073 reservation, and, where the server is not supporting the 3074 CLAIM_DELEGATE_PREV claim type, all delegation state associated with 3075 same client with the same identity. For discussion of delegation 3076 state recovery, see the section "Delegation Recovery". 3078 Client identification is encapsulated in the following structure: 3080 struct nfs_client_id4 { 3081 verifier4 verifier; 3082 opaque id; 3083 }; 3085 The first field, verifier is a client incarnation verifier that is 3086 used to detect client reboots. Only if the verifier is different from 3087 that the server has previously recorded the client (as identified by 3088 the second field f the structure, id) does the server start the 3089 process of canceling the client's leased state. 3091 The second field, id is a variable length string that uniquely 3092 defines the client. 3094 There are several considerations for how the client generates the id 3095 string: 3097 o The string should be unique so that multiple clients do not 3098 present the same string. The consequences of two clients 3099 presenting the same string range from one client getting an 3100 error to one client having its leased state abruptly and 3101 unexpectedly canceled. 3103 o The string should be selected so the subsequent incarnations 3104 (e.g. reboots) of the same client cause the client to present 3105 the same string. The implementor is cautioned from an approach 3106 that requires the string to be recorded in a local file because 3107 this precludes the use of the implementation in an environment 3108 where there is no local disk and all file access is from an NFS 3109 version 4 server. 3111 o The string should be different for each server network address 3112 that the client accesses, rather than common to all server 3113 network addresses. The reason is that it may not be possible for 3114 the client to tell if same server is listening on multiple 3115 network addresses. If the client issues SETCLIENTID with the 3117 Draft Specification NFS version 4 Protocol September 2002 3119 same id string to each network address of such a server, the 3120 server will think it is the same client, and each successive 3121 SETCLIENTID will cause the server to begin the process of 3122 removing the client's previous leased state. 3124 o The algorithm for generating the string should not assume that 3125 the client's network address won't change. This includes 3126 changes between client incarnations and even changes while the 3127 client is stilling running in its current incarnation. This 3128 means that if the client includes just the client's and server's 3129 network address in the id string, there is a real risk, after 3130 the client gives up the network address, that another client, 3131 using a similar algorithm for generating the id string, will 3132 generate a conflicting id string. 3134 Given the above considerations, an example of a well generated id 3135 string is one that includes: 3137 o The server's network address. 3139 o The client's network address. 3141 o For a user level NFS version 4 client, it should contain 3142 additional information to distinguish the client from other user 3143 level clients running on the same host, such as a process id or 3144 other unique sequence. 3146 o Additional information that tends to be unique, such as one or 3147 more of: 3149 - The client machine's serial number (for privacy reasons, it is 3150 best to perform some one way function on the serial number). 3152 - A MAC address. 3154 - The timestamp of when the NFS version 4 software was first 3155 installed on the client (though this is subject to the 3156 previously mentioned caution about using information that is 3157 stored in a file, because the file might only be accessible 3158 over NFS version 4). 3160 - A true random number. However since this number ought to be 3161 the same between client incarnations, this shares the same 3162 problem as that of the using the timestamp of the software 3163 installation. 3165 As a security measure, the server MUST NOT cancel a client's leased 3166 state if the principal established the state for a given id string is 3167 not the same as the principal issuing the SETCLIENTID. 3169 Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose 3171 Draft Specification NFS version 4 Protocol September 2002 3173 of establishing the information the server needs to make callbacks to 3174 the client for purpose of supporting delegations. It is permitted to 3175 change this information via SETCLIENTID and SETCLIENTID_CONFIRM 3176 within the same incarnation of the client without removing the 3177 client's leased state. 3179 Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully 3180 completed, the client uses the short hand client identifier, of type 3181 clientid4, instead of the longer and less compact nfs_client_id4 3182 structure. This short hand client identifier (a clientid) is 3183 assigned by the server and should be chosen so that it will not 3184 conflict with a clientid previously assigned by the server. This 3185 applies across server restarts or reboots. When a clientid is 3186 presented to a server and that clientid is not recognized, as would 3187 happen after a server reboot, the server will reject the request with 3188 the error NFS4ERR_STALE_CLIENTID. When this happens, the client must 3189 obtain a new clientid by use of the SETCLIENTID operation and then 3190 proceed to any other necessary recovery for the server reboot case 3191 (See the section "Server Failure and Recovery"). 3193 The client must also employ the SETCLIENTID operation when it 3194 receives a NFS4ERR_STALE_STATEID error using a stateid derived from 3195 its current clientid, since this also indicates a server reboot which 3196 has invalidated the existing clientid (see the next section 3197 "lock_owner and stateid Definition" for details). 3199 See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM 3200 for a complete specification of the operations. 3202 8.1.2. Server Release of Clientid 3204 If the server determines that the client holds no associated state 3205 for its clientid, the server may choose to release the clientid. The 3206 server may make this choice for an inactive client so that resources 3207 are not consumed by those intermittently active clients. If the 3208 client contacts the server after this release, the server must ensure 3209 the client receives the appropriate error so that it will use the 3210 SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. 3211 It should be clear that the server must be very hesitant to release a 3212 clientid since the resulting work on the client to recover from such 3213 an event will be the same burden as if the server had failed and 3214 restarted. Typically a server would not release a clientid unless 3215 there had been no activity from that client for many minutes. 3217 Note that if the id string in a SETCLIENTID request is properly 3218 constructed, and if the client takes care to use the same principal 3219 for each successive use of SETCLIENTID, then, barring an active 3220 denial of service attack, NFS4ERR_CLID_INUSE should never be 3221 returned. 3223 However, client bugs, server bugs, or perhaps a deliberate change of 3225 Draft Specification NFS version 4 Protocol September 2002 3227 the principal owner of the id string (such as the case of a client 3228 that changes security flavors, and under the new flavor, there is no 3229 mapping to the previous owner) will in rare cases result in 3230 NFS4ERR_CLID_INUSE. 3232 In that event, when the server gets a SETCLIENTID for a client id 3233 that currently has no state, or it has state, but the lease has 3234 expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST 3235 allow the SETCLIENTID, and confirm the new clientid if followed by 3236 the appropriate SETCLIENTID_CONFIRM. 3238 8.1.3. lock_owner and stateid Definition 3240 When requesting a lock, the client must present to the server the 3241 clientid and an identifier for the owner of the requested lock. 3242 These two fields are referred to as the lock_owner and the definition 3243 of those fields are: 3245 o A clientid returned by the server as part of the client's use of 3246 the SETCLIENTID operation. 3248 o A variable length opaque array used to uniquely define the owner 3249 of a lock managed by the client. 3251 This may be a thread id, process id, or other unique value. 3253 When the server grants the lock, it responds with a unique stateid. 3254 The stateid is used as a shorthand reference to the lock_owner, since 3255 the server will be maintaining the correspondence between them. 3257 The server is free to form the stateid in any manner that it chooses 3258 as long as it is able to recognize invalid and out-of-date stateids. 3259 This requirement includes those stateids generated by earlier 3260 instances of the server. From this, the client can be properly 3261 notified of a server restart. This notification will occur when the 3262 client presents a stateid to the server from a previous 3263 instantiation. 3265 The server must be able to distinguish the following situations and 3266 return the error as specified: 3268 o The stateid was generated by an earlier server instance (i.e. 3269 before a server reboot). The error NFS4ERR_STALE_STATEID should 3270 be returned. 3272 o The stateid was generated by the current server instance but the 3273 stateid no longer designates the current locking state for the 3274 lockowner-file pair in question (i.e. one or more locking 3275 operations has occurred). The error NFS4ERR_OLD_STATEID should 3276 be returned. 3278 Draft Specification NFS version 4 Protocol September 2002 3280 This error condition will only occur when the client issues a 3281 locking request which changes a stateid while an I/O request 3282 that uses that stateid is outstanding. 3284 o The stateid was generated by the current server instance but the 3285 stateid does not designate a locking state for any active 3286 lockowner-file pair. The error NFS4ERR_BAD_STATEID should be 3287 returned. 3289 This error condition will occur when there has been a logic 3290 error on the part of the client or server. This should not 3291 happen. 3293 One mechanism that may be used to satisfy these requirements is for 3294 the server to, 3296 o divide the "other" field of each stateid into two fields: 3298 - A server verifier which uniquely designates a particular 3299 server instantiation. 3301 - An index into a table of locking-state structures. 3303 o utilize the "seqid" field of each stateid, such that seqid is 3304 monotonically incremented for each stateid that is associated 3305 with the same index into the locking-state table. 3307 By matching the incoming stateid and its field values with the state 3308 held at the server, the server is able to easily determine if a 3309 stateid is valid for its current instantiation and state. If the 3310 stateid is not valid, the appropriate error can be supplied to the 3311 client. 3313 8.1.4. Use of the stateid and Locking 3315 All READ, WRITE and SETATTR operations contain a stateid. For the 3316 purposes of this section, SETATTR operations which change the size 3317 attribute of a file are treated as if they are writing the area 3318 between the old and new size (i.e. the range truncated or added to 3319 the file by means of the SETATTR), even where SETATTR is not 3320 explicitly mentioned in the text. 3322 If the lock_owner performs a READ or WRITE in a situation in which it 3323 has established a lock or share reservation on the server (any OPEN 3324 constitutes a share reservation) the stateid (previously returned by 3325 the server) must be used to indicate what locks, including both 3326 record locks and share reservations, are held by the lockowner. If 3327 no state is established by the client, either record lock or share 3329 Draft Specification NFS version 4 Protocol September 2002 3331 reservation, a stateid of all bits 0 is used. Regardless whether a 3332 stateid of all bits 0, or a stateid returned by the server is used, 3333 if there is a conflicting share reservation or mandatory record lock 3334 held on the file, the server MUST refuse to service the READ or WRITE 3335 operation. 3337 Share reservations are established by OPEN operations and by their 3338 nature are mandatory in that when the OPEN denies READ or WRITE 3339 operations, that denial results in such operations being rejected 3340 with error NFS4ERR_LOCKED. Record locks may be implemented by the 3341 server as either mandatory or advisory, or the choice of mandatory or 3342 advisory behavior may be determined by the server on the basis of the 3343 file being accessed (for example, some UNIX-based servers support a 3344 "mandatory lock bit" on the mode attribute such that if set, record 3345 locks are required on the file before I/O is possible). When record 3346 locks are advisory, they only prevent the granting of conflicting 3347 lock requests and have no effect on READs or WRITEs. Mandatory 3348 record locks, however, prevent conflicting I/O operations. When they 3349 are attempted, they are rejected with NFS4ERR_LOCKED. When the 3350 client gets NFS4ERR_LOCKED on a file it knows it has the proper share 3351 reservation for, it will need to issue a LOCK request on the region 3352 of the file that includes the region the I/O was to be performed on, 3353 with an appropriate locktype (i.e. READ*_LT for a READ operation, 3354 WRITE*_LT for a WRITE operation). 3356 With NFS version 3, there was no notion of a stateid so there was no 3357 way to tell if the application process of the client sending the READ 3358 or WRITE operation had also acquired the appropriate record lock on 3359 the file. Thus there was no way to implement mandatory locking. With 3360 the stateid construct, this barrier has been removed. 3362 Note that for UNIX environments that support mandatory file locking, 3363 the distinction between advisory and mandatory locking is subtle. In 3364 fact, advisory and mandatory record locks are exactly the same in so 3365 far as the APIs and requirements on implementation. If the mandatory 3366 lock attribute is set on the file, the server checks to see if the 3367 lockowner has an appropriate shared (read) or exclusive (write) 3368 record lock on the region it wishes to read or write to. If there is 3369 no appropriate lock, the server checks if there is a conflicting lock 3370 (which can be done by attempting to acquire the conflicting lock on 3371 the behalf of the lockowner, and if successful, release the lock 3372 after the READ or WRITE is done), and if there is, the server returns 3373 NFS4ERR_LOCKED. 3375 For Windows environments, there are no advisory record locks, so the 3376 server always checks for record locks during I/O requests. 3378 Thus, the NFS version 4 LOCK operation does not need to distinguish 3379 between advisory and mandatory record locks. It is the NFS version 4 3380 server's processing of the READ and WRITE operations that introduces 3381 the distinction. 3383 Draft Specification NFS version 4 Protocol September 2002 3385 Every stateid other than the special stateid values noted in this 3386 section, whether returned by an OPEN-type operation (i.e. OPEN, 3387 OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), 3388 defines an access mode for the file (i.e. READ, WRITE, or READ-WRITE) 3389 as established by the original OPEN which began the stateid sequence, 3390 and as modified by subsequent OPENs and OPEN_DOWNGRADEs within that 3391 stateid sequence. When a READ, WRITE, or SETATTR which specifies the 3392 size attribute, is done, the operation is subject to checking against 3393 the access mode to verify that the operation is appropriate given the 3394 OPEN with which the operation is associated. 3396 In the case of WRITE-type operations (i.e. WRITEs and SETATTRs which 3397 set size), the server must verify that the access mode allows writing 3398 and return an NFS4ERR_OPENMODE error if it does not. In the case, of 3399 READ, the server may perform the corresponding check on the access 3400 mode, or it may choose to allow READ on opens for WRITE only, to 3401 accommodate clients whose write implementation may unavoidably do 3402 reads (e.g. due to buffer cache constraints). However, even if READs 3403 are allowed in these circumstances, the server MUST still check for 3404 locks that conflict with the READ (e.g. another open specify denial 3405 of READs). Note that a server which does enforce the access mode 3406 check on READs need not explicitly check for conflicting share 3407 reservations since the existence of OPEN for read access guarantees 3408 that no conflicting share reservation can exist. 3410 A stateid of all bits 1 (one) MAY allow READ operations to bypass 3411 locking checks at the server. However, WRITE operations with a 3412 stateid with bits all 1 (one) MUST NOT bypass locking checks and are 3413 treated exactly the same as if a stateid of all bits 0 were used. 3415 A lock may not be granted while a READ or WRITE operation using one 3416 of the special stateids is being performed and the range of the lock 3417 request conflicts with the range of the READ or WRITE operation. For 3418 the purposes of this paragraph, a conflict occurs when a shared lock 3419 is requested and a WRITE operation is being performed, or an 3420 exclusive lock is requested and either a READ or a WRITE operation is 3421 being performed. A SETATTR that sets size is treated similarly to a 3422 WRITE as discussed above. 3424 8.1.5. Sequencing of Lock Requests 3426 Locking is different than most NFS operations as it requires "at- 3427 most-one" semantics that are not provided by ONCRPC. ONCRPC over a 3428 reliable transport is not sufficient because a sequence of locking 3429 requests may span multiple TCP connections. In the face of 3430 retransmission or reordering, lock or unlock requests must have a 3431 well defined and consistent behavior. To accomplish this, each lock 3432 request contains a sequence number that is a consecutively increasing 3433 integer. Different lock_owners have different sequences. The server 3434 maintains the last sequence number (L) received and the response that 3435 was returned. The first request issued for any given lock_owner is 3437 Draft Specification NFS version 4 Protocol September 2002 3439 issued with a sequence number of zero. 3441 Note that for requests that contain a sequence number, for each 3442 lock_owner, there should be no more than one outstanding request. 3444 If a request (r) with a previous sequence number (r < L) is received, 3445 it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a 3446 properly-functioning client, the response to (r) must have been 3447 received before the last request (L) was sent. If a duplicate of 3448 last request (r == L) is received, the stored response is returned. 3449 If a request beyond the next sequence (r == L + 2) is received, it is 3450 rejected with the return of error NFS4ERR_BAD_SEQID. Sequence 3451 history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM 3452 sequence changes the client verifier. 3454 Since the sequence number is represented with an unsigned 32-bit 3455 integer, the arithmetic involved with the sequence number is mod 3456 2^32. 3458 It is critical the server maintain the last response sent to the 3459 client to provide a more reliable cache of duplicate non-idempotent 3460 requests than that of the traditional cache described in [Juszczak]. 3461 The traditional duplicate request cache uses a least recently used 3462 algorithm for removing unneeded requests. However, the last lock 3463 request and response on a given lock_owner must be cached as long as 3464 the lock state exists on the server. 3466 The client MUST monotonically increment the sequence number for the 3467 CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE 3468 operations. This is true even in the event that the previous 3469 operation that used the sequence number received an error. The only 3470 exception to this rule is if the previous operation received one of 3471 the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, 3472 NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR, 3473 NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE. 3475 8.1.6. Recovery from Replayed Requests 3477 As described above, the sequence number is per lock_owner. As long 3478 as the server maintains the last sequence number received and follows 3479 the methods described above, there are no risks of a Byzantine router 3480 re-sending old requests. The server need only maintain the 3481 (lock_owner, sequence number) state as long as there are open files 3482 or closed files with locks outstanding. 3484 LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence 3485 number and therefore the risk of the replay of these operations 3486 resulting in undesired effects is non-existent while the server 3487 maintains the lock_owner state. 3489 Draft Specification NFS version 4 Protocol September 2002 3491 8.1.7. Releasing lock_owner State 3493 When a particular lock_owner no longer holds open or file locking 3494 state at the server, the server may choose to release the sequence 3495 number state associated with the lock_owner. The server may make 3496 this choice based on lease expiration, for the reclamation of server 3497 memory, or other implementation specific details. In any event, the 3498 server is able to do this safely only when the lock_owner no longer 3499 is being utilized by the client. The server may choose to hold the 3500 lock_owner state in the event that retransmitted requests are 3501 received. However, the period to hold this state is implementation 3502 specific. 3504 In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is 3505 retransmitted after the server has previously released the lock_owner 3506 state, the server will find that the lock_owner has no files open and 3507 an error will be returned to the client. If the lock_owner does have 3508 a file open, the stateid will not match and again an error is 3509 returned to the client. 3511 8.1.8. Use of Open Confirmation 3513 In the case that an OPEN is retransmitted and the lock_owner is being 3514 used for the first time or the lock_owner state has been previously 3515 released by the server, the use of the OPEN_CONFIRM operation will 3516 prevent incorrect behavior. When the server observes the use of the 3517 lock_owner for the first time, it will direct the client to perform 3518 the OPEN_CONFIRM for the corresponding OPEN. This sequence 3519 establishes the use of an lock_owner and associated sequence number. 3520 Since the OPEN_CONFIRM sequence connects a new open_owner on the 3521 server with an existing open_owner on a client, the sequence number 3522 may have any value. The OPEN_CONFIRM step assures the server that 3523 the value received is the correct one. See the section "OPEN_CONFIRM 3524 - Confirm Open" for further details. 3526 There are a number of situations in which the requirement to confirm 3527 an OPEN would pose difficulties for the client and server, in that 3528 they would be prevented from acting in a timely fashion on 3529 information received, because that information would be provisional, 3530 subject to deletion upon non-confirmation. Fortunately, these are 3531 situations in which the server can avoid the need for confirmation 3532 when responding to open requests. The two constraints are: 3534 o The server must not bestow a delegation for any open which would 3535 require confirmation. 3537 o The server MUST NOT require confirmation on a reclaim-type open 3538 (i.e. one specifying claim type CLAIM_PREVIOUS or 3539 CLAIM_DELEGATE_PREV). 3541 Draft Specification NFS version 4 Protocol September 2002 3543 These constraints are related in that reclaim-type opens are the only 3544 ones in which the server may be required to send a delegation. For 3545 CLAIM_NULL, sending the delegation is optional while for 3546 CLAIM_DELEGATE_CUR, no delegation is sent. 3548 Delegations being sent with an open requiring confirmation are 3549 troublesome because recovering from non-confirmation adds undue 3550 complexity to the protocol while requiring confirmation on reclaim- 3551 type opens poses difficulties in that the inability to resolve the 3552 status of the reclaim until lease expiration may make it difficult to 3553 have timely determination of the set of locks being reclaimed (since 3554 the grace period may expire). 3556 Requiring open confirmation on reclaim-type opens is avoidable 3557 because of the nature of the environments in which such opens are 3558 done. For CLAIM_PREVIOUS opens, this is immediately after server 3559 reboot, so there should be no time for lockowners to be created, 3560 found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we 3561 are dealing with a client reboot situation. A server which supports 3562 delegation can be sure that no lockowners for that client have been 3563 recycled since client initialization and thus can ensure that 3564 confirmation will not be required. 3566 8.2. Lock Ranges 3568 The protocol allows a lock owner to request a lock with a byte range 3569 and then either upgrade or unlock a sub-range of the initial lock. 3570 It is expected that this will be an uncommon type of request. In any 3571 case, servers or server filesystems may not be able to support sub- 3572 range lock semantics. In the event that a server receives a locking 3573 request that represents a sub-range of current locking state for the 3574 lock owner, the server is allowed to return the error 3575 NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock 3576 operations. Therefore, the client should be prepared to receive this 3577 error and, if appropriate, report the error to the requesting 3578 application. 3580 The client is discouraged from combining multiple independent locking 3581 ranges that happen to be adjacent into a single request since the 3582 server may not support sub-range requests and for reasons related to 3583 the recovery of file locking state in the event of server failure. 3584 As discussed in the section "Server Failure and Recovery" below, the 3585 server may employ certain optimizations during recovery that work 3586 effectively only when the client's behavior during lock recovery is 3587 similar to the client's locking behavior prior to server failure. 3589 8.3. Upgrading and Downgrading Locks 3591 If a client has a write lock on a record, it can request an atomic 3592 downgrade of the lock to a read lock via the LOCK request, by setting 3594 Draft Specification NFS version 4 Protocol September 2002 3596 the type to READ_LT. If the server supports atomic downgrade, the 3597 request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. 3598 The client should be prepared to receive this error, and if 3599 appropriate, report the error to the requesting application. 3601 If a client has a read lock on a record, it can request an atomic 3602 upgrade of the lock to a write lock via the LOCK request by setting 3603 the type to WRITE_LT or WRITEW_LT. If the server does not support 3604 atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade 3605 can be achieved without an existing conflict, the request will 3606 succeed. Otherwise, the server will return either NFS4ERR_DENIED or 3607 NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the 3608 client issued the LOCK request with the type set to WRITEW_LT and the 3609 server has detected a deadlock. The client should be prepared to 3610 receive such errors and if appropriate, report the error to the 3611 requesting application. 3613 8.4. Blocking Locks 3615 Some clients require the support of blocking locks. The NFS version 3616 4 protocol must not rely on a callback mechanism and therefore is 3617 unable to notify a client when a previously denied lock has been 3618 granted. Clients have no choice but to continually poll for the 3619 lock. This presents a fairness problem. Two new lock types are 3620 added, READW and WRITEW, and are used to indicate to the server that 3621 the client is requesting a blocking lock. The server should maintain 3622 an ordered list of pending blocking locks. When the conflicting lock 3623 is released, the server may wait the lease period for the first 3624 waiting client to re-request the lock. After the lease period 3625 expires the next waiting client request is allowed the lock. Clients 3626 are required to poll at an interval sufficiently small that it is 3627 likely to acquire the lock in a timely manner. The server is not 3628 required to maintain a list of pending blocked locks as it is used to 3629 increase fairness and not correct operation. Because of the 3630 unordered nature of crash recovery, storing of lock state to stable 3631 storage would be required to guarantee ordered granting of blocking 3632 locks. 3634 Servers may also note the lock types and delay returning denial of 3635 the request to allow extra time for a conflicting lock to be 3636 released, allowing a successful return. In this way, clients can 3637 avoid the burden of needlessly frequent polling for blocking locks. 3638 The server should take care in the length of delay in the event the 3639 client retransmits the request. 3641 8.5. Lease Renewal 3643 The purpose of a lease is to allow a server to remove stale locks 3644 that are held by a client that has crashed or is otherwise 3645 unreachable. It is not a mechanism for cache consistency and lease 3647 Draft Specification NFS version 4 Protocol September 2002 3649 renewals may not be denied if the lease interval has not expired. 3651 The following events cause implicit renewal of all of the leases for 3652 a given client (i.e. all those sharing a given clientid). Each of 3653 these is a positive indication that the client is still active and 3654 that the associated state held at the server, for the client, is 3655 still valid. 3657 o An OPEN with a valid clientid. 3659 o Any operation made with a valid stateid (CLOSE, DELEGPURGE, 3660 DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, 3661 READ, RENEW, SETATTR, WRITE). This does not include the special 3662 stateids of all bits 0 or all bits 1. 3664 Note that if the client had restarted or rebooted, the 3665 client would not be making these requests without issuing 3666 the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of 3667 the SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that 3668 changes the client verifier) notifies the server to drop 3669 the locking state associated with the client. 3670 SETCLIENTID/SETCLIENTID_CONFIRM never renews a lease. 3672 If the server has rebooted, the stateids 3673 (NFS4ERR_STALE_STATEID error) or the clientid 3674 (NFS4ERR_STALE_CLIENTID error) will not be valid hence 3675 preventing spurious renewals. 3677 This approach allows for low overhead lease renewal which scales 3678 well. In the typical case no extra RPC calls are required for lease 3679 renewal and in the worst case one RPC is required every lease period 3680 (i.e. a RENEW operation). The number of locks held by the client is 3681 not a factor since all state for the client is involved with the 3682 lease renewal action. 3684 Since all operations that create a new lease also renew existing 3685 leases, the server must maintain a common lease expiration time for 3686 all valid leases for a given client. This lease time can then be 3687 easily updated upon implicit lease renewal actions. 3689 8.6. Crash Recovery 3691 The important requirement in crash recovery is that both the client 3692 and the server know when the other has failed. Additionally, it is 3693 required that a client sees a consistent view of data across server 3694 restarts or reboots. All READ and WRITE operations that may have 3695 been queued within the client or network buffers must wait until the 3696 client has successfully recovered the locks protecting the READ and 3697 WRITE operations. 3699 Draft Specification NFS version 4 Protocol September 2002 3701 8.6.1. Client Failure and Recovery 3703 In the event that a client fails, the server may recover the client's 3704 locks when the associated leases have expired. Conflicting locks 3705 from another client may only be granted after this lease expiration. 3706 If the client is able to restart or reinitialize within the lease 3707 period the client may be forced to wait the remainder of the lease 3708 period before obtaining new locks. 3710 To minimize client delay upon restart, lock requests are associated 3711 with an instance of the client by a client supplied verifier. This 3712 verifier is part of the initial SETCLIENTID call made by the client. 3713 The server returns a clientid as a result of the SETCLIENTID 3714 operation. The client then confirms the use of the clientid with 3715 SETCLIENTID_CONFIRM. The clientid in combination with an opaque 3716 owner field is then used by the client to identify the lock owner for 3717 OPEN. This chain of associations is then used to identify all locks 3718 for a particular client. 3720 Since the verifier will be changed by the client upon each 3721 initialization, the server can compare a new verifier to the verifier 3722 associated with currently held locks and determine that they do not 3723 match. This signifies the client's new instantiation and subsequent 3724 loss of locking state. As a result, the server is free to release 3725 all locks held which are associated with the old clientid which was 3726 derived from the old verifier. 3728 Note that the verifier must have the same uniqueness properties of 3729 the verifier for the COMMIT operation. 3731 8.6.2. Server Failure and Recovery 3733 If the server loses locking state (usually as a result of a restart 3734 or reboot), it must allow clients time to discover this fact and re- 3735 establish the lost locking state. The client must be able to re- 3736 establish the locking state without having the server deny valid 3737 requests because the server has granted conflicting access to another 3738 client. Likewise, if there is the possibility that clients have not 3739 yet re-established their locking state for a file, the server must 3740 disallow READ and WRITE operations for that file. The duration of 3741 this recovery period is equal to the duration of the lease period. 3743 A client can determine that server failure (and thus loss of locking 3744 state) has occurred, when it receives one of two errors. The 3745 NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a 3746 reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a 3747 clientid invalidated by reboot or restart. When either of these are 3748 received, the client must establish a new clientid (See the section 3749 "Client ID") and re-establish the locking state as discussed below. 3751 The period of special handling of locking and READs and WRITEs, equal 3753 Draft Specification NFS version 4 Protocol September 2002 3755 in duration to the lease period, is referred to as the "grace 3756 period". During the grace period, clients recover locks and the 3757 associated state by reclaim-type locking requests (i.e. LOCK requests 3758 with reclaim set to true and OPEN operations with a claim type of 3759 CLAIM_PREVIOUS). During the grace period, the server must reject 3760 READ and WRITE operations and non-reclaim locking requests (i.e. 3761 other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. 3763 If the server can reliably determine that granting a non-reclaim 3764 request will not conflict with reclamation of locks by other clients, 3765 the NFS4ERR_GRACE error does not have to be returned and the non- 3766 reclaim client request can be serviced. For the server to be able to 3767 service READ and WRITE operations during the grace period, it must 3768 again be able to guarantee that no possible conflict could arise 3769 between an impending reclaim locking request and the READ or WRITE 3770 operation. If the server is unable to offer that guarantee, the 3771 NFS4ERR_GRACE error must be returned to the client. 3773 For a server to provide simple, valid handling during the grace 3774 period, the easiest method is to simply reject all non-reclaim 3775 locking requests and READ and WRITE operations by returning the 3776 NFS4ERR_GRACE error. However, a server may keep information about 3777 granted locks in stable storage. With this information, the server 3778 could determine if a regular lock or READ or WRITE operation can be 3779 safely processed. 3781 For example, if a count of locks on a given file is available in 3782 stable storage, the server can track reclaimed locks for the file and 3783 when all reclaims have been processed, non-reclaim locking requests 3784 may be processed. This way the server can ensure that non-reclaim 3785 locking requests will not conflict with potential reclaim requests. 3786 With respect to I/O requests, if the server is able to determine that 3787 there are no outstanding reclaim requests for a file by information 3788 from stable storage or another similar mechanism, the processing of 3789 I/O requests could proceed normally for the file. 3791 To reiterate, for a server that allows non-reclaim lock and I/O 3792 requests to be processed during the grace period, it MUST determine 3793 that no lock subsequently reclaimed will be rejected and that no lock 3794 subsequently reclaimed would have prevented any I/O operation 3795 processed during the grace period. 3797 Clients should be prepared for the return of NFS4ERR_GRACE errors for 3798 non-reclaim lock and I/O requests. In this case the client should 3799 employ a retry mechanism for the request. A delay (on the order of 3800 several seconds) between retries should be used to avoid overwhelming 3801 the server. Further discussion of the general issue is included in 3802 [Floyd]. The client must account for the server that is able to 3803 perform I/O and non-reclaim locking requests within the grace period 3804 as well as those that can not do so. 3806 A reclaim-type locking request outside the server's grace period can 3808 Draft Specification NFS version 4 Protocol September 2002 3810 only succeed if the server can guarantee that no conflicting lock or 3811 I/O request has been granted since reboot or restart. 3813 A server may, upon restart, establish a new value for the lease 3814 period. Therefore, clients should, once a new clientid is 3815 established, refetch the lease_time attribute and use it as the basis 3816 for lease renewal for the lease associated with that server. However, 3817 the server must establish, for this restart event, a grace period at 3818 least as long as the lease period for the previous server 3819 instantiation. This allows the client state obtained during the 3820 previous server instance to be reliably re-established. 3822 8.6.3. Network Partitions and Recovery 3824 If the duration of a network partition is greater than the lease 3825 period provided by the server, the server will have not received a 3826 lease renewal from the client. If this occurs, the server may free 3827 all locks held for the client. As a result, all stateids held by the 3828 client will become invalid or stale. Once the client is able to 3829 reach the server after such a network partition, all I/O submitted by 3830 the client with the now invalid stateids will fail with the server 3831 returning the error NFS4ERR_EXPIRED. Once this error is received, 3832 the client will suitably notify the application that held the lock. 3834 As a courtesy to the client or as an optimization, the server may 3835 continue to hold locks on behalf of a client for which recent 3836 communication has extended beyond the lease period. If the server 3837 receives a lock or I/O request that conflicts with one of these 3838 courtesy locks, the server must free the courtesy lock and grant the 3839 new request. 3841 When a network partition is combined with a server reboot, there are 3842 edge conditions that place requirements on the server in order to 3843 avoid silent data corruption following the server reboot. Two of 3844 these edge conditions are known, and are discussed below. 3846 The first edge condition has the following scenario: 3848 1. Client A acquires a lock. 3850 2. Client A and server experience mutual network partition, 3851 such that client A is unable to renew its lease. 3853 3. Client A's lease expires, so server releases lock. 3855 4. Client B acquires a lock that would have conflicted with 3856 that of Client A. 3858 5. Client B releases the lock 3860 Draft Specification NFS version 4 Protocol September 2002 3862 6. Server reboots 3864 7. Network partition between client A and server heals. 3866 8. Client A issues a RENEW operation, and gets back a 3867 NFS4ERR_STALE_CLIENTID. 3869 9. Client A reclaims its lock within the server's grace period. 3871 Thus, at the final step, the server has erroneously granted client 3872 A's lock reclaim. If client B modified the object the lock was 3873 protecting, client A will experience object corruption. 3875 The second known edge condition follows: 3877 1. Client A acquires a lock. 3879 2. Server reboots. 3881 3. Client A and server experience mutual network partition, 3882 such that client A is unable to reclaim its lock within the 3883 grace period. 3885 4. Server's reclaim grace period ends. Client A has no locks 3886 recorded on server. 3888 5. Client B acquires a lock that would have conflicted with 3889 that of Client A. 3891 6. Client B releases the lock 3893 7. Server reboots a second time 3895 8. Network partition between client A and server heals. 3897 9. Client A issues a RENEW operation, and gets back a 3898 NFS4ERR_STALE_CLIENTID. 3900 10. Client A reclaims its lock within the server's grace period. 3902 As with the first edge condition, the final step of the scenario of 3903 the second edge condition has the server erroneously granting client 3904 A's lock reclaim. 3906 Solving the first and second edge conditions requires that the server 3907 either assume after it reboots that edge condition occurs, and thus 3908 return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server 3909 record some information stable storage. The amount of information 3910 the server records in stable storage is in inverse proportion to how 3911 harsh the server wants to be whenever the edge conditions occur. The 3912 server that is completely tolerant of all edge conditions will record 3913 in stable storage every lock that is acquired, removing the lock 3915 Draft Specification NFS version 4 Protocol September 2002 3917 record from stable storage only when the lock is unlocked by the 3918 client and the lock's lockowner advances the sequence number such 3919 that the lock release is not the last stateful event for the 3920 lockowner's sequence. For the two aforementioned edge conditions, the 3921 harshest a server can be, and still support a grace period for 3922 reclaims, requires that the server record in stable storage 3923 information some minimal information. For example, a server 3924 implementation could, for each client, save in stable storage a 3925 record containing: 3927 o the client's id string 3929 o a boolean that indicates if the client's lease expired or if 3930 there was administrative intervention (see the section, 3931 Server Revocation of Locks) to revoke a record lock, share 3932 reservation, or delegation 3934 o a timestamp that is updated the first time after a server 3935 boot or reboot the client acquires record locking, share 3936 reservation, or delegation state on the server. The 3937 timestamp need not be updated on subsequent lock requests 3938 until the server reboots. 3940 The server implementation would also record in the stable storage the 3941 timestamps from the two most recent server reboots. 3943 Assuming the above record keeping, for the first edge condition, 3944 after the server reboots, the record that client A's lease expired 3945 means that another client could have acquired a conflicting record 3946 lock, share reservation, or delegation. Hence the server must reject 3947 a reclaim from client A with the error NFS4ERR_NO_GRACE. 3949 For the second edge condition, after the server reboots for a second 3950 time, the record that the client had an unexpired record lock, share 3951 reservation, or delegation established before the server's previous 3952 incarnation means that the server must reject a reclaim from client A 3953 with the error NFS4ERR_NO_GRACE. 3955 Regardless of the level and approach to record keeping, the server 3956 MUST implement one of the following strategies (which apply to 3957 reclaims of share reservations, record locks, and delegations): 3959 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is 3960 superharsh, but necessary if the server does not want to 3961 record lock state in stable storage. 3963 2. Record sufficient state in stable storage such that all 3964 known edge conditions involving server reboot, including the 3965 two noted in this section, are detected. False positives are 3966 acceptable. Note that at this time, it is not known if there 3967 are other edge conditions. 3969 Draft Specification NFS version 4 Protocol September 2002 3971 In the event, after a server reboot, the server determines 3972 that there is unrecoverable damage or corruption to the the 3973 stable storage, then for all clients and/or locks affected, 3974 the server MUST return NFS4ERR_NO_GRACE. 3976 A mandate for the client's handling of the NFS4ERR_NO_GRACE error is 3977 outside the scope of this specification, since the strategies for 3978 such handling are very dependent on the client's operating 3979 environment. However, one potential approach is described below. 3981 When the client receives NFS4ERR_NO_GRACE, it could examine the 3982 change attribute of the objects the client is trying to reclaim state 3983 for, and use that to determine whether to re-establish the state via 3984 normal OPEN or LOCK requests. This is acceptable provided the 3985 client's operating environment allows it. In otherwords, the client 3986 implementor is advised to document for his users the behavior. The 3987 client could also inform the application that its record lock or 3988 share reservations (whether they were delegated or not) have been 3989 lost, such as via a UNIX signal, a GUI pop-up window, etc. See the 3990 section, "Data Caching and Revocation" for a discussion of what the 3991 client should do for dealing with unreclaimed delegations on client 3992 state. 3994 For further discussion of revocation of locks see the section "Server 3995 Revocation of Locks". 3997 8.7. Recovery from a Lock Request Timeout or Abort 3999 In the event a lock request times out, a client may decide to not 4000 retry the request. The client may also abort the request when the 4001 process for which it was issued is terminated (e.g. in UNIX due to a 4002 signal). It is possible though that the server received the request 4003 and acted upon it. This would change the state on the server without 4004 the client being aware of the change. It is paramount that the 4005 client re-synchronize state with server before it attempts any other 4006 operation that takes a seqid and/or a stateid with the same 4007 lock_owner. This is straightforward to do without a special re- 4008 synchronize operation. 4010 Since the server maintains the last lock request and response 4011 received on the lock_owner, for each lock_owner, the client should 4012 cache the last lock request it sent such that the lock request did 4013 not receive a response. From this, the next time the client does a 4014 lock operation for the lock_owner, it can send the cached request, if 4015 there is one, and if the request was one that established state (e.g. 4016 a LOCK or OPEN operation), the server will return the cached result 4017 or if never saw the request, perform it. The client can follow up 4018 with a request to remove the state (e.g. a LOCKU or CLOSE operation). 4019 With this approach, the sequencing and stateid information on the 4020 client and server for the given lock_owner will re-synchronize and in 4021 turn the lock state will re-synchronize. 4023 Draft Specification NFS version 4 Protocol September 2002 4025 8.8. Server Revocation of Locks 4027 At any point, the server can revoke locks held by a client and the 4028 client must be prepared for this event. When the client detects that 4029 its locks have been or may have been revoked, the client is 4030 responsible for validating the state information between itself and 4031 the server. Validating locking state for the client means that it 4032 must verify or reclaim state for each lock currently held. 4034 The first instance of lock revocation is upon server reboot or re- 4035 initialization. In this instance the client will receive an error 4036 (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will 4037 proceed with normal crash recovery as described in the previous 4038 section. 4040 The second lock revocation event is the inability to renew the lease 4041 before expiration. While this is considered a rare or unusual event, 4042 the client must be prepared to recover. Both the server and client 4043 will be able to detect the failure to renew the lease and are capable 4044 of recovering without data corruption. For the server, it tracks the 4045 last renewal event serviced for the client and knows when the lease 4046 will expire. Similarly, the client must track operations which will 4047 renew the lease period. Using the time that each such request was 4048 sent and the time that the corresponding reply was received, the 4049 client should bound the time that the corresponding renewal could 4050 have occurred on the server and thus determine if it is possible that 4051 a lease period expiration could have occurred. 4053 The third lock revocation event can occur as a result of 4054 administrative intervention within the lease period. While this is 4055 considered a rare event, it is possible that the server's 4056 administrator has decided to release or revoke a particular lock held 4057 by the client. As a result of revocation, the client will receive an 4058 error of NFS4ERR_ADMIN_REVOKED. In this instance the client may 4059 assume that only the lock_owner's locks have been lost. The client 4060 notifies the lock holder appropriately. The client may not assume 4061 the lease period has been renewed as a result of failed operation. 4063 When the client determines the lease period may have expired, the 4064 client must mark all locks held for the associated lease as 4065 "unvalidated". This means the client has been unable to re-establish 4066 or confirm the appropriate lock state with the server. As described 4067 in the previous section on crash recovery, there are scenarios in 4068 which the server may grant conflicting locks after the lease period 4069 has expired for a client. When it is possible that the lease period 4070 has expired, the client must validate each lock currently held to 4071 ensure that a conflicting lock has not been granted. The client may 4072 accomplish this task by issuing an I/O request, either a pending I/O 4073 or a zero-length read, specifying the stateid associated with the 4074 lock in question. If the response to the request is success, the 4075 client has validated all of the locks governed by that stateid and 4076 re-established the appropriate state between itself and the server. 4078 Draft Specification NFS version 4 Protocol September 2002 4080 If the I/O request is not successful, then one or more of the locks 4081 associated with the stateid was revoked by the server and the client 4082 must notify the owner. 4084 8.9. Share Reservations 4086 A share reservation is a mechanism to control access to a file. It 4087 is a separate and independent mechanism from record locking. When a 4088 client opens a file, it issues an OPEN operation to the server 4089 specifying the type of access required (READ, WRITE, or BOTH) and the 4090 type of access to deny others (deny NONE, READ, WRITE, or BOTH). If 4091 the OPEN fails the client will fail the application's open request. 4093 Pseudo-code definition of the semantics: 4095 if (request.access == 0) 4096 return (NFS4ERR_INVAL) 4097 else 4098 if ((request.access & file_state.deny)) || 4099 (request.deny & file_state.access)) 4100 return (NFS4ERR_DENIED) 4102 This checking of share reservations on OPEN is done with no exception 4103 for an existing OPEN for the same open_owner. 4105 The constants used for the OPEN and OPEN_DOWNGRADE operations for the 4106 access and deny fields are as follows: 4108 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 4109 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 4110 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 4112 const OPEN4_SHARE_DENY_NONE = 0x00000000; 4113 const OPEN4_SHARE_DENY_READ = 0x00000001; 4114 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 4115 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 4117 8.10. OPEN/CLOSE Operations 4119 To provide correct share semantics, a client MUST use the OPEN 4120 operation to obtain the initial filehandle and indicate the desired 4121 access and what if any access to deny. Even if the client intends to 4122 use a stateid of all 0's or all 1's, it must still obtain the 4123 filehandle for the regular file with the OPEN operation so the 4124 appropriate share semantics can be applied. For clients that do not 4125 have a deny mode built into their open programming interfaces, deny 4126 equal to NONE should be used. 4128 Draft Specification NFS version 4 Protocol September 2002 4130 The OPEN operation with the CREATE flag, also subsumes the CREATE 4131 operation for regular files as used in previous versions of the NFS 4132 protocol. This allows a create with a share to be done atomically. 4134 The CLOSE operation removes all share reservations held by the 4135 lock_owner on that file. If record locks are held, the client SHOULD 4136 release all locks before issuing a CLOSE. The server MAY free all 4137 outstanding locks on CLOSE but some servers may not support the CLOSE 4138 of a file that still has record locks held. The server MUST return 4139 failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the 4140 CLOSE. 4142 The LOOKUP operation will return a filehandle without establishing 4143 any lock state on the server. Without a valid stateid, the server 4144 will assume the client has the least access. For example, a file 4145 opened with deny READ/WRITE cannot be accessed using a filehandle 4146 obtained through LOOKUP because it would not have a valid stateid 4147 (i.e. using a stateid of all bits 0 or all bits 1). 4149 8.10.1. Close and Retention of State Information 4151 Since a CLOSE operation requests deallocation of a stateid, dealing 4152 with retransmission of the CLOSE, may pose special difficulties, 4153 since the state information, which normally would be used to 4154 determine the state of the open file being designated, might be 4155 deallocated, resulting in an NFS4ERR_BAD_STATEID error. 4157 Servers may deal with this problem in a number of ways. To provide 4158 the greatest degree assurance that the protocol is being used 4159 properly, a server should, rather than deallocate the stateid, mark 4160 it as close-pending, and retain the stateid with this status, until 4161 later deallocation. In this way, a retransmitted CLOSE can be 4162 recognized since the stateid points to state information with this 4163 distinctive status, so that it can be handled without error. 4165 When adopting this strategy, a server should retain the state 4166 information until the earliest of: 4168 o Another validly sequenced request for the same lockowner, that 4169 is not a retransmission. 4171 o The time that a lockowner is freed by the server due to period 4172 with no activity. 4174 o All locks for the client are freed as a result of a SETCLIENTID. 4176 Servers may avoid this complexity, at the cost of less complete 4177 protocol error checking, by simply responding NFS4_OK in the event of 4178 a CLOSE for a deallocated stateid, on the assumption that this case 4179 must be caused by a retransmitted close. When adopting this 4181 Draft Specification NFS version 4 Protocol September 2002 4183 approach, it is desirable to at least log an error when returning a 4184 no-error indication in this situation. If the server maintains a 4185 reply-cache mechanism, it can verify the CLOSE is indeed a 4186 retransmission and avoid error logging in most cases. 4188 8.11. Open Upgrade and Downgrade 4190 When an OPEN is done for a file and the lockowner for which the open 4191 is being done already has the file open, the result is to upgrade the 4192 open file status maintained on the server to include the access and 4193 deny bits specified by the new OPEN as well as those for the existing 4194 OPEN. The result is that there is one open file, as far as the 4195 protocol is concerned, and it includes the union of the access and 4196 deny bits for all of the OPEN requests completed. Only a single 4197 CLOSE will be done to reset the effects of both OPENs. Note that the 4198 client, when issuing the OPEN, may not know that the same file is in 4199 fact being opened. The above only applies if both OPENs result in 4200 the OPENed object being designated by the same filehandle. 4202 When the server chooses to export multiple filehandles corresponding 4203 to the same file object and returns different filehandles on two 4204 different OPENs of the same file object, the server MUST NOT "OR" 4205 together the access and deny bits and coalesce the two open files. 4206 Instead the server must maintain separate OPENs with separate 4207 stateids and will require separate CLOSEs to free them. 4209 When multiple open files on the client are merged into a single open 4210 file object on the server, the close of one of the open files (on the 4211 client) may necessitate change of the access and deny status of the 4212 open file on the server. This is because the union of the access and 4213 deny bits for the remaining opens may be smaller (i.e. a proper 4214 subset) than previously. The OPEN_DOWNGRADE operation is used to 4215 make the necessary change and the client should use it to update the 4216 server so that share reservation requests by other clients are 4217 handled properly. 4219 8.12. Short and Long Leases 4221 When determining the time period for the server lease, the usual 4222 lease tradeoffs apply. Short leases are good for fast server 4223 recovery at a cost of increased RENEW or READ (with zero length) 4224 requests. Longer leases are certainly kinder and gentler to servers 4225 trying to handle very large numbers of clients. The number of RENEW 4226 requests drop in proportion to the lease time. The disadvantages of 4227 long leases are slower recovery after server failure (the server must 4228 wait for the leases to expire and the grace period to elapse before 4229 granting new lock requests) and increased file contention (if client 4230 fails to transmit an unlock request then server must wait for lease 4231 expiration before granting new locks). 4233 Draft Specification NFS version 4 Protocol September 2002 4235 Long leases are usable if the server is able to store lease state in 4236 non-volatile memory. Upon recovery, the server can reconstruct the 4237 lease state from its non-volatile memory and continue operation with 4238 its clients and therefore long leases would not be an issue. 4240 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration 4242 To avoid the need for synchronized clocks, lease times are granted by 4243 the server as a time delta. However, there is a requirement that the 4244 client and server clocks do not drift excessively over the duration 4245 of the lock. There is also the issue of propagation delay across the 4246 network which could easily be several hundred milliseconds as well as 4247 the possibility that requests will be lost and need to be 4248 retransmitted. 4250 To take propagation delay into account, the client should subtract it 4251 from lease times (e.g. if the client estimates the one-way 4252 propagation delay as 200 msec, then it can assume that the lease is 4253 already 200 msec old when it gets it). In addition, it will take 4254 another 200 msec to get a response back to the server. So the client 4255 must send a lock renewal or write data back to the server 400 msec 4256 before the lease would expire. 4258 The server's lease period configuration should take into account the 4259 network distance of the clients that will be accessing the server's 4260 resources. It is expected that the lease period will take into 4261 account the network propagation delays and other network delay 4262 factors for the client population. Since the protocol does not allow 4263 for an automatic method to determine an appropriate lease period, the 4264 server's administrator may have to tune the lease period. 4266 8.14. Migration, Replication and State 4268 When responsibility for handling a given file system is transferred 4269 to a new server (migration) or the client chooses to use an alternate 4270 server (e.g. in response to server unresponsiveness) in the context 4271 of file system replication, the appropriate handling of state shared 4272 between the client and server (i.e. locks, leases, stateids, and 4273 clientids) is as described below. The handling differs between 4274 migration and replication. For related discussion of file server 4275 state and recover of such see the sections under "File Locking and 4276 Share Reservations" 4278 If server replica or a server immigrating a filesystem agrees to, or 4279 is expected to, accept opaque values from the client that originated 4280 from another server, then it is a wise implementation practice for 4281 the servers to encode the "opaque" values in network byte order. This 4282 way, servers acting as replicas or immigrating filesystems will be 4283 able to parse values like stateids, directory cookies, filehandles, 4284 etc. even if their native byte order is different from other servers 4286 Draft Specification NFS version 4 Protocol September 2002 4288 cooperating in the replication and migration of the filesystem. 4290 8.14.1. Migration and State 4292 In the case of migration, the servers involved in the migration of a 4293 filesystem SHOULD transfer all server state from the original to the 4294 new server. This must be done in a way that is transparent to the 4295 client. This state transfer will ease the client's transition when a 4296 filesystem migration occurs. If the servers are successful in 4297 transferring all state, the client will continue to use stateids 4298 assigned by the original server. Therefore the new server must 4299 recognize these stateids as valid. This holds true for the clientid 4300 as well. Since responsibility for an entire filesystem is 4301 transferred with a migration event, there is no possibility that 4302 conflicts will arise on the new server as a result of the transfer of 4303 locks. 4305 As part of the transfer of information between servers, leases would 4306 be transferred as well. The leases being transferred to the new 4307 server will typically have a different expiration time from those for 4308 the same client, previously on the old server. To maintain the 4309 property that all leases on a given server for a given client expire 4310 at the same time, the server should advance the expiration time to 4311 the later of the leases being transferred or the leases already 4312 present. This allows the client to maintain lease renewal of both 4313 classes without special effort. 4315 The servers may choose not to transfer the state information upon 4316 migration. However, this choice is discouraged. In this case, when 4317 the client presents state information from the original server, the 4318 client must be prepared to receive either NFS4ERR_STALE_CLIENTID or 4319 NFS4ERR_STALE_STATEID from the new server. The client should then 4320 recover its state information as it normally would in response to a 4321 server failure. The new server must take care to allow for the 4322 recovery of state information as it would in the event of server 4323 restart. 4325 8.14.2. Replication and State 4327 Since client switch-over in the case of replication is not under 4328 server control, the handling of state is different. In this case, 4329 leases, stateids and clientids do not have validity across a 4330 transition from one server to another. The client must re-establish 4331 its locks on the new server. This can be compared to the re- 4332 establishment of locks by means of reclaim-type requests after a 4333 server reboot. The difference is that the server has no provision to 4334 distinguish requests reclaiming locks from those obtaining new locks 4335 or to defer the latter. Thus, a client re-establishing a lock on the 4336 new server (by means of a LOCK or OPEN request), may have the 4337 requests denied due to a conflicting lock. Since replication is 4339 Draft Specification NFS version 4 Protocol September 2002 4341 intended for read-only use of filesystems, such denial of locks 4342 should not pose large difficulties in practice. When an attempt to 4343 re-establish a lock on a new server is denied, the client should 4344 treat the situation as if his original lock had been revoked. 4346 8.14.3. Notification of Migrated Lease 4348 In the case of lease renewal, the client may not be submitting 4349 requests for a filesystem that has been migrated to another server. 4350 This can occur because of the implicit lease renewal mechanism. The 4351 client renews leases for all filesystems when submitting a request to 4352 any one filesystem at the server. 4354 In order for the client to schedule renewal of leases that may have 4355 been relocated to the new server, the client must find out about 4356 lease relocation before those leases expire. To accomplish this, all 4357 operations which implicitly renew leases for a client (i.e. OPEN, 4358 CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error 4359 NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be 4360 renewed has been transferred to a new server. This condition will 4361 continue until the client receives an NFS4ERR_MOVED error and the 4362 server receives the subsequent GETATTR(fs_locations) for an access to 4363 each filesystem for which a lease has been moved to a new server. 4365 When a client receives an NFS4ERR_LEASE_MOVED error, it should 4366 perform an operation on each filesystem associated with the server in 4367 question. When the client receives an NFS4ERR_MOVED error, the 4368 client can follow the normal process to obtain the new server 4369 information (through the fs_locations attribute) and perform renewal 4370 of those leases on the new server. If the server has not had state 4371 transferred to it transparently, the client will receive either 4372 NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, 4373 as described above, and the client can then recover state information 4374 as it does in the event of server failure. 4376 8.14.4. Migration and the Lease_time Attribute 4378 In order that the client may appropriately manage its leases in the 4379 case of migration, the destination server must establish proper 4380 values for the lease_time attribute. 4382 When state is transferred transparently, that state should include 4383 the correct value of the lease_time attribute. The lease_time 4384 attribute on the destination server must never be less than that on 4385 the source since this would result in premature expiration of leases 4386 granted by the source server. Upon migration in which state is 4387 transferred transparently, the client is under no obligation to re- 4388 fetch the lease_time attribute and may continue to use the value 4389 previously fetched (on the source server). 4391 Draft Specification NFS version 4 Protocol September 2002 4393 If state has not been transferred transparently (i.e. the client sees 4394 a real or simulated server reboot), the client should fetch the value 4395 of lease_time on the new (i.e. destination) server, and use it for 4396 subsequent locking requests. However the server must respect a grace 4397 period at least as long as the lease_time on the source server, in 4398 order to ensure that clients have ample time to reclaim their locks 4399 before potentially conflicting non-reclaimed locks are granted. The 4400 means by which the new server obtains the value of lease_time on the 4401 old server is left to the server implementations. It is not 4402 specified by the NFS version 4 protocol. 4404 Draft Specification NFS version 4 Protocol September 2002 4406 9. Client-Side Caching 4408 Client-side caching of data, of file attributes, and of file names is 4409 essential to providing good performance with the NFS protocol. 4410 Providing distributed cache coherence is a difficult problem and 4411 previous versions of the NFS protocol have not attempted it. 4412 Instead, several NFS client implementation techniques have been used 4413 to reduce the problems that a lack of coherence poses for users. 4414 These techniques have not been clearly defined by earlier protocol 4415 specifications and it is often unclear what is valid or invalid 4416 client behavior. 4418 The NFS version 4 protocol uses many techniques similar to those that 4419 have been used in previous protocol versions. The NFS version 4 4420 protocol does not provide distributed cache coherence. However, it 4421 defines a more limited set of caching guarantees to allow locks and 4422 share reservations to be used without destructive interference from 4423 client side caching. 4425 In addition, the NFS version 4 protocol introduces a delegation 4426 mechanism which allows many decisions normally made by the server to 4427 be made locally by clients. This mechanism provides efficient 4428 support of the common cases where sharing is infrequent or where 4429 sharing is read-only. 4431 9.1. Performance Challenges for Client-Side Caching 4433 Caching techniques used in previous versions of the NFS protocol have 4434 been successful in providing good performance. However, several 4435 scalability challenges can arise when those techniques are used with 4436 very large numbers of clients. This is particularly true when 4437 clients are geographically distributed which classically increases 4438 the latency for cache revalidation requests. 4440 The previous versions of the NFS protocol repeat their file data 4441 cache validation requests at the time the file is opened. This 4442 behavior can have serious performance drawbacks. A common case is 4443 one in which a file is only accessed by a single client. Therefore, 4444 sharing is infrequent. 4446 In this case, repeated reference to the server to find that no 4447 conflicts exist is expensive. A better option with regards to 4448 performance is to allow a client that repeatedly opens a file to do 4449 so without reference to the server. This is done until potentially 4450 conflicting operations from another client actually occur. 4452 A similar situation arises in connection with file locking. Sending 4453 file lock and unlock requests to the server as well as the read and 4454 write requests necessary to make data caching consistent with the 4455 locking semantics (see the section "Data Caching and File Locking") 4456 can severely limit performance. When locking is used to provide 4458 Draft Specification NFS version 4 Protocol September 2002 4460 protection against infrequent conflicts, a large penalty is incurred. 4461 This penalty may discourage the use of file locking by applications. 4463 The NFS version 4 protocol provides more aggressive caching 4464 strategies with the following design goals: 4466 o Compatibility with a large range of server semantics. 4468 o Provide the same caching benefits as previous versions of the 4469 NFS protocol when unable to provide the more aggressive model. 4471 o Requirements for aggressive caching are organized so that a 4472 large portion of the benefit can be obtained even when not all 4473 of the requirements can be met. 4475 The appropriate requirements for the server are discussed in later 4476 sections in which specific forms of caching are covered. (see the 4477 section "Open Delegation"). 4479 9.2. Delegation and Callbacks 4481 Recallable delegation of server responsibilities for a file to a 4482 client improves performance by avoiding repeated requests to the 4483 server in the absence of inter-client conflict. With the use of a 4484 "callback" RPC from server to client, a server recalls delegated 4485 responsibilities when another client engages in sharing of a 4486 delegated file. 4488 A delegation is passed from the server to the client, specifying the 4489 object of the delegation and the type of delegation. There are 4490 different types of delegations but each type contains a stateid to be 4491 used to represent the delegation when performing operations that 4492 depend on the delegation. This stateid is similar to those 4493 associated with locks and share reservations but differs in that the 4494 stateid for a delegation is associated with a clientid and may be 4495 used on behalf of all the open_owners for the given client. A 4496 delegation is made to the client as a whole and not to any specific 4497 process or thread of control within it. 4499 Because callback RPCs may not work in all environments (due to 4500 firewalls, for example), correct protocol operation does not depend 4501 on them. Preliminary testing of callback functionality by means of a 4502 CB_NULL procedure determines whether callbacks can be supported. The 4503 CB_NULL procedure checks the continuity of the callback path. A 4504 server makes a preliminary assessment of callback availability to a 4505 given client and avoids delegating responsibilities until it has 4506 determined that callbacks are supported. Because the granting of a 4507 delegation is always conditional upon the absence of conflicting 4508 access, clients must not assume that a delegation will be granted and 4509 they must always be prepared for OPENs to be processed without any 4511 Draft Specification NFS version 4 Protocol September 2002 4513 delegations being granted. 4515 Once granted, a delegation behaves in most ways like a lock. There 4516 is an associated lease that is subject to renewal together with all 4517 of the other leases held by that client. 4519 Unlike locks, an operation by a second client to a delegated file 4520 will cause the server to recall a delegation through a callback. 4522 On recall, the client holding the delegation must flush modified 4523 state (such as modified data) to the server and return the 4524 delegation. The conflicting request will not receive a response 4525 until the recall is complete. The recall is considered complete when 4526 the client returns the delegation or the server times out on the 4527 recall and revokes the delegation as a result of the timeout. 4528 Following the resolution of the recall, the server has the 4529 information necessary to grant or deny the second client's request. 4531 At the time the client receives a delegation recall, it may have 4532 substantial state that needs to be flushed to the server. Therefore, 4533 the server should allow sufficient time for the delegation to be 4534 returned since it may involve numerous RPCs to the server. If the 4535 server is able to determine that the client is diligently flushing 4536 state to the server as a result of the recall, the server may extend 4537 the usual time allowed for a recall. However, the time allowed for 4538 recall completion should not be unbounded. 4540 An example of this is when responsibility to mediate opens on a given 4541 file is delegated to a client (see the section "Open Delegation"). 4542 The server will not know what opens are in effect on the client. 4543 Without this knowledge the server will be unable to determine if the 4544 access and deny state for the file allows any particular open until 4545 the delegation for the file has been returned. 4547 A client failure or a network partition can result in failure to 4548 respond to a recall callback. In this case, the server will revoke 4549 the delegation which in turn will render useless any modified state 4550 still on the client. 4552 9.2.1. Delegation Recovery 4554 There are three situations that delegation recovery must deal with: 4556 o Client reboot or restart 4558 o Server reboot or restart 4560 o Network partition (full or callback-only) 4562 In the event the client reboots or restarts, the failure to renew 4564 Draft Specification NFS version 4 Protocol September 2002 4566 leases will result in the revocation of record locks and share 4567 reservations. Delegations, however, may be treated a bit 4568 differently. 4570 There will be situations in which delegations will need to be 4571 reestablished after a client reboots or restarts. The reason for 4572 this is the client may have file data stored locally and this data 4573 was associated with the previously held delegations. The client will 4574 need to reestablish the appropriate file state on the server. 4576 To allow for this type of client recovery, the server MAY extend the 4577 period for delegation recovery beyond the typical lease expiration 4578 period. This implies that requests from other clients that conflict 4579 with these delegations will need to wait. Because the normal recall 4580 process may require significant time for the client to flush changed 4581 state to the server, other clients need be prepared for delays that 4582 occur because of a conflicting delegation. This longer interval 4583 would increase the window for clients to reboot and consult stable 4584 storage so that the delegations can be reclaimed. For open 4585 delegations, such delegations are reclaimed using OPEN with a claim 4586 type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and 4587 Revocation" and "Operation 18: OPEN" for discussion of open 4588 delegation and the details of OPEN respectively). 4590 A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it 4591 does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and 4592 instead MUST, for a period of time no less than that of the value of 4593 the lease_time attribute, maintain the client's delegations to allow 4594 time for the client to issue CLAIM_DELEGATE_PREV requests. The server 4595 that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE 4596 operation. 4598 When the server reboots or restarts, delegations are reclaimed (using 4599 the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to 4600 record locks and share reservations. However, there is a slight 4601 semantic difference. In the normal case if the server decides that a 4602 delegation should not be granted, it performs the requested action 4603 (e.g. OPEN) without granting any delegation. For reclaim, the server 4604 grants the delegation but a special designation is applied so that 4605 the client treats the delegation as having been granted but recalled 4606 by the server. Because of this, the client has the duty to write all 4607 modified state to the server and then return the delegation. This 4608 process of handling delegation reclaim reconciles three principles of 4609 the NFS version 4 protocol: 4611 o Upon reclaim, a client reporting resources assigned to it by an 4612 earlier server instance must be granted those resources. 4614 o The server has unquestionable authority to determine whether 4615 delegations are to be granted and, once granted, whether they 4616 are to be continued. 4618 Draft Specification NFS version 4 Protocol September 2002 4620 o The use of callbacks is not to be depended upon until the client 4621 has proven its ability to receive them. 4623 When a network partition occurs, delegations are subject to freeing 4624 by the server when the lease renewal period expires. This is similar 4625 to the behavior for locks and share reservations. For delegations, 4626 however, the server may extend the period in which conflicting 4627 requests are held off. Eventually the occurrence of a conflicting 4628 request from another client will cause revocation of the delegation. 4629 A loss of the callback path (e.g. by later network configuration 4630 change) will have the same effect. A recall request will fail and 4631 revocation of the delegation will result. 4633 A client normally finds out about revocation of a delegation when it 4634 uses a stateid associated with a delegation and receives the error 4635 NFS4ERR_EXPIRED. It also may find out about delegation revocation 4636 after a client reboot when it attempts to reclaim a delegation and 4637 receives that same error. Note that in the case of a revoked write 4638 open delegation, there are issues because data may have been modified 4639 by the client whose delegation is revoked and separately by other 4640 clients. See the section "Revocation Recovery for Write Open 4641 Delegation" for a discussion of such issues. Note also that when 4642 delegations are revoked, information about the revoked delegation 4643 will be written by the server to stable storage (as described in the 4644 section "Crash Recovery"). This is done to deal with the case in 4645 which a server reboots after revoking a delegation but before the 4646 client holding the revoked delegation is notified about the 4647 revocation. 4649 9.3. Data Caching 4651 When applications share access to a set of files, they need to be 4652 implemented so as to take account of the possibility of conflicting 4653 access by another application. This is true whether the applications 4654 in question execute on different clients or reside on the same 4655 client. 4657 Share reservations and record locks are the facilities the NFS 4658 version 4 protocol provides to allow applications to coordinate 4659 access by providing mutual exclusion facilities. The NFS version 4 4660 protocol's data caching must be implemented such that it does not 4661 invalidate the assumptions that those using these facilities depend 4662 upon. 4664 9.3.1. Data Caching and OPENs 4666 In order to avoid invalidating the sharing assumptions that 4667 applications rely on, NFS version 4 clients should not provide cached 4668 data to applications or modify it on behalf of an application when it 4669 would not be valid to obtain or modify that same data via a READ or 4671 Draft Specification NFS version 4 Protocol September 2002 4673 WRITE operation. 4675 Furthermore, in the absence of open delegation (see the section "Open 4676 Delegation") two additional rules apply. Note that these rules are 4677 obeyed in practice by many NFS version 2 and version 3 clients. 4679 o First, cached data present on a client must be revalidated after 4680 doing an OPEN. Revalidating means that the client fetches the 4681 change attribute from the server, compares it with the cached 4682 change attribute, and if different, declares the cached data (as 4683 well as the cached attributes) as invalid. This is to ensure 4684 that the data for the OPENed file is still correctly reflected 4685 in the client's cache. This validation must be done at least 4686 when the client's OPEN operation includes DENY=WRITE or BOTH 4687 thus terminating a period in which other clients may have had 4688 the opportunity to open the file with WRITE access. Clients may 4689 choose to do the revalidation more often (i.e. at OPENs 4690 specifying DENY=NONE) to parallel the NFS version 3 protocol's 4691 practice for the benefit of users assuming this degree of cache 4692 revalidation. 4694 Since the change attribute is updated for data and metadata 4695 modifications, some client implementors may be tempted to use 4696 the time_modify attribute and not change to validate cached 4697 data, so that metadata changes do not spuriously invalidate 4698 clean data. The implementor is cautioned in this approach. The 4699 change attribute is guaranteed to change for each update to the 4700 file, whereas time_modify is guaranteed to change only at the 4701 granularity of the time_delta attribute. Use by the client's 4702 data cache validation logic of time_modify and not change runs 4703 the risk of the client incorrectly marking stale data as valid. 4705 o Second, modified data must be flushed to the server before 4706 closing a file OPENed for write. This is complementary to the 4707 first rule. If the data is not flushed at CLOSE, the 4708 revalidation done after client OPENs as file is unable to 4709 achieve its purpose. The other aspect to flushing the data 4710 before close is that the data must be committed to stable 4711 storage, at the server, before the CLOSE operation is requested 4712 by the client. In the case of a server reboot or restart and a 4713 CLOSEd file, it may not be possible to retransmit the data to be 4714 written to the file. Hence, this requirement. 4716 9.3.2. Data Caching and File Locking 4718 For those applications that choose to use file locking instead of 4719 share reservations to exclude inconsistent file access, there is an 4720 analogous set of constraints that apply to client side data caching. 4721 These rules are effective only if the file locking is used in a way 4722 that matches in an equivalent way the actual READ and WRITE 4724 Draft Specification NFS version 4 Protocol September 2002 4726 operations executed. This is as opposed to file locking that is 4727 based on pure convention. For example, it is possible to manipulate 4728 a two-megabyte file by dividing the file into two one-megabyte 4729 regions and protecting access to the two regions by file locks on 4730 bytes zero and one. A lock for write on byte zero of the file would 4731 represent the right to do READ and WRITE operations on the first 4732 region. A lock for write on byte one of the file would represent the 4733 right to do READ and WRITE operations on the second region. As long 4734 as all applications manipulating the file obey this convention, they 4735 will work on a local filesystem. However, they may not work with the 4736 NFS version 4 protocol unless clients refrain from data caching. 4738 The rules for data caching in the file locking environment are: 4740 o First, when a client obtains a file lock for a particular 4741 region, the data cache corresponding to that region (if any 4742 cache data exists) must be revalidated. If the change attribute 4743 indicates that the file may have been updated since the cached 4744 data was obtained, the client must flush or invalidate the 4745 cached data for the newly locked region. A client might choose 4746 to invalidate all of non-modified cached data that it has for 4747 the file but the only requirement for correct operation is to 4748 invalidate all of the data in the newly locked region. 4750 o Second, before releasing a write lock for a region, all modified 4751 data for that region must be flushed to the server. The 4752 modified data must also be written to stable storage. 4754 Note that flushing data to the server and the invalidation of cached 4755 data must reflect the actual byte ranges locked or unlocked. 4756 Rounding these up or down to reflect client cache block boundaries 4757 will cause problems if not carefully done. For example, writing a 4758 modified block when only half of that block is within an area being 4759 unlocked may cause invalid modification to the region outside the 4760 unlocked area. This, in turn, may be part of a region locked by 4761 another client. Clients can avoid this situation by synchronously 4762 performing portions of write operations that overlap that portion 4763 (initial or final) that is not a full block. Similarly, invalidating 4764 a locked area which is not an integral number of full buffer blocks 4765 would require the client to read one or two partial blocks from the 4766 server if the revalidation procedure shows that the data which the 4767 client possesses may not be valid. 4769 The data that is written to the server as a prerequisite to the 4770 unlocking of a region must be written, at the server, to stable 4771 storage. The client may accomplish this either with synchronous 4772 writes or by following asynchronous writes with a COMMIT operation. 4773 This is required because retransmission of the modified data after a 4774 server reboot might conflict with a lock held by another client. 4776 A client implementation may choose to accommodate applications which 4777 use record locking in non-standard ways (e.g. using a record lock as 4779 Draft Specification NFS version 4 Protocol September 2002 4781 a global semaphore) by flushing to the server more data upon an LOCKU 4782 than is covered by the locked range. This may include modified data 4783 within files other than the one for which the unlocks are being done. 4784 In such cases, the client must not interfere with applications whose 4785 READs and WRITEs are being done only within the bounds of record 4786 locks which the application holds. For example, an application locks 4787 a single byte of a file and proceeds to write that single byte. A 4788 client that chose to handle a LOCKU by flushing all modified data to 4789 the server could validly write that single byte in response to an 4790 unrelated unlock. However, it would not be valid to write the entire 4791 block in which that single written byte was located since it includes 4792 an area that is not locked and might be locked by another client. 4793 Client implementations can avoid this problem by dividing files with 4794 modified data into those for which all modifications are done to 4795 areas covered by an appropriate record lock and those for which there 4796 are modifications not covered by a record lock. Any writes done for 4797 the former class of files must not include areas not locked and thus 4798 not modified on the client. 4800 9.3.3. Data Caching and Mandatory File Locking 4802 Client side data caching needs to respect mandatory file locking when 4803 it is in effect. The presence of mandatory file locking for a given 4804 file is indicated when the client gets back NFS4ERR_LOCKED from a 4805 READ or WRITE on a file it has an appropriate share reservation for. 4806 When mandatory locking is in effect for a file, the client must check 4807 for an appropriate file lock for data being read or written. If a 4808 lock exists for the range being read or written, the client may 4809 satisfy the request using the client's validated cache. If an 4810 appropriate file lock is not held for the range of the read or write, 4811 the read or write request must not be satisfied by the client's cache 4812 and the request must be sent to the server for processing. When a 4813 read or write request partially overlaps a locked region, the request 4814 should be subdivided into multiple pieces with each region (locked or 4815 not) treated appropriately. 4817 9.3.4. Data Caching and File Identity 4819 When clients cache data, the file data needs to be organized 4820 according to the filesystem object to which the data belongs. For 4821 NFS version 3 clients, the typical practice has been to assume for 4822 the purpose of caching that distinct filehandles represent distinct 4823 filesystem objects. The client then has the choice to organize and 4824 maintain the data cache on this basis. 4826 In the NFS version 4 protocol, there is now the possibility to have 4827 significant deviations from a "one filehandle per object" model 4828 because a filehandle may be constructed on the basis of the object's 4829 pathname. Therefore, clients need a reliable method to determine if 4830 two filehandles designate the same filesystem object. If clients 4832 Draft Specification NFS version 4 Protocol September 2002 4834 were simply to assume that all distinct filehandles denote distinct 4835 objects and proceed to do data caching on this basis, caching 4836 inconsistencies would arise between the distinct client side objects 4837 which mapped to the same server side object. 4839 By providing a method to differentiate filehandles, the NFS version 4 4840 protocol alleviates a potential functional regression in comparison 4841 with the NFS version 3 protocol. Without this method, caching 4842 inconsistencies within the same client could occur and this has not 4843 been present in previous versions of the NFS protocol. Note that it 4844 is possible to have such inconsistencies with applications executing 4845 on multiple clients but that is not the issue being addressed here. 4847 For the purposes of data caching, the following steps allow an NFS 4848 version 4 client to determine whether two distinct filehandles denote 4849 the same server side object: 4851 o If GETATTR directed to two filehandles returns different values 4852 of the fsid attribute, then the filehandles represent distinct 4853 objects. 4855 o If GETATTR for any file with an fsid that matches the fsid of 4856 the two filehandles in question returns a unique_handles 4857 attribute with a value of TRUE, then the two objects are 4858 distinct. 4860 o If GETATTR directed to the two filehandles does not return the 4861 fileid attribute for both of the handles, then it cannot be 4862 determined whether the two objects are the same. Therefore, 4863 operations which depend on that knowledge (e.g. client side data 4864 caching) cannot be done reliably. 4866 o If GETATTR directed to the two filehandles returns different 4867 values for the fileid attribute, then they are distinct objects. 4869 o Otherwise they are the same object. 4871 9.4. Open Delegation 4873 When a file is being OPENed, the server may delegate further handling 4874 of opens and closes for that file to the opening client. Any such 4875 delegation is recallable, since the circumstances that allowed for 4876 the delegation are subject to change. In particular, the server may 4877 receive a conflicting OPEN from another client, the server must 4878 recall the delegation before deciding whether the OPEN from the other 4879 client may be granted. Making a delegation is up to the server and 4880 clients should not assume that any particular OPEN either will or 4881 will not result in an open delegation. The following is a typical 4882 set of conditions that servers might use in deciding whether OPEN 4883 should be delegated: 4885 Draft Specification NFS version 4 Protocol September 2002 4887 o The client must be able to respond to the server's callback 4888 requests. The server will use the CB_NULL procedure for a test 4889 of callback ability. 4891 o The client must have responded properly to previous recalls. 4893 o There must be no current open conflicting with the requested 4894 delegation. 4896 o There should be no current delegation that conflicts with the 4897 delegation being requested. 4899 o The probability of future conflicting open requests should be 4900 low based on the recent history of the file. 4902 o The existence of any server-specific semantics of OPEN/CLOSE 4903 that would make the required handling incompatible with the 4904 prescribed handling that the delegated client would apply (see 4905 below). 4907 There are two types of open delegations, read and write. A read open 4908 delegation allows a client to handle, on its own, requests to open a 4909 file for reading that do not deny read access to others. Multiple 4910 read open delegations may be outstanding simultaneously and do not 4911 conflict. A write open delegation allows the client to handle, on 4912 its own, all opens. Only one write open delegation may exist for a 4913 given file at a given time and it is inconsistent with any read open 4914 delegations. 4916 When a client has a read open delegation, it may not make any changes 4917 to the contents or attributes of the file but it is assured that no 4918 other client may do so. When a client has a write open delegation, 4919 it may modify the file data since no other client will be accessing 4920 the file's data. The client holding a write delegation may only 4921 affect file attributes which are intimately connected with the file 4922 data: size, time_modify, change. 4924 When a client has an open delegation, it does not send OPENs or 4925 CLOSEs to the server but updates the appropriate status internally. 4926 For a read open delegation, opens that cannot be handled locally 4927 (opens for write or that deny read access) must be sent to the 4928 server. 4930 When an open delegation is made, the response to the OPEN contains an 4931 open delegation structure which specifies the following: 4933 o the type of delegation (read or write) 4935 o space limitation information to control flushing of data on 4936 close (write open delegation only, see the section "Open 4937 Delegation and Data Caching") 4939 Draft Specification NFS version 4 Protocol September 2002 4941 o an nfsace4 specifying read and write permissions 4943 o a stateid to represent the delegation for READ and WRITE 4945 The delegation stateid is separate and distinct from the stateid for 4946 the OPEN proper. The standard stateid, unlike the delegation 4947 stateid, is associated with a particular lock_owner and will continue 4948 to be valid after the delegation is recalled and the file remains 4949 open. 4951 When a request internal to the client is made to open a file and open 4952 delegation is in effect, it will be accepted or rejected solely on 4953 the basis of the following conditions. Any requirement for other 4954 checks to be made by the delegate should result in open delegation 4955 being denied so that the checks can be made by the server itself. 4957 o The access and deny bits for the request and the file as 4958 described in the section "Share Reservations". 4960 o The read and write permissions as determined below. 4962 The nfsace4 passed with delegation can be used to avoid frequent 4963 ACCESS calls. The permission check should be as follows: 4965 o If the nfsace4 indicates that the open may be done, then it 4966 should be granted without reference to the server. 4968 o If the nfsace4 indicates that the open may not be done, then an 4969 ACCESS request must be sent to the server to obtain the 4970 definitive answer. 4972 The server may return an nfsace4 that is more restrictive than the 4973 actual ACL of the file. This includes an nfsace4 that specifies 4974 denial of all access. Note that some common practices such as 4975 mapping the traditional user "root" to the user "nobody" may make it 4976 incorrect to return the actual ACL of the file in the delegation 4977 response. 4979 The use of delegation together with various other forms of caching 4980 creates the possibility that no server authentication will ever be 4981 performed for a given user since all of the user's requests might be 4982 satisfied locally. Where the client is depending on the server for 4983 authentication, the client should be sure authentication occurs for 4984 each user by use of the ACCESS operation. This should be the case 4985 even if an ACCESS operation would not be required otherwise. As 4986 mentioned before, the server may enforce frequent authentication by 4987 returning an nfsace4 denying all access with every open delegation. 4989 Draft Specification NFS version 4 Protocol September 2002 4991 9.4.1. Open Delegation and Data Caching 4993 OPEN delegation allows much of the message overhead associated with 4994 the opening and closing files to be eliminated. An open when an open 4995 delegation is in effect does not require that a validation message be 4996 sent to the server. The continued endurance of the "read open 4997 delegation" provides a guarantee that no OPEN for write and thus no 4998 write has occurred. Similarly, when closing a file opened for write 4999 and if write open delegation is in effect, the data written does not 5000 have to be flushed to the server until the open delegation is 5001 recalled. The continued endurance of the open delegation provides a 5002 guarantee that no open and thus no read or write has been done by 5003 another client. 5005 For the purposes of open delegation, READs and WRITEs done without an 5006 OPEN are treated as the functional equivalents of a corresponding 5007 type of OPEN. This refers to the READs and WRITEs that use the 5008 special stateids consisting of all zero bits or all one bits. 5009 Therefore, READs or WRITEs with a special stateid done by another 5010 client will force the server to recall a write open delegation. A 5011 WRITE with a special stateid done by another client will force a 5012 recall of read open delegations. 5014 With delegations, a client is able to avoid writing data to the 5015 server when the CLOSE of a file is serviced. The file close system 5016 call is the usual point at which the client is notified of a lack of 5017 stable storage for the modified file data generated by the 5018 application. At the close, file data is written to the server and 5019 through normal accounting the server is able to determine if the 5020 available filesystem space for the data has been exceeded (i.e. 5021 server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting 5022 includes quotas. The introduction of delegations requires that a 5023 alternative method be in place for the same type of communication to 5024 occur between client and server. 5026 In the delegation response, the server provides either the limit of 5027 the size of the file or the number of modified blocks and associated 5028 block size. The server must ensure that the client will be able to 5029 flush data to the server of a size equal to that provided in the 5030 original delegation. The server must make this assurance for all 5031 outstanding delegations. Therefore, the server must be careful in 5032 its management of available space for new or modified data taking 5033 into account available filesystem space and any applicable quotas. 5034 The server can recall delegations as a result of managing the 5035 available filesystem space. The client should abide by the server's 5036 state space limits for delegations. If the client exceeds the stated 5037 limits for the delegation, the server's behavior is undefined. 5039 Based on server conditions, quotas or available filesystem space, the 5040 server may grant write open delegations with very restrictive space 5041 limitations. The limitations may be defined in a way that will 5042 always force modified data to be flushed to the server on close. 5044 Draft Specification NFS version 4 Protocol September 2002 5046 With respect to authentication, flushing modified data to the server 5047 after a CLOSE has occurred may be problematic. For example, the user 5048 of the application may have logged off the client and unexpired 5049 authentication credentials may not be present. In this case, the 5050 client may need to take special care to ensure that local unexpired 5051 credentials will in fact be available. This may be accomplished by 5052 tracking the expiration time of credentials and flushing data well in 5053 advance of their expiration or by making private copies of 5054 credentials to assure their availability when needed. 5056 9.4.2. Open Delegation and File Locks 5058 When a client holds a write open delegation, lock operations are 5059 performed locally. This includes those required for mandatory file 5060 locking. This can be done since the delegation implies that there 5061 can be no conflicting locks. Similarly, all of the revalidations 5062 that would normally be associated with obtaining locks and the 5063 flushing of data associated with the releasing of locks need not be 5064 done. 5066 When a client holds a read open delegation, lock operations are not 5067 performed locally. All lock operations, including those requesting 5068 non-exclusive locks, are sent to the server for resolution. 5070 9.4.3. Handling of CB_GETATTR 5072 The server needs to employ special handling for a GETATTR where the 5073 target is a file that has a write open delegation in effect. The 5074 reason for this is that the client holding the write delegation may 5075 have modified the data and the server needs to reflect this change to 5076 the second client that submitted the GETATTR. Therefore, the client 5077 holding the write delegation needs to be interrogated. The server 5078 will use the CB_GETATTR operation. The only attributes that the 5079 server can reliably query via CB_GETATTR are size and change. 5081 Since CB_GETATTR is being used to satisfy another client's GETATTR 5082 request, the server only needs to know if the client holding the 5083 delegation has a modified version of the file. If the client's copy 5084 of the delegated file is not modified (data or size), the server can 5085 satisfy the second client's GETATTR request from the attributes 5086 stored locally at the server. If the file is modified, the server 5087 only needs to know about this modified state. If the server 5088 determines that the file is currently modified, it will respond to 5089 the second client's GETATTR as if the file had been modified locally 5090 at the server. 5092 Since the form of the change attribute is determined by the server 5093 and is opaque to the client, the client and server need to agree on a 5094 method of communicating the modified state of the file. For the size 5095 attribute, the client will report its current view of the file size. 5097 Draft Specification NFS version 4 Protocol September 2002 5099 For the change attribute, the handling is more involved. 5101 For the client, the following steps will be taken when receiving a 5102 write delegation: 5104 o The value of the change attribute will be obtained from the 5105 server and cached. Let this value be represented by c. 5107 o The client will create a value greater than c that will be used 5108 for communicating modified data is held at the client. Let this 5109 value be represented by d. 5111 o When the client is queried via CB_GETATTR for the change 5112 attribute, it checks to see if it holds modified data. If the 5113 file is modified, the value d is returned for the change 5114 attribute value. If this file is not currently modified, the 5115 client returns the value c for the change attribute. 5117 For simplicity of implementation, the client MAY for each CB_GETATTR 5118 return the same value d. This is true even if, between successive 5119 CB_GETATTR operations, the client again modifies in the file's data 5120 or metadata in its cache. The client can return the same value 5121 because the only requirement is that the client be able to indicate 5122 to the server that the client holds modified data. Therefore, the 5123 value of d may always be c + 1. 5125 While the change attribute is opaque to the client in the sense that 5126 it has no idea what units of time, if any, the server is counting 5127 change with, it is not opaque in that the client has to treat it as 5128 an unsigned integer, and the server has to be able to see the results 5129 of the client's changes to that integer. Therefore, the server MUST 5130 encode the change attribute in network order when sending it to the 5131 client. The client MUST decode it from network order to its native 5132 order when receiving it and the client MUST encode it network order 5133 when sending it to the server. For this reason, change is defined as 5134 an unsigned integer rather than an opaque array of octets. 5136 For the server, the following steps will be taken when providing a 5137 write delegation: 5139 o Upon providing a write delegation, the server will cache a copy 5140 of the change attribute in the data structure it uses to record 5141 the delegation. Let this value be represented by sc. 5143 o When a second client sends a GETATTR operation on the same file 5144 to the server, the server obtains the change attribute from the 5145 first client. Let this value be cc. 5147 o If the value cc is equal to sc, the file is not modified and the 5148 server returns the current values for change, time_metadata, and 5149 time_modify (for example) to the second client. 5151 Draft Specification NFS version 4 Protocol September 2002 5153 o If the value cc is NOT equal to sc, the file is currently 5154 modified at the first client and most likely will be modified at 5155 the server at a future time. The server then uses its current 5156 time to construct attribute values for time_metadata and 5157 time_modify. A new value of sc, which we will call nsc, is 5158 computed by the server, such that nsc >= sc + 1. The server 5159 then returns the constructed time_metadata, time_modify, and nsc 5160 values to the requester. The server replaces sc in the 5161 delegation record with nsc. To prevent the possibility of 5162 time_modify, time_metadata, and change from appearing to go 5163 backward (which would happen if the client holding the 5164 delegation fails to write its modified data to the server before 5165 the delegation is revoked or returned), the server SHOULD update 5166 the file's metadata record with the constructed attribute 5167 values. For reasons of reasonable performance, committing the 5168 constructed attribute values to stable storage is OPTIONAL. 5170 As discussed earlier in this section, the client MAY return the 5171 same cc value on subsequent CB_GETATTR calls, even if the file 5172 was modified in the client's cache yet again between successive 5173 CB_GETATTR calls. Therefore, the server must assume that the 5174 file has been modified yet again, and MUST take care to ensure 5175 that the new nsc it constructs and returns is greater than the 5176 previous nsc it returned. An example implementation's 5177 delegation record would satisfy this mandate by including a 5178 boolean field (let us call it "modified") that is set to false 5179 when the delegation is granted, and an sc value set at the time 5180 of grant to the change attribute value. The modified field would 5181 be set to true the first time cc != sc, and would stay true 5182 until the delegation is returned or revoked. The processing for 5183 constructing nsc, time_modify, and time_metadata would use this 5184 pseudo code: 5186 if (!modified) { 5187 do CB_GETATTR for change and size; 5189 if (cc != sc) 5190 modified = TRUE; } else { 5191 do CB_GETATTR for size; } 5193 if (modified) { 5194 sc = sc + 1; 5195 time_modify = time_metadata = current_time; 5196 update sc, time_modify, time_metadata into file's metadata; 5197 } 5199 return to client (that sent GETATTR) the attributes 5200 it requested, but make sure size comes from what 5201 CB_GETATTR returned. Do not update the file's metadata 5202 with the client's modified size. 5204 o In the case that the file attribute size is different than the 5206 Draft Specification NFS version 4 Protocol September 2002 5208 server's current value, the server treats this as a modification 5209 regardless of the value of the change attribute retrieved via 5210 CB_GETATTR and responds to the second client as in the last 5211 step. 5213 This methodology resolves issues of clock differences between client 5214 and server and other scenarios where the use of CB_GETATTR break 5215 down. 5217 It should be noted that the server is under no obligation to use 5218 CB_GETATTR and therefore the server MAY simply recall the delegation 5219 to avoid its use. 5221 9.4.4. Recall of Open Delegation 5223 The following events necessitate recall of an open delegation: 5225 o Potentially conflicting OPEN request (or READ/WRITE done with 5226 "special" stateid) 5228 o SETATTR issued by another client 5230 o REMOVE request for the file 5232 o RENAME request for the file as either source or target of the 5233 RENAME 5235 Whether a RENAME of a directory in the path leading to the file 5236 results in recall of an open delegation depends on the semantics of 5237 the server filesystem. If that filesystem denies such RENAMEs when a 5238 file is open, the recall must be performed to determine whether the 5239 file in question is, in fact, open. 5241 In addition to the situations above, the server may choose to recall 5242 open delegations at any time if resource constraints make it 5243 advisable to do so. Clients should always be prepared for the 5244 possibility of recall. 5246 When a client receives a recall for an open delegation, it needs to 5247 update state on the server before returning the delegation. These 5248 same updates must be done whenever a client chooses to return a 5249 delegation voluntarily. The following items of state need to be 5250 dealt with: 5252 o If the file associated with the delegation is no longer open and 5253 no previous CLOSE operation has been sent to the server, a CLOSE 5254 operation must be sent to the server. 5256 o If a file has other open references at the client, then OPEN 5257 operations must be sent to the server. The appropriate stateids 5259 Draft Specification NFS version 4 Protocol September 2002 5261 will be provided by the server for subsequent use by the client 5262 since the delegation stateid will not longer be valid. These 5263 OPEN requests are done with the claim type of 5264 CLAIM_DELEGATE_CUR. This will allow the presentation of the 5265 delegation stateid so that the client can establish the 5266 appropriate rights to perform the OPEN. (see the section 5267 "Operation 18: OPEN" for details.) 5269 o If there are granted file locks, the corresponding LOCK 5270 operations need to be performed. This applies to the write open 5271 delegation case only. 5273 o For a write open delegation, if at the time of recall the file 5274 is not open for write, all modified data for the file must be 5275 flushed to the server. If the delegation had not existed, the 5276 client would have done this data flush before the CLOSE 5277 operation. 5279 o For a write open delegation when a file is still open at the 5280 time of recall, any modified data for the file needs to be 5281 flushed to the server. 5283 o With the write open delegation in place, it is possible that the 5284 file was truncated during the duration of the delegation. For 5285 example, the truncation could have occurred as a result of an 5286 OPEN UNCHECKED with a size attribute value of zero. Therefore, 5287 if a truncation of the file has occurred and this operation has 5288 not been propagated to the server, the truncation must occur 5289 before any modified data is written to the server. 5291 In the case of write open delegation, file locking imposes some 5292 additional requirements. To precisely maintain the associated 5293 invariant, it is required to flush any modified data in any region 5294 for which a write lock was released while the write delegation was in 5295 effect. However, because the write open delegation implies no other 5296 locking by other clients, a simpler implementation is to flush all 5297 modified data for the file (as described just above) if any write 5298 lock has been released while the write open delegation was in effect. 5300 An implementation need not wait until delegation recall (or deciding 5301 to voluntarily return a delegation) to perform any of the above 5302 actions, if implementation considerations (e.g. resource availability 5303 constraints) make that desirable. Generally, however, the fact that 5304 the actual open state of the file may continue to change makes it not 5305 worthwhile to send information about opens and closes to the server, 5306 except as part of delegation return. Only in the case of closing the 5307 open that resulted in obtaining the delegation would clients be 5308 likely to do this early, since, in that case, the close once done 5309 will not be undone. Regardless of the client's choices on scheduling 5310 these actions, all must be performed before the delegation is 5311 returned, including (when applicable) the close that corresponds to 5312 the open that resulted in the delegation. These actions can be 5314 Draft Specification NFS version 4 Protocol September 2002 5316 performed either in previous requests or in previous operations in 5317 the same COMPOUND request. 5319 9.4.5. Clients that Fail to Honor Delegation Recalls 5321 A client may fail to respond to a recall for various reasons, such as 5322 a failure of the callback path from server to the client. The client 5323 may be unaware of a failure in the callback path. This lack of 5324 awareness could result in the client finding out long after the 5325 failure that its delegation has been revoked, and another client has 5326 modified the data for which the client had a delegation. This is 5327 especially a problem for the client that held a write delegation. 5329 The server also has a dilemma in that the client that fails to 5330 respond to the recall might also be sending other NFS requests, 5331 including those that renew the lease before the lease expires. 5332 Without returning an error for those lease renewing operations, the 5333 server leads the client to believe that the delegation it has is in 5334 force. 5336 This difficulty is solved by the following rules: 5338 o When the callback path is down, the server MUST NOT revoke the 5339 delegation if one of the following occurs: 5341 - The client has issued a RENEW operation and the server has 5342 returned an NFS4ERR_CB_PATH_DOWN error. The server MUST renew 5343 the lease for any record locks and share reservations the 5344 client has that the server has known about (as opposed to those 5345 locks and share reservations the client has established but not 5346 yet sent to the server, due to the delegation). The server 5347 SHOULD give the client a reasonable time to return its 5348 delegations to the server before revoking the client's 5349 delegations. 5351 - The client has not issued a RENEW operation for some period of 5352 time after the server attempted to recall the delegation. This 5353 period of time MUST NOT be less than the value of the 5354 lease_time attribute. 5356 o When the client holds a delegation, it can not rely on operations, 5357 except for RENEW, that take a stateid, to renew delegation leases 5358 across callback path failures. The client that wants to keep 5359 delegations in force across callback path failures must use RENEW 5360 to do so. 5362 9.4.6. Delegation Revocation 5364 At the point a delegation is revoked, if there are associated opens 5365 on the client, the applications holding these opens need to be 5367 Draft Specification NFS version 4 Protocol September 2002 5369 notified. This notification usually occurs by returning errors for 5370 READ/WRITE operations or when a close is attempted for the open file. 5372 If no opens exist for the file at the point the delegation is 5373 revoked, then notification of the revocation is unnecessary. 5374 However, if there is modified data present at the client for the 5375 file, the user of the application should be notified. Unfortunately, 5376 it may not be possible to notify the user since active applications 5377 may not be present at the client. See the section "Revocation 5378 Recovery for Write Open Delegation" for additional details. 5380 9.5. Data Caching and Revocation 5382 When locks and delegations are revoked, the assumptions upon which 5383 successful caching depend are no longer guaranteed. For any locks or 5384 share reservations that have been revoked, the corresponding owner 5385 needs to be notified. This notification includes applications with a 5386 file open that has a corresponding delegation which has been revoked. 5387 Cached data associated with the revocation must be removed from the 5388 client. In the case of modified data existing in the client's cache, 5389 that data must be removed from the client without it being written to 5390 the server. As mentioned, the assumptions made by the client are no 5391 longer valid at the point when a lock or delegation has been revoked. 5392 For example, another client may have been granted a conflicting lock 5393 after the revocation of the lock at the first client. Therefore, the 5394 data within the lock range may have been modified by the other 5395 client. Obviously, the first client is unable to guarantee to the 5396 application what has occurred to the file in the case of revocation. 5398 Notification to a lock owner will in many cases consist of simply 5399 returning an error on the next and all subsequent READs/WRITEs to the 5400 open file or on the close. Where the methods available to a client 5401 make such notification impossible because errors for certain 5402 operations may not be returned, more drastic action such as signals 5403 or process termination may be appropriate. The justification for 5404 this is that an invariant for which an application depends on may be 5405 violated. Depending on how errors are typically treated for the 5406 client operating environment, further levels of notification 5407 including logging, console messages, and GUI pop-ups may be 5408 appropriate. 5410 9.5.1. Revocation Recovery for Write Open Delegation 5412 Revocation recovery for a write open delegation poses the special 5413 issue of modified data in the client cache while the file is not 5414 open. In this situation, any client which does not flush modified 5415 data to the server on each close must ensure that the user receives 5416 appropriate notification of the failure as a result of the 5417 revocation. Since such situations may require human action to 5418 correct problems, notification schemes in which the appropriate user 5420 Draft Specification NFS version 4 Protocol September 2002 5422 or administrator is notified may be necessary. Logging and console 5423 messages are typical examples. 5425 If there is modified data on the client, it must not be flushed 5426 normally to the server. A client may attempt to provide a copy of 5427 the file data as modified during the delegation under a different 5428 name in the filesystem name space to ease recovery. Note that when 5429 the client can determine that the file has not been modified by any 5430 other client, or when the client has a complete cached copy of file 5431 in question, such a saved copy of the client's view of the file may 5432 be of particular value for recovery. In other case, recovery using a 5433 copy of the file based partially on the client's cached data and 5434 partially on the server copy as modified by other clients, will be 5435 anything but straightforward, so clients may avoid saving file 5436 contents in these situations or mark the results specially to warn 5437 users of possible problems. 5439 Saving of such modified data in delegation revocation situations may 5440 be limited to files of a certain size or might be used only when 5441 sufficient disk space is available within the target filesystem. 5442 Such saving may also be restricted to situations when the client has 5443 sufficient buffering resources to keep the cached copy available 5444 until it is properly stored to the target filesystem. 5446 9.6. Attribute Caching 5448 The attributes discussed in this section do not include named 5449 attributes. Individual named attributes are analogous to files and 5450 caching of the data for these needs to be handled just as data 5451 caching is for ordinary files. Similarly, LOOKUP results from an 5452 OPENATTR directory are to be cached on the same basis as any other 5453 pathnames and similarly for directory contents. 5455 Clients may cache file attributes obtained from the server and use 5456 them to avoid subsequent GETATTR requests. Such caching is write 5457 through in that modification to file attributes is always done by 5458 means of requests to the server and should not be done locally and 5459 cached. The exception to this are modifications to attributes that 5460 are intimately connected with data caching. Therefore, extending a 5461 file by writing data to the local data cache is reflected immediately 5462 in the size as seen on the client without this change being 5463 immediately reflected on the server. Normally such changes are not 5464 propagated directly to the server but when the modified data is 5465 flushed to the server, analogous attribute changes are made on the 5466 server. When open delegation is in effect, the modified attributes 5467 may be returned to the server in the response to a CB_RECALL call. 5469 The result of local caching of attributes is that the attribute 5470 caches maintained on individual clients will not be coherent. Changes 5471 made in one order on the server may be seen in a different order on 5473 Draft Specification NFS version 4 Protocol September 2002 5475 one client and in a third order on a different client. 5477 The typical filesystem application programming interfaces do not 5478 provide means to atomically modify or interrogate attributes for 5479 multiple files at the same time. The following rules provide an 5480 environment where the potential incoherences mentioned above can be 5481 reasonably managed. These rules are derived from the practice of 5482 previous NFS protocols. 5484 o All attributes for a given file (per-fsid attributes excepted) 5485 are cached as a unit at the client so that no non- 5486 serializability can arise within the context of a single file. 5488 o An upper time boundary is maintained on how long a client cache 5489 entry can be kept without being refreshed from the server. 5491 o When operations are performed that change attributes at the 5492 server, the updated attribute set is requested as part of the 5493 containing RPC. This includes directory operations that update 5494 attributes indirectly. This is accomplished by following the 5495 modifying operation with a GETATTR operation and then using the 5496 results of the GETATTR to update the client's cached attributes. 5498 Note that if the full set of attributes to be cached is requested by 5499 READDIR, the results can be cached by the client on the same basis as 5500 attributes obtained via GETATTR. 5502 A client may validate its cached version of attributes for a file by 5503 fetching just both the change and time_access attributes and assuming 5504 that if the change attribute has the same value as it did when the 5505 attributes were cached, then no attributes other than time_access 5506 have changed. The reason why time_access is also fetched is because 5507 many servers operate in environments where the operation that updates 5508 change does not update time_access. For example, POSIX file 5509 semantics do not update access time when a file is modified by the 5510 write system call. Therefore, the client that wants a current 5511 time_access value should fetch it with change during the attribute 5512 cache validation processing and update its cached time_access. 5514 The client may maintain a cache of modified attributes for those 5515 attributes intimately connected with data of modified regular files 5516 (size, time_modify, and change). Other than those three attributes, 5517 the client MUST NOT maintain a cache of modified attributes. Instead, 5518 attribute changes are immediately sent to the server. 5520 In some operating environments, the equivalent to time_access is 5521 expected to be implicitly updated by each read of the content of the 5522 file object. If an NFS client is caching the content of a file 5523 object, whether it is a regular file, directory, or symbolic link, 5524 the client SHOULD NOT update the time_access attribute (via SETATTR 5525 or a small READ or READDIR request) on the server with each read that 5526 is satisfied from cache. The reason is that this can defeat the 5528 Draft Specification NFS version 4 Protocol September 2002 5530 performance benefits of caching content, especially since an explicit 5531 SETATTR of time_access may alter the change attribute on the server. 5532 If the change attribute changes, clients that are caching the content 5533 will think the content has changed, and will re-read unmodified data 5534 from the server. Nor is the client encouraged to maintain a modified 5535 version of time_access in its cache, since this would mean that the 5536 client will either eventually have to write the access time to the 5537 server with bad performance effects, or it would never update the 5538 server's time_access, thereby resulting in a situation where an 5539 application that caches access time between a close and open of the 5540 same file observes the access time oscillating between the past and 5541 present. The time_access attribute always means the time of last 5542 access to a file by a read that was satisfied by the server. This way 5543 clients will tend to see only time_access changes that go forward in 5544 time. 5546 9.7. Data and Metadata Caching and Memory Mapped Files 5548 Some operating environments include the capability for an application 5549 to map a file's content into the application's address space. Each 5550 time the application accesses a memory location that corresponds to a 5551 block that has not been loaded into the address space, a page fault 5552 occurs and the file is read (or if the block does not exist in the 5553 file, the block is allocated and then instantiated in the 5554 application's address space). 5556 As long as each memory mapped access to the file requires a page 5557 fault, the relevant attributes of the file that are used to detect 5558 access and modification (time_access, time_metadata, time_modify, and 5559 change) will be updated. However, in many operating environments, 5560 when page faults are not required these attributes will not be 5561 updated on reads or updates to the file via memory access (regardless 5562 whether the file is local file or is being access remotely). A 5563 client or server MAY fail to update attributes of a file that is 5564 being accessed via memory mapped I/O. This has several implications: 5566 o If there is an application on the server that has memory mapped 5567 a file that a client is also accessing, the client may not be 5568 able to get a consistent value of the change attribute to 5569 determine whether its cache is stale or not. A server that 5570 knows that the file is memory mapped could always 5571 pessimistically return updated values for change so as to force 5572 the application to always get the most up to date data and 5573 metadata for the file. However, due to the negative performance 5574 implications of this, such behavior is OPTIONAL. 5576 o If the memory mapped file is not being modified on the server, 5577 and instead is just being read by an application via the memory 5578 mapped interface, the client will not see an updated time_access 5579 attribute. However, in many operating environments, neither 5580 will any process running on the server. Thus NFS clients are at 5582 Draft Specification NFS version 4 Protocol September 2002 5584 no disadvantage with respect to local processes. 5586 o If there is another client that is memory mapping the file, and 5587 if that client is holding a write delegation, the same set of 5588 issues as discussed in the previous two bullet items apply. So, 5589 when a server does a CB_GETATTR to a file that the client has 5590 modified in its cache, the response from CB_GETATTR will not 5591 necessarily be accurate. As discussed earlier, the client's 5592 obligation is to report that the file has been modified since 5593 the delegation was granted, not whether it has been modified 5594 again between successive CB_GETATTR calls, and the server MUST 5595 assume that any file the client has modified in cache has been 5596 modified again between successive CB_GETATTR calls. Depending 5597 on the nature of the client's memory management system, this 5598 weak obligation may not be possible. A client MAY return stale 5599 information in CB_GETATTR whenever the file is memory mapped. 5601 o The mixture of memory mapping and file locking on the same file 5602 is problematic. Consider the following scenario, where a page 5603 size on each client is 8192 bytes. 5605 - Client A memory maps first page (8192 bytes) of file X 5607 - Client B memory maps first page (8192 bytes) of file X 5609 - Client A write locks first 4096 bytes 5611 - Client B write locks second 4096 bytes 5613 - Client A, via a STORE instruction modifies part of its 5614 locked region. 5616 - Simultaneous to client A, client B issues a STORE on part 5617 of its locked region. 5619 Here the challenge is for each client to resynchronize to get a 5620 correct view of the first page. In many operating environments, 5621 the virtual memory management systems on each client only know a 5622 page is modified, not that a subset of the page corresponding to 5623 the respective lock regions has been modified. So it is not 5624 possible for each client to do the right thing, which is to only 5625 write to the server that portion of the page that is locked. 5626 For example, if client A simply writes out the page, and then 5627 client B writes out the page, client A's data is lost. 5629 Moreover, if mandatory locking is enabled on the file, then we 5630 have a different problem. When clients A and B issue the STORE 5631 instructions, the resulting page faults require a record lock on 5632 the entire page. Each client then tries to extend their locked 5633 range to the entire page, which results in a deadlock. 5635 Draft Specification NFS version 4 Protocol September 2002 5637 Communicating the NFS4ERR_DEADLOCK error to a STORE instruction 5638 is difficult at best. 5640 If a client is locking the entire memory mapped file, there is 5641 no problem with advisory or mandatory record locking, at least 5642 until the client unlocks a region in the middle of the file. 5644 Given the above issues the following are permitted: 5646 - Clients and servers MAY deny memory mapping a file they 5647 know there are record locks for. 5649 - Clients and servers MAY deny a record lock on a file they 5650 know is memory mapped. 5652 - A client MAY deny memory mapping a file that it knows 5653 requires mandatory locking for I/O. If mandatory locking 5654 is enabled after the file is opened and mapped, the client 5655 MAY deny the application further access to its mapped file. 5657 9.8. Name Caching 5659 The results of LOOKUP and READDIR operations may be cached to avoid 5660 the cost of subsequent LOOKUP operations. Just as in the case of 5661 attribute caching, inconsistencies may arise among the various client 5662 caches. To mitigate the effects of these inconsistencies and given 5663 the context of typical filesystem APIs, an upper time boundary is 5664 maintained on how long a client name cache entry can be kept without 5665 verifying that the entry has not been made invalid by a directory 5666 change operation performed by another client. 5668 When a client is not making changes to a directory for which there 5669 exist name cache entries, the client needs to periodically fetch 5670 attributes for that directory to ensure that it is not being 5671 modified. After determining that no modification has occurred, the 5672 expiration time for the associated name cache entries may be updated 5673 to be the current time plus the name cache staleness bound. 5675 When a client is making changes to a given directory, it needs to 5676 determine whether there have been changes made to the directory by 5677 other clients. It does this by using the change attribute as 5678 reported before and after the directory operation in the associated 5679 change_info4 value returned for the operation. The server is able to 5680 communicate to the client whether the change_info4 data is provided 5681 atomically with respect to the directory operation. If the change 5682 values are provided atomically, the client is then able to compare 5683 the pre-operation change value with the change value in the client's 5684 name cache. If the comparison indicates that the directory was 5685 updated by another client, the name cache associated with the 5687 Draft Specification NFS version 4 Protocol September 2002 5689 modified directory is purged from the client. If the comparison 5690 indicates no modification, the name cache can be updated on the 5691 client to reflect the directory operation and the associated timeout 5692 extended. The post-operation change value needs to be saved as the 5693 basis for future change_info4 comparisons. 5695 As demonstrated by the scenario above, name caching requires that the 5696 client revalidate name cache data by inspecting the change attribute 5697 of a directory at the point when the name cache item was cached. 5698 This requires that the server update the change attribute for 5699 directories when the contents of the corresponding directory is 5700 modified. For a client to use the change_info4 information 5701 appropriately and correctly, the server must report the pre and post 5702 operation change attribute values atomically. When the server is 5703 unable to report the before and after values atomically with respect 5704 to the directory operation, the server must indicate that fact in the 5705 change_info4 return value. When the information is not atomically 5706 reported, the client should not assume that other clients have not 5707 changed the directory. 5709 9.9. Directory Caching 5711 The results of READDIR operations may be used to avoid subsequent 5712 READDIR operations. Just as in the cases of attribute and name 5713 caching, inconsistencies may arise among the various client caches. 5714 To mitigate the effects of these inconsistencies, and given the 5715 context of typical filesystem APIs, the following rules should be 5716 followed: 5718 o Cached READDIR information for a directory which is not obtained 5719 in a single READDIR operation must always be a consistent 5720 snapshot of directory contents. This is determined by using a 5721 GETATTR before the first READDIR and after the last of READDIR 5722 that contributes to the cache. 5724 o An upper time boundary is maintained to indicate the length of 5725 time a directory cache entry is considered valid before the 5726 client must revalidate the cached information. 5728 The revalidation technique parallels that discussed in the case of 5729 name caching. When the client is not changing the directory in 5730 question, checking the change attribute of the directory with GETATTR 5731 is adequate. The lifetime of the cache entry can be extended at 5732 these checkpoints. When a client is modifying the directory, the 5733 client needs to use the change_info4 data to determine whether there 5734 are other clients modifying the directory. If it is determined that 5735 no other client modifications are occurring, the client may update 5736 its directory cache to reflect its own changes. 5738 As demonstrated previously, directory caching requires that the 5739 client revalidate directory cache data by inspecting the change 5741 Draft Specification NFS version 4 Protocol September 2002 5743 attribute of a directory at the point when the directory was cached. 5744 This requires that the server update the change attribute for 5745 directories when the contents of the corresponding directory is 5746 modified. For a client to use the change_info4 information 5747 appropriately and correctly, the server must report the pre and post 5748 operation change attribute values atomically. When the server is 5749 unable to report the before and after values atomically with respect 5750 to the directory operation, the server must indicate that fact in the 5751 change_info4 return value. When the information is not atomically 5752 reported, the client should not assume that other clients have not 5753 changed the directory. 5755 Draft Specification NFS version 4 Protocol September 2002 5757 10. Minor Versioning 5759 To address the requirement of an NFS protocol that can evolve as the 5760 need arises, the NFS version 4 protocol contains the rules and 5761 framework to allow for future minor changes or versioning. 5763 The base assumption with respect to minor versioning is that any 5764 future accepted minor version must follow the IETF process and be 5765 documented in a standards track RFC. Therefore, each minor version 5766 number will correspond to an RFC. Minor version zero of the NFS 5767 version 4 protocol is represented by this RFC. The COMPOUND 5768 procedure will support the encoding of the minor version being 5769 requested by the client. 5771 The following items represent the basic rules for the development of 5772 minor versions. Note that a future minor version may decide to 5773 modify or add to the following rules as part of the minor version 5774 definition. 5776 1 Procedures are not added or deleted 5778 To maintain the general RPC model, NFS version 4 minor versions 5779 will not add to or delete procedures from the NFS program. 5781 2 Minor versions may add operations to the COMPOUND and 5782 CB_COMPOUND procedures. 5784 The addition of operations to the COMPOUND and CB_COMPOUND 5785 procedures does not affect the RPC model. 5787 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, 5788 and GETATTR4res. 5790 This allows for the expansion of the attribute model to allow 5791 for future growth or adaptation. 5793 2.2 Minor version X must append any new attributes after the last 5794 documented attribute. 5796 Since attribute results are specified as an opaque array of 5797 per-attribute XDR encoded results, the complexity of adding new 5798 attributes in the midst of the current definitions will be too 5799 burdensome. 5801 3 Minor versions must not modify the structure of an existing 5802 operation's arguments or results. 5804 Draft Specification NFS version 4 Protocol September 2002 5806 Again the complexity of handling multiple structure definitions 5807 for a single operation is too burdensome. New operations should 5808 be added instead of modifying existing structures for a minor 5809 version. 5811 This rule does not preclude the following adaptations in a minor 5812 version. 5814 o adding bits to flag fields such as new attributes to 5815 GETATTR's bitmap4 data type 5817 o adding bits to existing attributes like ACLs that have flag 5818 words 5820 o extending enumerated types (including NFS4ERR_*) with new 5821 values 5823 4 Minor versions may not modify the structure of existing 5824 attributes. 5826 5 Minor versions may not delete operations. 5828 This prevents the potential reuse of a particular operation 5829 "slot" in a future minor version. 5831 6 Minor versions may not delete attributes. 5833 7 Minor versions may not delete flag bits or enumeration values. 5835 8 Minor versions may declare an operation as mandatory to NOT 5836 implement. 5838 Specifying an operation as "mandatory to not implement" is 5839 equivalent to obsoleting an operation. For the client, it means 5840 that the operation should not be sent to the server. For the 5841 server, an NFS error can be returned as opposed to "dropping" 5842 the request as an XDR decode error. This approach allows for 5843 the obsolescence of an operation while maintaining its structure 5844 so that a future minor version can reintroduce the operation. 5846 8.1 Minor versions may declare attributes mandatory to NOT 5847 implement. 5849 8.2 Minor versions may declare flag bits or enumeration values as 5850 mandatory to NOT implement. 5852 Draft Specification NFS version 4 Protocol September 2002 5854 9 Minor versions may downgrade features from mandatory to 5855 recommended, or recommended to optional. 5857 10 Minor versions may upgrade features from optional to recommended 5858 or recommended to mandatory. 5860 11 A client and server that support minor version X must support 5861 minor versions 0 (zero) through X-1 as well. 5863 12 No new features may be introduced as mandatory in a minor 5864 version. 5866 This rule allows for the introduction of new functionality and 5867 forces the use of implementation experience before designating a 5868 feature as mandatory. 5870 13 A client MUST NOT attempt to use a stateid, filehandle, or 5871 similar returned object from the COMPOUND procedure with minor 5872 version X for another COMPOUND procedure with minor version Y, 5873 where X != Y. 5875 Draft Specification NFS version 4 Protocol September 2002 5877 11. Internationalization 5879 The primary issue in which NFS needs to deal with 5880 internationalization, or I18N, is with respect to file names and 5881 other strings as used within the protocol. The choice of string 5882 representation must allow reasonable name/string access to clients 5883 which use various languages. The UTF-8 encoding of the UCS as 5884 defined by [ISO10646] allows for this type of access and follows the 5885 policy described in "IETF Policy on Character Sets and Languages", 5886 [RFC2277]. This choice is explained further in the following. 5888 11.1. Universal Versus Local Character Sets 5890 [RFC1345] describes a table of 16 bit characters for many different 5891 languages (the bit encodings match Unicode, though of course 5892 [RFC1345] is somewhat out of date with respect to current Unicode 5893 assignments). Each character from each language has a unique 16 bit 5894 value in the 16 bit character set. Thus this table can be thought of 5895 as a universal character set. [RFC1345] then talks about groupings 5896 of subsets of the entire 16 bit character set into "Charset Tables". 5897 For example one might take all the Greek characters from the 16 bit 5898 table (which are consecutively allocated), and normalize their 5899 offsets to a table that fits in 7 bits. Thus it is determined that 5900 "lower case alpha" is in the same position as "upper case a" in the 5901 US-ASCII table, and "upper case alpha" is in the same position as 5902 "lower case a" in the US-ASCII table. 5904 These normalized subset character sets can be thought of as "local 5905 character sets", suitable for an operating system locale. 5907 Local character sets are not suitable for the NFS protocol. Consider 5908 someone who creates a file with a name in a Swedish character set. 5909 If someone else later goes to access the file with their locale set 5910 to the Swedish language, then there are no problems. But if someone 5911 in say the US-ASCII locale goes to access the file, the file name 5912 will look very different, because the Swedish characters in the 7 bit 5913 table will now be represented in US-ASCII characters on the display. 5914 It would be preferable to give the US-ASCII user a way to display the 5915 file name using Swedish glyphs. In order to do that, the NFS protocol 5916 would have to include the locale with the file name on each operation 5917 to create a file. 5919 However, the complexity burden of defining such locales in a way that 5920 could be understood by all clients and servers, and maintaining them 5921 in the face of changes would be considerable. A better solution is 5922 desirable. 5924 If the NFS version 4 protocol used a universal 16 bit or 32 bit 5925 character set (or an encoding of a 16 bit or 32 bit character set 5926 into octets), then the server and client need not care if the locale 5927 of the user accessing the file is different than the locale of the 5928 user who created the file. The unique 16 bit or 32 bit encoding of 5930 Draft Specification NFS version 4 Protocol September 2002 5932 the character allows for determination of what language the character 5933 is from and also how to display that character on the client. The 5934 server need not know what locales are used. 5936 11.2. Overview of Universal Character Set Standards 5938 The previous section makes a case for using a universal character 5939 set. This section makes the case for using UTF-8 as the specific 5940 universal character set for the NFS version 4 protocol. 5942 [RFC2279] discusses UTF-* (UTF-8 and other UTF-XXX encodings), 5943 Unicode, and UCS-*. There are two standards bodies managing 5944 universal code sets: 5946 o ISO/IEC which has the standard 10646-1 5948 o Unicode which has the Unicode standard 5950 Both standards bodies have pledged to track each other's assignments 5951 of character codes. 5953 The following is a brief analysis of the various standards. 5955 UCS Universal Character Set. This is ISO/IEC 10646-1: "a 5956 multi-octet character set called the Universal Character 5957 Set (UCS), which encompasses most of the world's writing 5958 systems." 5960 UCS-2 a two octet per character encoding that addresses the first 5961 2^16 characters of UCS. Currently there are no UCS 5962 characters beyond that range. 5964 UCS-4 a four octet per character encoding that permits the 5965 encoding of up to 2^31 characters. 5967 UTF UTF is an abbreviation of the term "UCS transformation 5968 format" and is used in the naming of various standards for 5969 encoding of UCS characters as described below. 5971 UTF-1 Only historical interest; it has been removed from 10646-1 5973 UTF-7 Encodes the entire "repertoire" of UCS "characters using 5974 only octets with the higher order bit clear". [RFC2152] 5975 describes UTF-7. UTF-7 accomplishes this by reserving one 5976 of the 7bit US-ASCII characters as a "shift" character to 5977 indicate non-US-ASCII characters. 5979 Draft Specification NFS version 4 Protocol September 2002 5981 UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII 5982 characters are encoded as before unchanged. Any octet with 5983 the high bit cleared can only mean a US-ASCII character. 5984 The high bit set means that a UCS character is being 5985 encoded. 5987 UTF-16 Encodes UCS-4 characters into UCS-2 characters using a 5988 reserved range in UCS-2. 5990 Unicode Unicode and UCS-2 are the same; [RFC2279] states: 5992 Up to the present time, changes in Unicode and amendments 5993 to ISO/IEC 10646 have tracked each other, so that the 5994 character repertoires and code point assignments have 5995 remained in sync. The relevant standardization committees 5996 have committed to maintain this very useful synchronism. 5998 11.3. Difficulties with UCS-4, UCS-2, Unicode 6000 Adapting existing applications, and filesystems to multi-octet 6001 schemes like UCS and Unicode can be difficult. A significant amount 6002 of code has been written to process streams of bytes. Also there are 6003 many existing stored objects described with 7 bit or 8 bit 6004 characters. Doubling or quadrupling the bandwidth and storage 6005 requirements seems like an expensive way to accomplish I18N. 6007 UCS-2 and Unicode are "only" 16 bits long. That might seem to be 6008 enough but, according to [Unicode1], 49,194 Unicode characters are 6009 already assigned. According to [Unicode2] there are still more 6010 languages that need to be added. 6012 11.4. UTF-8 and its solutions 6014 UTF-8 solves problems for NFS that exist with the use of UCS and 6015 Unicode. UTF-8 will encode 16 bit and 32 bit characters in a way 6016 that will be compact for most users. The encoding table from UCS-4 to 6017 UTF-8, as copied from [RFC2279]: 6019 UCS-4 range (hex.) UTF-8 octet sequence (binary) 6020 0000 0000-0000 007F 0xxxxxxx 6021 0000 0080-0000 07FF 110xxxxx 10xxxxxx 6022 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 6024 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 6025 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6026 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6027 10xxxxxx 6029 See [RFC2279] for precise encoding and decoding rules. Note because 6031 Draft Specification NFS version 4 Protocol September 2002 6033 of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account 6034 for the reserved range between D800 and DFFF. 6036 Note that the 16 bit UCS or Unicode characters require no more than 3 6037 octets to encode into UTF-8 6039 Interestingly, UTF-8 has room to handle characters larger than 31 6040 bits, because the leading octet of form: 6042 1111111x 6044 is not defined. If needed, ISO could either use that octet to 6045 indicate a sequence of an encoded 8 octet character, or perhaps use 6046 11111110 to permit the next octet to indicate an even more expandable 6047 character set. 6049 So using UTF-8 to represent character encodings means never having to 6050 run out of room. 6052 11.5. Normalization 6054 The client and server operating environments may differ in their 6055 policies and operational methods with respect to character 6056 normalization (See [Unicode1] for a discussion of normalization 6057 forms). This difference may also exist between applications on the 6058 same client. This adds to the difficulty of providing a single 6059 normalization policy for the protocol that allows for maximal 6060 interoperability. This issue is similar to the character case issues 6061 where the server may or may not support case insensitive file name 6062 matching and may or may not preserve the character case when storing 6063 file names. The protocol does not mandate a particular behavior but 6064 allows for the various permutations. 6066 The NFS version 4 protocol does not mandate the use of a particular 6067 normalization form at this time. A later revision of this 6068 specification may specify a particular normalization form. 6069 Therefore, the server and client can expect that they may receive 6070 unnormalized characters within protocol requests and responses. If 6071 the operating environment requires normalization, then the 6072 implementation must normalize the various UTF-8 encoded strings 6073 within the protocol before presenting the information to an 6074 application (at the client) or local filesystem (at the server). 6076 11.6. UTF-8 Related Errors 6078 Where the client sends an invalid UTF-8 string, the server should 6079 return an NFS4ERR_INVAL error. This includes cases in which 6080 inappropriate prefixes are detected and where the count includes 6081 trailing bytes that do not constitute a full UCS character. 6083 Where the client supplied string is valid UTF-8 but contains 6085 Draft Specification NFS version 4 Protocol September 2002 6087 characters that are not supported by the server as a value for that 6088 string (e.g. names containing characters that have more than two 6089 octets on a filesystem that supports Unicode characters only), the 6090 server should return an NFS4ERR_BADCHAR error. 6092 Where a UTF-8 string is used as a file name, and the filesystem, 6093 while supporting all of the characters within the name, does not 6094 allow that particular name to be used, the error should return the 6095 error NFS4ERR_BADNAME. This includes situations in which the server 6096 filesystem imposes a normalization constraint on name strings, but 6097 will also include such situations as filesystem prohibitions of "." 6098 and ".." as file names for certain operations, and other such 6099 constraints. 6101 Draft Specification NFS version 4 Protocol September 2002 6103 12. Error Definitions 6105 NFS error numbers are assigned to failed operations within a compound 6106 request. A compound request contains a number of NFS operations that 6107 have their results encoded in sequence in a compound reply. The 6108 results of successful operations will consist of an NFS4_OK status 6109 followed by the encoded results of the operation. If an NFS 6110 operation fails, an error status will be entered in the reply and the 6111 compound request will be terminated. 6113 A description of each defined error follows: 6115 NFS4_OK Indicates the operation completed successfully. 6117 NFS4ERR_ACCESS Permission denied. The caller does not have the 6118 correct permission to perform the requested 6119 operation. Contrast this with NFS4ERR_PERM, 6120 which restricts itself to owner or privileged 6121 user permission failures. 6123 NFS4ERR_ATTRNOTSUPP An attribute specified is not supported by the 6124 server. Does not apply to the GETATTR 6125 operation. 6127 NFS4ERR_ADMIN_REVOKED Due to administrator intervention, the 6128 lockowner's record locks, share reservations, 6129 and delegations have been revoked by the 6130 server. 6132 NFS4ERR_BADCHAR A UTF-8 string contains a character which is 6133 not supported by the server in the context in 6134 which it being used. 6136 NFS4ERR_BAD_COOKIE READDIR cookie is stale. 6138 NFS4ERR_BADHANDLE Illegal NFS filehandle. The filehandle failed 6139 internal consistency checks. 6141 NFS4ERR_BADNAME A name string in a request consists of valid 6142 UTF-8 characters supported by the server but 6143 the name is not supported by the server as a 6144 valid name for current operation. 6146 NFS4ERR_BADOWNER An owner, owner_group, or ACL attribute value 6147 can not be translated to local representation. 6149 NFS4ERR_BADTYPE An attempt was made to create an object of a 6150 type not supported by the server. 6152 NFS4ERR_BAD_RANGE The range for a LOCK, LOCKT, or LOCKU operation 6154 Draft Specification NFS version 4 Protocol September 2002 6156 is not appropriate to the allowable range of 6157 offsets for the server. 6159 NFS4ERR_BAD_SEQID The sequence number in a locking request is 6160 neither the next expected number or the last 6161 number processed. 6163 NFS4ERR_BAD_STATEID A stateid generated by the current server 6164 instance, but which does not designate any 6165 locking state (either current or superseded) 6166 for a current lockowner-file pair, was used. 6168 NFS4ERR_BADXDR The server encountered an XDR decoding error 6169 while processing an operation. 6171 NFS4ERR_CLID_INUSE The SETCLIENTID operation has found that a 6172 client id is already in use by another client. 6174 NFS4ERR_DEADLOCK The server has been able to determine a file 6175 locking deadlock condition for a blocking lock 6176 request. 6178 NFS4ERR_DELAY The server initiated the request, but was not 6179 able to complete it in a timely fashion. The 6180 client should wait and then try the request 6181 with a new RPC transaction ID. For example, 6182 this error should be returned from a server 6183 that supports hierarchical storage and receives 6184 a request to process a file that has been 6185 migrated. In this case, the server should start 6186 the immigration process and respond to client 6187 with this error. This error may also occur 6188 when a necessary delegation recall makes 6189 processing a request in a timely fashion 6190 impossible. 6192 NFS4ERR_DENIED An attempt to lock a file is denied. Since 6193 this may be a temporary condition, the client 6194 is encouraged to retry the lock request until 6195 the lock is accepted. 6197 NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The 6198 user's resource limit on the server has been 6199 exceeded. 6201 NFS4ERR_EXIST File exists. The file specified already exists. 6203 NFS4ERR_EXPIRED A lease has expired that is being used in the 6204 current operation. 6206 NFS4ERR_FBIG File too large. The operation would have caused 6207 a file to grow beyond the server's limit. 6209 Draft Specification NFS version 4 Protocol September 2002 6211 NFS4ERR_FHEXPIRED The filehandle provided is volatile and has 6212 expired at the server. 6214 NFS4ERR_FILE_OPEN The operation can not be successfully processed 6215 because a file involved in the operation is 6216 currently open. 6218 NFS4ERR_GRACE The server is in its recovery or grace period 6219 which should match the lease period of the 6220 server. 6222 NFS4ERR_INVAL Invalid argument or unsupported argument for an 6223 operation. Two examples are attempting a 6224 READLINK on an object other than a symbolic 6225 link or specifying a value for an enum field 6226 that is not defined in the protocol (e.g. 6227 nfs_ftype4). 6229 NFS4ERR_IO I/O error. A hard error (for example, a disk 6230 error) occurred while processing the requested 6231 operation. 6233 NFS4ERR_ISDIR Is a directory. The caller specified a 6234 directory in a non-directory operation. 6236 NFS4ERR_LEASE_MOVED A lease being renewed is associated with a 6237 filesystem that has been migrated to a new 6238 server. 6240 NFS4ERR_LOCKED A read or write operation was attempted on a 6241 locked file. 6243 NFS4ERR_LOCK_NOTSUPP Server does not support atomic upgrade or 6244 downgrade of locks. 6246 NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a 6247 current lock for the lock owner and the server 6248 does not support this type of request. 6250 NFS4ERR_LOCKS_HELD A CLOSE was attempted and file locks would 6251 exist after the CLOSE. 6253 NFS4ERR_MINOR_VERS_MISMATCH 6254 The server has received a request that 6255 specifies an unsupported minor version. The 6256 server must return a COMPOUND4res with a zero 6257 length operations result array. 6259 NFS4ERR_MLINK Too many hard links. 6261 NFS4ERR_MOVED The filesystem which contains the current 6262 filehandle object has been relocated or 6264 Draft Specification NFS version 4 Protocol September 2002 6266 migrated to another server. The client may 6267 obtain the new filesystem location by obtaining 6268 the "fs_locations" attribute for the current 6269 filehandle. For further discussion, refer to 6270 the section "Filesystem Migration or 6271 Relocation". 6273 NFS4ERR_NAMETOOLONG The filename in an operation was too long. 6275 NFS4ERR_NOENT No such file or directory. The file or 6276 directory name specified does not exist. 6278 NFS4ERR_NOFILEHANDLE The logical current filehandle value (or, in 6279 the case of RESTOREFH, the saved filehandle 6280 value) has not been set properly. This may be 6281 a result of a malformed COMPOUND operation 6282 (i.e. no PUTFH or PUTROOTFH before an operation 6283 that requires the current filehandle be set). 6285 NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of 6286 the grace period of the server. As a result, 6287 the server can not guarantee that conflicting 6288 state has not been provided to another client. 6290 NFS4ERR_NOSPC No space left on device. The operation would 6291 have caused the server's filesystem to exceed 6292 its limit. 6294 NFS4ERR_NOTDIR Not a directory. The caller specified a non- 6295 directory in a directory operation. 6297 NFS4ERR_NOTEMPTY An attempt was made to remove a directory that 6298 was not empty. 6300 NFS4ERR_NOTSUPP Operation is not supported. 6302 NFS4ERR_NOT_SAME This error is returned by the VERIFY operation 6303 to signify that the attributes compared were 6304 not the same as provided in the client's 6305 request. 6307 NFS4ERR_NXIO I/O error. No such device or address. 6309 NFS4ERR_OLD_STATEID A stateid which designates the locking state 6310 for a lockowner-file at an earlier time was 6311 used. 6313 NFS4ERR_OPENMODE The client attempted a READ, WRITE, LOCK or 6314 SETATTR operation not sanctioned by the stateid 6315 passed (e.g. writing to a file opened only for 6316 read). 6318 Draft Specification NFS version 4 Protocol September 2002 6320 NFS4ERR_OP_ILLEGAL An illegal operation value has been specified 6321 in the argop field of a COMPOUND or CB_COMPOUND 6322 procedure. 6324 NFS4ERR_PERM Not owner. The operation was not allowed 6325 because the caller is either not a privileged 6326 user (root) or not the owner of the target of 6327 the operation. 6329 NFS4ERR_RECLAIM_BAD The reclaim provided by the client does not 6330 match any of the server's state consistency 6331 checks and is bad. 6333 NFS4ERR_RECLAIM_CONFLICT 6334 The reclaim provided by the client has 6335 encountered a conflict and can not be provided. 6336 Potentially indicates a misbehaving client. 6338 NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, 6339 the server may exhaust available resources and 6340 can not continue processing operations within 6341 the COMPOUND procedure. This error will be 6342 returned from the server in those instances of 6343 resource exhaustion related to the processing 6344 of the COMPOUND procedure. 6346 NFS4ERR_RESTOREFH The RESTOREFH operation does not have a saved 6347 filehandle (identified by SAVEFH) to operate 6348 upon. 6350 NFS4ERR_ROFS Read-only filesystem. A modifying operation was 6351 attempted on a read-only filesystem. 6353 NFS4ERR_SAME This error is returned by the NVERIFY operation 6354 to signify that the attributes compared were 6355 the same as provided in the client's request. 6357 NFS4ERR_SERVERFAULT An error occurred on the server which does not 6358 map to any of the legal NFS version 4 protocol 6359 error values. The client should translate this 6360 into an appropriate error. UNIX clients may 6361 choose to translate this to EIO. 6363 NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share 6364 reservation has failed because of a share 6365 conflict. 6367 NFS4ERR_STALE Invalid filehandle. The filehandle given in the 6368 arguments was invalid. The file referred to by 6369 that filehandle no longer exists or access to 6370 it has been revoked. 6372 Draft Specification NFS version 4 Protocol September 2002 6374 NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was 6375 used in a locking or SETCLIENTID_CONFIRM 6376 request. 6378 NFS4ERR_STALE_STATEID A stateid generated by an earlier server 6379 instance was used. 6381 NFS4ERR_SYMLINK The current filehandle provided for a LOOKUP is 6382 not a directory but a symbolic link. Also used 6383 if the final component of the OPEN path is a 6384 symbolic link. 6386 NFS4ERR_TOOSMALL The encoded response to a READDIR request 6387 exceeds the size limit set by the initial 6388 request. 6390 NFS4ERR_WRONGSEC The security mechanism being used by the client 6391 for the operation does not match the server's 6392 security policy. The client should change the 6393 security mechanism being used and retry the 6394 operation. 6396 NFS4ERR_XDEV Attempt to do an operation between different 6397 fsids. 6399 Draft Specification NFS version 4 Protocol September 2002 6401 13. NFS version 4 Requests 6403 For the NFS version 4 RPC program, there are two traditional RPC 6404 procedures: NULL and COMPOUND. All other functionality is defined as 6405 a set of operations and these operations are defined in normal 6406 XDR/RPC syntax and semantics. However, these operations are 6407 encapsulated within the COMPOUND procedure. This requires that the 6408 client combine one or more of the NFS version 4 operations into a 6409 single request. 6411 The NFS4_CALLBACK program is used to provide server to client 6412 signaling and is constructed in a similar fashion as the NFS version 6413 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the 6414 same way as NULL and COMPOUND are within the NFS program. The 6415 CB_COMPOUND request also encapsulates the remaining operations of the 6416 NFS4_CALLBACK program. There is no predefined RPC program number for 6417 the NFS4_CALLBACK program. It is up to the client to specify a 6418 program number in the "transient" program range. The program and 6419 port number of the NFS4_CALLBACK program are provided by the client 6420 as part of the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The program 6421 and port can be changed by another SETCLIENTID/SETCLIENTID_CONFIRM 6422 sequence, and it is possible to use the sequence to change them 6423 within a client incarnation without removing relevant leased client 6424 state. 6426 13.1. Compound Procedure 6428 The COMPOUND procedure provides the opportunity for better 6429 performance within high latency networks. The client can avoid 6430 cumulative latency of multiple RPCs by combining multiple dependent 6431 operations into a single COMPOUND procedure. A compound operation 6432 may provide for protocol simplification by allowing the client to 6433 combine basic procedures into a single request that is customized for 6434 the client's environment. 6436 The CB_COMPOUND procedure precisely parallels the features of 6437 COMPOUND as described above. 6439 The basic structure of the COMPOUND procedure is: 6441 +-----+--------------+--------+-----------+-----------+-----------+-- 6442 | tag | minorversion | numops | op + args | op + args | op + args | 6443 +-----+--------------+--------+-----------+-----------+-----------+-- 6445 and the reply's structure is: 6447 +------------+-----+--------+-----------------------+-- 6448 |last status | tag | numres | status + op + results | 6449 +------------+-----+--------+-----------------------+-- 6451 The numops and numres fields, used in the depiction above, represent 6453 Draft Specification NFS version 4 Protocol September 2002 6455 the count for the counted array encoding use to signify the number of 6456 arguments or results encoded in the request and response. As per the 6457 XDR encoding, these counts must match exactly the number of operation 6458 arguments or results encoded. 6460 13.2. Evaluation of a Compound Request 6462 The server will process the COMPOUND procedure by evaluating each of 6463 the operations within the COMPOUND procedure in order. Each 6464 component operation consists of a 32 bit operation code, followed by 6465 the argument of length determined by the type of operation. The 6466 results of each operation are encoded in sequence into a reply 6467 buffer. The results of each operation are preceded by the opcode and 6468 a status code (normally zero). If an operation results in a non-zero 6469 status code, the status will be encoded and evaluation of the 6470 compound sequence will halt and the reply will be returned. Note 6471 that evaluation stops even in the event of "non error" conditions 6472 such as NFS4ERR_SAME. 6474 There are no atomicity requirements for the operations contained 6475 within the COMPOUND procedure. The operations being evaluated as 6476 part of a COMPOUND request may be evaluated simultaneously with other 6477 COMPOUND requests that the server receives. 6479 It is the client's responsibility for recovering from any partially 6480 completed COMPOUND procedure. Partially completed COMPOUND 6481 procedures may occur at any point due to errors such as 6482 NFS4ERR_RESOURCE and NFS4ERR_DELAY. This may occur even given an 6483 otherwise valid operation string. Further, a server reboot which 6484 occurs in the middle of processing a COMPOUND procedure may leave the 6485 client with the difficult task of determining how far COMPOUND 6486 processing has proceeded. Therefore, the client should avoid overly 6487 complex COMPOUND procedures in the event of the failure of an 6488 operation within the procedure. 6490 Each operation assumes a "current" and "saved" filehandle that is 6491 available as part of the execution context of the compound request. 6492 Operations may set, change, or return the current filehandle. The 6493 "saved" filehandle is used for temporary storage of a filehandle 6494 value and as operands for the RENAME and LINK operations. 6496 13.3. Synchronous Modifying Operations 6498 NFS version 4 operations that modify the filesystem are synchronous. 6499 When an operation is successfully completed at the server, the client 6500 can depend that any data associated with the request is now on stable 6501 storage (the one exception is in the case of the file data in a WRITE 6502 operation with the UNSTABLE option specified). 6504 This implies that any previous operations within the same compound 6506 Draft Specification NFS version 4 Protocol September 2002 6508 request are also reflected in stable storage. This behavior enables 6509 the client's ability to recover from a partially executed compound 6510 request which may resulted from the failure of the server. For 6511 example, if a compound request contains operations A and B and the 6512 server is unable to send a response to the client, depending on the 6513 progress the server made in servicing the request the result of both 6514 operations may be reflected in stable storage or just operation A may 6515 be reflected. The server must not have just the results of operation 6516 B in stable storage. 6518 13.4. Operation Values 6520 The operations encoded in the COMPOUND procedure are identified by 6521 operation values. To avoid overlap with the RPC procedure numbers, 6522 operations 0 (zero) and 1 are not defined. Operation 2 is not 6523 defined but reserved for future use with minor versioning. 6525 Draft Specification NFS version 4 Protocol September 2002 6527 14. NFS version 4 Procedures 6529 14.1. Procedure 0: NULL - No Operation 6531 SYNOPSIS 6533 6535 ARGUMENT 6537 void; 6539 RESULT 6541 void; 6543 DESCRIPTION 6545 Standard NULL procedure. Void argument, void response. This 6546 procedure has no functionality associated with it. Because of this 6547 it is sometimes used to measure the overhead of processing a 6548 service request. Therefore, the server should ensure that no 6549 unnecessary work is done in servicing this procedure. 6551 ERRORS 6553 None. 6555 Draft Specification NFS version 4 Protocol September 2002 6557 14.2. Procedure 1: COMPOUND - Compound Operations 6559 SYNOPSIS 6561 compoundargs -> compoundres 6563 ARGUMENT 6565 union nfs_argop4 switch (nfs_opnum4 argop) { 6566 case : ; 6567 ... 6568 }; 6570 struct COMPOUND4args { 6571 utf8string tag; 6572 uint32_t minorversion; 6573 nfs_argop4 argarray<>; 6574 }; 6576 RESULT 6578 union nfs_resop4 switch (nfs_opnum4 resop){ 6579 case : ; 6580 ... 6581 }; 6583 struct COMPOUND4res { 6584 nfsstat4 status; 6585 utf8string tag; 6586 nfs_resop4 resarray<>; 6587 }; 6589 DESCRIPTION 6591 The COMPOUND procedure is used to combine one or more of the NFS 6592 operations into a single RPC request. The main NFS RPC program has 6593 two main procedures: NULL and COMPOUND. All other operations use 6594 the COMPOUND procedure as a wrapper. 6596 The COMPOUND procedure is used to combine individual operations 6597 into a single RPC request. The server interprets each of the 6598 operations in turn. If an operation is executed by the server and 6599 the status of that operation is NFS4_OK, then the next operation in 6600 the COMPOUND procedure is executed. The server continues this 6601 process until there are no more operations to be executed or one of 6602 the operations has a status value other than NFS4_OK. 6604 Draft Specification NFS version 4 Protocol September 2002 6606 In the processing of the COMPOUND procedure, the server may find 6607 that it does not have the available resources to execute any or all 6608 of the operations within the COMPOUND sequence. In this case, the 6609 error NFS4ERR_RESOURCE will be returned for the particular 6610 operation within the COMPOUND procedure where the resource 6611 exhaustion occurred. This assumes that all previous operations 6612 within the COMPOUND sequence have been evaluated successfully. The 6613 results for all of the evaluated operations must be returned to the 6614 client. 6616 The server will generally choose between two methods of decoding 6617 the client's request. The first would be the traditional one pass 6618 XDR decode. If there is an XDR decoding error in this case, the 6619 RPC XDR decode error would be returned. The second method would be 6620 to make an initial pass to decode the basic COMPOUND request and 6621 then to XDR decode the individual operations; the most interesting 6622 is the decode of attributes. In this case, the server may 6623 encounter an XDR decode error during the second pass. In this 6624 case, the server would return the error NFS4ERR_BADXDR to signify 6625 the decode error. 6627 The COMPOUND arguments contain a "minorversion" field. The initial 6628 and default value for this field is 0 (zero). This field will be 6629 used by future minor versions such that the client can communicate 6630 to the server what minor version is being requested. If the server 6631 receives a COMPOUND procedure with a minorversion field value that 6632 it does not support, the server MUST return an error of 6633 NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. 6635 Contained within the COMPOUND results is a "status" field. If the 6636 results array length is non-zero, this status must be equivalent to 6637 the status of the last operation that was executed within the 6638 COMPOUND procedure. Therefore, if an operation incurred an error 6639 then the "status" value will be the same error value as is being 6640 returned for the operation that failed. 6642 Note that operations, 0 (zero) and 1 (one) are not defined for the 6643 COMPOUND procedure. Operation 2 is not defined but reserved for 6644 future definition and use with minor versioning. If the server 6645 receives a operation array that contains operation 2 and the 6646 minorversion field has a value of 0 (zero), an error of 6647 NFS4ERR_OP_ILLEGAL, as described in the next paragraph, is returned 6648 to the client. If an operation array contains an operation 2 and 6649 the minorversion field is non-zero and the server does not support 6650 the minor version, the server returns an error of 6651 NFS4ERR_MINOR_VERS_MISMATCH. Therefore, the 6652 NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other 6653 errors. 6655 It is possible that the server receives a request that contains an 6656 operation that is less than the first legal operation (OP_ACCESS) 6657 or greater than the last legal operation (OP_RELEASE_LOCKOWNER). 6659 Draft Specification NFS version 4 Protocol September 2002 6661 In this case, the server's response will encode the opcode 6662 OP_ILLEGAL rather than the illegal opcode of the request. The 6663 status field in the ILLEGAL return results will set to 6664 NFS4ERR_OP_ILLEGAL. The COMPOUND procedure's return results will 6665 also be NFS4ERR_OP_ILLEGAL. 6667 The definition of the "tag" in the request is left to the 6668 implementor. It may be used to summarize the content of the 6669 compound request for the benefit of packet sniffers and engineers 6670 debugging implementations. However, the value of "tag" in the 6671 response SHOULD be the same value as provided in the request. This 6672 applies to the tag field of the CB_COMPOUND procedure as well. 6674 IMPLEMENTATION 6676 Since an error of any type may occur after only a portion of the 6677 operations have been evaluated, the client must be prepared to 6678 recover from any failure. If the source of an NFS4ERR_RESOURCE 6679 error was a complex or lengthy set of operations, it is likely that 6680 if the number of operations were reduced the server would be able 6681 to evaluate them successfully. Therefore, the client is 6682 responsible for dealing with this type of complexity in recovery. 6684 ERRORS 6686 All errors defined in the protocol 6688 Draft Specification NFS version 4 Protocol September 2002 6690 14.2.1. Operation 3: ACCESS - Check Access Rights 6692 SYNOPSIS 6694 (cfh), accessreq -> supported, accessrights 6696 ARGUMENT 6698 const ACCESS4_READ = 0x00000001; 6699 const ACCESS4_LOOKUP = 0x00000002; 6700 const ACCESS4_MODIFY = 0x00000004; 6701 const ACCESS4_EXTEND = 0x00000008; 6702 const ACCESS4_DELETE = 0x00000010; 6703 const ACCESS4_EXECUTE = 0x00000020; 6705 struct ACCESS4args { 6706 /* CURRENT_FH: object */ 6707 uint32_t access; 6708 }; 6710 RESULT 6712 struct ACCESS4resok { 6713 uint32_t supported; 6714 uint32_t access; 6715 }; 6717 union ACCESS4res switch (nfsstat4 status) { 6718 case NFS4_OK: 6719 ACCESS4resok resok4; 6720 default: 6721 void; 6722 }; 6724 DESCRIPTION 6726 ACCESS determines the access rights that a user, as identified by 6727 the credentials in the RPC request, has with respect to the file 6728 system object specified by the current filehandle. The client 6729 encodes the set of access rights that are to be checked in the bit 6730 mask "access". The server checks the permissions encoded in the 6731 bit mask. If a status of NFS4_OK is returned, two bit masks are 6732 included in the response. The first, "supported", represents the 6733 access rights for which the server can verify reliably. The 6734 second, "access", represents the access rights available to the 6735 user for the filehandle provided. On success, the current 6736 filehandle retains its value. 6738 Draft Specification NFS version 4 Protocol September 2002 6740 Note that the supported field will contain only as many values as 6741 was originally sent in the arguments. For example, if the client 6742 sends an ACCESS operation with only the ACCESS4_READ value set and 6743 the server supports this value, the server will return only 6744 ACCESS4_READ even if it could have reliably checked other values. 6746 The results of this operation are necessarily advisory in nature. 6747 A return status of NFS4_OK and the appropriate bit set in the bit 6748 mask does not imply that such access will be allowed to the file 6749 system object in the future. This is because access rights can be 6750 revoked by the server at any time. 6752 The following access permissions may be requested: 6754 ACCESS4_READ Read data from file or read a directory. 6756 ACCESS4_LOOKUP Look up a name in a directory (no meaning for non- 6757 directory objects). 6759 ACCESS4_MODIFY Rewrite existing file data or modify existing 6760 directory entries. 6762 ACCESS4_EXTEND Write new data or add directory entries. 6764 ACCESS4_DELETE Delete an existing directory entry. 6766 ACCESS4_EXECUTE Execute file (no meaning for a directory). 6768 On success, the current filehandle retains its value. 6770 IMPLEMENTATION 6772 In general, it is not sufficient for the client to attempt to 6773 deduce access permissions by inspecting the uid, gid, and mode 6774 fields in the file attributes or by attempting to interpret the 6775 contents of the ACL attribute. This is because the server may 6776 perform uid or gid mapping or enforce additional access control 6777 restrictions. It is also possible that the server may not be in 6778 the same ID space as the client. In these cases (and perhaps 6779 others), the client can not reliably perform an access check with 6780 only current file attributes. 6782 In the NFS version 2 protocol, the only reliable way to determine 6783 whether an operation was allowed was to try it and see if it 6784 succeeded or failed. Using the ACCESS operation in the NFS version 6785 4 protocol, the client can ask the server to indicate whether or 6786 not one or more classes of operations are permitted. The ACCESS 6787 operation is provided to allow clients to check before doing a 6788 series of operations which will result in an access failure. The 6789 OPEN operation provides a point where the server can verify access 6790 to the file object and method to return that information to the 6792 Draft Specification NFS version 4 Protocol September 2002 6794 client. The ACCESS operation is still useful for directory 6795 operations or for use in the case the UNIX API "access" is used on 6796 the client. 6798 The information returned by the server in response to an ACCESS 6799 call is not permanent. It was correct at the exact time that the 6800 server performed the checks, but not necessarily afterwards. The 6801 server can revoke access permission at any time. 6803 The client should use the effective credentials of the user to 6804 build the authentication information in the ACCESS request used to 6805 determine access rights. It is the effective user and group 6806 credentials that are used in subsequent read and write operations. 6808 Many implementations do not directly support the ACCESS4_DELETE 6809 permission. Operating systems like UNIX will ignore the 6810 ACCESS4_DELETE bit if set on an access request on a non-directory 6811 object. In these systems, delete permission on a file is 6812 determined by the access permissions on the directory in which the 6813 file resides, instead of being determined by the permissions of the 6814 file itself. Therefore, the mask returned enumerating which access 6815 rights can be determined will have the ACCESS4_DELETE value set to 6816 0. This indicates to the client that the server was unable to 6817 check that particular access right. The ACCESS4_DELETE bit in the 6818 access mask returned will then be ignored by the client. 6820 ERRORS 6822 NFS4ERR_ACCESS 6823 NFS4ERR_BADHANDLE 6824 NFS4ERR_BADXDR 6825 NFS4ERR_DELAY 6826 NFS4ERR_FHEXPIRED 6827 NFS4ERR_INVAL 6828 NFS4ERR_IO 6829 NFS4ERR_MOVED 6830 NFS4ERR_NOFILEHANDLE 6831 NFS4ERR_RESOURCE 6832 NFS4ERR_SERVERFAULT 6833 NFS4ERR_STALE 6835 Draft Specification NFS version 4 Protocol September 2002 6837 14.2.2. Operation 4: CLOSE - Close File 6839 SYNOPSIS 6841 (cfh), seqid, open_stateid -> open_stateid 6843 ARGUMENT 6845 struct CLOSE4args { 6846 /* CURRENT_FH: object */ 6847 seqid4 seqid 6848 stateid4 open_stateid; 6849 }; 6851 RESULT 6853 union CLOSE4res switch (nfsstat4 status) { 6854 case NFS4_OK: 6855 stateid4 open_stateid; 6856 default: 6857 void; 6858 }; 6860 DESCRIPTION 6862 The CLOSE operation releases share reservations for the regular or 6863 named attribute file as specified by the current filehandle. The 6864 share reservations and other state information released at the 6865 server as a result of this CLOSE is only associated with the 6866 supplied stateid. The sequence id provides for the correct 6867 ordering. State associated with other OPENs is not affected. 6869 If record locks are held, the client SHOULD release all locks 6870 before issuing a CLOSE. The server MAY free all outstanding locks 6871 on CLOSE but some servers may not support the CLOSE of a file that 6872 still has record locks held. The server MUST return failure if any 6873 locks would exist after the CLOSE. 6875 On success, the current filehandle retains its value. 6877 IMPLEMENTATION 6879 Even though CLOSE returns a stateid, this stateid is not useful to 6880 the client and should be treated as deprecated. CLOSE "shuts down" 6881 the state associated with all OPENs for the file by a single 6883 Draft Specification NFS version 4 Protocol September 2002 6885 open_owner. As noted above, CLOSE will either release all file 6886 locking state or return an error. Therefore, the stateid returned 6887 by CLOSE is not useful for operations that follow. 6889 ERRORS 6891 NFS4ERR_ADMIN_REVOKED 6892 NFS4ERR_BADHANDLE 6893 NFS4ERR_BAD_SEQID 6894 NFS4ERR_BAD_STATEID 6895 NFS4ERR_BADXDR 6896 NFS4ERR_DELAY 6897 NFS4ERR_EXPIRED 6898 NFS4ERR_FHEXPIRED 6899 NFS4ERR_INVAL 6900 NFS4ERR_ISDIR 6901 NFS4ERR_LEASE_MOVED 6902 NFS4ERR_LOCKS_HELD 6903 NFS4ERR_MOVED 6904 NFS4ERR_NOFILEHANDLE 6905 NFS4ERR_OLD_STATEID 6906 NFS4ERR_RESOURCE 6907 NFS4ERR_SERVERFAULT 6908 NFS4ERR_STALE 6909 NFS4ERR_STALE_STATEID 6911 Draft Specification NFS version 4 Protocol September 2002 6913 14.2.3. Operation 5: COMMIT - Commit Cached Data 6915 SYNOPSIS 6917 (cfh), offset, count -> verifier 6919 ARGUMENT 6921 struct COMMIT4args { 6922 /* CURRENT_FH: file */ 6923 offset4 offset; 6924 count4 count; 6925 }; 6927 RESULT 6929 struct COMMIT4resok { 6930 verifier4 writeverf; 6931 }; 6933 union COMMIT4res switch (nfsstat4 status) { 6934 case NFS4_OK: 6935 COMMIT4resok resok4; 6936 default: 6937 void; 6938 }; 6940 DESCRIPTION 6942 The COMMIT operation forces or flushes data to stable storage for 6943 the file specified by the current filehandle. The flushed data is 6944 that which was previously written with a WRITE operation which had 6945 the stable field set to UNSTABLE4. 6947 The offset specifies the position within the file where the flush 6948 is to begin. An offset value of 0 (zero) means to flush data 6949 starting at the beginning of the file. The count specifies the 6950 number of bytes of data to flush. If count is 0 (zero), a flush 6951 from offset to the end of the file is done. 6953 The server returns a write verifier upon successful completion of 6954 the COMMIT. The write verifier is used by the client to determine 6955 if the server has restarted or rebooted between the initial 6956 WRITE(s) and the COMMIT. The client does this by comparing the 6957 write verifier returned from the initial writes and the verifier 6958 returned by the COMMIT operation. The server must vary the value 6959 of the write verifier at each server event or instantiation that 6960 may lead to a loss of uncommitted data. Most commonly this occurs 6961 when the server is rebooted; however, other events at the server 6963 Draft Specification NFS version 4 Protocol September 2002 6965 may result in uncommitted data loss as well. 6967 On success, the current filehandle retains its value. 6969 IMPLEMENTATION 6971 The COMMIT operation is similar in operation and semantics to the 6972 POSIX fsync(2) system call that synchronizes a file's state with 6973 the disk (file data and metadata is flushed to disk or stable 6974 storage). COMMIT performs the same operation for a client, flushing 6975 any unsynchronized data and metadata on the server to the server's 6976 disk or stable storage for the specified file. Like fsync(2), it 6977 may be that there is some modified data or no modified data to 6978 synchronize. The data may have been synchronized by the server's 6979 normal periodic buffer synchronization activity. COMMIT should 6980 return NFS4_OK, unless there has been an unexpected error. 6982 COMMIT differs from fsync(2) in that it is possible for the client 6983 to flush a range of the file (most likely triggered by a buffer- 6984 reclamation scheme on the client before file has been completely 6985 written). 6987 The server implementation of COMMIT is reasonably simple. If the 6988 server receives a full file COMMIT request, that is starting at 6989 offset 0 and count 0, it should do the equivalent of fsync()'ing 6990 the file. Otherwise, it should arrange to have the cached data in 6991 the range specified by offset and count to be flushed to stable 6992 storage. In both cases, any metadata associated with the file must 6993 be flushed to stable storage before returning. It is not an error 6994 for there to be nothing to flush on the server. This means that 6995 the data and metadata that needed to be flushed have already been 6996 flushed or lost during the last server failure. 6998 The client implementation of COMMIT is a little more complex. 6999 There are two reasons for wanting to commit a client buffer to 7000 stable storage. The first is that the client wants to reuse a 7001 buffer. In this case, the offset and count of the buffer are sent 7002 to the server in the COMMIT request. The server then flushes any 7003 cached data based on the offset and count, and flushes any metadata 7004 associated with the file. It then returns the status of the flush 7005 and the write verifier. The other reason for the client to 7006 generate a COMMIT is for a full file flush, such as may be done at 7007 close. In this case, the client would gather all of the buffers 7008 for this file that contain uncommitted data, do the COMMIT 7009 operation with an offset of 0 and count of 0, and then free all of 7010 those buffers. Any other dirty buffers would be sent to the server 7011 in the normal fashion. 7013 After a buffer is written by the client with the stable parameter 7014 set to UNSTABLE4, the buffer must be considered as modified by the 7015 client until the buffer has either been flushed via a COMMIT 7017 Draft Specification NFS version 4 Protocol September 2002 7019 operation or written via a WRITE operation with stable parameter 7020 set to FILE_SYNC4 or DATA_SYNC4. This is done to prevent the buffer 7021 from being freed and reused before the data can be flushed to 7022 stable storage on the server. 7024 When a response is returned from either a WRITE or a COMMIT 7025 operation and it contains a write verifier that is different than 7026 previously returned by the server, the client will need to 7027 retransmit all of the buffers containing uncommitted cached data to 7028 the server. How this is to be done is up to the implementor. If 7029 there is only one buffer of interest, then it should probably be 7030 sent back over in a WRITE request with the appropriate stable 7031 parameter. If there is more than one buffer, it might be 7032 worthwhile retransmitting all of the buffers in WRITE requests with 7033 the stable parameter set to UNSTABLE4 and then retransmitting the 7034 COMMIT operation to flush all of the data on the server to stable 7035 storage. The timing of these retransmissions is left to the 7036 implementor. 7038 The above description applies to page-cache-based systems as well 7039 as buffer-cache-based systems. In those systems, the virtual 7040 memory system will need to be modified instead of the buffer cache. 7042 ERRORS 7044 NFS4ERR_ACCESS 7045 NFS4ERR_BADHANDLE 7046 NFS4ERR_BADXDR 7047 NFS4ERR_FHEXPIRED 7048 NFS4ERR_INVAL 7049 NFS4ERR_IO 7050 NFS4ERR_ISDIR 7051 NFS4ERR_MOVED 7052 NFS4ERR_NOFILEHANDLE 7053 NFS4ERR_RESOURCE 7054 NFS4ERR_ROFS 7055 NFS4ERR_SERVERFAULT 7056 NFS4ERR_STALE 7058 Draft Specification NFS version 4 Protocol September 2002 7060 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 7062 SYNOPSIS 7064 (cfh), name, type, attrs -> (cfh), change_info, attrs_set 7066 ARGUMENT 7068 union createtype4 switch (nfs_ftype4 type) { 7069 case NF4LNK: 7070 linktext4 linkdata; 7071 case NF4BLK: 7072 case NF4CHR: 7073 specdata4 devdata; 7074 case NF4SOCK: 7075 case NF4FIFO: 7076 case NF4DIR: 7077 void; 7078 }; 7080 struct CREATE4args { 7081 /* CURRENT_FH: directory for creation */ 7082 createtype4 objtype; 7083 component4 objname; 7084 fattr4 createattrs; 7085 }; 7087 RESULT 7089 struct CREATE4resok { 7090 change_info4 cinfo; 7091 bitmap4 attrset; /* attributes set */ 7092 }; 7094 union CREATE4res switch (nfsstat4 status) { 7095 case NFS4_OK: 7096 CREATE4resok resok4; 7097 default: 7098 void; 7099 }; 7101 DESCRIPTION 7103 The CREATE operation creates a non-regular file object in a 7104 directory with a given name. The OPEN operation MUST be used to 7105 create a regular file. 7107 The objname specifies the name for the new object. The objtype 7108 determines the type of object to be created: directory, symlink, 7110 Draft Specification NFS version 4 Protocol September 2002 7112 etc. 7114 If an object of the same name already exists in the directory, the 7115 server will return the error NFS4ERR_EXIST. 7117 For the directory where the new file object was created, the server 7118 returns change_info4 information in cinfo. With the atomic field 7119 of the change_info4 struct, the server will indicate if the before 7120 and after change attributes were obtained atomically with respect 7121 to the file object creation. 7123 If the objname has a length of 0 (zero), or if objname does not 7124 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 7125 returned. 7127 The current filehandle is replaced by that of the new object. 7129 The createattrs specifies the initial set of attributes for the 7130 object. The set of attributes may include any writable attribute 7131 valid for the object type. When the operation is successful, the 7132 server will return to the client an attribute mask signifying which 7133 attributes were successfully set for the object. 7135 If createattrs includes neither the owner attribute nor an ACL with 7136 an ACE for the owner, and if the server's filesystem both supports 7137 and requires an owner attribute (or an owner ACE) then the server 7138 MUST derive the owner (or the owner ACE). This would typically be 7139 from the principal indicated in the RPC credentials of the call, 7140 but the server's operating environment or filesystem semantics may 7141 dictate other methods of derivation. Similarly, if createattrs 7142 includes neither the group attribute nor a group ACE, and if the 7143 server's filesystem both supports and requires the notion of a 7144 group attribute (or group ACE), the server MUST derive the group 7145 attribute (or the corresponding owner ACE) for the file. This could 7146 be from the RPC call's credentials, such as the group principal if 7147 the credentials include it (such as with AUTH_SYS), from the group 7148 identifier associated with the principal in the credentials (for 7149 e.g., POSIX systems have a passwd database that has the group 7150 identifier for every user identifier), inherited from directory the 7151 object is created in, or whatever else the server's operating 7152 environment or filesystem semantics dictate. This applies to the 7153 OPEN operation too. 7155 Conversely, it is possible the client will specify in createattrs 7156 an owner attribute or group attribute or ACL that the principal 7157 indicated the RPC call's credentials does not have permissions to 7158 create files for. The error to be returned in this instance is 7159 NFS4ERR_PERM. This applies to the OPEN operation too. 7161 IMPLEMENTATION 7163 Draft Specification NFS version 4 Protocol September 2002 7165 If the client desires to set attribute values after the create, a 7166 SETATTR operation can be added to the COMPOUND request so that the 7167 appropriate attributes will be set. 7169 ERRORS 7171 NFS4ERR_ACCESS 7172 NFS4ERR_ATTRNOTSUPP 7173 NFS4ERR_BADCHAR 7174 NFS4ERR_BADHANDLE 7175 NFS4ERR_BADNAME 7176 NFS4ERR_BADOWNER 7177 NFS4ERR_BADTYPE 7178 NFS4ERR_BADXDR 7179 NFS4ERR_DELAY 7180 NFS4ERR_DQUOT 7181 NFS4ERR_EXIST 7182 NFS4ERR_FHEXPIRED 7183 NFS4ERR_INVAL 7184 NFS4ERR_IO 7185 NFS4ERR_MOVED 7186 NFS4ERR_NAMETOOLONG 7187 NFS4ERR_NOFILEHANDLE 7188 NFS4ERR_NOSPC 7189 NFS4ERR_NOTDIR 7190 NFS4ERR_PERM 7191 NFS4ERR_RESOURCE 7192 NFS4ERR_ROFS 7193 NFS4ERR_SERVERFAULT 7194 NFS4ERR_STALE 7196 Draft Specification NFS version 4 Protocol September 2002 7198 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery 7200 SYNOPSIS 7202 clientid -> 7204 ARGUMENT 7206 struct DELEGPURGE4args { 7207 clientid4 clientid; 7208 }; 7210 RESULT 7212 struct DELEGPURGE4res { 7213 nfsstat4 status; 7214 }; 7216 DESCRIPTION 7218 Purges all of the delegations awaiting recovery for a given client. 7219 This is useful for clients which do not commit delegation 7220 information to stable storage to indicate that conflicting requests 7221 need not be delayed by the server awaiting recovery of delegation 7222 information. 7224 This operation should be used by clients that record delegation 7225 information on stable storage on the client. In this case, 7226 DELEGPURGE should be issued immediately after doing delegation 7227 recovery on all delegations known to the client. Doing so will 7228 notify the server that no additional delegations for the client 7229 will be recovered allowing it to free resources, and avoid delaying 7230 other clients who make requests that conflict with the unrecovered 7231 delegations. The set of delegations known to the server and the 7232 client may be different. The reason for this is that a client may 7233 fail after making a request which resulted in delegation but before 7234 it received the results and committed them to the client's stable 7235 storage. 7237 The server MAY support DELEGPURGE, but if it does not, it MUST NOT 7238 support CLAIM_DELEGATE_PREV. 7240 ERRORS 7242 NFS4ERR_BADXDR 7243 NFS4ERR_NOTSUPP 7244 NFS4ERR_LEASE_MOVED 7245 NFS4ERR_MOVED 7246 NFS4ERR_RESOURCE 7248 Draft Specification NFS version 4 Protocol September 2002 7250 NFS4ERR_SERVERFAULT 7251 NFS4ERR_STALE_CLIENTID 7253 Draft Specification NFS version 4 Protocol September 2002 7255 14.2.6. Operation 8: DELEGRETURN - Return Delegation 7257 SYNOPSIS 7259 (cfh), stateid -> 7261 ARGUMENT 7263 struct DELEGRETURN4args { 7264 /* CURRENT_FH: delegated file */ 7265 stateid4 stateid; 7266 }; 7268 RESULT 7270 struct DELEGRETURN4res { 7271 nfsstat4 status; 7272 }; 7274 DESCRIPTION 7276 Returns the delegation represented by the current filehandle and 7277 stateid. 7279 Delegations may be returned when recalled or voluntarily (i.e. 7280 before the server has recalled them). In either case the client 7281 must properly propagate state changed under the context of the 7282 delegation to the server before returning the delegation. 7284 ERRORS 7286 NFS4ERR_ADMIN_REVOKED 7287 NFS4ERR_BAD_STATEID 7288 NFS4ERR_BADXDR 7289 NFS4ERR_EXPIRED 7290 NFS4ERR_INVAL 7291 NFS4ERR_LEASE_MOVED 7292 NFS4ERR_MOVED 7293 NFS4ERR_NOFILEHANDLE 7294 NFS4ERR_NOTSUPP 7295 NFS4ERR_OLD_STATEID 7296 NFS4ERR_RESOURCE 7297 NFS4ERR_SERVERFAULT 7298 NFS4ERR_STALE 7299 NFS4ERR_STALE_STATEID 7301 Draft Specification NFS version 4 Protocol September 2002 7303 14.2.7. Operation 9: GETATTR - Get Attributes 7305 SYNOPSIS 7307 (cfh), attrbits -> attrbits, attrvals 7309 ARGUMENT 7311 struct GETATTR4args { 7312 /* CURRENT_FH: directory or file */ 7313 bitmap4 attr_request; 7314 }; 7316 RESULT 7318 struct GETATTR4resok { 7319 fattr4 obj_attributes; 7320 }; 7322 union GETATTR4res switch (nfsstat4 status) { 7323 case NFS4_OK: 7324 GETATTR4resok resok4; 7325 default: 7326 void; 7327 }; 7329 DESCRIPTION 7331 The GETATTR operation will obtain attributes for the filesystem 7332 object specified by the current filehandle. The client sets a bit 7333 in the bitmap argument for each attribute value that it would like 7334 the server to return. The server returns an attribute bitmap that 7335 indicates the attribute values for which it was able to return, 7336 followed by the attribute values ordered lowest attribute number 7337 first. 7339 The server must return a value for each attribute that the client 7340 requests if the attribute is supported by the server. If the 7341 server does not support an attribute or cannot approximate a useful 7342 value then it must not return the attribute value and must not set 7343 the attribute bit in the result bitmap. The server must return an 7344 error if it supports an attribute but cannot obtain its value. In 7345 that case no attribute values will be returned. 7347 All servers must support the mandatory attributes as specified in 7348 the section "File Attributes". 7350 On success, the current filehandle retains its value. 7352 Draft Specification NFS version 4 Protocol September 2002 7354 IMPLEMENTATION 7356 ERRORS 7358 NFS4ERR_ACCESS 7359 NFS4ERR_BADHANDLE 7360 NFS4ERR_BADXDR 7361 NFS4ERR_DELAY 7362 NFS4ERR_FHEXPIRED 7363 NFS4ERR_INVAL 7364 NFS4ERR_IO 7365 NFS4ERR_MOVED 7366 NFS4ERR_NOFILEHANDLE 7367 NFS4ERR_RESOURCE 7368 NFS4ERR_SERVERFAULT 7369 NFS4ERR_STALE 7371 Draft Specification NFS version 4 Protocol September 2002 7373 14.2.8. Operation 10: GETFH - Get Current Filehandle 7375 SYNOPSIS 7377 (cfh) -> filehandle 7379 ARGUMENT 7381 /* CURRENT_FH: */ 7382 void; 7384 RESULT 7386 struct GETFH4resok { 7387 nfs_fh4 object; 7388 }; 7390 union GETFH4res switch (nfsstat4 status) { 7391 case NFS4_OK: 7392 GETFH4resok resok4; 7393 default: 7394 void; 7395 }; 7397 DESCRIPTION 7399 This operation returns the current filehandle value. 7401 On success, the current filehandle retains its value. 7403 IMPLEMENTATION 7405 Operations that change the current filehandle like LOOKUP or CREATE 7406 do not automatically return the new filehandle as a result. For 7407 instance, if a client needs to lookup a directory entry and obtain 7408 its filehandle then the following request is needed. 7410 PUTFH (directory filehandle) 7411 LOOKUP (entry name) 7412 GETFH 7414 ERRORS 7416 NFS4ERR_BADHANDLE 7417 NFS4ERR_FHEXPIRED 7418 NFS4ERR_MOVED 7420 Draft Specification NFS version 4 Protocol September 2002 7422 NFS4ERR_NOFILEHANDLE 7423 NFS4ERR_RESOURCE 7424 NFS4ERR_SERVERFAULT 7425 NFS4ERR_STALE 7427 Draft Specification NFS version 4 Protocol September 2002 7429 14.2.9. Operation 11: LINK - Create Link to a File 7431 SYNOPSIS 7433 (sfh), (cfh), newname -> (cfh), change_info 7435 ARGUMENT 7437 struct LINK4args { 7438 /* SAVED_FH: source object */ 7439 /* CURRENT_FH: target directory */ 7440 component4 newname; 7441 }; 7443 RESULT 7445 struct LINK4resok { 7446 change_info4 cinfo; 7447 }; 7449 union LINK4res switch (nfsstat4 status) { 7450 case NFS4_OK: 7451 LINK4resok resok4; 7452 default: 7453 void; 7454 }; 7456 DESCRIPTION 7458 The LINK operation creates an additional newname for the file 7459 represented by the saved filehandle, as set by the SAVEFH 7460 operation, in the directory represented by the current filehandle. 7461 The existing file and the target directory must reside within the 7462 same filesystem on the server. On success, the current filehandle 7463 will continue to be the target directory. If an object exists in 7464 the target directory with the same name as newname, the server must 7465 return NFS4ERR_EXIST. 7467 For the target directory, the server returns change_info4 7468 information in cinfo. With the atomic field of the change_info4 7469 struct, the server will indicate if the before and after change 7470 attributes were obtained atomically with respect to the link 7471 creation. 7473 If the newname has a length of 0 (zero), or if newname does not 7474 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 7475 returned. 7477 Draft Specification NFS version 4 Protocol September 2002 7479 IMPLEMENTATION 7481 Changes to any property of the "hard" linked files are reflected in 7482 all of the linked files. When a link is made to a file, the 7483 attributes for the file should have a value for numlinks that is 7484 one greater than the value before the LINK operation. 7486 The statement "file and the target directory must reside within the 7487 same filesystem on the server" means that the fsid fields in the 7488 attributes for the objects are the same. If they reside on 7489 different filesystems, the error, NFS4ERR_XDEV, is returned. On 7490 some servers, the filenames, "." and "..", are illegal as newname. 7492 In the case that newname is already linked to the file represented 7493 by the saved filehandle, the server will return NFS4ERR_EXIST. 7495 Note that symbolic links are created with the CREATE operation. 7497 ERRORS 7499 NFS4ERR_ACCESS 7500 NFS4ERR_BADCHAR 7501 NFS4ERR_BADHANDLE 7502 NFS4ERR_BADNAME 7503 NFS4ERR_BADXDR 7504 NFS4ERR_DELAY 7505 NFS4ERR_DQUOT 7506 NFS4ERR_EXIST 7507 NFS4ERR_FHEXPIRED 7508 NFS4ERR_FILE_OPEN 7509 NFS4ERR_INVAL 7510 NFS4ERR_IO 7511 NFS4ERR_ISDIR 7512 NFS4ERR_MLINK 7513 NFS4ERR_MOVED 7514 NFS4ERR_NAMETOOLONG 7515 NFS4ERR_NOENT 7516 NFS4ERR_NOFILEHANDLE 7517 NFS4ERR_NOSPC 7518 NFS4ERR_NOTDIR 7519 NFS4ERR_NOTSUPP 7520 NFS4ERR_RESOURCE 7521 NFS4ERR_ROFS 7522 NFS4ERR_SERVERFAULT 7523 NFS4ERR_STALE 7524 NFS4ERR_WRONGSEC 7525 NFS4ERR_XDEV 7527 Draft Specification NFS version 4 Protocol September 2002 7529 14.2.10. Operation 12: LOCK - Create Lock 7531 SYNOPSIS 7533 (cfh) locktype, reclaim, offset, length, locker -> stateid 7535 ARGUMENT 7537 struct open_to_lock_owner4 { 7538 seqid4 open_seqid; 7539 stateid4 open_stateid; 7540 seqid4 lock_seqid; 7541 lock_owner4 lock_owner; 7542 }; 7544 struct exist_lock_owner4 { 7545 stateid4 lock_stateid; 7546 seqid4 lock_seqid; 7547 }; 7549 union locker4 switch (bool new_lock_owner) { 7550 case TRUE: 7551 open_to_lock_owner4 open_owner; 7552 case FALSE: 7553 exist_lock_owner4 lock_owner; 7554 }; 7556 enum nfs_lock_type4 { 7557 READ_LT = 1, 7558 WRITE_LT = 2, 7559 READW_LT = 3, /* blocking read */ 7560 WRITEW_LT = 4 /* blocking write */ 7561 }; 7563 struct LOCK4args { 7564 /* CURRENT_FH: file */ 7565 nfs_lock_type4 locktype; 7566 bool reclaim; 7567 offset4 offset; 7568 length4 length; 7569 locker4 locker; 7570 }; 7572 RESULT 7574 struct LOCK4denied { 7575 offset4 offset; 7576 length4 length; 7577 nfs_lock_type4 locktype; 7579 Draft Specification NFS version 4 Protocol September 2002 7581 lock_owner4 owner; 7582 }; 7584 struct LOCK4resok { 7585 stateid4 lock_stateid; 7586 }; 7588 union LOCK4res switch (nfsstat4 status) { 7589 case NFS4_OK: 7590 LOCK4resok resok4; 7591 case NFS4ERR_DENIED: 7592 LOCK4denied denied; 7593 default: 7594 void; 7595 }; 7597 DESCRIPTION 7599 The LOCK operation requests a record lock for the byte range 7600 specified by the offset and length parameters. The lock type is 7601 also specified to be one of the nfs_lock_type4s. If this is a 7602 reclaim request, the reclaim parameter will be TRUE; 7604 Bytes in a file may be locked even if those bytes are not currently 7605 allocated to the file. To lock the file from a specific offset 7606 through the end-of-file (no matter how long the file actually is) 7607 use a length field with all bits set to 1 (one). If the length is 7608 zero, or if a length which is not all bits set to one is specified, 7609 and length when added to the offset exceeds the maximum 64-bit 7610 unsigned integer value, the error NFS4ERR_INVAL will result. 7612 Some servers may only support locking for byte offsets that fit 7613 within 32 bits. If the client specifies a range that includes a 7614 byte beyond the last byte offset of the 32-bit range, but does not 7615 include the last byte offset of the 32-bit and all of the byte 7616 offsets beyond it, up to the end of the valid 64-bit range, such a 7617 32-bit server MUST return the error NFS4ERR_BAD_RANGE. 7619 In the case that the lock is denied, the owner, offset, and length 7620 of a conflicting lock are returned. 7622 On success, the current filehandle retains its value. 7624 IMPLEMENTATION 7626 If the server is unable to determine the exact offset and length of 7627 the conflicting lock, the same offset and length that were provided 7628 in the arguments should be returned in the denied results. The 7629 File Locking section contains a full description of this and the 7630 other file locking operations. 7632 Draft Specification NFS version 4 Protocol September 2002 7634 LOCK operations are subject to permission checks and to checks 7635 against the access type of the associated file. However, the 7636 specific right and modes required for various type of locks, 7637 reflect the semantics of the server-exported filesystem, and are 7638 not specified by the protocol. For example, Windows 2000 allows a 7639 write lock of a file open for READ, while a POSIX-compliant system 7640 does not. 7642 When the client makes a lock request that corresponds to a range 7643 that the lockowner has locked already (with the same or different 7644 lock type), or to a sub-region of such a range, or to a region 7645 which includes multiple locks already granted to that lockowner, in 7646 whole or in part, and the server does not support such locking 7647 operations (i.e. does not support POSIX locking semantics), the 7648 server will return the error NFS4ERR_LOCK_RANGE. In that case, the 7649 client may return an error, or it may emulate the required 7650 operations, using only LOCK for ranges that do not include any 7651 bytes already locked by that lock_owner and LOCKU of locks held by 7652 that lock_owner (specifying an exactly-matching range and type). 7653 Similarly, when the client makes a lock request that amounts to 7654 upgrading (changing from a read lock to a write lock) or 7655 downgrading (changing from write lock to a read lock) an existing 7656 record lock, and the server does not support such a lock, the 7657 server will return NFS4ERR_LOCK_NOTSUPP. Such operations may not 7658 perfectly reflect the required semantics in the face of conflicting 7659 lock requests from other clients. 7661 The locker argument specifies the lock_owner that is associated 7662 with the LOCK request. The locker4 structure is a switched union 7663 that indicates whether the lock_owner is known to the server or if 7664 the lock_owner is new to the server. In the case that the 7665 lock_owner is known to the server and has an established 7666 lock_seqid, the argument is just the lock_owner and lock_seqid. In 7667 the case that the lock_owner is not known to the server, the 7668 argument contains not only the lock_owner and lock_seqid but also 7669 the open_stateid and open_seqid. The new lock_owner case covers 7670 the very first lock done by the lock_owner and offers a method to 7671 use the established state of the open_stateid to transition to the 7672 use of the lock_owner. 7674 ERRORS 7676 NFS4ERR_ACCESS 7677 NFS4ERR_ADMIN_REVOKED 7678 NFS4ERR_BADHANDLE 7679 NFS4ERR_BAD_RANGE 7680 NFS4ERR_BAD_SEQID 7681 NFS4ERR_BAD_STATEID 7682 NFS4ERR_BADXDR 7683 NFS4ERR_DEADLOCK 7684 NFS4ERR_DELAY 7686 Draft Specification NFS version 4 Protocol September 2002 7688 NFS4ERR_DENIED 7689 NFS4ERR_EXPIRED 7690 NFS4ERR_FHEXPIRED 7691 NFS4ERR_GRACE 7692 NFS4ERR_INVAL 7693 NFS4ERR_ISDIR 7694 NFS4ERR_LEASE_MOVED 7695 NFS4ERR_LOCK_NOTSUPP 7696 NFS4ERR_LOCK_RANGE 7697 NFS4ERR_MOVED 7698 NFS4ERR_NOFILEHANDLE 7699 NFS4ERR_NO_GRACE 7700 NFS4ERR_OLD_STATEID 7701 NFS4ERR_OPENMODE 7702 NFS4ERR_RECLAIM_BAD 7703 NFS4ERR_RECLAIM_CONFLICT 7704 NFS4ERR_RESOURCE 7705 NFS4ERR_SERVERFAULT 7706 NFS4ERR_STALE 7707 NFS4ERR_STALE_CLIENTID 7708 NFS4ERR_STALE_STATEID 7710 Draft Specification NFS version 4 Protocol September 2002 7712 14.2.11. Operation 13: LOCKT - Test For Lock 7714 SYNOPSIS 7716 (cfh) locktype, offset, length owner -> {void, NFS4ERR_DENIED -> 7717 owner} 7719 ARGUMENT 7721 struct LOCKT4args { 7722 /* CURRENT_FH: file */ 7723 nfs_lock_type4 locktype; 7724 offset4 offset; 7725 length4 length; 7726 lock_owner4 owner; 7727 }; 7729 RESULT 7731 struct LOCK4denied { 7732 offset4 offset; 7733 length4 length; 7734 nfs_lock_type4 locktype; 7735 lock_owner4 owner; 7736 }; 7738 union LOCKT4res switch (nfsstat4 status) { 7739 case NFS4ERR_DENIED: 7740 LOCK4denied denied; 7741 case NFS4_OK: 7742 void; 7743 default: 7744 void; 7745 }; 7747 DESCRIPTION 7749 The LOCKT operation tests the lock as specified in the arguments. 7750 If a conflicting lock exists, the owner, offset, length, and type 7751 of the conflicting lock are returned; if no lock is held, nothing 7752 other than NFS4_OK is returned. Lock types READ_LT and READW_LT 7753 are processed in the same way in that a conflicting lock test is 7754 done without regard to blocking or non-blocking. The same is true 7755 for WRITE_LT and WRITEW_LT. 7757 The ranges are specified as for LOCK. The NFS4ERR_INVAL and 7758 NFS4ERR_BAD_RANGE errors are returned under the same circumstances 7759 as for LOCK. 7761 Draft Specification NFS version 4 Protocol September 2002 7763 On success, the current filehandle retains its value. 7765 IMPLEMENTATION 7767 If the server is unable to determine the exact offset and length of 7768 the conflicting lock, the same offset and length that were provided 7769 in the arguments should be returned in the denied results. The 7770 File Locking section contains further discussion of the file 7771 locking mechanisms. 7773 LOCKT uses a lock_owner4 rather a stateid4, as is used in LOCK to 7774 identify the owner. This is because the client does not have to 7775 open the file to test for the existence of a lock, so a stateid may 7776 not be available. 7778 The test for conflicting locks should exclude locks for the current 7779 lockowner. Note that since such locks are not examined the 7780 possible existence of overlapping ranges may not affect the results 7781 of LOCKT. If the server does examine locks that match the 7782 lockowner for the purpose of range checking, NFS4ERR_LOCK_RANGE may 7783 be returned.. In the event that it returns NFS4_OK, clients may do 7784 a LOCK and receive NFS4ERR_LOCK_RANGE on the LOCK request because 7785 of the flexibility provided to the server. 7787 ERRORS 7789 NFS4ERR_ACCESS 7790 NFS4ERR_BADHANDLE 7791 NFS4ERR_BAD_RANGE 7792 NFS4ERR_BADXDR 7793 NFS4ERR_DELAY 7794 NFS4ERR_DENIED 7795 NFS4ERR_FHEXPIRED 7796 NFS4ERR_GRACE 7797 NFS4ERR_INVAL 7798 NFS4ERR_ISDIR 7799 NFS4ERR_LEASE_MOVED 7800 NFS4ERR_LOCK_RANGE 7801 NFS4ERR_MOVED 7802 NFS4ERR_NOFILEHANDLE 7803 NFS4ERR_RESOURCE 7804 NFS4ERR_SERVERFAULT 7805 NFS4ERR_STALE 7806 NFS4ERR_STALE_CLIENTID 7808 Draft Specification NFS version 4 Protocol September 2002 7810 14.2.12. Operation 14: LOCKU - Unlock File 7812 SYNOPSIS 7814 (cfh) type, seqid, stateid, offset, length -> stateid 7816 ARGUMENT 7818 struct LOCKU4args { 7819 /* CURRENT_FH: file */ 7820 nfs_lock_type4 locktype; 7821 seqid4 seqid; 7822 stateid4 stateid; 7823 offset4 offset; 7824 length4 length; 7825 }; 7827 RESULT 7829 union LOCKU4res switch (nfsstat4 status) { 7830 case NFS4_OK: 7831 stateid4 stateid; 7832 default: 7833 void; 7834 }; 7836 DESCRIPTION 7838 The LOCKU operation unlocks the record lock specified by the 7839 parameters. The client may set the locktype field to any value that 7840 is legal for the nfs_lock_type4 enumerated type, and the server 7841 MUST accept any legal value for locktype. Any legal value for 7842 locktype has no effect on the success or failure of the LOCKU 7843 operation. 7845 The ranges are specified as for LOCK. The NFS4ERR_INVAL and 7846 NFS4ERR_BAD_RANGE errors are returned under the same circumstances 7847 as for LOCK. 7849 On success, the current filehandle retains its value. 7851 IMPLEMENTATION 7853 If the area to be unlocked does not correspond exactly to a lock 7854 actually held by the lockowner the server may return the error 7855 NFS4ERR_LOCK_RANGE. This includes the case in which the area is 7856 not locked, where the area is a sub-range of the area locked, where 7857 it overlaps the area locked without matching exactly or the area 7859 Draft Specification NFS version 4 Protocol September 2002 7861 specified includes multiple locks held by the lockowner. In all of 7862 these cases, allowed by POSIX locking semantics, a client receiving 7863 this error, should if it desires support for such operations, 7864 simulate the operation using LOCKU on ranges corresponding to locks 7865 it actually holds, possibly followed by LOCK requests for the sub- 7866 ranges not being unlocked. 7868 ERRORS 7870 NFS4ERR_ACCESS 7871 NFS4ERR_ADMIN_REVOKED 7872 NFS4ERR_BADHANDLE 7873 NFS4ERR_BAD_RANGE 7874 NFS4ERR_BAD_SEQID 7875 NFS4ERR_BAD_STATEID 7876 NFS4ERR_BADXDR 7877 NFS4ERR_EXPIRED 7878 NFS4ERR_FHEXPIRED 7879 NFS4ERR_GRACE 7880 NFS4ERR_INVAL 7881 NFS4ERR_ISDIR 7882 NFS4ERR_LEASE_MOVED 7883 NFS4ERR_LOCK_RANGE 7884 NFS4ERR_MOVED 7885 NFS4ERR_NOFILEHANDLE 7886 NFS4ERR_OLD_STATEID 7887 NFS4ERR_RESOURCE 7888 NFS4ERR_SERVERFAULT 7889 NFS4ERR_STALE 7890 NFS4ERR_STALE_STATEID 7892 Draft Specification NFS version 4 Protocol September 2002 7894 14.2.13. Operation 15: LOOKUP - Lookup Filename 7896 SYNOPSIS 7898 (cfh), component -> (cfh) 7900 ARGUMENT 7902 struct LOOKUP4args { 7903 /* CURRENT_FH: directory */ 7904 component4 objname; 7905 }; 7907 RESULT 7909 struct LOOKUP4res { 7910 /* CURRENT_FH: object */ 7911 nfsstat4 status; 7912 }; 7914 DESCRIPTION 7916 This operation LOOKUPs or finds a filesystem object using the 7917 directory specified by the current filehandle. LOOKUP evaluates 7918 the component and if the object exists the current filehandle is 7919 replaced with the component's filehandle. 7921 If the component cannot be evaluated either because it does not 7922 exist or because the client does not have permission to evaluate 7923 the component, then an error will be returned and the current 7924 filehandle will be unchanged. 7926 If the component is a zero length string or if any component does 7927 not obey the UTF-8 definition, the error NFS4ERR_INVAL will be 7928 returned. 7930 IMPLEMENTATION 7932 If the client wants to achieve the effect of a multi-component 7933 lookup, it may construct a COMPOUND request such as (and obtain 7934 each filehandle): 7936 Draft Specification NFS version 4 Protocol September 2002 7938 PUTFH (directory filehandle) 7939 LOOKUP "pub" 7940 GETFH 7941 LOOKUP "foo" 7942 GETFH 7943 LOOKUP "bar" 7944 GETFH 7946 NFS version 4 servers depart from the semantics of previous NFS 7947 versions in allowing LOOKUP requests to cross mountpoints on the 7948 server. The client can detect a mountpoint crossing by comparing 7949 the fsid attribute of the directory with the fsid attribute of the 7950 directory looked up. If the fsids are different then the new 7951 directory is a server mountpoint. UNIX clients that detect a 7952 mountpoint crossing will need to mount the server's filesystem. 7953 This needs to be done to maintain the file object identity checking 7954 mechanisms common to UNIX clients. 7956 Servers that limit NFS access to "shares" or "exported" filesystems 7957 should provide a pseudo-filesystem into which the exported 7958 filesystems can be integrated, so that clients can browse the 7959 server's name space. The clients view of a pseudo filesystem will 7960 be limited to paths that lead to exported filesystems. 7962 Note: previous versions of the protocol assigned special semantics 7963 to the names "." and "..". NFS version 4 assigns no special 7964 semantics to these names. The LOOKUPP operator must be used to 7965 lookup a parent directory. 7967 Note that this operation does not follow symbolic links. The 7968 client is responsible for all parsing of filenames including 7969 filenames that are modified by symbolic links encountered during 7970 the lookup process. 7972 If the current filehandle supplied is not a directory but a 7973 symbolic link, the error NFS4ERR_SYMLINK is returned as the error. 7974 For all other non-directory file types, the error NFS4ERR_NOTDIR is 7975 returned. 7977 ERRORS 7979 NFS4ERR_ACCESS 7980 NFS4ERR_BADCHAR 7981 NFS4ERR_BADHANDLE 7982 NFS4ERR_BADNAME 7983 NFS4ERR_BADXDR 7984 NFS4ERR_FHEXPIRED 7985 NFS4ERR_INVAL 7986 NFS4ERR_IO 7987 NFS4ERR_MOVED 7989 Draft Specification NFS version 4 Protocol September 2002 7991 NFS4ERR_NAMETOOLONG 7992 NFS4ERR_NOENT 7993 NFS4ERR_NOFILEHANDLE 7994 NFS4ERR_NOTDIR 7995 NFS4ERR_RESOURCE 7996 NFS4ERR_SERVERFAULT 7997 NFS4ERR_STALE 7998 NFS4ERR_SYMLINK 7999 NFS4ERR_WRONGSEC 8001 Draft Specification NFS version 4 Protocol September 2002 8003 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory 8005 SYNOPSIS 8007 (cfh) -> (cfh) 8009 ARGUMENT 8011 /* CURRENT_FH: object */ 8012 void; 8014 RESULT 8016 struct LOOKUPP4res { 8017 /* CURRENT_FH: directory */ 8018 nfsstat4 status; 8019 }; 8021 DESCRIPTION 8023 The current filehandle is assumed to refer to a regular directory 8024 or a named attribute directory. LOOKUPP assigns the filehandle for 8025 its parent directory to be the current filehandle. If there is no 8026 parent directory an NFS4ERR_NOENT error must be returned. 8027 Therefore, NFS4ERR_NOENT will be returned by the server when the 8028 current filehandle is at the root or top of the server's file tree. 8030 IMPLEMENTATION 8032 As for LOOKUP, LOOKUPP will also cross mountpoints. 8034 If the current filehandle is not a directory or named attribute 8035 directory, the error NFS4ERR_NOTDIR is returned. 8037 ERRORS 8039 NFS4ERR_ACCESS 8040 NFS4ERR_BADHANDLE 8041 NFS4ERR_FHEXPIRED 8042 NFS4ERR_IO 8043 NFS4ERR_MOVED 8044 NFS4ERR_NOENT 8045 NFS4ERR_NOFILEHANDLE 8046 NFS4ERR_NOTDIR 8047 NFS4ERR_RESOURCE 8048 NFS4ERR_SERVERFAULT 8049 NFS4ERR_STALE 8051 Draft Specification NFS version 4 Protocol September 2002 8053 14.2.15. Operation 17: NVERIFY - Verify Difference in Attributes 8055 SYNOPSIS 8057 (cfh), fattr -> - 8059 ARGUMENT 8061 struct NVERIFY4args { 8062 /* CURRENT_FH: object */ 8063 fattr4 obj_attributes; 8064 }; 8066 RESULT 8068 struct NVERIFY4res { 8069 nfsstat4 status; 8070 }; 8072 DESCRIPTION 8074 This operation is used to prefix a sequence of operations to be 8075 performed if one or more attributes have changed on some filesystem 8076 object. If all the attributes match then the error NFS4ERR_SAME 8077 must be returned. 8079 On success, the current filehandle retains its value. 8081 IMPLEMENTATION 8083 This operation is useful as a cache validation operator. If the 8084 object to which the attributes belong has changed then the 8085 following operations may obtain new data associated with that 8086 object. For instance, to check if a file has been changed and 8087 obtain new data if it has: 8089 PUTFH (public) 8090 LOOKUP "foobar" 8091 NVERIFY attrbits attrs 8092 READ 0 32767 8094 In the case that a recommended attribute is specified in the 8095 NVERIFY operation and the server does not support that attribute 8096 for the filesystem object, the error NFS4ERR_ATTRNOTSUPP is 8097 returned to the client. 8099 When the attribute rdattr_error or any write-only attribute (e.g. 8101 Draft Specification NFS version 4 Protocol September 2002 8103 time_modify_set) is specified, the error NFS4ERR_INVAL is returned 8104 to the client. 8106 ERRORS 8108 NFS4ERR_ACCESS 8109 NFS4ERR_ATTRNOTSUPP 8110 NFS4ERR_BADCHAR 8111 NFS4ERR_BADHANDLE 8112 NFS4ERR_BADXDR 8113 NFS4ERR_DELAY 8114 NFS4ERR_FHEXPIRED 8115 NFS4ERR_INVAL 8116 NFS4ERR_IO 8117 NFS4ERR_MOVED 8118 NFS4ERR_NOFILEHANDLE 8119 NFS4ERR_RESOURCE 8120 NFS4ERR_SAME 8121 NFS4ERR_SERVERFAULT 8122 NFS4ERR_STALE 8124 Draft Specification NFS version 4 Protocol September 2002 8126 14.2.16. Operation 18: OPEN - Open a Regular File 8128 SYNOPSIS 8130 (cfh), seqid, share_access, share_deny, owner, openhow, claim -> 8131 (cfh), stateid, cinfo, rflags, open_confirm, attrset delegation 8133 ARGUMENT 8135 struct OPEN4args { 8136 seqid4 seqid; 8137 uint32_t share_access; 8138 uint32_t share_deny; 8139 open_owner4 owner; 8140 openflag4 openhow; 8141 open_claim4 claim; 8142 }; 8144 enum createmode4 { 8145 UNCHECKED4 = 0, 8146 GUARDED4 = 1, 8147 EXCLUSIVE4 = 2 8148 }; 8150 union createhow4 switch (createmode4 mode) { 8151 case UNCHECKED4: 8152 case GUARDED4: 8153 fattr4 createattrs; 8154 case EXCLUSIVE4: 8155 verifier4 createverf; 8156 }; 8158 enum opentype4 { 8159 OPEN4_NOCREATE = 0, 8160 OPEN4_CREATE = 1 8161 }; 8163 union openflag4 switch (opentype4 opentype) { 8164 case OPEN4_CREATE: 8165 createhow4 how; 8166 default: 8167 void; 8168 }; 8170 /* Next definitions used for OPEN delegation */ 8171 enum limit_by4 { 8172 NFS_LIMIT_SIZE = 1, 8173 NFS_LIMIT_BLOCKS = 2 8174 /* others as needed */ 8175 }; 8177 Draft Specification NFS version 4 Protocol September 2002 8179 struct nfs_modified_limit4 { 8180 uint32_t num_blocks; 8181 uint32_t bytes_per_block; 8182 }; 8184 union nfs_space_limit4 switch (limit_by4 limitby) { 8185 /* limit specified as file size */ 8186 case NFS_LIMIT_SIZE: 8187 uint64_t filesize; 8188 /* limit specified by number of blocks */ 8189 case NFS_LIMIT_BLOCKS: 8190 nfs_modified_limit4 mod_blocks; 8191 } ; 8193 enum open_delegation_type4 { 8194 OPEN_DELEGATE_NONE = 0, 8195 OPEN_DELEGATE_READ = 1, 8196 OPEN_DELEGATE_WRITE = 2 8197 }; 8199 enum open_claim_type4 { 8200 CLAIM_NULL = 0, 8201 CLAIM_PREVIOUS = 1, 8202 CLAIM_DELEGATE_CUR = 2, 8203 CLAIM_DELEGATE_PREV = 3 8204 }; 8206 struct open_claim_delegate_cur4 { 8207 stateid4 delegate_stateid; 8208 component4 file; 8209 }; 8211 union open_claim4 switch (open_claim_type4 claim) { 8212 /* 8213 * No special rights to file. Ordinary OPEN of the specified file. 8214 */ 8215 case CLAIM_NULL: 8216 /* CURRENT_FH: directory */ 8217 component4 file; 8219 /* 8220 * Right to the file established by an open previous to server 8221 * reboot. File identified by filehandle obtained at that time 8222 * rather than by name. 8223 */ 8224 case CLAIM_PREVIOUS: 8225 /* CURRENT_FH: file being reclaimed */ 8226 open_delegation_type4 delegate_type; 8228 /* 8229 * Right to file based on a delegation granted by the server. 8230 * File is specified by name. 8232 Draft Specification NFS version 4 Protocol September 2002 8234 */ 8235 case CLAIM_DELEGATE_CUR: 8236 /* CURRENT_FH: directory */ 8237 open_claim_delegate_cur4 delegate_cur_info; 8239 /* Right to file based on a delegation granted to a previous boot 8240 * instance of the client. File is specified by name. 8241 */ 8242 case CLAIM_DELEGATE_PREV: 8243 /* CURRENT_FH: directory */ 8244 component4 file_delegate_prev; 8245 }; 8247 RESULT 8249 struct open_read_delegation4 { 8250 stateid4 stateid; /* Stateid for delegation*/ 8251 bool recall; /* Pre-recalled flag for 8252 delegations obtained 8253 by reclaim 8254 (CLAIM_PREVIOUS) */ 8255 nfsace4 permissions; /* Defines users who don't 8256 need an ACCESS call to 8257 open for read */ 8258 }; 8260 struct open_write_delegation4 { 8261 stateid4 stateid; /* Stateid for delegation*/ 8262 bool recall; /* Pre-recalled flag for 8263 delegations obtained 8264 by reclaim 8265 (CLAIM_PREVIOUS) */ 8266 nfs_space_limit4 space_limit; /* Defines condition that 8267 the client must check to 8268 determine whether the 8269 file needs to be flushed 8270 to the server on close. 8271 */ 8272 nfsace4 permissions; /* Defines users who don't 8273 need an ACCESS call as 8274 part of a delegated 8275 open. */ 8276 }; 8278 union open_delegation4 8279 switch (open_delegation_type4 delegation_type) { 8280 case OPEN_DELEGATE_NONE: 8281 void; 8282 case OPEN_DELEGATE_READ: 8283 open_read_delegation4 read; 8285 Draft Specification NFS version 4 Protocol September 2002 8287 case OPEN_DELEGATE_WRITE: 8288 open_write_delegation4 write; 8289 }; 8291 const OPEN4_RESULT_CONFIRM = 0x00000002; 8292 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 8294 struct OPEN4resok { 8295 stateid4 stateid; /* Stateid for open */ 8296 change_info4 cinfo; /* Directory Change Info */ 8297 uint32_t rflags; /* Result flags */ 8298 bitmap4 attrset; /* attributes on create */ 8299 open_delegation4 delegation; /* Info on any open 8300 delegation */ 8301 }; 8303 union OPEN4res switch (nfsstat4 status) { 8304 case NFS4_OK: 8305 /* CURRENT_FH: opened file */ 8306 OPEN4resok resok4; 8307 default: 8308 void; 8309 }; 8311 WARNING TO CLIENT IMPLEMENTORS 8313 OPEN resembles LOOKUP in that it generates a filehandle for the 8314 client to use. Unlike LOOKUP though, OPEN creates server state on 8315 the filehandle. In normal circumstances, the client can only 8316 release this state with a CLOSE operation. CLOSE uses the current 8317 filehandle to determine which file to close. Therefore the client 8318 MUST follow every OPEN operation with a GETFH operation in the same 8319 COMPOUND procedure. This will supply the client with the 8320 filehandle such that CLOSE can be used appropriately. 8322 Simply waiting for the lease on the file to expire is insufficient 8323 because the server may maintain the state indefinitely as long as 8324 another client does not attempt to make a conflicting access to the 8325 same file. 8327 DESCRIPTION 8329 The OPEN operation creates and/or opens a regular file in a 8330 directory with the provided name. If the file does not exist at 8331 the server and creation is desired, specification of the method of 8332 creation is provided by the openhow parameter. The client has the 8333 choice of three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. 8335 If the current filehandle is a named attribute directory, OPEN will 8336 then create or open a named attribute file. Note that exclusive 8338 Draft Specification NFS version 4 Protocol September 2002 8340 create of a named attribute is not supported. If the createmode is 8341 EXCLUSIVE4 and the current filehandle is a named attribute 8342 directory, the server will return EINVAL. 8344 UNCHECKED means that the file should be created if a file of that 8345 name does not exist and encountering an existing regular file of 8346 that name is not an error. For this type of create, createattrs 8347 specifies the initial set of attributes for the file. The set of 8348 attributes may include any writable attribute valid for regular 8349 files. When an UNCHECKED create encounters an existing file, the 8350 attributes specified by createattrs are not used, except that when 8351 an size of zero is specified, the existing file is truncated. If 8352 GUARDED is specified, the server checks for the presence of a 8353 duplicate object by name before performing the create. If a 8354 duplicate exists, an error of NFS4ERR_EXIST is returned as the 8355 status. If the object does not exist, the request is performed as 8356 described for UNCHECKED. For each of these cases (UNCHECKED and 8357 GUARDED) where the operation is successful, the server will return 8358 to the client an attribute mask signifying which attributes were 8359 successfully set for the object. 8361 EXCLUSIVE specifies that the server is to follow exclusive creation 8362 semantics, using the verifier to ensure exclusive creation of the 8363 target. The server should check for the presence of a duplicate 8364 object by name. If the object does not exist, the server creates 8365 the object and stores the verifier with the object. If the object 8366 does exist and the stored verifier matches the client provided 8367 verifier, the server uses the existing object as the newly created 8368 object. If the stored verifier does not match, then an error of 8369 NFS4ERR_EXIST is returned. No attributes may be provided in this 8370 case, since the server may use an attribute of the target object to 8371 store the verifier. If the server uses an attribute to store the 8372 exclusive create verifier, it will signify which attribute by 8373 setting the appropriate bit in the attribute mask that is returned 8374 in the results. 8376 For the target directory, the server returns change_info4 8377 information in cinfo. With the atomic field of the change_info4 8378 struct, the server will indicate if the before and after change 8379 attributes were obtained atomically with respect to the link 8380 creation. 8382 Upon successful creation, the current filehandle is replaced by 8383 that of the new object. 8385 The OPEN operation provides for Windows share reservation 8386 capability with the use of the share_access and share_deny fields 8387 of the OPEN arguments. The client specifies at OPEN the required 8388 share_access and share_deny modes. For clients that do not 8389 directly support SHAREs (i.e. UNIX), the expected deny value is 8390 DENY_NONE. In the case that there is a existing SHARE reservation 8391 that conflicts with the OPEN request, the server returns the error 8393 Draft Specification NFS version 4 Protocol September 2002 8395 NFS4ERR_SHARE_DENIED. For a complete SHARE request, the client 8396 must provide values for the owner and seqid fields for the OPEN 8397 argument. For additional discussion of SHARE semantics see the 8398 section on 'Share Reservations'. 8400 In the case that the client is recovering state from a server 8401 failure, the claim field of the OPEN argument is used to signify 8402 that the request is meant to reclaim state previously held. 8404 The "claim" field of the OPEN argument is used to specify the file 8405 to be opened and the state information which the client claims to 8406 possess. There are four basic claim types which cover the various 8407 situations for an OPEN. They are as follows: 8409 CLAIM_NULL 8410 For the client, this is a new OPEN 8411 request and there is no previous state 8412 associate with the file for the client. 8414 CLAIM_PREVIOUS 8415 The client is claiming basic OPEN state 8416 for a file that was held previous to a 8417 server reboot. Generally used when a 8418 server is returning persistent 8419 filehandles; the client may not have the 8420 file name to reclaim the OPEN. 8422 CLAIM_DELEGATE_CUR 8423 The client is claiming a delegation for 8424 OPEN as granted by the server. 8425 Generally this is done as part of 8426 recalling a delegation. 8428 CLAIM_DELEGATE_PREV 8429 The client is claiming a delegation 8430 granted to a previous client instance; 8431 used after the client reboots. The 8432 server MAY support CLAIM_DELEGATE_PREV. 8433 If it does support CLAIM_DELEGATE_PREV, 8434 SETCLIENTID_CONFIRM MUST NOT remove the 8435 client's delegation state, and the 8436 server MUST support the DELEGPURGE 8437 operation. 8439 For OPEN requests whose claim type is other than CLAIM_PREVIOUS 8440 (i.e. requests other than those devoted to reclaiming opens after a 8441 server reboot) that reach the server during its grace or lease 8442 expiration period, the server returns an error of NFS4ERR_GRACE. 8444 For any OPEN request, the server may return an open delegation, 8445 which allows further opens and closes to be handled locally on the 8446 client as described in the section Open Delegation. Note that 8447 delegation is up to the server to decide. The client should never 8448 assume that delegation will or will not be granted in a particular 8449 instance. It should always be prepared for either case. A partial 8451 Draft Specification NFS version 4 Protocol September 2002 8453 exception is the reclaim (CLAIM_PREVIOUS) case, in which a 8454 delegation type is claimed. In this case, delegation will always 8455 be granted, although the server may specify an immediate recall in 8456 the delegation structure. 8458 The rflags returned by a successful OPEN allow the server to return 8459 information governing how the open file is to be handled. 8460 OPEN4_RESULT_CONFIRM indicates that the client MUST execute an 8461 OPEN_CONFIRM operation before using the open file. 8462 OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking 8463 behavior supports the complete set of Posix locking techniques. 8464 From this the client can choose to manage file locking state in a 8465 way to handle a mis-match of file locking management. 8467 If the component is of zero length, NFS4ERR_INVAL will be returned. 8468 The component is also subject to the normal UTF-8, character 8469 support, and name checks. See the section "UTF-8 Related Errors" 8470 for further discussion. 8472 When an OPEN is done and the specified lockowner already has the 8473 resulting filehandle open, the result is to "OR" together the new 8474 share and deny status together with the existing status. In this 8475 case, only a single CLOSE need be done, even though multiple OPENs 8476 were completed. When such an OPEN is done, checking of share 8477 reservations for the new OPEN proceeds normally, with no exception 8478 for the existing OPEN held by the same lockowner. 8480 If the underlying filesystem at the server is only accessible in a 8481 read-only mode and the OPEN request has specified ACCESS_WRITE or 8482 ACCESS_BOTH, the server will return NFS4ERR_ROFS to indicate a 8483 read-only filesystem. 8485 As with the CREATE operation, the server MUST derive the owner, 8486 owner ACE, group, or group ACE if any of the four attributes are 8487 required and supported by the server's filesystem. For an OPEN 8488 with the EXCLUSIVE4 createmode, the server has no choice, since 8489 such OPEN calls do not include the createattrs field. Conversely, 8490 if createattrs is specified, and includes owner or group (or 8491 corresponding ACEs) that the principal in the RPC call's 8492 credentials does not have authorization to create files for, then 8493 the server may return NFS4ERR_PERM. 8495 In the case of a OPEN which specifies a size of zero (e.g. 8496 truncation) and the file has named attributes, the named attributes 8497 are left as is. They are not removed. 8499 IMPLEMENTATION 8501 The OPEN operation contains support for EXCLUSIVE create. The 8502 mechanism is similar to the support in NFS version 3 [RFC1813]. As 8503 in NFS version 3, this mechanism provides reliable exclusive 8505 Draft Specification NFS version 4 Protocol September 2002 8507 creation. Exclusive create is invoked when the how parameter is 8508 EXCLUSIVE. In this case, the client provides a verifier that can 8509 reasonably be expected to be unique. A combination of a client 8510 identifier, perhaps the client network address, and a unique number 8511 generated by the client, perhaps the RPC transaction identifier, 8512 may be appropriate. 8514 If the object does not exist, the server creates the object and 8515 stores the verifier in stable storage. For filesystems that do not 8516 provide a mechanism for the storage of arbitrary file attributes, 8517 the server may use one or more elements of the object meta-data to 8518 store the verifier. The verifier must be stored in stable storage 8519 to prevent erroneous failure on retransmission of the request. It 8520 is assumed that an exclusive create is being performed because 8521 exclusive semantics are critical to the application. Because of the 8522 expected usage, exclusive CREATE does not rely solely on the 8523 normally volatile duplicate request cache for storage of the 8524 verifier. The duplicate request cache in volatile storage does not 8525 survive a crash and may actually flush on a long network partition, 8526 opening failure windows. In the UNIX local filesystem environment, 8527 the expected storage location for the verifier on creation is the 8528 meta-data (time stamps) of the object. For this reason, an 8529 exclusive object create may not include initial attributes because 8530 the server would have nowhere to store the verifier. 8532 If the server can not support these exclusive create semantics, 8533 possibly because of the requirement to commit the verifier to 8534 stable storage, it should fail the OPEN request with the error, 8535 NFS4ERR_NOTSUPP. 8537 During an exclusive CREATE request, if the object already exists, 8538 the server reconstructs the object's verifier and compares it with 8539 the verifier in the request. If they match, the server treats the 8540 request as a success. The request is presumed to be a duplicate of 8541 an earlier, successful request for which the reply was lost and 8542 that the server duplicate request cache mechanism did not detect. 8543 If the verifiers do not match, the request is rejected with the 8544 status, NFS4ERR_EXIST. 8546 Once the client has performed a successful exclusive create, it 8547 must issue a SETATTR to set the correct object attributes. Until 8548 it does so, it should not rely upon any of the object attributes, 8549 since the server implementation may need to overload object meta- 8550 data to store the verifier. The subsequent SETATTR must not occur 8551 in the same COMPOUND request as the OPEN. This separation will 8552 guarantee that the exclusive create mechanism will continue to 8553 function properly in the face of retransmission of the request. 8555 Use of the GUARDED attribute does not provide exactly-once 8556 semantics. In particular, if a reply is lost and the server does 8557 not detect the retransmission of the request, the operation can 8558 fail with NFS4ERR_EXIST, even though the create was performed 8560 Draft Specification NFS version 4 Protocol September 2002 8562 successfully. The client would use this behavior in the case that 8563 the application has not requested an exclusive create but has asked 8564 to have the file truncated when the file is opened. In the case of 8565 the client timing out and retransmitting the create request, the 8566 client can use GUARDED to prevent against a sequence like: create, 8567 write, create (retransmitted) from occurring. 8569 For SHARE reservations, the client must specify a value for 8570 share_access that is one of READ, WRITE, or BOTH. For share_deny, 8571 the client must specify one of NONE, READ, WRITE, or BOTH. If the 8572 client fails to do this, the server must return NFS4ERR_INVAL. 8574 Based on the share_access value (READ, WRITE, or BOTH) the client 8575 should check that the requester has the proper access rights to 8576 perform the specified operation. This would generally be the 8577 results of applying the ACL access rules to the file for the 8578 current requester. However, just as with the ACCESS operation, the 8579 client should not attempt to second-guess the server's decisions, 8580 as access rights may change and may be subject to server 8581 administrative controls outside the ACL framework. If the 8582 requester is not authorized to READ or WRITE (depending on the 8583 share_access value), the server must return NFS4ERR_ACCESS. Note 8584 that since the NFS version 4 protocol does not impose any 8585 requirement that READs and WRITEs issued for an open file have the 8586 same credentials as the OPEN itself, the server still must do 8587 appropriate access checking on the READs and WRITEs themselves. 8589 If the component provided to OPEN is a symbolic link, the error 8590 NFS4ERR_SYMLINK will be returned to the client. If the current 8591 filehandle is not a directory, the error NFS4ERR_NOTDIR will be 8592 returned. 8594 ERRORS 8596 NFS4ERR_ACCESS 8597 NFS4ERR_ADMIN_REVOKED 8598 NFS4ERR_ATTRNOTSUPP 8599 NFS4ERR_BADCHAR 8600 NFS4ERR_BADHANDLE 8601 NFS4ERR_BADNAME 8602 NFS4ERR_BADOWNER 8603 NFS4ERR_BAD_SEQID 8604 NFS4ERR_BADXDR 8605 NFS4ERR_DELAY 8606 NFS4ERR_DQUOT 8607 NFS4ERR_EXIST 8608 NFS4ERR_EXPIRED 8609 NFS4ERR_FHEXPIRED 8610 NFS4ERR_GRACE 8611 NFS4ERR_IO 8612 NFS4ERR_INVAL 8614 Draft Specification NFS version 4 Protocol September 2002 8616 NFS4ERR_ISDIR 8617 NFS4ERR_LEASE_MOVED 8618 NFS4ERR_MOVED 8619 NFS4ERR_NAMETOOLONG 8620 NFS4ERR_NOENT 8621 NFS4ERR_NOFILEHANDLE 8622 NFS4ERR_NOSPC 8623 NFS4ERR_NOTDIR 8624 NFS4ERR_NO_GRACE 8625 NFS4ERR_PERM 8626 NFS4ERR_RECLAIM_BAD 8627 NFS4ERR_RECLAIM_CONFLICT 8628 NFS4ERR_RESOURCE 8629 NFS4ERR_ROFS 8630 NFS4ERR_SERVERFAULT 8631 NFS4ERR_SHARE_DENIED 8632 NFS4ERR_STALE 8633 NFS4ERR_STALE_CLIENTID 8634 NFS4ERR_SYMLINK 8635 NFS4ERR_WRONGSEC 8637 Draft Specification NFS version 4 Protocol September 2002 8639 14.2.17. Operation 19: OPENATTR - Open Named Attribute Directory 8641 SYNOPSIS 8643 (cfh) createdir -> (cfh) 8645 ARGUMENT 8647 struct OPENATTR4args { 8648 /* CURRENT_FH: object */ 8649 bool createdir; 8650 }; 8652 RESULT 8654 struct OPENATTR4res { 8655 /* CURRENT_FH: named attr directory*/ 8656 nfsstat4 status; 8657 }; 8659 DESCRIPTION 8661 The OPENATTR operation is used to obtain the filehandle of the 8662 named attribute directory associated with the current filehandle. 8663 The result of the OPENATTR will be a filehandle to an object of 8664 type NF4ATTRDIR. From this filehandle, READDIR and LOOKUP 8665 operations can be used to obtain filehandles for the various named 8666 attributes associated with the original filesystem object. 8667 Filehandles returned within the named attribute directory will have 8668 a type of NF4NAMEDATTR. 8670 The createdir argument allows the client to signify if a named 8671 attribute directory should be created as a result of the OPENATTR 8672 operation. Some clients may use the OPENATTR operation with a 8673 value of FALSE for createdir to determine if any named attributes 8674 exist for the object. If none exist, then NFS4ERR_NOENT will be 8675 returned. If createdir has a value of TRUE and no named attribute 8676 directory exists, one is created. The creation of a named 8677 attribute directory assumes that the server has implemented named 8678 attribute support in this fashion and is not required to do so by 8679 this definition. 8681 IMPLEMENTATION 8683 If the server does not support named attributes for the current 8684 filehandle, an error of NFS4ERR_NOTSUPP will be returned to the 8685 client. 8687 Draft Specification NFS version 4 Protocol September 2002 8689 ERRORS 8691 NFS4ERR_ACCESS 8692 NFS4ERR_BADHANDLE 8693 NFS4ERR_BADXDR 8694 NFS4ERR_DELAY 8695 NFS4ERR_DQUOT 8696 NFS4ERR_FHEXPIRED 8697 NFS4ERR_IO 8698 NFS4ERR_MOVED 8699 NFS4ERR_NOENT 8700 NFS4ERR_NOFILEHANDLE 8701 NFS4ERR_NOSPC 8702 NFS4ERR_NOTSUPP 8703 NFS4ERR_RESOURCE 8704 NFS4ERR_ROFS 8705 NFS4ERR_SERVERFAULT 8706 NFS4ERR_STALE 8708 Draft Specification NFS version 4 Protocol September 2002 8710 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open 8712 SYNOPSIS 8714 (cfh), seqid, stateid-> stateid 8716 ARGUMENT 8718 struct OPEN_CONFIRM4args { 8719 /* CURRENT_FH: opened file */ 8720 seqid4 seqid; 8721 stateid4 stateid; 8722 }; 8724 RESULT 8726 struct OPEN_CONFIRM4resok { 8727 stateid4 stateid; 8728 }; 8730 union OPEN_CONFIRM4res switch (nfsstat4 status) { 8731 case NFS4_OK: 8732 OPEN_CONFIRM4resok resok4; 8733 default: 8734 void; 8735 }; 8737 DESCRIPTION 8739 This operation is used to confirm the sequence id usage for the 8740 first time that a open_owner is used by a client. The stateid 8741 returned from the OPEN operation is used as the argument for this 8742 operation along with the next sequence id for the open_owner. The 8743 sequence id passed to the OPEN_CONFIRM must be 1 (one) greater than 8744 the seqid passed to the OPEN operation from which the open_confirm 8745 value was obtained. If the server receives an unexpected sequence 8746 id with respect to the original open, then the server assumes that 8747 the client will not confirm the original OPEN and all state 8748 associated with the original OPEN is released by the server. 8750 On success, the current filehandle retains its value. 8752 IMPLEMENTATION 8754 A given client might generate many open_owner4 data structures for 8755 a given clientid. The client will periodically either dispose of 8756 its open_owner4s or stop using them for indefinite periods of time. 8757 The latter situation is why the NFS version 4 protocol does not 8759 Draft Specification NFS version 4 Protocol September 2002 8761 have an explicit operation to exit an open_owner4: such an 8762 operation is of no use in that situation. Instead, to avoid 8763 unbounded memory use, the server needs to implement a strategy for 8764 disposing of open_owner4s that have no current lock, open, or 8765 delegation state for any files and have not been used recently. 8766 The time period used to determine when to dispose of open_owner4s 8767 is an implementation choice. The time period should certainly be 8768 no less than the lease time plus any grace period the server wishes 8769 to implement beyond a lease time. The OPEN_CONFIRM operation 8770 allows the server to safely dispose of unused open_owner4 data 8771 structures. 8773 In the case that a client issues an OPEN operation and the server 8774 no longer has a record of the open_owner4, the server needs ensure 8775 that this is a new OPEN and not a replay or retransmission. 8777 Servers must not require confirmation on OPENs that grant 8778 delegations or are doing reclaim operations. See section "Use of 8779 Open Confirmation" for details. The server can easily avoid this 8780 by noting whether it has disposed of one open_owner4 for the given 8781 clientid. If the server does not support delegation, it might 8782 simply maintain a single bit that notes whether any open_owner4 8783 (for any client) has been disposed of. 8785 The server must hold unconfirmed OPEN state until one of three 8786 events occur. First, the client sends an OPEN_CONFIRM request with 8787 the appropriate sequence id and stateid within the lease period. 8788 In this case, the OPEN state on the server goes to confirmed, and 8789 the open_owner4 on the server is fully established. 8791 Second, the client sends another OPEN request with a sequence id 8792 that is incorrect for the open_owner4 (out of sequence). In this 8793 case, the server assumes the second OPEN request is valid and the 8794 first one is a replay. The server cancels the OPEN state of the 8795 first OPEN request, establishes an unconfirmed OPEN state for the 8796 second OPEN request, and responds to the second OPEN request with 8797 an indication that an OPEN_CONFIRM is needed. The process then 8798 repeats itself. While there is a potential for a denial of service 8799 attack on the client, it is mitigated if the client and server 8800 require the use of a security flavor based on Kerberos V5, LIPKEY, 8801 or some other flavor that uses cryptography. 8803 What if the server is in the unconfirmed OPEN state for a given 8804 open_owner4, and it receives an operation on the open_owner4 that 8805 has a stateid but the operation is not OPEN, or it is OPEN_CONFIRM 8806 but with the wrong stateid? Then, even if the seqid is correct, 8807 the server returns NFS4ERR_BAD_STATEID, because the server assumes 8808 the operation is a replay: if the server has no established OPEN 8809 state, then there is no way, for example, a LOCK operation could be 8810 valid. 8812 Third, neither of the two aforementioned events occur for the 8814 Draft Specification NFS version 4 Protocol September 2002 8816 open_owner4 within the lease period. In this case, the OPEN state 8817 is canceled and disposal of the open_owner4 can occur. 8819 ERRORS 8821 NFS4ERR_ADMIN_REVOKED 8822 NFS4ERR_BADHANDLE 8823 NFS4ERR_BAD_SEQID 8824 NFS4ERR_BAD_STATEID 8825 NFS4ERR_BADXDR 8826 NFS4ERR_EXPIRED 8827 NFS4ERR_FHEXPIRED 8828 NFS4ERR_INVAL 8829 NFS4ERR_ISDIR 8830 NFS4ERR_MOVED 8831 NFS4ERR_NOFILEHANDLE 8832 NFS4ERR_OLD_STATEID 8833 NFS4ERR_RESOURCE 8834 NFS4ERR_SERVERFAULT 8835 NFS4ERR_STALE 8836 NFS4ERR_STALE_STATEID 8838 Draft Specification NFS version 4 Protocol September 2002 8840 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access 8842 SYNOPSIS 8844 (cfh), stateid, seqid, access, deny -> stateid 8846 ARGUMENT 8848 struct OPEN_DOWNGRADE4args { 8849 /* CURRENT_FH: opened file */ 8850 stateid4 stateid; 8851 seqid4 seqid; 8852 uint32_t share_access; 8853 uint32_t share_deny; 8854 }; 8856 RESULT 8858 struct OPEN_DOWNGRADE4resok { 8859 stateid4 stateid; 8860 }; 8862 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 8863 case NFS4_OK: 8864 OPEN_DOWNGRADE4resok resok4; 8865 default: 8866 void; 8867 }; 8869 DESCRIPTION 8871 This operation is used to adjust the share_access and share_deny 8872 bits for a given open. This is necessary when a given lockowner 8873 opens the same file multiple times with different share_access and 8874 share_deny flags. In this situation, a close of one of the opens 8875 may change the appropriate share_access and share_deny flags to 8876 remove bits associated with opens no longer in effect. 8878 The share_access and share_deny bits specified in this operation 8879 replace the current ones for the specified open file. The 8880 share_access and share_deny bits specified must be exactly equal to 8881 the union of the share_access and share_deny bits specified for 8882 some subset of the OPENs in effect for current openowner on the 8883 current file. If that constraint is not respected, the error 8884 NFS4ERR_INVAL should be returned. Since share_access and 8885 share_deny bits are subsets of those already granted, it is not 8886 possible for this request to be denied because of conflicting share 8887 reservations. 8889 Draft Specification NFS version 4 Protocol September 2002 8891 On success, the current filehandle retains its value. 8893 ERRORS 8895 NFS4ERR_ADMIN_REVOKED 8896 NFS4ERR_BADHANDLE 8897 NFS4ERR_BAD_SEQID 8898 NFS4ERR_BAD_STATEID 8899 NFS4ERR_BADXDR 8900 NFS4ERR_EXPIRED 8901 NFS4ERR_FHEXPIRED 8902 NFS4ERR_INVAL 8903 NFS4ERR_MOVED 8904 NFS4ERR_NOFILEHANDLE 8905 NFS4ERR_OLD_STATEID 8906 NFS4ERR_RESOURCE 8907 NFS4ERR_SERVERFAULT 8908 NFS4ERR_STALE 8909 NFS4ERR_STALE_STATEID 8911 Draft Specification NFS version 4 Protocol September 2002 8913 14.2.20. Operation 22: PUTFH - Set Current Filehandle 8915 SYNOPSIS 8917 filehandle -> (cfh) 8919 ARGUMENT 8921 struct PUTFH4args { 8922 nfs_fh4 object; 8923 }; 8925 RESULT 8927 struct PUTFH4res { 8928 /* CURRENT_FH: */ 8929 nfsstat4 status; 8930 }; 8932 DESCRIPTION 8934 Replaces the current filehandle with the filehandle provided as an 8935 argument. 8937 If the security mechanism used by the requester does not meet the 8938 requirements of the filehandle provided to this operation, the 8939 server MUST return NFS4ERR_WRONGSEC. 8941 IMPLEMENTATION 8943 Commonly used as the first operator in an NFS request to set the 8944 context for following operations. 8946 ERRORS 8948 NFS4ERR_BADHANDLE 8949 NFS4ERR_BADXDR 8950 NFS4ERR_FHEXPIRED 8951 NFS4ERR_MOVED 8952 NFS4ERR_RESOURCE 8953 NFS4ERR_SERVERFAULT 8954 NFS4ERR_STALE 8955 NFS4ERR_WRONGSEC 8957 Draft Specification NFS version 4 Protocol September 2002 8959 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle 8961 SYNOPSIS 8963 - -> (cfh) 8965 ARGUMENT 8967 void; 8969 RESULT 8971 struct PUTPUBFH4res { 8972 /* CURRENT_FH: public fh */ 8973 nfsstat4 status; 8974 }; 8976 DESCRIPTION 8978 Replaces the current filehandle with the filehandle that represents 8979 the public filehandle of the server's name space. This filehandle 8980 may be different from the "root" filehandle which may be associated 8981 with some other directory on the server. 8983 The public filehandle represents the concepts embodied in 8984 [RFC2054], [RFC2055], [RFC2224]. The intent for NFS version 4 is 8985 that the public filehandle (represented by the PUTPUBFH operation) 8986 be used as a method of providing WebNFS server compatibility with 8987 NFS versions 2 and 3. 8989 The public filehandle and the root filehandle (represented by the 8990 PUTROOTFH operation) should be equivalent. If the public and root 8991 filehandles are not equivalent, then the public filehandle MUST be 8992 a descendant of the root filehandle. 8994 IMPLEMENTATION 8996 Used as the first operator in an NFS request to set the context for 8997 following operations. 8999 With the NFS version 2 and 3 public filehandle, the client is able 9000 to specify whether the path name provided in the LOOKUP should be 9001 evaluated as either an absolute path relative to the server's root 9002 or relative to the public filehandle. [RFC2224] contains further 9003 discussion of the functionality. With NFS version 4, that type of 9004 specification is not directly available in the LOOKUP operation. 9005 The reason for this is because the component separators needed to 9006 specify absolute vs. relative are not allowed in NFS version 4. 9008 Draft Specification NFS version 4 Protocol September 2002 9010 Therefore, the client is responsible for constructing its request 9011 such that the use of either PUTROOTFH or PUTPUBFH are used to 9012 signify absolute or relative evaluation of an NFS URL respectively. 9014 Note that there are warnings mentioned in [RFC2224] with respect to 9015 the use of absolute evaluation and the restrictions the server may 9016 place on that evaluation with respect to how much of its namespace 9017 has been made available. These same warnings apply to NFS version 9018 4. It is likely, therefore that because of server implementation 9019 details, an NFS version 3 absolute public filehandle lookup may 9020 behave differently than an NFS version 4 absolute resolution. 9022 There is a form of security negotiation as described in [RFC2755] 9023 that uses the public filehandle a method of employing SNEGO. This 9024 method is not available with NFS version 4 as filehandles are not 9025 overloaded with special meaning and therefore do not provide the 9026 same framework as NFS versions 2 and 3. Clients should therefore 9027 use the security negotiation mechanisms described in this RFC. 9029 ERRORS 9031 NFS4ERR_RESOURCE 9032 NFS4ERR_SERVERFAULT 9033 NFS4ERR_WRONGSEC 9035 Draft Specification NFS version 4 Protocol September 2002 9037 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle 9039 SYNOPSIS 9041 - -> (cfh) 9043 ARGUMENT 9045 void; 9047 RESULT 9049 struct PUTROOTFH4res { 9050 /* CURRENT_FH: root fh */ 9051 nfsstat4 status; 9052 }; 9054 DESCRIPTION 9056 Replaces the current filehandle with the filehandle that represents 9057 the root of the server's name space. From this filehandle a LOOKUP 9058 operation can locate any other filehandle on the server. This 9059 filehandle may be different from the "public" filehandle which may 9060 be associated with some other directory on the server. 9062 IMPLEMENTATION 9064 Commonly used as the first operator in an NFS request to set the 9065 context for following operations. 9067 ERRORS 9069 NFS4ERR_RESOURCE 9070 NFS4ERR_SERVERFAULT 9071 NFS4ERR_WRONGSEC 9073 Draft Specification NFS version 4 Protocol September 2002 9075 14.2.23. Operation 25: READ - Read from File 9077 SYNOPSIS 9079 (cfh), stateid, offset, count -> eof, data 9081 ARGUMENT 9083 struct READ4args { 9084 /* CURRENT_FH: file */ 9085 stateid4 stateid; 9086 offset4 offset; 9087 count4 count; 9088 }; 9090 RESULT 9092 struct READ4resok { 9093 bool eof; 9094 opaque data<>; 9095 }; 9097 union READ4res switch (nfsstat4 status) { 9098 case NFS4_OK: 9099 READ4resok resok4; 9100 default: 9101 void; 9102 }; 9104 DESCRIPTION 9106 The READ operation reads data from the regular file identified by 9107 the current filehandle. 9109 The client provides an offset of where the READ is to start and a 9110 count of how many bytes are to be read. An offset of 0 (zero) 9111 means to read data starting at the beginning of the file. If 9112 offset is greater than or equal to the size of the file, the 9113 status, NFS4_OK, is returned with a data length set to 0 (zero) and 9114 eof is set to TRUE. The READ is subject to access permissions 9115 checking. 9117 If the client specifies a count value of 0 (zero), the READ 9118 succeeds and returns 0 (zero) bytes of data again subject to access 9119 permissions checking. The server may choose to return fewer bytes 9120 than specified by the client. The client needs to check for this 9121 condition and handle the condition appropriately. 9123 The stateid value for a READ request represents a value returned 9125 Draft Specification NFS version 4 Protocol September 2002 9127 from a previous record lock or share reservation request. The 9128 stateid is used by the server to verify that the associated share 9129 reservation and any record locks are still valid and to update 9130 lease timeouts for the client. 9132 If the read ended at the end-of-file (formally, in a correctly 9133 formed READ request, if offset + count is equal to the size of the 9134 file), or the read request extends beyond the size of the file (if 9135 offset + count is greater than the size of the file), eof is 9136 returned as TRUE; otherwise it is FALSE. A successful READ of an 9137 empty file will always return eof as TRUE. 9139 If the current filehandle is not a regular file, an error will be 9140 returned to the client. In the case the current filehandle 9141 represents a directory, NFS4ERR_ISDIR is return; otherwise, 9142 NFS4ERR_INVAL is returned. 9144 For a READ with a stateid value of all bits 0, the server MAY allow 9145 the READ to be serviced subject to mandatory file locks or the 9146 current share deny modes for the file. For a READ with a stateid 9147 value of all bits 1, the server MAY allow READ operations to bypass 9148 locking checks at the server. 9150 On success, the current filehandle retains its value. 9152 IMPLEMENTATION 9154 It is possible for the server to return fewer than count bytes of 9155 data. If the server returns less than the count requested and eof 9156 is set to FALSE, the client should issue another READ to get the 9157 remaining data. A server may return less data than requested under 9158 several circumstances. The file may have been truncated by another 9159 client or perhaps on the server itself, changing the file size from 9160 what the requesting client believes to be the case. This would 9161 reduce the actual amount of data available to the client. It is 9162 possible that the server may back off the transfer size and reduce 9163 the read request return. Server resource exhaustion may also occur 9164 necessitating a smaller read return. 9166 If mandatory file locking is on for the file, and if the region 9167 corresponding to the data to be read from file is write locked by 9168 an owner not associated the stateid, the server will return the 9169 NFS4ERR_LOCKED error. The client should try to get the appropriate 9170 read record lock via the LOCK operation before re-attempting the 9171 READ. When the READ completes, the client should release the 9172 record lock via LOCKU. 9174 ERRORS 9176 NFS4ERR_ACCESS 9178 Draft Specification NFS version 4 Protocol September 2002 9180 NFS4ERR_ADMIN_REVOKED 9181 NFS4ERR_BADHANDLE 9182 NFS4ERR_BAD_STATEID 9183 NFS4ERR_BADXDR 9184 NFS4ERR_DELAY 9185 NFS4ERR_EXPIRED 9186 NFS4ERR_FHEXPIRED 9187 NFS4ERR_GRACE 9188 NFS4ERR_IO 9189 NFS4ERR_INVAL 9190 NFS4ERR_ISDIR 9191 NFS4ERR_LEASE_MOVED 9192 NFS4ERR_LOCKED 9193 NFS4ERR_MOVED 9194 NFS4ERR_NOFILEHANDLE 9195 NFS4ERR_NXIO 9196 NFS4ERR_OLD_STATEID 9197 NFS4ERR_OPENMODE 9198 NFS4ERR_RESOURCE 9199 NFS4ERR_SERVERFAULT 9200 NFS4ERR_STALE 9201 NFS4ERR_STALE_STATEID 9203 Draft Specification NFS version 4 Protocol September 2002 9205 14.2.24. Operation 26: READDIR - Read Directory 9207 SYNOPSIS 9208 (cfh), cookie, cookieverf, dircount, maxcount, attr_request -> 9209 cookieverf { cookie, name, attrs } 9211 ARGUMENT 9213 struct READDIR4args { 9214 /* CURRENT_FH: directory */ 9215 nfs_cookie4 cookie; 9216 verifier4 cookieverf; 9217 count4 dircount; 9218 count4 maxcount; 9219 bitmap4 attr_request; 9220 }; 9222 RESULT 9224 struct entry4 { 9225 nfs_cookie4 cookie; 9226 component4 name; 9227 fattr4 attrs; 9228 entry4 *nextentry; 9229 }; 9231 struct dirlist4 { 9232 entry4 *entries; 9233 bool eof; 9234 }; 9236 struct READDIR4resok { 9237 verifier4 cookieverf; 9238 dirlist4 reply; 9239 }; 9241 union READDIR4res switch (nfsstat4 status) { 9242 case NFS4_OK: 9243 READDIR4resok resok4; 9244 default: 9245 void; 9246 }; 9248 DESCRIPTION 9250 The READDIR operation retrieves a variable number of entries from a 9251 filesystem directory and returns client requested attributes for 9252 each entry along with information to allow the client to request 9254 Draft Specification NFS version 4 Protocol September 2002 9256 additional directory entries in a subsequent READDIR. 9258 The arguments contain a cookie value that represents where the 9259 READDIR should start within the directory. A value of 0 (zero) for 9260 the cookie is used to start reading at the beginning of the 9261 directory. For subsequent READDIR requests, the client specifies a 9262 cookie value that is provided by the server on a previous READDIR 9263 request. 9265 The cookieverf value should be set to 0 (zero) when the cookie 9266 value is 0 (zero) (first directory read). On subsequent requests, 9267 it should be a cookieverf as returned by the server. The 9268 cookieverf must match that returned by the READDIR in which the 9269 cookie was acquired. If the server determines that the cookieverf 9270 is no longer valid for the directory, the error NFS4ERR_NOT_SAME 9271 must be returned. 9273 The dircount portion of the argument is a hint of the maximum 9274 number of bytes of directory information that should be returned. 9275 This value represents the length of the names of the directory 9276 entries and the cookie value for these entries. This length 9277 represents the XDR encoding of the data (names and cookies) and not 9278 the length in the native format of the server. 9280 The maxcount value of the argument is the maximum number of bytes 9281 for the result. This maximum size represents all of the data being 9282 returned within the READDIR4resok structure and includes the XDR 9283 overhead. The server may return less data. If the server is 9284 unable to return a single directory entry within the maxcount 9285 limit, the error NFS4ERR_TOOSMALL will be returned to the client. 9287 Finally, attr_request represents the list of attributes to be 9288 returned for each directory entry supplied by the server. 9290 On successful return, the server's response will provide a list of 9291 directory entries. Each of these entries contains the name of the 9292 directory entry, a cookie value for that entry, and the associated 9293 attributes as requested. The "eof" flag has a value of TRUE if 9294 there are no more entries in the directory. 9296 The cookie value is only meaningful to the server and is used as a 9297 "bookmark" for the directory entry. As mentioned, this cookie is 9298 used by the client for subsequent READDIR operations so that it may 9299 continue reading a directory. The cookie is similar in concept to 9300 a READ offset but should not be interpreted as such by the client. 9301 Ideally, the cookie value should not change if the directory is 9302 modified since the client may be caching these values. 9304 In some cases, the server may encounter an error while obtaining 9305 the attributes for a directory entry. Instead of returning an 9306 error for the entire READDIR operation, the server can instead 9307 return the attribute 'fattr4_rdattr_error'. With this, the server 9309 Draft Specification NFS version 4 Protocol September 2002 9311 is able to communicate the failure to the client and not fail the 9312 entire operation in the instance of what might be a transient 9313 failure. Obviously, the client must request the 9314 fattr4_rdattr_error attribute for this method to work properly. If 9315 the client does not request the attribute, the server has no choice 9316 but to return failure for the entire READDIR operation. 9318 For some filesystem environments, the directory entries "." and 9319 ".." have special meaning and in other environments, they may not. 9320 If the server supports these special entries within a directory, 9321 they should not be returned to the client as part of the READDIR 9322 response. To enable some client environments, the cookie values of 9323 0, 1, and 2 are to be considered reserved. Note that the UNIX 9324 client will use these values when combining the server's response 9325 and local representations to enable a fully formed UNIX directory 9326 presentation to the application. 9328 For READDIR arguments, cookie values of 1 and 2 should not be used 9329 and for READDIR results cookie values of 0, 1, and 2 should not be 9330 returned. 9332 On success, the current filehandle retains its value. 9334 IMPLEMENTATION 9336 The server's filesystem directory representations can differ 9337 greatly. A client's programming interfaces may also be bound to 9338 the local operating environment in a way that does not translate 9339 well into the NFS protocol. Therefore the use of the dircount and 9340 maxcount fields are provided to allow the client the ability to 9341 provide guidelines to the server. If the client is aggressive 9342 about attribute collection during a READDIR, the server has an idea 9343 of how to limit the encoded response. The dircount field provides 9344 a hint on the number of entries based solely on the names of the 9345 directory entries. Since it is a hint, it may be possible that a 9346 dircount value is zero. In this case, the server is free to ignore 9347 the dircount value and return directory information based on the 9348 specified maxcount value. 9350 The cookieverf may be used by the server to help manage cookie 9351 values that may become stale. It should be a rare occurrence that 9352 a server is unable to continue properly reading a directory with 9353 the provided cookie/cookieverf pair. The server should make every 9354 effort to avoid this condition since the application at the client 9355 may not be able to properly handle this type of failure. 9357 The use of the cookieverf will also protect the client from using 9358 READDIR cookie values that may be stale. For example, if the file 9359 system has been migrated, the server may or may not be able to use 9360 the same cookie values to service READDIR as the previous server 9361 used. With the client providing the cookieverf, the server is able 9363 Draft Specification NFS version 4 Protocol September 2002 9365 to provide the appropriate response to the client. This prevents 9366 the case where the server may accept a cookie value but the 9367 underlying directory has changed and the response is invalid from 9368 the client's context of its previous READDIR. 9370 Since some servers will not be returning "." and ".." entries as 9371 has been done with previous versions of the NFS protocol, the 9372 client that requires these entries be present in READDIR responses 9373 must fabricate them. 9375 ERRORS 9377 NFS4ERR_ACCESS 9378 NFS4ERR_BADHANDLE 9379 NFS4ERR_BAD_COOKIE 9380 NFS4ERR_BADXDR 9381 NFS4ERR_DELAY 9382 NFS4ERR_FHEXPIRED 9383 NFS4ERR_INVAL 9384 NFS4ERR_IO 9385 NFS4ERR_MOVED 9386 NFS4ERR_NOFILEHANDLE 9387 NFS4ERR_NOTDIR 9388 NFS4ERR_RESOURCE 9389 NFS4ERR_SERVERFAULT 9390 NFS4ERR_STALE 9391 NFS4ERR_TOOSMALL 9393 Draft Specification NFS version 4 Protocol September 2002 9395 14.2.25. Operation 27: READLINK - Read Symbolic Link 9397 SYNOPSIS 9399 (cfh) -> linktext 9401 ARGUMENT 9403 /* CURRENT_FH: symlink */ 9404 void; 9406 RESULT 9408 struct READLINK4resok { 9409 linktext4 link; 9410 }; 9412 union READLINK4res switch (nfsstat4 status) { 9413 case NFS4_OK: 9414 READLINK4resok resok4; 9415 default: 9416 void; 9417 }; 9419 DESCRIPTION 9421 READLINK reads the data associated with a symbolic link. The data 9422 is a UTF-8 string that is opaque to the server. That is, whether 9423 created by an NFS client or created locally on the server, the data 9424 in a symbolic link is not interpreted when created, but is simply 9425 stored. 9427 On success, the current filehandle retains its value. 9429 IMPLEMENTATION 9431 A symbolic link is nominally a pointer to another file. The data 9432 is not necessarily interpreted by the server, just stored in the 9433 file. It is possible for a client implementation to store a path 9434 name that is not meaningful to the server operating system in a 9435 symbolic link. A READLINK operation returns the data to the client 9436 for interpretation. If different implementations want to share 9437 access to symbolic links, then they must agree on the 9438 interpretation of the data in the symbolic link. 9440 The READLINK operation is only allowed on objects of type NF4LNK. 9441 The server should return the error, NFS4ERR_INVAL, if the object is 9442 not of type, NF4LNK. 9444 Draft Specification NFS version 4 Protocol September 2002 9446 ERRORS 9448 NFS4ERR_ACCESS 9449 NFS4ERR_BADHANDLE 9450 NFS4ERR_DELAY 9451 NFS4ERR_FHEXPIRED 9452 NFS4ERR_INVAL 9453 NFS4ERR_IO 9454 NFS4ERR_ISDIR 9455 NFS4ERR_MOVED 9456 NFS4ERR_NOFILEHANDLE 9457 NFS4ERR_NOTSUPP 9458 NFS4ERR_RESOURCE 9459 NFS4ERR_SERVERFAULT 9460 NFS4ERR_STALE 9462 Draft Specification NFS version 4 Protocol September 2002 9464 14.2.26. Operation 28: REMOVE - Remove Filesystem Object 9466 SYNOPSIS 9468 (cfh), filename -> change_info 9470 ARGUMENT 9472 struct REMOVE4args { 9473 /* CURRENT_FH: directory */ 9474 component4 target; 9475 }; 9477 RESULT 9479 struct REMOVE4resok { 9480 change_info4 cinfo; 9481 } 9483 union REMOVE4res switch (nfsstat4 status) { 9484 case NFS4_OK: 9485 REMOVE4resok resok4; 9486 default: 9487 void; 9488 } 9490 DESCRIPTION 9492 The REMOVE operation removes (deletes) a directory entry named by 9493 filename from the directory corresponding to the current 9494 filehandle. If the entry in the directory was the last reference 9495 to the corresponding filesystem object, the object may be 9496 destroyed. 9498 For the directory where the filename was removed, the server 9499 returns change_info4 information in cinfo. With the atomic field 9500 of the change_info4 struct, the server will indicate if the before 9501 and after change attributes were obtained atomically with respect 9502 to the removal. 9504 If the target has a length of 0 (zero), or if target does not obey 9505 the UTF-8 definition, the error NFS4ERR_INVAL will be returned. 9507 On success, the current filehandle retains its value. 9509 IMPLEMENTATION 9511 NFS versions 2 and 3 required a different operator RMDIR for 9513 Draft Specification NFS version 4 Protocol September 2002 9515 directory removal and REMOVE for non-directory removal. This 9516 allowed clients to skip checking the file type when being passed a 9517 non-directory delete system call (e.g. unlink() in POSIX) to remove 9518 a directory, as well as the converse (e.g. a rmdir() on a non- 9519 directory) because they knew the server would check the file type. 9520 NFS version 4 REMOVE can be used to delete any directory entry 9521 independent of its file type. The implementor of an NFS version 4 9522 client's entry points from the unlink() and rmdir() system calls 9523 should first check the file type against the types the system call 9524 is allowed to remove before issuing a REMOVE. Alternatively, the 9525 implementor can produce a COMPOUND call that includes a 9526 LOOKUP/VERIFY sequence to verify the file type before a REMOVE 9527 operation in the same COMPOUND call. 9529 The concept of last reference is server specific. However, if the 9530 numlinks field in the previous attributes of the object had the 9531 value 1, the client should not rely on referring to the object via 9532 a filehandle. Likewise, the client should not rely on the resources 9533 (disk space, directory entry, and so on) formerly associated with 9534 the object becoming immediately available. Thus, if a client needs 9535 to be able to continue to access a file after using REMOVE to 9536 remove it, the client should take steps to make sure that the file 9537 will still be accessible. The usual mechanism used is to RENAME 9538 the file from its old name to a new hidden name. 9540 If the server finds that the file is still open when the REMOVE 9541 arrives: 9543 o The server SHOULD NOT delete the file's directory entry if the 9544 file was opened with OPEN4_SHARE_DENY_WRITE or 9545 OPEN4_SHARE_DENY_BOTH. 9547 o If the file was not opened with OPEN4_SHARE_DENY_WRITE or 9548 OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's 9549 directory entry. However, until last CLOSE of the file, the 9550 server MAY continue to allow access to the file via its 9551 filehandle. 9553 ERRORS 9555 NFS4ERR_ACCESS 9556 NFS4ERR_BADCHAR 9557 NFS4ERR_BADHANDLE 9558 NFS4ERR_BADNAME 9559 NFS4ERR_BADXDR 9560 NFS4ERR_DELAY 9561 NFS4ERR_FHEXPIRED 9562 NFS4ERR_FILE_OPEN 9563 NFS4ERR_INVAL 9564 NFS4ERR_IO 9565 NFS4ERR_MOVED 9567 Draft Specification NFS version 4 Protocol September 2002 9569 NFS4ERR_NAMETOOLONG 9570 NFS4ERR_NOENT 9571 NFS4ERR_NOFILEHANDLE 9572 NFS4ERR_NOTDIR 9573 NFS4ERR_NOTEMPTY 9574 NFS4ERR_RESOURCE 9575 NFS4ERR_ROFS 9576 NFS4ERR_SERVERFAULT 9577 NFS4ERR_STALE 9579 Draft Specification NFS version 4 Protocol September 2002 9581 14.2.27. Operation 29: RENAME - Rename Directory Entry 9583 SYNOPSIS 9585 (sfh), oldname, (cfh), newname -> source_change_info, 9586 target_change_info 9588 ARGUMENT 9590 struct RENAME4args { 9591 /* SAVED_FH: source directory */ 9592 component4 oldname; 9593 /* CURRENT_FH: target directory */ 9594 component4 newname; 9595 }; 9597 RESULT 9599 struct RENAME4resok { 9600 change_info4 source_cinfo; 9601 change_info4 target_cinfo; 9602 }; 9604 union RENAME4res switch (nfsstat4 status) { 9605 case NFS4_OK: 9606 RENAME4resok resok4; 9607 default: 9608 void; 9609 }; 9611 DESCRIPTION 9613 The RENAME operation renames the object identified by oldname in 9614 the source directory corresponding to the saved filehandle, as set 9615 by the SAVEFH operation, to newname in the target directory 9616 corresponding to the current filehandle. The operation is required 9617 to be atomic to the client. Source and target directories must 9618 reside on the same filesystem on the server. On success, the 9619 current filehandle will continue to be the target directory. 9621 If the target directory already contains an entry with the name, 9622 newname, the source object must be compatible with the target: 9623 either both are non-directories or both are directories and the 9624 target must be empty. If compatible, the existing target is 9625 removed before the rename occurs (See the IMPLEMENTATION subsection 9626 of the section "Operation 28: REMOVE - Remove Filesystem Object" 9627 for client and server actions whenever a target is removed). If 9628 they are not compatible or if the target is a directory but not 9629 empty, the server will return the error, NFS4ERR_EXIST. 9631 Draft Specification NFS version 4 Protocol September 2002 9633 If oldname and newname both refer to the same file (they might be 9634 hard links of each other), then RENAME should perform no action and 9635 return success. 9637 For both directories involved in the RENAME, the server returns 9638 change_info4 information. With the atomic field of the 9639 change_info4 struct, the server will indicate if the before and 9640 after change attributes were obtained atomically with respect to 9641 the rename. 9643 If the oldname refers to a named attribute and the saved and 9644 current filehandles refer to different filesystem objects, the 9645 server will return NFS4ERR_XDEV just as if the saved and current 9646 filehandles represented directories on different filesystems. 9648 If the oldname or newname has a length of 0 (zero), or if oldname 9649 or newname does not obey the UTF-8 definition, the error 9650 NFS4ERR_INVAL will be returned. 9652 IMPLEMENTATION 9654 The RENAME operation must be atomic to the client. The statement 9655 "source and target directories must reside on the same filesystem 9656 on the server" means that the fsid fields in the attributes for the 9657 directories are the same. If they reside on different filesystems, 9658 the error, NFS4ERR_XDEV, is returned. 9660 Based on the value of the fh_expire_type attribute for the object, 9661 the filehandle may or may not expire on a RENAME. However, server 9662 implementors are strongly encouraged to attempt to keep filehandles 9663 from expiring in this fashion. 9665 On some servers, the file names "." and ".." are illegal as either 9666 oldname or newname, and will result in the error NFS4ERR_BADNAME. 9667 In addition, on many servers the case of oldname or newname being 9668 an alias for the source directory will be checked for. Such 9669 servers will return the error NFS4ERR_INVAL in these cases. 9671 If either of the source or target filehandles are not directories, 9672 the server will return NFS4ERR_NOTDIR. 9674 ERRORS 9676 NFS4ERR_ACCESS 9677 NFS4ERR_BADCHAR 9678 NFS4ERR_BADHANDLE 9679 NFS4ERR_BADNAME 9680 NFS4ERR_BADXDR 9681 NFS4ERR_DELAY 9683 Draft Specification NFS version 4 Protocol September 2002 9685 NFS4ERR_DQUOT 9686 NFS4ERR_EXIST 9687 NFS4ERR_FHEXPIRED 9688 NFS4ERR_FILE_OPEN 9689 NFS4ERR_INVAL 9690 NFS4ERR_IO 9691 NFS4ERR_MOVED 9692 NFS4ERR_NAMETOOLONG 9693 NFS4ERR_NOENT 9694 NFS4ERR_NOFILEHANDLE 9695 NFS4ERR_NOSPC 9696 NFS4ERR_NOTDIR 9697 NFS4ERR_NOTEMPTY 9698 NFS4ERR_RESOURCE 9699 NFS4ERR_ROFS 9700 NFS4ERR_SERVERFAULT 9701 NFS4ERR_STALE 9702 NFS4ERR_WRONGSEC 9703 NFS4ERR_XDEV 9705 Draft Specification NFS version 4 Protocol September 2002 9707 14.2.28. Operation 30: RENEW - Renew a Lease 9709 SYNOPSIS 9711 clientid -> () 9713 ARGUMENT 9715 struct RENEW4args { 9716 clientid4 clientid; 9717 }; 9719 RESULT 9721 struct RENEW4res { 9722 nfsstat4 status; 9723 }; 9725 DESCRIPTION 9727 The RENEW operation is used by the client to renew leases which it 9728 currently holds at a server. In processing the RENEW request, the 9729 server renews all leases associated with the client. The 9730 associated leases are determined by the clientid provided via the 9731 SETCLIENTID operation. 9733 IMPLEMENTATION 9735 When the client holds delegations, it needs to use RENEW to detect 9736 when the server has determined that the callback path is down. 9737 When the server has made such a determination, only the RENEW 9738 operation will renew the lease on delegations. If the server 9739 determines the callback path is down, it returns 9740 NFS4ERR_CB_PATH_DOWN. Even though it returns NFS4ERR_CB_PATH_DOWN, 9741 the server MUST renew the lease on the record locks and share 9742 reservations that the client has established on the server. If for 9743 some reason the lock and share reservation lease cannot be renewed, 9744 then the server MUST return an error other than 9745 NFS4ERR_CB_PATH_DOWN, even if the callback path is also down. 9747 ERRORS 9749 NFS4ERR_ADMIN_REVOKED 9750 NFS4ERR_BADXDR 9751 NFS4ERR_CB_PATH_DOWN 9752 NFS4ERR_EXPIRED 9753 NFS4ERR_LEASE_MOVED 9755 Draft Specification NFS version 4 Protocol September 2002 9757 NFS4ERR_RESOURCE 9758 NFS4ERR_SERVERFAULT 9759 NFS4ERR_STALE_CLIENTID 9761 Draft Specification NFS version 4 Protocol September 2002 9763 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle 9765 SYNOPSIS 9767 (sfh) -> (cfh) 9769 ARGUMENT 9771 /* SAVED_FH: */ 9772 void; 9774 RESULT 9776 struct RESTOREFH4res { 9777 /* CURRENT_FH: value of saved fh */ 9778 nfsstat4 status; 9779 }; 9781 DESCRIPTION 9783 Set the current filehandle to the value in the saved filehandle. 9784 If there is no saved filehandle then return the error 9785 NFS4ERR_RESTOREFH. 9787 IMPLEMENTATION 9789 Operations like OPEN and LOOKUP use the current filehandle to 9790 represent a directory and replace it with a new filehandle. 9791 Assuming the previous filehandle was saved with a SAVEFH operator, 9792 the previous filehandle can be restored as the current filehandle. 9793 This is commonly used to obtain post-operation attributes for the 9794 directory, e.g. 9796 PUTFH (directory filehandle) 9797 SAVEFH 9798 GETATTR attrbits (pre-op dir attrs) 9799 CREATE optbits "foo" attrs 9800 GETATTR attrbits (file attributes) 9801 RESTOREFH 9802 GETATTR attrbits (post-op dir attrs) 9804 ERRORS 9806 NFS4ERR_BADHANDLE 9807 NFS4ERR_FHEXPIRED 9808 NFS4ERR_MOVED 9810 Draft Specification NFS version 4 Protocol September 2002 9812 NFS4ERR_RESOURCE 9813 NFS4ERR_RESTOREFH 9814 NFS4ERR_SERVERFAULT 9815 NFS4ERR_STALE 9816 NFS4ERR_WRONGSEC 9818 Draft Specification NFS version 4 Protocol September 2002 9820 14.2.30. Operation 32: SAVEFH - Save Current Filehandle 9822 SYNOPSIS 9824 (cfh) -> (sfh) 9826 ARGUMENT 9828 /* CURRENT_FH: */ 9829 void; 9831 RESULT 9833 struct SAVEFH4res { 9834 /* SAVED_FH: value of current fh */ 9835 nfsstat4 status; 9836 }; 9838 DESCRIPTION 9840 Save the current filehandle. If a previous filehandle was saved 9841 then it is no longer accessible. The saved filehandle can be 9842 restored as the current filehandle with the RESTOREFH operator. 9844 On success, the current filehandle retains its value. 9846 IMPLEMENTATION 9848 ERRORS 9850 NFS4ERR_BADHANDLE 9851 NFS4ERR_FHEXPIRED 9852 NFS4ERR_MOVED 9853 NFS4ERR_NOFILEHANDLE 9854 NFS4ERR_RESOURCE 9855 NFS4ERR_SERVERFAULT 9856 NFS4ERR_STALE 9858 Draft Specification NFS version 4 Protocol September 2002 9860 14.2.31. Operation 33: SECINFO - Obtain Available Security 9862 SYNOPSIS 9864 (cfh), name -> { secinfo } 9866 ARGUMENT 9868 struct SECINFO4args { 9869 /* CURRENT_FH: directory */ 9870 component4 name; 9871 }; 9873 RESULT 9875 enum rpc_gss_svc_t {/* From RFC 2203 */ 9876 RPC_GSS_SVC_NONE = 1, 9877 RPC_GSS_SVC_INTEGRITY = 2, 9878 RPC_GSS_SVC_PRIVACY = 3 9879 }; 9881 struct rpcsec_gss_info { 9882 sec_oid4 oid; 9883 qop4 qop; 9884 rpc_gss_svc_t service; 9885 }; 9887 union secinfo4 switch (uint32_t flavor) { 9888 case RPCSEC_GSS: 9889 rpcsec_gss_info flavor_info; 9890 default: 9891 void; 9892 }; 9894 typedef secinfo4 SECINFO4resok<>; 9896 union SECINFO4res switch (nfsstat4 status) { 9897 case NFS4_OK: 9898 SECINFO4resok resok4; 9899 default: 9900 void; 9901 }; 9903 DESCRIPTION 9905 The SECINFO operation is used by the client to obtain a list of 9906 valid RPC authentication flavors for a specific directory 9907 filehandle, file name pair. SECINFO should apply the same access 9909 Draft Specification NFS version 4 Protocol September 2002 9911 methodology used for LOOKUP when evaluating the name. Therefore, 9912 if the requester does not have the appropriate access to LOOKUP the 9913 name then SECINFO must behave the same way and return 9914 NFS4ERR_ACCESS. 9916 The result will contain an array which represents the security 9917 mechanisms available, with an order corresponding to server's 9918 preferences, the most preferred being first in the array. The 9919 client is free to pick whatever security mechanism it both desires 9920 and supports, or to pick in the server's preference order the first 9921 one it supports. The array entries are represented by the secinfo4 9922 structure. The field 'flavor' will contain a value of AUTH_NONE, 9923 AUTH_SYS (as defined in [RFC1831]), or RPCSEC_GSS (as defined in 9924 [RFC2203]). 9926 For the flavors AUTH_NONE and AUTH_SYS, no additional security 9927 information is returned. For a return value of RPCSEC_GSS, a 9928 security triple is returned that contains the mechanism object id 9929 (as defined in [RFC2743]), the quality of protection (as defined in 9930 [RFC2743]) and the service type (as defined in [RFC2203]). It is 9931 possible for SECINFO to return multiple entries with flavor equal 9932 to RPCSEC_GSS with different security triple values. 9934 On success, the current filehandle retains its value. 9936 If the name has a length of 0 (zero), or if name does not obey the 9937 UTF-8 definition, the error NFS4ERR_INVAL will be returned. 9939 IMPLEMENTATION 9941 The SECINFO operation is expected to be used by the NFS client when 9942 the error value of NFS4ERR_WRONGSEC is returned from another NFS 9943 operation. This signifies to the client that the server's security 9944 policy is different from what the client is currently using. At 9945 this point, the client is expected to obtain a list of possible 9946 security flavors and choose what best suits its policies. 9948 As mentioned, the server's security policies will determine when a 9949 client request receives NFS4ERR_WRONGSEC. The operations which may 9950 receive this error are: LINK, LOOKUP, OPEN, PUTFH, PUTPUBFH, 9951 PUTROOTFH, RESTOREFH, RENAME, and indirectly READDIR. LINK and 9952 RENAME will only receive this error if the security used for the 9953 operation is inappropriate for saved filehandle. With the 9954 exception of READDIR, these operations represent the point at which 9955 the client can instantiate a filehandle into the "current 9956 filehandle" at the server. The filehandle is either provided by 9957 the client (PUTFH, PUTPUBFH, PUTROOTFH) or generated as a result of 9958 a name to filehandle translation (LOOKUP and OPEN). RESTOREFH is 9959 different because the filehandle is a result of a previous SAVEFH. 9960 Even though the filehandle, for RESTOREFH, might have previously 9961 passed the server's inspection for a security match, the server 9963 Draft Specification NFS version 4 Protocol September 2002 9965 will check it again on RESTOREFH to ensure that the security policy 9966 has not changed. 9968 If the client wants to resolve an error return of NFS4ERR_WRONGSEC, 9969 the following will occur: 9971 o For LOOKUP and OPEN, the client will use SECINFO with the same 9972 current filehandle and name as provided in the original LOOKUP 9973 or OPEN to enumerate the available security triples. 9975 o For LINK, PUTFH, RENAME, and RESTOREFH, the client will use 9976 SECINFO and provide the parent directory filehandle and object 9977 name which corresponds to the filehandle originally provided by 9978 the PUTFH RESTOREFH, or for LINK and RENAME, the SAVEFH. 9980 o For PUTROOTFH and PUTPUBFH, the client will be unable to use 9981 the SECINFO operation since SECINFO requires a current 9982 filehandle and none exist for these two operations. Therefore, 9983 the client must iterate through the security triples available 9984 at the client and reattempt the PUTROOTFH or PUTPUBFH 9985 operation. In the unfortunate event none of the MANDATORY 9986 security triples are supported by the client and server, the 9987 client SHOULD try using others that support integrity. Failing 9988 that, the client can try using AUTH_NONE, but because such 9989 forms lack integrity checks, this puts the client at risk. 9990 Nonetheless, the server SHOULD allow the client to use whatever 9991 security form the client requests and the server supports, 9992 since the risks of doing so are on the client. 9994 The READDIR operation will not directly return the NFS4ERR_WRONGSEC 9995 error. However, if the READDIR request included a request for 9996 attributes, it is possible that the READDIR request's security 9997 triple does not match that of a directory entry. If this is the 9998 case and the client has requested the rdattr_error attribute, the 9999 server will return the NFS4ERR_WRONGSEC error in rdattr_error for 10000 the entry. 10002 See the section "Security Considerations" for a discussion on the 10003 recommendations for security flavor used by SECINFO. 10005 ERRORS 10007 NFS4ERR_ACCESS 10008 NFS4ERR_BADCHAR 10009 NFS4ERR_BADHANDLE 10010 NFS4ERR_BADNAME 10011 NFS4ERR_BADXDR 10012 NFS4ERR_FHEXPIRED 10013 NFS4ERR_INVAL 10014 NFS4ERR_MOVED 10015 NFS4ERR_NAMETOOLONG 10017 Draft Specification NFS version 4 Protocol September 2002 10019 NFS4ERR_NOENT 10020 NFS4ERR_NOFILEHANDLE 10021 NFS4ERR_NOTDIR 10022 NFS4ERR_RESOURCE 10023 NFS4ERR_SERVERFAULT 10024 NFS4ERR_STALE 10026 Draft Specification NFS version 4 Protocol September 2002 10028 14.2.32. Operation 34: SETATTR - Set Attributes 10030 SYNOPSIS 10032 (cfh), stateid, attrmask, attr_vals -> attrsset 10034 ARGUMENT 10036 struct SETATTR4args { 10037 /* CURRENT_FH: target object */ 10038 stateid4 stateid; 10039 fattr4 obj_attributes; 10040 }; 10042 RESULT 10044 struct SETATTR4res { 10045 nfsstat4 status; 10046 bitmap4 attrsset; 10047 }; 10049 DESCRIPTION 10051 The SETATTR operation changes one or more of the attributes of a 10052 filesystem object. The new attributes are specified with a bitmap 10053 and the attributes that follow the bitmap in bit order. 10055 The stateid argument for SETATTR is used to provide file locking 10056 context that is necessary for SETATTR requests that set the size 10057 attribute. Since setting the size attribute modifies the file's 10058 data, it has the same locking requirements as a corresponding 10059 WRITE. Any SETATTR that sets the size attribute is incompatible 10060 with a share reservation that specifies DENY_WRITE. The area 10061 between the old end-of-file and the new end-of-file is considered 10062 to be modified just as would have been the case had the area in 10063 question been specified as the target of WRITE, for the purpose of 10064 checking conflicts with record locks, for those cases in which a 10065 server is implementing mandatory record locking behavior. A valid 10066 stateid should always be specified. When the file size attribute 10067 is not set, the special stateid consisting of all bits zero should 10068 be passed. 10070 On either success or failure of the operation, the server will 10071 return the attrsset bitmask to represent what (if any) attributes 10072 were successfully set. The attrsset in the response is a subset of 10073 the bitmap4 that is part of the obj_attributes in the argument. 10075 On success, the current filehandle retains its value. 10077 Draft Specification NFS version 4 Protocol September 2002 10079 IMPLEMENTATION 10081 If the request specifies the owner attribute to be set, the server 10082 should allow the operation to succeed if the current owner of the 10083 object matches the value specified in the request. Some servers 10084 may be implemented in a way as to prohibit the setting of the owner 10085 attribute unless the requester has privilege to do so. If the 10086 server is lenient in this one case of matching owner values, the 10087 client implementation may be simplified in cases of creation of an 10088 object followed by a SETATTR. 10090 The file size attribute is used to request changes to the size of a 10091 file. A value of 0 (zero) causes the file to be truncated, a value 10092 less than the current size of the file causes data from new size to 10093 the end of the file to be discarded, and a size greater than the 10094 current size of the file causes logically zeroed data bytes to be 10095 added to the end of the file. Servers are free to implement this 10096 using holes or actual zero data bytes. Clients should not make any 10097 assumptions regarding a server's implementation of this feature, 10098 beyond that the bytes returned will be zeroed. Servers must 10099 support extending the file size via SETATTR. 10101 SETATTR is not guaranteed atomic. A failed SETATTR may partially 10102 change a file's attributes. 10104 Changing the size of a file with SETATTR indirectly changes the 10105 time_modify. A client must account for this as size changes can 10106 result in data deletion. 10108 The attributes time_access_set and time_modify_set are write-only 10109 attributes constructed as a switched union so the client can direct 10110 the server in setting the time values. If the switched union 10111 specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 10112 to be used for the operation. If the switch union does not specify 10113 SET_TO_CLIENT_TIME4, the server is to use its current time for the 10114 SETATTR operation. 10116 If server and client times differ, programs that compare client 10117 time to file times can break. A time maintenance protocol should be 10118 used to limit client/server time skew. 10120 Use of a COMPOUND containing a VERIFY operation specifying only the 10121 change attribute, immediately followed by a SETATTR, provides a 10122 means whereby a client may specify a request that emulates the 10123 functionality of the SETATTR guard mechanism of NFS version 3. 10124 Since the function of the guard mechanism is to avoid changes to 10125 the file attributes based on stale information, delays between 10126 checking of the guard condition and the setting of the attributes 10127 have the potential to compromise this function, as would the 10128 corresponding delay in the NFS version 4 emulation. Therefore, NFS 10129 version 4 servers should take care to avoid such delays, to the 10130 degree possible, when executing such a request. 10132 Draft Specification NFS version 4 Protocol September 2002 10134 If the server does not support an attribute as requested by the 10135 client, the server should return NFS4ERR_ATTRNOTSUPP. 10137 A mask of the attributes actually set is returned by SETATTR in all 10138 cases. That mask must not include attributes bits not requested to 10139 be set by the client, and must be equal to the mask of attributes 10140 requested to be set only if the SETATTR completes without error. 10142 ERRORS 10144 NFS4ERR_ACCESS 10145 NFS4ERR_ADMIN_REVOKED 10146 NFS4ERR_ATTRNOTSUPP 10147 NFS4ERR_BADCHAR 10148 NFS4ERR_BADHANDLE 10149 NFS4ERR_BADOWNER 10150 NFS4ERR_BAD_STATEID 10151 NFS4ERR_BADXDR 10152 NFS4ERR_DELAY 10153 NFS4ERR_DQUOT 10154 NFS4ERR_EXPIRED 10155 NFS4ERR_FBIG 10156 NFS4ERR_FHEXPIRED 10157 NFS4ERR_GRACE 10158 NFS4ERR_INVAL 10159 NFS4ERR_IO 10160 NFS4ERR_ISDIR 10161 NFS4ERR_LOCKED 10162 NFS4ERR_MOVED 10163 NFS4ERR_NOFILEHANDLE 10164 NFS4ERR_NOSPC 10165 NFS4ERR_OLD_STATEID 10166 NFS4ERR_OPENMODE 10167 NFS4ERR_PERM 10168 NFS4ERR_RESOURCE 10169 NFS4ERR_ROFS 10170 NFS4ERR_SERVERFAULT 10171 NFS4ERR_STALE 10172 NFS4ERR_STALE_STATEID 10174 Draft Specification NFS version 4 Protocol September 2002 10176 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid 10178 SYNOPSIS 10180 client, callback, callback_ident -> clientid, setclientid_confirm 10182 ARGUMENT 10184 struct SETCLIENTID4args { 10185 nfs_client_id4 client; 10186 cb_client4 callback; 10187 uint32_t callback_ident; 10188 }; 10190 RESULT 10192 struct SETCLIENTID4resok { 10193 clientid4 clientid; 10194 verifier4 setclientid_confirm; 10195 }; 10197 union SETCLIENTID4res switch (nfsstat4 status) { 10198 case NFS4_OK: 10199 SETCLIENTID4resok resok4; 10200 case NFS4ERR_CLID_INUSE: 10201 clientaddr4 client_using; 10202 default: 10203 void; 10204 }; 10206 DESCRIPTION 10208 The client uses the SETCLIENTID operation to notify the server of 10209 its intention to use a particular client identifier, callback, and 10210 callback_ident for subsequent requests that entail creating lock, 10211 share reservation, and delegation state on the server. Upon 10212 successful completion the server will return a short hand clientid 10213 which, if confirmed via a separate step, will be used in subsequent 10214 file locking and file open requests. Confirmation of the clientid 10215 must be done via the SETCLIENTID_CONFIRM operation to return the 10216 clientid and setclientid_confirm values, as verifiers, to the 10217 server. The reason why two verifiers are necessary is that it is 10218 possible to use SETCLIENTID and SETCLIENTID_CONFIRM to modify the 10219 callback and callback_ident information but not the short hand 10220 clientid. In that event, the setclientid_confirm value is 10221 effectively the only verifier. 10223 The callback information provided in this operation will be used if 10224 the client is provided an open delegation at a future point. 10226 Draft Specification NFS version 4 Protocol September 2002 10228 Therefore, the client must correctly reflect the program and port 10229 numbers for the callback program at the time SETCLIENTID is used. 10231 The callback_ident value is used by the server on the callback. 10232 The client can use leverage the callback_ident eliminate the need 10233 for more than one callback RPC program number while still being 10234 able to determine which server is initiating the callback. 10236 IMPLEMENTATION 10238 To understand how to implement SETCLIENTID, make the following 10239 notations. Let: 10241 x be the value of the client.id subfield of the SETCLIENTID4args 10242 structure. 10244 v be the value of the client.verifier subfield of the 10245 SETCLIENTID4args structure. 10247 c be the value of the clientid field returned in the 10248 SETCLIENTID4resok structure. 10250 k represent the value combination of the fields callback and 10251 callback_ident fields of the SETCLIENTID4args structure. 10253 s be the setclientid_confirm value returned in the 10254 SETCLIENTID4resok structure. 10256 { x, v, c, k, s } 10257 be a quintuple for a client record. A client record is 10258 confirmed if there has been a SETCLIENTID_CONFIRM operation to 10259 confirm it. Otherwise it is unconfirmed. An unconfirmed 10260 record is established by a SETCLIENTID call. 10262 Since SETCLIENTID is a non-idempotent operation, let us assume that 10263 the server is implementing the duplicate request cache (DRC). 10265 When the server gets a SETCLIENTID { v, x, k } request, it 10266 processes it in the following manner. 10268 o It first looks up the request in the DRC. If there is a hit, it 10269 returns the result cached in the DRC. The server does NOT remove 10270 client state (locks, shares, delegations) nor does it modify any 10271 recorded callback and callback_ident information for client { x 10272 }. 10274 For any DRC miss, the server takes the client id string x, and 10275 searches for client records for x that the server may have 10276 recorded from previous SETCLIENTID calls. For any confirmed 10278 Draft Specification NFS version 4 Protocol September 2002 10280 record with the same id string x, if the recorded principal does 10281 not match that of SETCLIENTID call, then the server returns a 10282 NFS4ERR_CLID_INUSE error. 10284 For brevity of discussion, the remaining description of the 10285 processing assumes that there was a DRC miss, and that where the 10286 server has previously recorded a confirmed record for client x, 10287 the aforementioned principal check has successfully passed. 10289 o The server checks if it has recorded a confirmed recorded for { 10290 v, x, c, l, s }, where l may or may not equal k. If so, and since 10291 the id verifier v of the request matches that which is confirmed 10292 and recorded, the server treats this as a probable callback 10293 information update and records an unconfirmed { v, x, c, k, t } 10294 and leaves the confirmed { v, x, c, l, s } in place, such that t 10295 != s. It does not matter if k equals l or not. Any pre-existing 10296 unconfirmed { v, x, c, *, * } is removed. 10298 The server returns { c, t }. It is indeed returning the old 10299 clientid4 value c, because the client apparently only wants to 10300 update callback value k to value l. It's possible this request 10301 is one from the Byzantine router that has stale callback 10302 information, but this is not a problem. The callback information 10303 update is only confirmed if followed up by a SETCLIENTID_CONFIRM 10304 { c, t }. 10306 The server awaits confirmation of k via SETCLIENTID_CONFIRM { c, 10307 t }. 10309 The server does NOT remove client (lock/share/delegation) state 10310 for x. 10312 o The server has previously recorded a confirmed { u, x, c, l, s } 10313 record such that v != u, l may or may not equal k, and has not 10314 recorded any unconfirmed { *, x, *, *, * } record for x. The 10315 server records an unconfirmed { v, x, d, k, t } (d != c, t != s). 10317 The server returns { d, t }. 10319 The server awaits confirmation of { d, k } via 10320 SETCLIENTID_CONFIRM { d, t }. 10322 The server does NOT remove client (lock/share/delegation) state 10323 for x. 10325 o The server has previously recorded a confirmed { u, x, c, l, s } 10326 record such that v != u, l may or may not equal k, and recorded 10327 an unconfirmed { w, x, d, m, t } record such that c != d, t != s, 10328 m may or may not equal k, m may or may not equal l, and k may or 10330 Draft Specification NFS version 4 Protocol September 2002 10332 may not equal l. Whether w == v or w != v makes no difference. 10333 The server simply removes the unconfirmed { w, x, d, m, t } 10334 record and replaces it with an unconfirmed { v, x, e, k, r } 10335 record, such that e != d, e != c, r != t, r != s. 10337 The server returns { e, r }. 10339 The server awaits confirmation of { e, k } via 10340 SETCLIENTID_CONFIRM { e, r }. 10342 The server does NOT remove client (lock/share/delegation) state 10343 for x. 10345 o The server has no confirmed { *, x, *, *, * } for x. It may or 10346 may not have recorded an unconfirmed { u, x, c, l, s }, where l 10347 may or may not equal k, and u may or may not equal v. Any 10348 unconfirmed record { u, x, c, l, * }, regardless whether u == v 10349 or l == k, is replaced with an unconfirmed record { v, x, d, k, t 10350 } where d != c, t != s. 10352 The server returns { d, t }. 10354 The server awaits confirmation of { d, k } via 10355 SETCLIENTID_CONFIRM { d, t }. The server does NOT remove client 10356 (lock/share/delegation) state for x. 10358 The server generates the clientid and setclientid_confirm values 10359 and must take care to ensure that these values are extremely 10360 unlikely to ever be regenerated. 10362 ERRORS 10364 NFS4ERR_BADXDR 10365 NFS4ERR_CLID_INUSE 10366 NFS4ERR_INVAL 10367 NFS4ERR_RESOURCE 10368 NFS4ERR_SERVERFAULT 10370 Draft Specification NFS version 4 Protocol September 2002 10372 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 10374 SYNOPSIS 10376 clientid, verifier -> - 10378 ARGUMENT 10380 struct SETCLIENTID_CONFIRM4args { 10381 clientid4 clientid; 10382 verifier4 setclientid_confirm; 10383 }; 10385 RESULT 10387 struct SETCLIENTID_CONFIRM4res { 10388 nfsstat4 status; 10389 }; 10391 DESCRIPTION 10393 This operation is used by the client to confirm the results from a 10394 previous call to SETCLIENTID. The client provides the server 10395 supplied (from a SETCLIENTID response) clientid. The server 10396 responds with a simple status of success or failure. 10398 IMPLEMENTATION 10400 The client must use the SETCLIENTID_CONFIRM operation to confirm 10401 the following two distinct cases: 10403 o The client's use of a new shorthand client identifier (as 10404 returned from the server in the response to SETCLIENTID), a new 10405 callback value (as specified in the arguments to SETCLIENTID) and 10406 a new callback_ident (as specified in the arguments to 10407 SETCLIENTID) value. The client's use of SETCLIENTID_CONFIRM in 10408 this case also confirms the removal of any of the client's 10409 previous relevant leased state. Relevant leased client state 10410 includes record locks, share reservations, and where the server 10411 does not support the CLAIM_DELEGATE_PREV claim type, delegations. 10412 If the server supports CLAIM_DELEGATE_PREV, then 10413 SETCLIENTID_CONFIRM MUST NOT remove delegations for this client; 10414 relevant leased client state would then just include record locks 10415 and share reservations. 10417 o The client's re-use of an old, previously confirmed, shorthand 10418 client identifier, a new callback value, and a new callback_ident 10420 Draft Specification NFS version 4 Protocol September 2002 10422 value. The client's use of SETCLIENTID_CONFIRM in this case MUST 10423 NOT result in the removal of any previous leased state (locks, 10424 share reservations, and delegations) 10426 We use the same notation and definitions for v, x, c, k, s, and 10427 unconfirmed and confirmed client records as introduced in the 10428 description of the SETCLIENTID operation. The arguments to 10429 SETCLIENTID_CONFIRM are indicated by the notation { c, s }, where c 10430 is a value of type clientid4, and s is a value of type verifier4 10431 corresponding to the setclientid_confirm field. 10433 As with SETCLIENTID, SETCLIENTID_CONFIRM is a non-idempotent 10434 operation, and we assume that the server is implementing the 10435 duplicate request cache (DRC). 10437 When the server gets a SETCLIENTID_CONFIRM { c, s } request, it 10438 processes it in the following manner. 10440 o It first looks up the request in the DRC. If there is a hit, it 10441 returns the result cached in the DRC. The server does not remove 10442 any relevant leased client state nor does it modify any recorded 10443 callback and callback_ident information for client { x } as 10444 represented by the short hand value c. 10446 For a DRC miss, the server checks for client records that match the 10447 short hand value c. The processing cases are as follows: 10449 o The server has recorded an unconfirmed { v, x, c, k, s } record 10450 and a confirmed { v, x, c, l, t } record, such that s != t. If 10451 the principals of the records do not match that of the 10452 SETCLIENTID_CONFIRM, the server returns NFS4ERR_CLID_INUSE, and 10453 no relevant leased client state is removed and no recorded 10454 callback and callback_ident information for client { x } is 10455 changed. Otherwise, the confirmed { v, x, c, l, t } record is 10456 removed and the unconfirmed { v, x, c, k, s } is marked as 10457 confirmed, thereby modifying recorded and confirmed callback and 10458 callback_ident information for client { x }. 10460 The server does not remove any relevant leased client state. 10462 The server returns NFS4_OK. 10464 o The server has not recorded an unconfirmed { v, x, c, *, * } and 10465 has recorded a confirmed { v, x, c, *, s }. If the principals of 10466 the record and of SETCLIENTID_CONFIRM do not match, the server 10467 returns NFS4ERR_CLID_INUSE without removing any relevant leased 10468 client state and without changing recorded callback and 10469 callback_ident values for client { x }. 10471 If the principals match, then what has likely happened is that 10472 the client never got the response from the SETCLIENTID_CONFIRM, 10473 and the DRC entry has been purged. Whatever the scenario, since 10475 Draft Specification NFS version 4 Protocol September 2002 10477 the principals match, as well as { c, s } matching a confirmed 10478 record, the server leaves client x's relevant leased client state 10479 intact, leaves its callback and callback_ident values unmodified, 10480 and returns NFS4_OK. 10482 o The server has not recorded a confirmed { *, *, c, *, * }, and 10483 has recorded an unconfirmed { *, x, c, k, s }. Even if this is a 10484 retry from client, nonetheless the client's first 10485 SETCLIENTID_CONFIRM attempt was not received by the server. 10486 Retry or not, the server doesn't know, but it processes it as if 10487 were a first try. If the principal of the unconfirmed { *, x, c, 10488 k, s } record mismatches that of the SETCLIENTID_CONFIRM request 10489 the server returns NFS4ERR_CLID_INUSE without removing any 10490 relevant leased client state. 10492 Otherwise, the server records a confirmed { *, x, c, k, s }. If 10493 there is also a confirmed { *, x, d, *, t }, the server MUST 10494 remove the client x's relevant leased client state, and overwrite 10495 the callback state with k. The confirmed record { *, x, d, *, t } 10496 is removed. 10498 Server returns NFS4_OK. 10500 o The server has no record of a confirmed or unconfirmed { *, *, c, 10501 *, s }. The server returns NFS4ERR_STALE_CLIENTID. The server 10502 does not remove any relevant leased client state, nor does it 10503 modify any recorded callback and callback_ident information for 10504 any client. 10506 The server needs to cache unconfirmed { v, x, c, k, s } client 10507 records and await for some time their confirmation. As should be 10508 clear from the record processing discussions for SETCLIENTID and 10509 SETCLIENTID_CONFIRM, there are cases where the server does not 10510 deterministically remove unconfirmed client records. To avoid 10511 running out of resources, the server is not required to hold 10512 unconfirmed records indefinitely. One strategy the server might 10513 use is to set a limit on how many unconfirmed client records it 10514 will maintain, and then when the limit would be exceeded, remove 10515 the oldest record. Another strategy might be to remove an 10516 unconfirmed record when some amount of time has elapsed. The choice 10517 of the amount of time is fairly arbitrary but it is surely no 10518 higher than the server's lease time period. Consider that leases 10519 need to be renewed before the lease time expires via an operation 10520 from the client. If the client cannot issue a SETCLIENTID_CONFIRM 10521 after a SETCLIENTID before a period of time equal to that of a 10522 lease expires, then the client is unlikely to be able maintain 10523 state on the server during steady state operation. 10525 If the client does send a SETCLIENTID_CONFIRM for an unconfirmed 10526 record that the server has already deleted, the client will get 10527 NFS4ERR_STALE_CLIENTID back. If so, the client should then start 10528 over, and send SETCLIENTID to reestablish an unconfirmed client 10530 Draft Specification NFS version 4 Protocol September 2002 10532 record and get back an unconfirmed clientid and setclientid_confirm 10533 verifier. The client should then send the SETCLIENTID_CONFIRM to 10534 confirm the clientid. 10536 SETCLIENTID_CONFIRM does not establish or renew a lease. However, 10537 if SETCLIENTID_CONFIRM removes relevant leased client state, and 10538 that state does not include existing delegations, the server MUST 10539 allow the client a period of time no less than the value of 10540 lease_time attribute, to reclaim, (via the CLAIM_DELEGATE_PREV 10541 claim type of the OPEN operation) its delegations before removing 10542 unreclaimed delegations. 10544 ERRORS 10546 NFS4ERR_BADXDR 10547 NFS4ERR_CLID_INUSE 10548 NFS4ERR_RESOURCE 10549 NFS4ERR_SERVERFAULT 10550 NFS4ERR_STALE_CLIENTID 10552 Draft Specification NFS version 4 Protocol September 2002 10554 14.2.35. Operation 37: VERIFY - Verify Same Attributes 10556 SYNOPSIS 10558 (cfh), fattr -> - 10560 ARGUMENT 10562 struct VERIFY4args { 10563 /* CURRENT_FH: object */ 10564 fattr4 obj_attributes; 10565 }; 10567 RESULT 10569 struct VERIFY4res { 10570 nfsstat4 status; 10571 }; 10573 DESCRIPTION 10575 The VERIFY operation is used to verify that attributes have a value 10576 assumed by the client before proceeding with following operations 10577 in the compound request. If any of the attributes do not match 10578 then the error NFS4ERR_NOT_SAME must be returned. The current 10579 filehandle retains its value after successful completion of the 10580 operation. 10582 IMPLEMENTATION 10584 One possible use of the VERIFY operation is the following compound 10585 sequence. With this the client is attempting to verify that the 10586 file being removed will match what the client expects to be 10587 removed. This sequence can help prevent the unintended deletion of 10588 a file. 10590 PUTFH (directory filehandle) 10591 LOOKUP (file name) 10592 VERIFY (filehandle == fh) 10593 PUTFH (directory filehandle) 10594 REMOVE (file name) 10596 This sequence does not prevent a second client from removing and 10597 creating a new file in the middle of this sequence but it does help 10598 avoid the unintended result. 10600 In the case that a recommended attribute is specified in the VERIFY 10601 operation and the server does not support that attribute for the 10602 filesystem object, the error NFS4ERR_ATTRNOTSUPP is returned to the 10604 Draft Specification NFS version 4 Protocol September 2002 10606 client. 10608 When the attribute rdattr_error or any write-only attribute (e.g. 10609 time_modify_set) is specified, the error NFS4ERR_INVAL is returned 10610 to the client. 10612 ERRORS 10614 NFS4ERR_ACCESS 10615 NFS4ERR_ATTRNOTSUPP 10616 NFS4ERR_BADCHAR 10617 NFS4ERR_BADHANDLE 10618 NFS4ERR_BADXDR 10619 NFS4ERR_DELAY 10620 NFS4ERR_FHEXPIRED 10621 NFS4ERR_INVAL 10622 NFS4ERR_MOVED 10623 NFS4ERR_NOFILEHANDLE 10624 NFS4ERR_NOT_SAME 10625 NFS4ERR_RESOURCE 10626 NFS4ERR_SERVERFAULT 10627 NFS4ERR_STALE 10629 Draft Specification NFS version 4 Protocol September 2002 10631 14.2.36. Operation 38: WRITE - Write to File 10633 SYNOPSIS 10635 (cfh), stateid, offset, stable, data -> count, committed, writeverf 10637 ARGUMENT 10639 enum stable_how4 { 10640 UNSTABLE4 = 0, 10641 DATA_SYNC4 = 1, 10642 FILE_SYNC4 = 2 10643 }; 10645 struct WRITE4args { 10646 /* CURRENT_FH: file */ 10647 stateid4 stateid; 10648 offset4 offset; 10649 stable_how4 stable; 10650 opaque data<>; 10651 }; 10653 RESULT 10655 struct WRITE4resok { 10656 count4 count; 10657 stable_how4 committed; 10658 verifier4 writeverf; 10659 }; 10661 union WRITE4res switch (nfsstat4 status) { 10662 case NFS4_OK: 10663 WRITE4resok resok4; 10664 default: 10665 void; 10666 }; 10668 DESCRIPTION 10670 The WRITE operation is used to write data to a regular file. The 10671 target file is specified by the current filehandle. The offset 10672 specifies the offset where the data should be written. An offset 10673 of 0 (zero) specifies that the write should start at the beginning 10674 of the file. The count, as encoded as part of the opaque data 10675 parameter, represents the number of bytes of data that are to be 10676 written. If the count is 0 (zero), the WRITE will succeed and 10677 return a count of 0 (zero) subject to permissions checking. The 10678 server may choose to write fewer bytes than requested by the 10679 client. 10681 Draft Specification NFS version 4 Protocol September 2002 10683 Part of the write request is a specification of how the write is to 10684 be performed. The client specifies with the stable parameter the 10685 method of how the data is to be processed by the server. If stable 10686 is FILE_SYNC4, the server must commit the data written plus all 10687 filesystem metadata to stable storage before returning results. 10688 This corresponds to the NFS version 2 protocol semantics. Any 10689 other behavior constitutes a protocol violation. If stable is 10690 DATA_SYNC4, then the server must commit all of the data to stable 10691 storage and enough of the metadata to retrieve the data before 10692 returning. The server implementor is free to implement DATA_SYNC4 10693 in the same fashion as FILE_SYNC4, but with a possible performance 10694 drop. If stable is UNSTABLE4, the server is free to commit any 10695 part of the data and the metadata to stable storage, including all 10696 or none, before returning a reply to the client. There is no 10697 guarantee whether or when any uncommitted data will subsequently be 10698 committed to stable storage. The only guarantees made by the server 10699 are that it will not destroy any data without changing the value of 10700 verf and that it will not commit the data and metadata at a level 10701 less than that requested by the client. 10703 The stateid value for a WRITE request represents a value returned 10704 from a previous record lock or share reservation request. The 10705 stateid is used by the server to verify that the associated share 10706 reservation and any record locks are still valid and to update 10707 lease timeouts for the client. 10709 Upon successful completion, the following results are returned. 10710 The count result is the number of bytes of data written to the 10711 file. The server may write fewer bytes than requested. If so, the 10712 actual number of bytes written starting at location, offset, is 10713 returned. 10715 The server also returns an indication of the level of commitment of 10716 the data and metadata via committed. If the server committed all 10717 data and metadata to stable storage, committed should be set to 10718 FILE_SYNC4. If the level of commitment was at least as strong as 10719 DATA_SYNC4, then committed should be set to DATA_SYNC4. Otherwise, 10720 committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, 10721 then committed must also be FILE_SYNC4: anything else constitutes a 10722 protocol violation. If stable was DATA_SYNC4, then committed may be 10723 FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol 10724 violation. If stable was UNSTABLE4, then committed may be either 10725 FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. 10727 The final portion of the result is the write verifier. The write 10728 verifier is a cookie that the client can use to determine whether 10729 the server has changed instance (boot) state between a call to 10730 WRITE and a subsequent call to either WRITE or COMMIT. This cookie 10731 must be consistent during a single instance of the NFS version 4 10732 protocol service and must be unique between instances of the NFS 10733 version 4 protocol server, where uncommitted data may be lost. 10735 Draft Specification NFS version 4 Protocol September 2002 10737 If a client writes data to the server with the stable argument set 10738 to UNSTABLE4 and the reply yields a committed response of 10739 DATA_SYNC4 or UNSTABLE4, the client will follow up some time in the 10740 future with a COMMIT operation to synchronize outstanding 10741 asynchronous data and metadata with the server's stable storage, 10742 barring client error. It is possible that due to client crash or 10743 other error that a subsequent COMMIT will not be received by the 10744 server. 10746 For a WRITE with a stateid value of all bits 0, the server MAY 10747 allow the WRITE to be serviced subject to mandatory file locks or 10748 the current share deny modes for the file. For a WRITE with a 10749 stateid value of all bits 1, the server MUST NOT allow the WRITE 10750 operation to bypass locking checks at the server and are treated 10751 exactly the same as if a stateid of all bits 0 were used. 10753 On success, the current filehandle retains its value. 10755 IMPLEMENTATION 10757 It is possible for the server to write fewer bytes of data than 10758 requested by the client. In this case, the server should not 10759 return an error unless no data was written at all. If the server 10760 writes less than the number of bytes specified, the client should 10761 issue another WRITE to write the remaining data. 10763 It is assumed that the act of writing data to a file will cause the 10764 time_modified of the file to be updated. However, the 10765 time_modified of the file should not be changed unless the contents 10766 of the file are changed. Thus, a WRITE request with count set to 0 10767 should not cause the time_modified of the file to be updated. 10769 The definition of stable storage has been historically a point of 10770 contention. The following expected properties of stable storage 10771 may help in resolving design issues in the implementation. Stable 10772 storage is persistent storage that survives: 10774 1. Repeated power failures. 10775 2. Hardware failures (of any board, power supply, etc.). 10776 3. Repeated software crashes, including reboot cycle. 10778 This definition does not address failure of the stable storage 10779 module itself. 10781 The verifier is defined to allow a client to detect different 10782 instances of an NFS version 4 protocol server over which cached, 10783 uncommitted data may be lost. In the most likely case, the verifier 10784 allows the client to detect server reboots. This information is 10785 required so that the client can safely determine whether the server 10787 Draft Specification NFS version 4 Protocol September 2002 10789 could have lost cached data. If the server fails unexpectedly and 10790 the client has uncommitted data from previous WRITE requests (done 10791 with the stable argument set to UNSTABLE4 and in which the result 10792 committed was returned as UNSTABLE4 as well) it may not have 10793 flushed cached data to stable storage. The burden of recovery is on 10794 the client and the client will need to retransmit the data to the 10795 server. 10797 A suggested verifier would be to use the time that the server was 10798 booted or the time the server was last started (if restarting the 10799 server without a reboot results in lost buffers). 10801 The committed field in the results allows the client to do more 10802 effective caching. If the server is committing all WRITE requests 10803 to stable storage, then it should return with committed set to 10804 FILE_SYNC4, regardless of the value of the stable field in the 10805 arguments. A server that uses an NVRAM accelerator may choose to 10806 implement this policy. The client can use this to increase the 10807 effectiveness of the cache by discarding cached data that has 10808 already been committed on the server. 10810 Some implementations may return NFS4ERR_NOSPC instead of 10811 NFS4ERR_DQUOT when a user's quota is exceeded. In the case that 10812 the current filehandle is a directory, the server will return 10813 NFS4ERR_ISDIR. If the current filehandle is not a regular file or 10814 a directory, the server will return NFS4ERR_INVAL. 10816 If mandatory file locking is on for the file, and corresponding 10817 record of the data to be written file is read or write locked by an 10818 owner that is not associated with the stateid, the server will 10819 return NFS4ERR_LOCKED. If so, the client must check if the owner 10820 corresponding to the stateid used with the WRITE operation has a 10821 conflicting read lock that overlaps with the region that was to be 10822 written. If the stateid's owner has no conflicting read lock, then 10823 the client should try to get the appropriate write record lock via 10824 the LOCK operation before re-attempting the WRITE. When the WRITE 10825 completes, the client should release the record lock via LOCKU. 10827 If the stateid's owner had a conflicting read lock, then the client 10828 has no choice but to return an error to the application that 10829 attempted the WRITE. The reason is that since the stateid's owner 10830 had a read lock, the server either attempted to temporarily 10831 effectively upgrade this read lock to a write lock, or the server 10832 has no upgrade capability. If the server attempted to upgrade the 10833 read lock and failed, it is pointless for the client to re-attempt 10834 the upgrade via the LOCK operation, because there might be another 10835 client also trying to upgrade. If two clients are blocked trying 10836 upgrade the same lock, the clients deadlock. If the server has no 10837 upgrade capability, then it is pointless to try a LOCK operation to 10838 upgrade. 10840 Draft Specification NFS version 4 Protocol September 2002 10842 ERRORS 10844 NFS4ERR_ACCESS 10845 NFS4ERR_ADMIN_REVOKED 10846 NFS4ERR_BADHANDLE 10847 NFS4ERR_BAD_STATEID 10848 NFS4ERR_BADXDR 10849 NFS4ERR_DELAY 10850 NFS4ERR_DQUOT 10851 NFS4ERR_EXPIRED 10852 NFS4ERR_FBIG 10853 NFS4ERR_FHEXPIRED 10854 NFS4ERR_GRACE 10855 NFS4ERR_INVAL 10856 NFS4ERR_IO 10857 NFS4ERR_ISDIR 10858 NFS4ERR_LEASE_MOVED 10859 NFS4ERR_LOCKED 10860 NFS4ERR_MOVED 10861 NFS4ERR_NOFILEHANDLE 10862 NFS4ERR_NOSPC 10863 NFS4ERR_NXIO 10864 NFS4ERR_OLD_STATEID 10865 NFS4ERR_OPENMODE 10866 NFS4ERR_RESOURCE 10867 NFS4ERR_ROFS 10868 NFS4ERR_SERVERFAULT 10869 NFS4ERR_STALE 10870 NFS4ERR_STALE_STATEID 10872 Draft Specification NFS version 4 Protocol September 2002 10874 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner State 10876 SYNOPSIS 10878 lockowner -> () 10880 ARGUMENT 10882 struct RELEASE_LOCKOWNER4args { 10883 lock_owner4 lock_owner; 10884 }; 10886 RESULT 10888 struct RELEASE_LOCKOWNER4res { 10889 nfsstat4 status; 10890 }; 10892 DESCRIPTION 10894 This operation is used to notify the server that the lock_owner is 10895 no longer in use by the client. This allows the server to release 10896 cached state related to the specified lock_owner. If file locks, 10897 associated with the lock_owner, are held at the server, the error 10898 NFS4ERR_LOCKS_HELD will be returned and no further action will be 10899 taken. 10901 IMPLEMENTATION 10903 The client may choose to use this operation to ease the amount of 10904 server state that is held. Depending on behavior of applications 10905 at the client, it may be important for the client to use this 10906 operation since the server has certain obligations with respect to 10907 holding a reference to a lock_owner as long as the associated file 10908 is open. Therefore, if the client knows for certain that the 10909 lock_owner will no longer be used under the context of the 10910 associated open_owner4, it should use RELEASE_LOCKOWNER. 10912 ERRORS 10914 NFS4ERR_ADMIN_REVOKED 10915 NFS4ERR_BADXDR 10916 NFS4ERR_EXPIRED 10917 NFS4ERR_LEASE_MOVED 10918 NFS4ERR_LOCKS_HELD 10919 NFS4ERR_RESOURCE 10920 NFS4ERR_SERVERFAULT 10922 Draft Specification NFS version 4 Protocol September 2002 10924 NFS4ERR_STALE_CLIENTID 10926 Draft Specification NFS version 4 Protocol September 2002 10928 14.2.38. Operation 10044: ILLEGAL - Illegal operation 10930 SYNOPSIS 10932 -> () 10934 ARGUMENT 10936 void; 10938 RESULT 10940 struct ILLEGAL4res { 10941 nfsstat4 status; 10942 }; 10944 DESCRIPTION 10946 This operation is a placeholder for encoding a result to handle the 10947 case of the client sending an operation code within COMPOUND that 10948 is not supported. See the COMPOUND procedure description for more 10949 details. 10951 The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. 10953 IMPLEMENTATION 10955 A client will probably not send an operation with code OP_ILLEGAL 10956 but if it does, the response will be ILLEGAL4res just as it would 10957 be with any other invalid operation code. Note that if the server 10958 gets an illegal operation code that is not OP_ILLEGAL, and if the 10959 server checks for legal operation codes during the XDR decode 10960 phase, then the ILLEGAL4res would not be returned. 10962 ERRORS 10964 NFS4ERR_OP_ILLEGAL 10966 Draft Specification NFS version 4 Protocol September 2002 10968 15. NFS version 4 Callback Procedures 10970 The procedures used for callbacks are defined in the following 10971 sections. In the interest of clarity, the terms "client" and 10972 "server" refer to NFS clients and servers, despite the fact that for 10973 an individual callback RPC, the sense of these terms would be 10974 precisely the opposite. 10976 15.1. Procedure 0: CB_NULL - No Operation 10978 SYNOPSIS 10980 10982 ARGUMENT 10984 void; 10986 RESULT 10988 void; 10990 DESCRIPTION 10992 Standard NULL procedure. Void argument, void response. Even 10993 though there is no direct functionality associated with this 10994 procedure, the server will use CB_NULL to confirm the existence of 10995 a path for RPCs from server to client. 10997 ERRORS 10999 None. 11001 Draft Specification NFS version 4 Protocol September 2002 11003 15.2. Procedure 1: CB_COMPOUND - Compound Operations 11005 SYNOPSIS 11007 compoundargs -> compoundres 11009 ARGUMENT 11011 enum nfs_cb_opnum4 { 11012 OP_CB_GETATTR = 3, 11013 OP_CB_RECALL = 4, 11014 OP_CB_ILLEGAL = 10044 11015 }; 11017 union nfs_cb_argop4 switch (unsigned argop) { 11018 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 11019 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 11020 case OP_CB_ILLEGAL: void opcbillegal; 11021 }; 11023 struct CB_COMPOUND4args { 11024 utf8string tag; 11025 uint32_t minorversion; 11026 uint32_t callback_ident; 11027 nfs_cb_argop4 argarray<>; 11028 }; 11030 RESULT 11032 union nfs_cb_resop4 switch (unsigned resop){ 11033 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 11034 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 11035 }; 11037 struct CB_COMPOUND4res { 11038 nfsstat4 status; 11039 utf8string tag; 11040 nfs_cb_resop4 resarray<>; 11041 }; 11043 DESCRIPTION 11045 The CB_COMPOUND procedure is used to combine one or more of the 11046 callback procedures into a single RPC request. The main callback 11047 RPC program has two main procedures: CB_NULL and CB_COMPOUND. All 11048 other operations use the CB_COMPOUND procedure as a wrapper. 11050 In the processing of the CB_COMPOUND procedure, the client may find 11051 that it does not have the available resources to execute any or all 11053 Draft Specification NFS version 4 Protocol September 2002 11055 of the operations within the CB_COMPOUND sequence. In this case, 11056 the error NFS4ERR_RESOURCE will be returned for the particular 11057 operation within the CB_COMPOUND procedure where the resource 11058 exhaustion occurred. This assumes that all previous operations 11059 within the CB_COMPOUND sequence have been evaluated successfully. 11061 Contained within the CB_COMPOUND results is a 'status' field. This 11062 status must be equivalent to the status of the last operation that 11063 was executed within the CB_COMPOUND procedure. Therefore, if an 11064 operation incurred an error then the 'status' value will be the 11065 same error value as is being returned for the operation that 11066 failed. 11068 For the definition of the "tag" field, see the section "Procedure 11069 1: COMPOUND - Compound Operations". 11071 The value of callback_ident is supplied by the client during 11072 SETCLIENTID. The server must use the client supplied 11073 callback_ident during the CB_COMPOUND to allow the client to 11074 properly identify the server. 11076 Illegal operation codes are handled in the same way as they are 11077 handled for the COMPOUND procedure. 11079 IMPLEMENTATION 11081 The CB_COMPOUND procedure is used to combine individual operations 11082 into a single RPC request. The client interprets each of the 11083 operations in turn. If an operation is executed by the client and 11084 the status of that operation is NFS4_OK, then the next operation in 11085 the CB_COMPOUND procedure is executed. The client continues this 11086 process until there are no more operations to be executed or one of 11087 the operations has a status value other than NFS4_OK. 11089 ERRORS 11091 NFS4ERR_BADHANDLE 11092 NFS4ERR_BAD_STATEID 11093 NFS4ERR_BADXDR 11094 NFS4ERR_OP_ILLEGAL 11095 NFS4ERR_RESOURCE 11096 NFS4ERR_SERVERFAULT 11098 Draft Specification NFS version 4 Protocol September 2002 11100 15.2.1. Operation 3: CB_GETATTR - Get Attributes 11102 SYNOPSIS 11104 fh, attr_request -> attrmask, attr_vals 11106 ARGUMENT 11108 struct CB_GETATTR4args { 11109 nfs_fh4 fh; 11110 bitmap4 attr_request; 11111 }; 11113 RESULT 11115 struct CB_GETATTR4resok { 11116 fattr4 obj_attributes; 11117 }; 11119 union CB_GETATTR4res switch (nfsstat4 status) { 11120 case NFS4_OK: 11121 CB_GETATTR4resok resok4; 11122 default: 11123 void; 11124 }; 11126 DESCRIPTION 11128 The CB_GETATTR operation is used by the server to obtain the 11129 current modified state of a file that has been write delegated. 11130 The attributes size and change are the only ones guaranteed to be 11131 serviced by the client. See the section "Handling of CB_GETATTR" 11132 for a full description of how the client and server are to interact 11133 with the use of CB_GETATTR. 11135 If the filehandle specified is not one for which the client holds a 11136 write open delegation, an NFS4ERR_BADHANDLE error is returned. 11138 IMPLEMENTATION 11140 The client returns attrmask bits and the associated attribute 11141 values only for the change attribute, and attributes that it may 11142 change (time_modify, and size). 11144 ERRORS 11146 Draft Specification NFS version 4 Protocol September 2002 11148 NFS4ERR_BADHANDLE 11149 NFS4ERR_BADXDR 11150 NFS4ERR_RESOURCE 11151 NFS4ERR_SERVERFAULT 11153 Draft Specification NFS version 4 Protocol September 2002 11155 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation 11157 SYNOPSIS 11159 stateid, truncate, fh -> () 11161 ARGUMENT 11163 struct CB_RECALL4args { 11164 stateid4 stateid; 11165 bool truncate; 11166 nfs_fh4 fh; 11167 }; 11169 RESULT 11171 struct CB_RECALL4res { 11172 nfsstat4 status; 11173 }; 11175 DESCRIPTION 11177 The CB_RECALL operation is used to begin the process of recalling 11178 an open delegation and returning it to the server. 11180 The truncate flag is used to optimize recall for a file which is 11181 about to be truncated to zero. When it is set, the client is freed 11182 of obligation to propagate modified data for the file to the 11183 server, since this data is irrelevant. 11185 If the handle specified is not one for which the client holds an 11186 open delegation, an NFS4ERR_BADHANDLE error is returned. 11188 If the stateid specified is not one corresponding to an open 11189 delegation for the file specified by the filehandle, an 11190 NFS4ERR_BAD_STATEID is returned. 11192 IMPLEMENTATION 11194 The client should reply to the callback immediately. Replying does 11195 not complete the recall except when an error was returned. The 11196 recall is not complete until the delegation is returned using a 11197 DELEGRETURN. 11199 ERRORS 11201 NFS4ERR_BADHANDLE 11203 Draft Specification NFS version 4 Protocol September 2002 11205 NFS4ERR_BAD_STATEID 11206 NFS4ERR_BADXDR 11207 NFS4ERR_RESOURCE 11208 NFS4ERR_SERVERFAULT 11210 Draft Specification NFS version 4 Protocol September 2002 11212 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback Operation 11214 SYNOPSIS 11216 -> () 11218 ARGUMENT 11220 void; 11222 RESULT 11224 struct CB_ILLEGAL4res { 11225 nfsstat4 status; 11226 }; 11228 DESCRIPTION 11230 This operation is a placeholder for encoding a result to handle the 11231 case of the client sending an operation code within COMPOUND that 11232 is not supported. See the COMPOUND procedure description for more 11233 details. 11235 The status field of CB_ILLEGAL4res MUST be set to 11236 NFS4ERR_OP_ILLEGAL. 11238 IMPLEMENTATION 11240 A server will probably not send an operation with code 11241 OP_CB_ILLEGAL but if it does, the response will be CB_ILLEGAL4res 11242 just as it would be with any other invalid operation code. Note 11243 that if the client gets an illegal operation code that is not 11244 OP_ILLEGAL, and if the client checks for legal operation codes 11245 during the XDR decode phase, then the CB_ILLEGAL4res would not be 11246 returned. 11248 ERRORS 11250 NFS4ERR_OP_ILLEGAL 11252 Draft Specification NFS version 4 Protocol September 2002 11254 16. Security Considerations 11256 The major security feature to consider is the authentication of the 11257 user making the request of NFS service. Consideration should also be 11258 given to the integrity and privacy of this NFS request. These 11259 specific issues are discussed as part of the section on "RPC and 11260 Security Flavor". 11262 For reasons of reduced administration overhead, better performance 11263 and/or reduction of CPU utilization, users of NFS version 4 11264 implementations may choose to not use security mechanisms that enable 11265 integrity protection on each remote procedure call and response. The 11266 use of mechanisms without integrity leaves the customer vulnerable to 11267 an attacker in between the NFS client and server that modifies the 11268 RPC request and/or the response. While implementations are free to 11269 provide the option to use weaker security mechanisms, there are two 11270 operations in particular that warrant the implementation overriding 11271 user choices. 11273 The first such operation is SECINFO. It is recommended that the 11274 client issue the SECINFO call such that it is protected with a 11275 security flavor that has integrity protection, such as RPCSEC_GSS 11276 with a security triple that uses either rpc_gss_svc_integrity or 11277 rpc_gss_svc_privacy (rpc_gss_svc_privacy includes integrity 11278 protection) service. Without integrity protection encapsulating 11279 SECINFO and therefore its results, an attacker in the middle could 11280 modify results such that the client might select a weaker algorithm 11281 in the set allowed by server, making the client and/or server 11282 vulnerable to further attacks. 11284 The second operation that should definitely use integrity protection 11285 is any GETATTR for the fs_locations attribute. The attack has two 11286 steps. First the attacker modifies the unprotected results of some 11287 operation to return NFS4ERR_MOVED. Second, when the client follows up 11288 with a GETATTR for the fs_locations attribute, the attacker modifies 11289 the results to cause the client migrate its traffic to a server 11290 controlled by the attacker. 11292 Because the operations SETCLIENTID/SETCLIENTID_CONFIRM are 11293 responsible for the release of client state, it is imperative that 11294 the principal used for these operations is checked against and match 11295 the previous use of these operations. See the section "Client ID" 11296 for further discussion. 11298 Draft Specification NFS version 4 Protocol September 2002 11300 17. IANA Considerations 11302 17.1. Named Attribute Definition 11304 The NFS version 4 protocol provides for the association of named 11305 attributes to files. The name space identifiers for these attributes 11306 are defined as string names. The protocol does not define the 11307 specific assignment of the name space for these file attributes; the 11308 application developer or system vendor is allowed to define the 11309 attribute, its semantics, and the associated name. Even though this 11310 name space will not be specifically controlled to prevent collisions, 11311 the application developer or system vendor is strongly encouraged 11312 register its named attributes with IANA, and provide the name 11313 assignment and associated semantics for attributes via an 11314 Informational RFC. This will provide for interoperability where 11315 common interests exist. 11317 17.2. ONC RPC Network Identifiers (netids) 11319 The section "Structured Data Types" discussed the r_netid field and 11320 the corresponding r_addr field of a clientaddr4 structure. There 11321 should be a registry at IANA for netids and the corresponding 11322 universal address format corresponding to the native address format 11323 for the transport represented by a netid. 11325 Draft Specification NFS version 4 Protocol September 2002 11327 18. RPC definition file 11329 /* 11330 * Copyright (C) The Internet Society (1998,1999,2000,2001,2002). 11331 * All Rights Reserved. 11332 */ 11334 /* 11335 * nfs4_prot.x 11336 * 11337 */ 11339 %#pragma ident "@(#)nfs4_prot.x 1.119" 11341 /* 11342 * Basic typedefs for RFC 1832 data type definitions 11343 */ 11344 typedef int int32_t; 11345 typedef unsigned int uint32_t; 11346 typedef hyper int64_t; 11347 typedef unsigned hyper uint64_t; 11349 /* 11350 * Sizes 11351 */ 11352 const NFS4_FHSIZE = 128; 11353 const NFS4_VERIFIER_SIZE = 8; 11354 const NFS4_OPAQUE_LIMIT = 1024; 11356 /* 11357 * File types 11358 */ 11359 enum nfs_ftype4 { 11360 NF4REG = 1, /* Regular File */ 11361 NF4DIR = 2, /* Directory */ 11362 NF4BLK = 3, /* Special File - block device */ 11363 NF4CHR = 4, /* Special File - character device */ 11364 NF4LNK = 5, /* Symbolic Link */ 11365 NF4SOCK = 6, /* Special File - socket */ 11366 NF4FIFO = 7, /* Special File - fifo */ 11367 NF4ATTRDIR = 8, /* Attribute Directory */ 11368 NF4NAMEDATTR = 9 /* Named Attribute */ 11369 }; 11371 /* 11372 * Error status 11373 */ 11374 enum nfsstat4 { 11375 NFS4_OK = 0, /* everything is okay */ 11376 NFS4ERR_PERM = 1, /* caller not privileged */ 11377 NFS4ERR_NOENT = 2, /* no such file/directory */ 11378 NFS4ERR_IO = 5, /* hard I/O error */ 11380 Draft Specification NFS version 4 Protocol September 2002 11382 NFS4ERR_NXIO = 6, /* no such device */ 11383 NFS4ERR_ACCESS = 13, /* access denied */ 11384 NFS4ERR_EXIST = 17, /* file already exists */ 11385 NFS4ERR_XDEV = 18, /* different filesystems */ 11386 /* Unused/reserved 19 */ 11387 NFS4ERR_NOTDIR = 20, /* should be a directory */ 11388 NFS4ERR_ISDIR = 21, /* should not be directory */ 11389 NFS4ERR_INVAL = 22, /* invalid argument */ 11390 NFS4ERR_FBIG = 27, /* file exceeds server max */ 11391 NFS4ERR_NOSPC = 28, /* no space on filesystem */ 11392 NFS4ERR_ROFS = 30, /* read-only filesystem */ 11393 NFS4ERR_MLINK = 31, /* too many hard links */ 11394 NFS4ERR_NAMETOOLONG = 63, /* name exceeds server max */ 11395 NFS4ERR_NOTEMPTY = 66, /* directory not empty */ 11396 NFS4ERR_DQUOT = 69, /* hard quota limit reached*/ 11397 NFS4ERR_STALE = 70, /* file no longer exists */ 11398 NFS4ERR_BADHANDLE = 10001,/* Illegal filehandle */ 11399 NFS4ERR_BAD_COOKIE = 10003,/* READDIR cookie is stale */ 11400 NFS4ERR_NOTSUPP = 10004,/* operation not supported */ 11401 NFS4ERR_TOOSMALL = 10005,/* response limit exceeded */ 11402 NFS4ERR_SERVERFAULT = 10006,/* undefined server error */ 11403 NFS4ERR_BADTYPE = 10007,/* type invalid for CREATE */ 11404 NFS4ERR_DELAY = 10008,/* file "busy" - retry */ 11405 NFS4ERR_SAME = 10009,/* nverify says attrs same */ 11406 NFS4ERR_DENIED = 10010,/* lock unavailable */ 11407 NFS4ERR_EXPIRED = 10011,/* lock lease expired */ 11408 NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */ 11409 NFS4ERR_GRACE = 10013,/* in grace period */ 11410 NFS4ERR_FHEXPIRED = 10014,/* filehandle expired */ 11411 NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */ 11412 NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */ 11413 NFS4ERR_CLID_INUSE = 10017,/* clientid in use */ 11414 NFS4ERR_RESOURCE = 10018,/* resource exhaustion */ 11415 NFS4ERR_MOVED = 10019,/* filesystem relocated */ 11416 NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */ 11417 NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ 11418 NFS4ERR_STALE_CLIENTID = 10022,/* server has rebooted */ 11419 NFS4ERR_STALE_STATEID = 10023,/* server has rebooted */ 11420 NFS4ERR_OLD_STATEID = 10024,/* state is out of sync */ 11421 NFS4ERR_BAD_STATEID = 10025,/* incorrect stateid */ 11422 NFS4ERR_BAD_SEQID = 10026,/* request is out of seq. */ 11423 NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */ 11424 NFS4ERR_LOCK_RANGE = 10028,/* lock range not supported*/ 11425 NFS4ERR_SYMLINK = 10029,/* should be file/directory*/ 11426 NFS4ERR_RESTOREFH = 10030,/* no saved filehandle */ 11427 NFS4ERR_LEASE_MOVED = 10031,/* some filesystem moved */ 11428 NFS4ERR_ATTRNOTSUPP = 10032,/* recommended attr not sup*/ 11429 NFS4ERR_NO_GRACE = 10033,/* reclaim outside of grace*/ 11430 NFS4ERR_RECLAIM_BAD = 10034,/* reclaim error at server */ 11431 NFS4ERR_RECLAIM_CONFLICT = 10035,/* conflict on reclaim */ 11432 NFS4ERR_BADXDR = 10036,/* XDR decode failed */ 11433 NFS4ERR_LOCKS_HELD = 10037,/* file locks held at CLOSE*/ 11435 Draft Specification NFS version 4 Protocol September 2002 11437 NFS4ERR_OPENMODE = 10038,/* conflict in OPEN and I/O*/ 11438 NFS4ERR_BADOWNER = 10039,/* owner translation bad */ 11439 NFS4ERR_BADCHAR = 10040,/* utf-8 char not supported*/ 11440 NFS4ERR_BADNAME = 10041,/* name not supported */ 11441 NFS4ERR_BAD_RANGE = 10042,/* lock range not supported*/ 11442 NFS4ERR_LOCK_NOTSUPP = 10043,/* no atomic up/downgrade */ 11443 NFS4ERR_OP_ILLEGAL = 10044,/* undefined operation */ 11444 NFS4ERR_DEADLOCK = 10045,/* file locking deadlock */ 11445 NFS4ERR_FILE_OPEN = 10046,/* open file blocks op. */ 11446 NFS4ERR_ADMIN_REVOKED = 10047,/* lockowner state revoked */ 11447 NFS4ERR_CB_PATH_DOWN = 10048 /* callback path down */ 11448 }; 11450 /* 11451 * Basic data types 11452 */ 11453 typedef uint32_t bitmap4<>; 11454 typedef uint64_t offset4; 11455 typedef uint32_t count4; 11456 typedef uint64_t length4; 11457 typedef uint64_t clientid4; 11458 typedef uint32_t seqid4; 11459 typedef opaque utf8string<>; 11460 typedef utf8string component4; 11461 typedef component4 pathname4<>; 11462 typedef uint64_t nfs_lockid4; 11463 typedef uint64_t nfs_cookie4; 11464 typedef utf8string linktext4; 11465 typedef opaque sec_oid4<>; 11466 typedef uint32_t qop4; 11467 typedef uint32_t mode4; 11468 typedef uint64_t changeid4; 11469 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 11471 /* 11472 * Timeval 11473 */ 11474 struct nfstime4 { 11475 int64_t seconds; 11476 uint32_t nseconds; 11477 }; 11479 enum time_how4 { 11480 SET_TO_SERVER_TIME4 = 0, 11481 SET_TO_CLIENT_TIME4 = 1 11482 }; 11484 union settime4 switch (time_how4 set_it) { 11485 case SET_TO_CLIENT_TIME4: 11486 nfstime4 time; 11487 default: 11488 void; 11490 Draft Specification NFS version 4 Protocol September 2002 11492 }; 11494 /* 11495 * File access handle 11496 */ 11497 typedef opaque nfs_fh4; 11499 /* 11500 * File attribute definitions 11501 */ 11503 /* 11504 * FSID structure for major/minor 11505 */ 11506 struct fsid4 { 11507 uint64_t major; 11508 uint64_t minor; 11509 }; 11511 /* 11512 * Filesystem locations attribute for relocation/migration 11513 */ 11514 struct fs_location4 { 11515 utf8string server<>; 11516 pathname4 rootpath; 11517 }; 11519 struct fs_locations4 { 11520 pathname4 fs_root; 11521 fs_location4 locations<>; 11522 }; 11524 /* 11525 * Various Access Control Entry definitions 11526 */ 11528 /* 11529 * Mask that indicates which Access Control Entries are supported. 11530 * Values for the fattr4_aclsupport attribute. 11531 */ 11532 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 11533 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 11534 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 11535 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 11537 typedef uint32_t acetype4; 11539 /* 11540 * acetype4 values, others can be added as needed. 11541 */ 11543 Draft Specification NFS version 4 Protocol September 2002 11545 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 11546 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 11547 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 11548 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 11550 /* 11551 * ACE flag 11552 */ 11553 typedef uint32_t aceflag4; 11555 /* 11556 * ACE flag values 11557 */ 11558 const ACE4_FILE_INHERIT_ACE = 0x00000001; 11559 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 11560 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 11561 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 11562 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 11563 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 11564 const ACE4_IDENTIFIER_GROUP = 0x00000040; 11566 /* 11567 * ACE mask 11568 */ 11569 typedef uint32_t acemask4; 11571 /* 11572 * ACE mask values 11573 */ 11574 const ACE4_READ_DATA = 0x00000001; 11575 const ACE4_LIST_DIRECTORY = 0x00000001; 11576 const ACE4_WRITE_DATA = 0x00000002; 11577 const ACE4_ADD_FILE = 0x00000002; 11578 const ACE4_APPEND_DATA = 0x00000004; 11579 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 11580 const ACE4_READ_NAMED_ATTRS = 0x00000008; 11581 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 11582 const ACE4_EXECUTE = 0x00000020; 11583 const ACE4_DELETE_CHILD = 0x00000040; 11584 const ACE4_READ_ATTRIBUTES = 0x00000080; 11585 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 11587 const ACE4_DELETE = 0x00010000; 11588 const ACE4_READ_ACL = 0x00020000; 11589 const ACE4_WRITE_ACL = 0x00040000; 11590 const ACE4_WRITE_OWNER = 0x00080000; 11591 const ACE4_SYNCHRONIZE = 0x00100000; 11593 /* 11594 * ACE4_GENERIC_READ -- defined as combination of 11596 Draft Specification NFS version 4 Protocol September 2002 11598 * ACE4_READ_ACL | 11599 * ACE4_READ_DATA | 11600 * ACE4_READ_ATTRIBUTES | 11601 * ACE4_SYNCHRONIZE 11602 */ 11604 const ACE4_GENERIC_READ = 0x00120081; 11606 /* 11607 * ACE4_GENERIC_WRITE -- defined as combination of 11608 * ACE4_READ_ACL | 11609 * ACE4_WRITE_DATA | 11610 * ACE4_WRITE_ATTRIBUTES | 11611 * ACE4_WRITE_ACL | 11612 * ACE4_APPEND_DATA | 11613 * ACE4_SYNCHRONIZE 11614 */ 11615 const ACE4_GENERIC_WRITE = 0x00160106; 11617 /* 11618 * ACE4_GENERIC_EXECUTE -- defined as combination of 11619 * ACE4_READ_ACL 11620 * ACE4_READ_ATTRIBUTES 11621 * ACE4_EXECUTE 11622 * ACE4_SYNCHRONIZE 11623 */ 11624 const ACE4_GENERIC_EXECUTE = 0x001200A0; 11626 /* 11627 * Access Control Entry definition 11628 */ 11629 struct nfsace4 { 11630 acetype4 type; 11631 aceflag4 flag; 11632 acemask4 access_mask; 11633 utf8string who; 11634 }; 11636 /* 11637 * Field definitions for the fattr4_mode attribute 11638 */ 11639 const MODE4_SUID = 0x800; /* set user id on execution */ 11640 const MODE4_SGID = 0x400; /* set group id on execution */ 11641 const MODE4_SVTX = 0x200; /* save text even after use */ 11642 const MODE4_RUSR = 0x100; /* read permission: owner */ 11643 const MODE4_WUSR = 0x080; /* write permission: owner */ 11644 const MODE4_XUSR = 0x040; /* execute permission: owner */ 11645 const MODE4_RGRP = 0x020; /* read permission: group */ 11646 const MODE4_WGRP = 0x010; /* write permission: group */ 11647 const MODE4_XGRP = 0x008; /* execute permission: group */ 11649 Draft Specification NFS version 4 Protocol September 2002 11651 const MODE4_ROTH = 0x004; /* read permission: other */ 11652 const MODE4_WOTH = 0x002; /* write permission: other */ 11653 const MODE4_XOTH = 0x001; /* execute permission: other */ 11655 /* 11656 * Special data/attribute associated with 11657 * file types NF4BLK and NF4CHR. 11658 */ 11659 struct specdata4 { 11660 uint32_t specdata1; /* major device number */ 11661 uint32_t specdata2; /* minor device number */ 11662 }; 11664 /* 11665 * Values for fattr4_fh_expire_type 11666 */ 11667 const FH4_PERSISTENT = 0x00000000; 11668 const FH4_NOEXPIRE_WITH_OPEN = 0x00000001; 11669 const FH4_VOLATILE_ANY = 0x00000002; 11670 const FH4_VOL_MIGRATION = 0x00000004; 11671 const FH4_VOL_RENAME = 0x00000008; 11673 typedef bitmap4 fattr4_supported_attrs; 11674 typedef nfs_ftype4 fattr4_type; 11675 typedef uint32_t fattr4_fh_expire_type; 11676 typedef changeid4 fattr4_change; 11677 typedef uint64_t fattr4_size; 11678 typedef bool fattr4_link_support; 11679 typedef bool fattr4_symlink_support; 11680 typedef bool fattr4_named_attr; 11681 typedef fsid4 fattr4_fsid; 11682 typedef bool fattr4_unique_handles; 11683 typedef uint32_t fattr4_lease_time; 11684 typedef nfsstat4 fattr4_rdattr_error; 11686 typedef nfsace4 fattr4_acl<>; 11687 typedef uint32_t fattr4_aclsupport; 11688 typedef bool fattr4_archive; 11689 typedef bool fattr4_cansettime; 11690 typedef bool fattr4_case_insensitive; 11691 typedef bool fattr4_case_preserving; 11692 typedef bool fattr4_chown_restricted; 11693 typedef uint64_t fattr4_fileid; 11694 typedef uint64_t fattr4_files_avail; 11695 typedef nfs_fh4 fattr4_filehandle; 11696 typedef uint64_t fattr4_files_free; 11697 typedef uint64_t fattr4_files_total; 11698 typedef fs_locations4 fattr4_fs_locations; 11699 typedef bool fattr4_hidden; 11700 typedef bool fattr4_homogeneous; 11701 typedef uint64_t fattr4_maxfilesize; 11703 Draft Specification NFS version 4 Protocol September 2002 11705 typedef uint32_t fattr4_maxlink; 11706 typedef uint32_t fattr4_maxname; 11707 typedef uint64_t fattr4_maxread; 11708 typedef uint64_t fattr4_maxwrite; 11709 typedef utf8string fattr4_mimetype; 11710 typedef mode4 fattr4_mode; 11711 typedef uint64_t fattr4_mounted_on_fileid; 11712 typedef bool fattr4_no_trunc; 11713 typedef uint32_t fattr4_numlinks; 11714 typedef utf8string fattr4_owner; 11715 typedef utf8string fattr4_owner_group; 11716 typedef uint64_t fattr4_quota_avail_hard; 11717 typedef uint64_t fattr4_quota_avail_soft; 11718 typedef uint64_t fattr4_quota_used; 11719 typedef specdata4 fattr4_rawdev; 11720 typedef uint64_t fattr4_space_avail; 11721 typedef uint64_t fattr4_space_free; 11722 typedef uint64_t fattr4_space_total; 11723 typedef uint64_t fattr4_space_used; 11724 typedef bool fattr4_system; 11725 typedef nfstime4 fattr4_time_access; 11726 typedef settime4 fattr4_time_access_set; 11727 typedef nfstime4 fattr4_time_backup; 11728 typedef nfstime4 fattr4_time_create; 11729 typedef nfstime4 fattr4_time_delta; 11730 typedef nfstime4 fattr4_time_metadata; 11731 typedef nfstime4 fattr4_time_modify; 11732 typedef settime4 fattr4_time_modify_set; 11734 /* 11735 * Mandatory Attributes 11736 */ 11737 const FATTR4_SUPPORTED_ATTRS = 0; 11738 const FATTR4_TYPE = 1; 11739 const FATTR4_FH_EXPIRE_TYPE = 2; 11740 const FATTR4_CHANGE = 3; 11741 const FATTR4_SIZE = 4; 11742 const FATTR4_LINK_SUPPORT = 5; 11743 const FATTR4_SYMLINK_SUPPORT = 6; 11744 const FATTR4_NAMED_ATTR = 7; 11745 const FATTR4_FSID = 8; 11746 const FATTR4_UNIQUE_HANDLES = 9; 11747 const FATTR4_LEASE_TIME = 10; 11748 const FATTR4_RDATTR_ERROR = 11; 11749 const FATTR4_FILEHANDLE = 19; 11751 /* 11752 * Recommended Attributes 11753 */ 11754 const FATTR4_ACL = 12; 11755 const FATTR4_ACLSUPPORT = 13; 11757 Draft Specification NFS version 4 Protocol September 2002 11759 const FATTR4_ARCHIVE = 14; 11760 const FATTR4_CANSETTIME = 15; 11761 const FATTR4_CASE_INSENSITIVE = 16; 11762 const FATTR4_CASE_PRESERVING = 17; 11763 const FATTR4_CHOWN_RESTRICTED = 18; 11764 const FATTR4_FILEID = 20; 11765 const FATTR4_FILES_AVAIL = 21; 11766 const FATTR4_FILES_FREE = 22; 11767 const FATTR4_FILES_TOTAL = 23; 11768 const FATTR4_FS_LOCATIONS = 24; 11769 const FATTR4_HIDDEN = 25; 11770 const FATTR4_HOMOGENEOUS = 26; 11771 const FATTR4_MAXFILESIZE = 27; 11772 const FATTR4_MAXLINK = 28; 11773 const FATTR4_MAXNAME = 29; 11774 const FATTR4_MAXREAD = 30; 11775 const FATTR4_MAXWRITE = 31; 11776 const FATTR4_MIMETYPE = 32; 11777 const FATTR4_MODE = 33; 11778 const FATTR4_NO_TRUNC = 34; 11779 const FATTR4_NUMLINKS = 35; 11780 const FATTR4_OWNER = 36; 11781 const FATTR4_OWNER_GROUP = 37; 11782 const FATTR4_QUOTA_AVAIL_HARD = 38; 11783 const FATTR4_QUOTA_AVAIL_SOFT = 39; 11784 const FATTR4_QUOTA_USED = 40; 11785 const FATTR4_RAWDEV = 41; 11786 const FATTR4_SPACE_AVAIL = 42; 11787 const FATTR4_SPACE_FREE = 43; 11788 const FATTR4_SPACE_TOTAL = 44; 11789 const FATTR4_SPACE_USED = 45; 11790 const FATTR4_SYSTEM = 46; 11791 const FATTR4_TIME_ACCESS = 47; 11792 const FATTR4_TIME_ACCESS_SET = 48; 11793 const FATTR4_TIME_BACKUP = 49; 11794 const FATTR4_TIME_CREATE = 50; 11795 const FATTR4_TIME_DELTA = 51; 11796 const FATTR4_TIME_METADATA = 52; 11797 const FATTR4_TIME_MODIFY = 53; 11798 const FATTR4_TIME_MODIFY_SET = 54; 11799 const FATTR4_MOUNTED_ON_FILEID = 55; 11801 typedef opaque attrlist4<>; 11803 /* 11804 * File attribute container 11805 */ 11806 struct fattr4 { 11807 bitmap4 attrmask; 11808 attrlist4 attr_vals; 11809 }; 11811 Draft Specification NFS version 4 Protocol September 2002 11813 /* 11814 * Change info for the client 11815 */ 11816 struct change_info4 { 11817 bool atomic; 11818 changeid4 before; 11819 changeid4 after; 11820 }; 11822 struct clientaddr4 { 11823 /* see struct rpcb in RFC 1833 */ 11824 string r_netid<>; /* network id */ 11825 string r_addr<>; /* universal address */ 11826 }; 11828 /* 11829 * Callback program info as provided by the client 11830 */ 11831 struct cb_client4 { 11832 uint32_t cb_program; 11833 clientaddr4 cb_location; 11834 }; 11836 /* 11837 * Stateid 11838 */ 11839 struct stateid4 { 11840 uint32_t seqid; 11841 opaque other[12]; 11842 }; 11844 /* 11845 * Client ID 11846 */ 11847 struct nfs_client_id4 { 11848 verifier4 verifier; 11849 opaque id; 11850 }; 11852 struct open_owner4 { 11853 clientid4 clientid; 11854 opaque owner; 11855 }; 11857 struct lock_owner4 { 11858 clientid4 clientid; 11859 opaque owner; 11860 }; 11862 enum nfs_lock_type4 { 11863 READ_LT = 1, 11864 WRITE_LT = 2, 11866 Draft Specification NFS version 4 Protocol September 2002 11868 READW_LT = 3, /* blocking read */ 11869 WRITEW_LT = 4 /* blocking write */ 11870 }; 11872 /* 11873 * ACCESS: Check access permission 11874 */ 11875 const ACCESS4_READ = 0x00000001; 11876 const ACCESS4_LOOKUP = 0x00000002; 11877 const ACCESS4_MODIFY = 0x00000004; 11878 const ACCESS4_EXTEND = 0x00000008; 11879 const ACCESS4_DELETE = 0x00000010; 11880 const ACCESS4_EXECUTE = 0x00000020; 11882 struct ACCESS4args { 11883 /* CURRENT_FH: object */ 11884 uint32_t access; 11885 }; 11887 struct ACCESS4resok { 11888 uint32_t supported; 11889 uint32_t access; 11890 }; 11892 union ACCESS4res switch (nfsstat4 status) { 11893 case NFS4_OK: 11894 ACCESS4resok resok4; 11895 default: 11896 void; 11897 }; 11899 /* 11900 * CLOSE: Close a file and release share reservations 11901 */ 11902 struct CLOSE4args { 11903 /* CURRENT_FH: object */ 11904 seqid4 seqid; 11905 stateid4 open_stateid; 11906 }; 11908 union CLOSE4res switch (nfsstat4 status) { 11909 case NFS4_OK: 11910 stateid4 open_stateid; 11911 default: 11912 void; 11913 }; 11915 /* 11916 * COMMIT: Commit cached data on server to stable storage 11917 */ 11918 struct COMMIT4args { 11919 /* CURRENT_FH: file */ 11921 Draft Specification NFS version 4 Protocol September 2002 11923 offset4 offset; 11924 count4 count; 11925 }; 11927 struct COMMIT4resok { 11928 verifier4 writeverf; 11929 }; 11931 union COMMIT4res switch (nfsstat4 status) { 11932 case NFS4_OK: 11933 COMMIT4resok resok4; 11934 default: 11935 void; 11936 }; 11938 /* 11939 * CREATE: Create a non-regular file 11940 */ 11941 union createtype4 switch (nfs_ftype4 type) { 11942 case NF4LNK: 11943 linktext4 linkdata; 11944 case NF4BLK: 11945 case NF4CHR: 11946 specdata4 devdata; 11947 case NF4SOCK: 11948 case NF4FIFO: 11949 case NF4DIR: 11950 void; 11951 default: 11952 void; /* server should return NFS4ERR_BADTYPE */ 11953 }; 11955 struct CREATE4args { 11956 /* CURRENT_FH: directory for creation */ 11957 createtype4 objtype; 11958 component4 objname; 11959 fattr4 createattrs; 11960 }; 11962 struct CREATE4resok { 11963 change_info4 cinfo; 11964 bitmap4 attrset; /* attributes set */ 11965 }; 11967 union CREATE4res switch (nfsstat4 status) { 11968 case NFS4_OK: 11969 CREATE4resok resok4; 11970 default: 11971 void; 11972 }; 11974 Draft Specification NFS version 4 Protocol September 2002 11976 /* 11977 * DELEGPURGE: Purge Delegations Awaiting Recovery 11978 */ 11979 struct DELEGPURGE4args { 11980 clientid4 clientid; 11981 }; 11983 struct DELEGPURGE4res { 11984 nfsstat4 status; 11985 }; 11987 /* 11988 * DELEGRETURN: Return a delegation 11989 */ 11990 struct DELEGRETURN4args { 11991 /* CURRENT_FH: delegated file */ 11992 stateid4 deleg_stateid; 11993 }; 11995 struct DELEGRETURN4res { 11996 nfsstat4 status; 11997 }; 11999 /* 12000 * GETATTR: Get file attributes 12001 */ 12002 struct GETATTR4args { 12003 /* CURRENT_FH: directory or file */ 12004 bitmap4 attr_request; 12005 }; 12007 struct GETATTR4resok { 12008 fattr4 obj_attributes; 12009 }; 12011 union GETATTR4res switch (nfsstat4 status) { 12012 case NFS4_OK: 12013 GETATTR4resok resok4; 12014 default: 12015 void; 12016 }; 12018 /* 12019 * GETFH: Get current filehandle 12020 */ 12021 struct GETFH4resok { 12022 nfs_fh4 object; 12023 }; 12025 union GETFH4res switch (nfsstat4 status) { 12026 case NFS4_OK: 12027 GETFH4resok resok4; 12029 Draft Specification NFS version 4 Protocol September 2002 12031 default: 12032 void; 12033 }; 12035 /* 12036 * LINK: Create link to an object 12037 */ 12038 struct LINK4args { 12039 /* SAVED_FH: source object */ 12040 /* CURRENT_FH: target directory */ 12041 component4 newname; 12042 }; 12044 struct LINK4resok { 12045 change_info4 cinfo; 12046 }; 12048 union LINK4res switch (nfsstat4 status) { 12049 case NFS4_OK: 12050 LINK4resok resok4; 12051 default: 12052 void; 12053 }; 12055 /* 12056 * For LOCK, transition from open_owner to new lock_owner 12057 */ 12058 struct open_to_lock_owner4 { 12059 seqid4 open_seqid; 12060 stateid4 open_stateid; 12061 seqid4 lock_seqid; 12062 lock_owner4 lock_owner; 12063 }; 12065 /* 12066 * For LOCK, existing lock_owner continues to request file locks 12067 */ 12068 struct exist_lock_owner4 { 12069 stateid4 lock_stateid; 12070 seqid4 lock_seqid; 12071 }; 12073 union locker4 switch (bool new_lock_owner) { 12074 case TRUE: 12075 open_to_lock_owner4 open_owner; 12076 case FALSE: 12077 exist_lock_owner4 lock_owner; 12078 }; 12080 /* 12081 * LOCK/LOCKT/LOCKU: Record lock management 12082 */ 12084 Draft Specification NFS version 4 Protocol September 2002 12086 struct LOCK4args { 12087 /* CURRENT_FH: file */ 12088 nfs_lock_type4 locktype; 12089 bool reclaim; 12090 offset4 offset; 12091 length4 length; 12092 locker4 locker; 12093 }; 12095 struct LOCK4denied { 12096 offset4 offset; 12097 length4 length; 12098 nfs_lock_type4 locktype; 12099 lock_owner4 owner; 12100 }; 12102 struct LOCK4resok { 12103 stateid4 lock_stateid; 12104 }; 12106 union LOCK4res switch (nfsstat4 status) { 12107 case NFS4_OK: 12108 LOCK4resok resok4; 12109 case NFS4ERR_DENIED: 12110 LOCK4denied denied; 12111 default: 12112 void; 12113 }; 12115 struct LOCKT4args { 12116 /* CURRENT_FH: file */ 12117 nfs_lock_type4 locktype; 12118 offset4 offset; 12119 length4 length; 12120 lock_owner4 owner; 12121 }; 12123 union LOCKT4res switch (nfsstat4 status) { 12124 case NFS4ERR_DENIED: 12125 LOCK4denied denied; 12126 case NFS4_OK: 12127 void; 12128 default: 12129 void; 12130 }; 12132 struct LOCKU4args { 12133 /* CURRENT_FH: file */ 12134 nfs_lock_type4 locktype; 12135 seqid4 seqid; 12136 stateid4 lock_stateid; 12137 offset4 offset; 12139 Draft Specification NFS version 4 Protocol September 2002 12141 length4 length; 12142 }; 12144 union LOCKU4res switch (nfsstat4 status) { 12145 case NFS4_OK: 12146 stateid4 lock_stateid; 12147 default: 12148 void; 12149 }; 12151 /* 12152 * LOOKUP: Lookup filename 12153 */ 12154 struct LOOKUP4args { 12155 /* CURRENT_FH: directory */ 12156 component4 objname; 12157 }; 12159 struct LOOKUP4res { 12160 /* CURRENT_FH: object */ 12161 nfsstat4 status; 12162 }; 12164 /* 12165 * LOOKUPP: Lookup parent directory 12166 */ 12167 struct LOOKUPP4res { 12168 /* CURRENT_FH: directory */ 12169 nfsstat4 status; 12170 }; 12172 /* 12173 * NVERIFY: Verify attributes different 12174 */ 12175 struct NVERIFY4args { 12176 /* CURRENT_FH: object */ 12177 fattr4 obj_attributes; 12178 }; 12180 struct NVERIFY4res { 12181 nfsstat4 status; 12182 }; 12184 /* 12185 * Various definitions for OPEN 12186 */ 12187 enum createmode4 { 12188 UNCHECKED4 = 0, 12189 GUARDED4 = 1, 12190 EXCLUSIVE4 = 2 12191 }; 12193 Draft Specification NFS version 4 Protocol September 2002 12195 union createhow4 switch (createmode4 mode) { 12196 case UNCHECKED4: 12197 case GUARDED4: 12198 fattr4 createattrs; 12199 case EXCLUSIVE4: 12200 verifier4 createverf; 12201 }; 12203 enum opentype4 { 12204 OPEN4_NOCREATE = 0, 12205 OPEN4_CREATE = 1 12206 }; 12208 union openflag4 switch (opentype4 opentype) { 12209 case OPEN4_CREATE: 12210 createhow4 how; 12211 default: 12212 void; 12213 }; 12215 /* Next definitions used for OPEN delegation */ 12216 enum limit_by4 { 12217 NFS_LIMIT_SIZE = 1, 12218 NFS_LIMIT_BLOCKS = 2 12219 /* others as needed */ 12220 }; 12222 struct nfs_modified_limit4 { 12223 uint32_t num_blocks; 12224 uint32_t bytes_per_block; 12225 }; 12227 union nfs_space_limit4 switch (limit_by4 limitby) { 12228 /* limit specified as file size */ 12229 case NFS_LIMIT_SIZE: 12230 uint64_t filesize; 12231 /* limit specified by number of blocks */ 12232 case NFS_LIMIT_BLOCKS: 12233 nfs_modified_limit4 mod_blocks; 12234 } ; 12236 /* 12237 * Share Access and Deny constants for open argument 12238 */ 12239 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 12240 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 12241 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 12243 const OPEN4_SHARE_DENY_NONE = 0x00000000; 12244 const OPEN4_SHARE_DENY_READ = 0x00000001; 12245 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 12246 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 12248 Draft Specification NFS version 4 Protocol September 2002 12250 enum open_delegation_type4 { 12251 OPEN_DELEGATE_NONE = 0, 12252 OPEN_DELEGATE_READ = 1, 12253 OPEN_DELEGATE_WRITE = 2 12254 }; 12256 enum open_claim_type4 { 12257 CLAIM_NULL = 0, 12258 CLAIM_PREVIOUS = 1, 12259 CLAIM_DELEGATE_CUR = 2, 12260 CLAIM_DELEGATE_PREV = 3 12261 }; 12263 struct open_claim_delegate_cur4 { 12264 stateid4 delegate_stateid; 12265 component4 file; 12266 }; 12268 union open_claim4 switch (open_claim_type4 claim) { 12269 /* 12270 * No special rights to file. Ordinary OPEN of the specified file. 12271 */ 12272 case CLAIM_NULL: 12273 /* CURRENT_FH: directory */ 12274 component4 file; 12276 /* 12277 * Right to the file established by an open previous to server 12278 * reboot. File identified by filehandle obtained at that time 12279 * rather than by name. 12280 */ 12281 case CLAIM_PREVIOUS: 12282 /* CURRENT_FH: file being reclaimed */ 12283 open_delegation_type4 delegate_type; 12285 /* 12286 * Right to file based on a delegation granted by the server. 12287 * File is specified by name. 12288 */ 12289 case CLAIM_DELEGATE_CUR: 12290 /* CURRENT_FH: directory */ 12291 open_claim_delegate_cur4 delegate_cur_info; 12293 /* Right to file based on a delegation granted to a previous boot 12294 * instance of the client. File is specified by name. 12295 */ 12296 case CLAIM_DELEGATE_PREV: 12297 /* CURRENT_FH: directory */ 12298 component4 file_delegate_prev; 12299 }; 12301 /* 12303 Draft Specification NFS version 4 Protocol September 2002 12305 * OPEN: Open a file, potentially receiving an open delegation 12306 */ 12307 struct OPEN4args { 12308 seqid4 seqid; 12309 uint32_t share_access; 12310 uint32_t share_deny; 12311 open_owner4 owner; 12312 openflag4 openhow; 12313 open_claim4 claim; 12314 }; 12316 struct open_read_delegation4 { 12317 stateid4 stateid; /* Stateid for delegation*/ 12318 bool recall; /* Pre-recalled flag for 12319 delegations obtained 12320 by reclaim 12321 (CLAIM_PREVIOUS) */ 12322 nfsace4 permissions; /* Defines users who don't 12323 need an ACCESS call to 12324 open for read */ 12325 }; 12327 struct open_write_delegation4 { 12328 stateid4 stateid; /* Stateid for delegation */ 12329 bool recall; /* Pre-recalled flag for 12330 delegations obtained 12331 by reclaim 12332 (CLAIM_PREVIOUS) */ 12333 nfs_space_limit4 space_limit; /* Defines condition that 12334 the client must check to 12335 determine whether the 12336 file needs to be flushed 12337 to the server on close. 12338 */ 12339 nfsace4 permissions; /* Defines users who don't 12340 need an ACCESS call as 12341 part of a delegated 12342 open. */ 12343 }; 12345 union open_delegation4 12346 switch (open_delegation_type4 delegation_type) { 12347 case OPEN_DELEGATE_NONE: 12348 void; 12349 case OPEN_DELEGATE_READ: 12350 open_read_delegation4 read; 12351 case OPEN_DELEGATE_WRITE: 12352 open_write_delegation4 write; 12353 }; 12355 /* 12356 * Result flags 12358 Draft Specification NFS version 4 Protocol September 2002 12360 */ 12361 /* Client must confirm open */ 12362 const OPEN4_RESULT_CONFIRM = 0x00000002; 12363 /* Type of file locking behavior at the server */ 12364 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 12366 struct OPEN4resok { 12367 stateid4 stateid; /* Stateid for open */ 12368 change_info4 cinfo; /* Directory Change Info */ 12369 uint32_t rflags; /* Result flags */ 12370 bitmap4 attrset; /* attribute set for create*/ 12371 open_delegation4 delegation; /* Info on any open 12372 delegation */ 12373 }; 12375 union OPEN4res switch (nfsstat4 status) { 12376 case NFS4_OK: 12377 /* CURRENT_FH: opened file */ 12378 OPEN4resok resok4; 12379 default: 12380 void; 12381 }; 12383 /* 12384 * OPENATTR: open named attributes directory 12385 */ 12386 struct OPENATTR4args { 12387 /* CURRENT_FH: object */ 12388 bool createdir; 12389 }; 12391 struct OPENATTR4res { 12392 /* CURRENT_FH: named attr directory */ 12393 nfsstat4 status; 12394 }; 12396 /* 12397 * OPEN_CONFIRM: confirm the open 12398 */ 12399 struct OPEN_CONFIRM4args { 12400 /* CURRENT_FH: opened file */ 12401 stateid4 open_stateid; 12402 seqid4 seqid; 12403 }; 12405 struct OPEN_CONFIRM4resok { 12406 stateid4 open_stateid; 12407 }; 12409 union OPEN_CONFIRM4res switch (nfsstat4 status) { 12410 case NFS4_OK: 12411 OPEN_CONFIRM4resok resok4; 12413 Draft Specification NFS version 4 Protocol September 2002 12415 default: 12416 void; 12417 }; 12419 /* 12420 * OPEN_DOWNGRADE: downgrade the access/deny for a file 12421 */ 12422 struct OPEN_DOWNGRADE4args { 12423 /* CURRENT_FH: opened file */ 12424 stateid4 open_stateid; 12425 seqid4 seqid; 12426 uint32_t share_access; 12427 uint32_t share_deny; 12428 }; 12430 struct OPEN_DOWNGRADE4resok { 12431 stateid4 open_stateid; 12432 }; 12434 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 12435 case NFS4_OK: 12436 OPEN_DOWNGRADE4resok resok4; 12437 default: 12438 void; 12439 }; 12441 /* 12442 * PUTFH: Set current filehandle 12443 */ 12444 struct PUTFH4args { 12445 nfs_fh4 object; 12446 }; 12448 struct PUTFH4res { 12449 /* CURRENT_FH: */ 12450 nfsstat4 status; 12451 }; 12453 /* 12454 * PUTPUBFH: Set public filehandle 12455 */ 12456 struct PUTPUBFH4res { 12457 /* CURRENT_FH: public fh */ 12458 nfsstat4 status; 12459 }; 12461 /* 12462 * PUTROOTFH: Set root filehandle 12463 */ 12464 struct PUTROOTFH4res { 12465 /* CURRENT_FH: root fh */ 12466 nfsstat4 status; 12468 Draft Specification NFS version 4 Protocol September 2002 12470 }; 12472 /* 12473 * READ: Read from file 12474 */ 12475 struct READ4args { 12476 /* CURRENT_FH: file */ 12477 stateid4 stateid; 12478 offset4 offset; 12479 count4 count; 12480 }; 12482 struct READ4resok { 12483 bool eof; 12484 opaque data<>; 12485 }; 12487 union READ4res switch (nfsstat4 status) { 12488 case NFS4_OK: 12489 READ4resok resok4; 12490 default: 12491 void; 12492 }; 12494 /* 12495 * READDIR: Read directory 12496 */ 12497 struct READDIR4args { 12498 /* CURRENT_FH: directory */ 12499 nfs_cookie4 cookie; 12500 verifier4 cookieverf; 12501 count4 dircount; 12502 count4 maxcount; 12503 bitmap4 attr_request; 12504 }; 12506 struct entry4 { 12507 nfs_cookie4 cookie; 12508 component4 name; 12509 fattr4 attrs; 12510 entry4 *nextentry; 12511 }; 12513 struct dirlist4 { 12514 entry4 *entries; 12515 bool eof; 12516 }; 12518 struct READDIR4resok { 12519 verifier4 cookieverf; 12520 dirlist4 reply; 12521 }; 12523 Draft Specification NFS version 4 Protocol September 2002 12525 union READDIR4res switch (nfsstat4 status) { 12526 case NFS4_OK: 12527 READDIR4resok resok4; 12528 default: 12529 void; 12530 }; 12532 /* 12533 * READLINK: Read symbolic link 12534 */ 12535 struct READLINK4resok { 12536 linktext4 link; 12537 }; 12539 union READLINK4res switch (nfsstat4 status) { 12540 case NFS4_OK: 12541 READLINK4resok resok4; 12542 default: 12543 void; 12544 }; 12546 /* 12547 * REMOVE: Remove filesystem object 12548 */ 12549 struct REMOVE4args { 12550 /* CURRENT_FH: directory */ 12551 component4 target; 12552 }; 12554 struct REMOVE4resok { 12555 change_info4 cinfo; 12556 }; 12558 union REMOVE4res switch (nfsstat4 status) { 12559 case NFS4_OK: 12560 REMOVE4resok resok4; 12561 default: 12562 void; 12563 }; 12565 /* 12566 * RENAME: Rename directory entry 12567 */ 12568 struct RENAME4args { 12569 /* SAVED_FH: source directory */ 12570 component4 oldname; 12571 /* CURRENT_FH: target directory */ 12572 component4 newname; 12573 }; 12575 struct RENAME4resok { 12577 Draft Specification NFS version 4 Protocol September 2002 12579 change_info4 source_cinfo; 12580 change_info4 target_cinfo; 12581 }; 12583 union RENAME4res switch (nfsstat4 status) { 12584 case NFS4_OK: 12585 RENAME4resok resok4; 12586 default: 12587 void; 12588 }; 12590 /* 12591 * RENEW: Renew a Lease 12592 */ 12593 struct RENEW4args { 12594 clientid4 clientid; 12595 }; 12597 struct RENEW4res { 12598 nfsstat4 status; 12599 }; 12601 /* 12602 * RESTOREFH: Restore saved filehandle 12603 */ 12605 struct RESTOREFH4res { 12606 /* CURRENT_FH: value of saved fh */ 12607 nfsstat4 status; 12608 }; 12610 /* 12611 * SAVEFH: Save current filehandle 12612 */ 12613 struct SAVEFH4res { 12614 /* SAVED_FH: value of current fh */ 12615 nfsstat4 status; 12616 }; 12618 /* 12619 * SECINFO: Obtain Available Security Mechanisms 12620 */ 12621 struct SECINFO4args { 12622 /* CURRENT_FH: directory */ 12623 component4 name; 12624 }; 12626 /* 12627 * From RFC 2203 12628 */ 12629 enum rpc_gss_svc_t { 12630 RPC_GSS_SVC_NONE = 1, 12632 Draft Specification NFS version 4 Protocol September 2002 12634 RPC_GSS_SVC_INTEGRITY = 2, 12635 RPC_GSS_SVC_PRIVACY = 3 12636 }; 12638 struct rpcsec_gss_info { 12639 sec_oid4 oid; 12640 qop4 qop; 12641 rpc_gss_svc_t service; 12642 }; 12644 /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ 12645 union secinfo4 switch (uint32_t flavor) { 12646 case RPCSEC_GSS: 12647 rpcsec_gss_info flavor_info; 12648 default: 12649 void; 12650 }; 12652 typedef secinfo4 SECINFO4resok<>; 12654 union SECINFO4res switch (nfsstat4 status) { 12655 case NFS4_OK: 12656 SECINFO4resok resok4; 12657 default: 12658 void; 12659 }; 12661 /* 12662 * SETATTR: Set attributes 12663 */ 12664 struct SETATTR4args { 12665 /* CURRENT_FH: target object */ 12666 stateid4 stateid; 12667 fattr4 obj_attributes; 12668 }; 12670 struct SETATTR4res { 12671 nfsstat4 status; 12672 bitmap4 attrsset; 12673 }; 12675 /* 12676 * SETCLIENTID 12677 */ 12678 struct SETCLIENTID4args { 12679 nfs_client_id4 client; 12680 cb_client4 callback; 12681 uint32_t callback_ident; 12682 }; 12684 struct SETCLIENTID4resok { 12685 clientid4 clientid; 12687 Draft Specification NFS version 4 Protocol September 2002 12689 verifier4 setclientid_confirm; 12690 }; 12692 union SETCLIENTID4res switch (nfsstat4 status) { 12693 case NFS4_OK: 12694 SETCLIENTID4resok resok4; 12695 case NFS4ERR_CLID_INUSE: 12696 clientaddr4 client_using; 12697 default: 12698 void; 12699 }; 12701 struct SETCLIENTID_CONFIRM4args { 12702 clientid4 clientid; 12703 verifier4 setclientid_confirm; 12704 }; 12706 struct SETCLIENTID_CONFIRM4res { 12707 nfsstat4 status; 12708 }; 12710 /* 12711 * VERIFY: Verify attributes same 12712 */ 12713 struct VERIFY4args { 12714 /* CURRENT_FH: object */ 12715 fattr4 obj_attributes; 12716 }; 12718 struct VERIFY4res { 12719 nfsstat4 status; 12720 }; 12722 /* 12723 * WRITE: Write to file 12724 */ 12725 enum stable_how4 { 12726 UNSTABLE4 = 0, 12727 DATA_SYNC4 = 1, 12728 FILE_SYNC4 = 2 12729 }; 12731 struct WRITE4args { 12732 /* CURRENT_FH: file */ 12733 stateid4 stateid; 12734 offset4 offset; 12735 stable_how4 stable; 12736 opaque data<>; 12737 }; 12739 struct WRITE4resok { 12740 count4 count; 12742 Draft Specification NFS version 4 Protocol September 2002 12744 stable_how4 committed; 12745 verifier4 writeverf; 12746 }; 12748 union WRITE4res switch (nfsstat4 status) { 12749 case NFS4_OK: 12750 WRITE4resok resok4; 12751 default: 12752 void; 12753 }; 12755 /* 12756 * RELEASE_LOCKOWNER: Notify server to release lockowner 12757 */ 12758 struct RELEASE_LOCKOWNER4args { 12759 lock_owner4 lock_owner; 12760 }; 12762 struct RELEASE_LOCKOWNER4res { 12763 nfsstat4 status; 12764 }; 12766 /* 12767 * ILLEGAL: Response for illegal operation numbers 12768 */ 12769 struct ILLEGAL4res { 12770 nfsstat4 status; 12771 }; 12773 /* 12774 * Operation arrays 12775 */ 12777 enum nfs_opnum4 { 12778 OP_ACCESS = 3, 12779 OP_CLOSE = 4, 12780 OP_COMMIT = 5, 12781 OP_CREATE = 6, 12782 OP_DELEGPURGE = 7, 12783 OP_DELEGRETURN = 8, 12784 OP_GETATTR = 9, 12785 OP_GETFH = 10, 12786 OP_LINK = 11, 12787 OP_LOCK = 12, 12788 OP_LOCKT = 13, 12789 OP_LOCKU = 14, 12790 OP_LOOKUP = 15, 12791 OP_LOOKUPP = 16, 12792 OP_NVERIFY = 17, 12793 OP_OPEN = 18, 12794 OP_OPENATTR = 19, 12795 OP_OPEN_CONFIRM = 20, 12797 Draft Specification NFS version 4 Protocol September 2002 12799 OP_OPEN_DOWNGRADE = 21, 12800 OP_PUTFH = 22, 12801 OP_PUTPUBFH = 23, 12802 OP_PUTROOTFH = 24, 12803 OP_READ = 25, 12804 OP_READDIR = 26, 12805 OP_READLINK = 27, 12806 OP_REMOVE = 28, 12807 OP_RENAME = 29, 12808 OP_RENEW = 30, 12809 OP_RESTOREFH = 31, 12810 OP_SAVEFH = 32, 12811 OP_SECINFO = 33, 12812 OP_SETATTR = 34, 12813 OP_SETCLIENTID = 35, 12814 OP_SETCLIENTID_CONFIRM = 36, 12815 OP_VERIFY = 37, 12816 OP_WRITE = 38, 12817 OP_RELEASE_LOCKOWNER = 39, 12818 OP_ILLEGAL = 10044 12819 }; 12821 union nfs_argop4 switch (nfs_opnum4 argop) { 12822 case OP_ACCESS: ACCESS4args opaccess; 12823 case OP_CLOSE: CLOSE4args opclose; 12824 case OP_COMMIT: COMMIT4args opcommit; 12825 case OP_CREATE: CREATE4args opcreate; 12826 case OP_DELEGPURGE: DELEGPURGE4args opdelegpurge; 12827 case OP_DELEGRETURN: DELEGRETURN4args opdelegreturn; 12828 case OP_GETATTR: GETATTR4args opgetattr; 12829 case OP_GETFH: void; 12830 case OP_LINK: LINK4args oplink; 12831 case OP_LOCK: LOCK4args oplock; 12832 case OP_LOCKT: LOCKT4args oplockt; 12833 case OP_LOCKU: LOCKU4args oplocku; 12834 case OP_LOOKUP: LOOKUP4args oplookup; 12835 case OP_LOOKUPP: void; 12836 case OP_NVERIFY: NVERIFY4args opnverify; 12837 case OP_OPEN: OPEN4args opopen; 12838 case OP_OPENATTR: OPENATTR4args opopenattr; 12839 case OP_OPEN_CONFIRM: OPEN_CONFIRM4args opopen_confirm; 12840 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4args opopen_downgrade; 12841 case OP_PUTFH: PUTFH4args opputfh; 12842 case OP_PUTPUBFH: void; 12843 case OP_PUTROOTFH: void; 12844 case OP_READ: READ4args opread; 12845 case OP_READDIR: READDIR4args opreaddir; 12846 case OP_READLINK: void; 12847 case OP_REMOVE: REMOVE4args opremove; 12848 case OP_RENAME: RENAME4args oprename; 12849 case OP_RENEW: RENEW4args oprenew; 12850 case OP_RESTOREFH: void; 12852 Draft Specification NFS version 4 Protocol September 2002 12854 case OP_SAVEFH: void; 12855 case OP_SECINFO: SECINFO4args opsecinfo; 12856 case OP_SETATTR: SETATTR4args opsetattr; 12857 case OP_SETCLIENTID: SETCLIENTID4args opsetclientid; 12858 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4args 12859 opsetclientid_confirm; 12860 case OP_VERIFY: VERIFY4args opverify; 12861 case OP_WRITE: WRITE4args opwrite; 12862 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4args 12863 oprelease_lockowner; 12864 case OP_ILLEGAL: void; 12865 }; 12867 union nfs_resop4 switch (nfs_opnum4 resop){ 12868 case OP_ACCESS: ACCESS4res opaccess; 12869 case OP_CLOSE: CLOSE4res opclose; 12870 case OP_COMMIT: COMMIT4res opcommit; 12871 case OP_CREATE: CREATE4res opcreate; 12872 case OP_DELEGPURGE: DELEGPURGE4res opdelegpurge; 12873 case OP_DELEGRETURN: DELEGRETURN4res opdelegreturn; 12874 case OP_GETATTR: GETATTR4res opgetattr; 12875 case OP_GETFH: GETFH4res opgetfh; 12876 case OP_LINK: LINK4res oplink; 12877 case OP_LOCK: LOCK4res oplock; 12878 case OP_LOCKT: LOCKT4res oplockt; 12879 case OP_LOCKU: LOCKU4res oplocku; 12880 case OP_LOOKUP: LOOKUP4res oplookup; 12881 case OP_LOOKUPP: LOOKUPP4res oplookupp; 12882 case OP_NVERIFY: NVERIFY4res opnverify; 12883 case OP_OPEN: OPEN4res opopen; 12884 case OP_OPENATTR: OPENATTR4res opopenattr; 12885 case OP_OPEN_CONFIRM: OPEN_CONFIRM4res opopen_confirm; 12886 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4res opopen_downgrade; 12887 case OP_PUTFH: PUTFH4res opputfh; 12888 case OP_PUTPUBFH: PUTPUBFH4res opputpubfh; 12889 case OP_PUTROOTFH: PUTROOTFH4res opputrootfh; 12890 case OP_READ: READ4res opread; 12891 case OP_READDIR: READDIR4res opreaddir; 12892 case OP_READLINK: READLINK4res opreadlink; 12893 case OP_REMOVE: REMOVE4res opremove; 12894 case OP_RENAME: RENAME4res oprename; 12895 case OP_RENEW: RENEW4res oprenew; 12896 case OP_RESTOREFH: RESTOREFH4res oprestorefh; 12897 case OP_SAVEFH: SAVEFH4res opsavefh; 12898 case OP_SECINFO: SECINFO4res opsecinfo; 12899 case OP_SETATTR: SETATTR4res opsetattr; 12900 case OP_SETCLIENTID: SETCLIENTID4res opsetclientid; 12901 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4res 12902 opsetclientid_confirm; 12903 case OP_VERIFY: VERIFY4res opverify; 12904 case OP_WRITE: WRITE4res opwrite; 12905 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4res 12907 Draft Specification NFS version 4 Protocol September 2002 12909 oprelease_lockowner; 12910 case OP_ILLEGAL: ILLEGAL4res opillegal; 12911 }; 12913 struct COMPOUND4args { 12914 utf8string tag; 12915 uint32_t minorversion; 12916 nfs_argop4 argarray<>; 12917 }; 12919 struct COMPOUND4res { 12920 nfsstat4 status; 12921 utf8string tag; 12922 nfs_resop4 resarray<>; 12923 }; 12925 /* 12926 * Remote file service routines 12927 */ 12928 program NFS4_PROGRAM { 12929 version NFS_V4 { 12930 void 12931 NFSPROC4_NULL(void) = 0; 12933 COMPOUND4res 12934 NFSPROC4_COMPOUND(COMPOUND4args) = 1; 12936 } = 4; 12937 } = 100003; 12939 /* 12940 * NFS4 Callback Procedure Definitions and Program 12941 */ 12943 /* 12944 * CB_GETATTR: Get Current Attributes 12945 */ 12946 struct CB_GETATTR4args { 12947 nfs_fh4 fh; 12948 bitmap4 attr_request; 12949 }; 12951 struct CB_GETATTR4resok { 12952 fattr4 obj_attributes; 12953 }; 12955 union CB_GETATTR4res switch (nfsstat4 status) { 12956 case NFS4_OK: 12957 CB_GETATTR4resok resok4; 12958 default: 12960 Draft Specification NFS version 4 Protocol September 2002 12962 void; 12963 }; 12965 /* 12966 * CB_RECALL: Recall an Open Delegation 12967 */ 12968 struct CB_RECALL4args { 12969 stateid4 stateid; 12970 bool truncate; 12971 nfs_fh4 fh; 12972 }; 12974 struct CB_RECALL4res { 12975 nfsstat4 status; 12976 }; 12978 /* 12979 * CB_ILLEGAL: Response for illegal operation numbers 12980 */ 12981 struct CB_ILLEGAL4res { 12982 nfsstat4 status; 12983 }; 12985 /* 12986 * Various definitions for CB_COMPOUND 12987 */ 12988 enum nfs_cb_opnum4 { 12989 OP_CB_GETATTR = 3, 12990 OP_CB_RECALL = 4, 12991 OP_CB_ILLEGAL = 10044 12992 }; 12994 union nfs_cb_argop4 switch (unsigned argop) { 12995 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 12996 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 12997 case OP_CB_ILLEGAL: void; 12998 }; 13000 union nfs_cb_resop4 switch (unsigned resop){ 13001 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 13002 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 13003 case OP_CB_ILLEGAL: CB_ILLEGAL4res opcbillegal; 13004 }; 13006 struct CB_COMPOUND4args { 13007 utf8string tag; 13008 uint32_t minorversion; 13009 uint32_t callback_ident; 13010 nfs_cb_argop4 argarray<>; 13011 }; 13013 struct CB_COMPOUND4res { 13015 Draft Specification NFS version 4 Protocol September 2002 13017 nfsstat4 status; 13018 utf8string tag; 13019 nfs_cb_resop4 resarray<>; 13020 }; 13022 /* 13023 * Program number is in the transient range since the client 13024 * will assign the exact transient program number and provide 13025 * that to the server via the SETCLIENTID operation. 13026 */ 13027 program NFS4_CALLBACK { 13028 version NFS_CB { 13029 void 13030 CB_NULL(void) = 0; 13031 CB_COMPOUND4res 13032 CB_COMPOUND(CB_COMPOUND4args) = 1; 13033 } = 1; 13034 } = 0x40000000; 13036 Draft Specification NFS version 4 Protocol September 2002 13038 19. Bibliography 13040 [Floyd] 13041 S. Floyd, V. Jacobson, "The Synchronization of Periodic Routing 13042 Messages," IEEE/ACM Transactions on Networking, 2(2), pp. 122-136, 13043 April 1994. 13045 [Gray] 13046 C. Gray, D. Cheriton, "Leases: An Efficient Fault-Tolerant Mechanism 13047 for Distributed File Cache Consistency," Proceedings of the Twelfth 13048 Symposium on Operating Systems Principles, p. 202-210, December 1989. 13050 [ISO10646] 13051 "ISO/IEC 10646-1:1993. International Standard -- Information 13052 technology -- Universal Multiple-Octet Coded Character Set (UCS) -- 13053 Part 1: Architecture and Basic Multilingual Plane." 13055 [Juszczak] 13056 Juszczak, Chet, "Improving the Performance and Correctness of an NFS 13057 Server," USENIX Conference Proceedings, USENIX Association, Berkeley, 13058 CA, June 1990, pages 53-63. Describes reply cache implementation 13059 that avoids work in the server by handling duplicate requests. More 13060 important, though listed as a side-effect, the reply cache aids in 13061 the avoidance of destructive non-idempotent operation re-application 13062 -- improving correctness. 13064 [Kazar] 13065 Kazar, Michael Leon, "Synchronization and Caching Issues in the 13066 Andrew File System," USENIX Conference Proceedings, USENIX 13067 Association, Berkeley, CA, Dallas Winter 1988, pages 27-36. A 13068 description of the cache consistency scheme in AFS. Contrasted with 13069 other distributed file systems. 13071 [Macklem] 13072 Macklem, Rick, "Lessons Learned Tuning the 4.3BSD Reno Implementation 13073 of the NFS Protocol," Winter USENIX Conference Proceedings, USENIX 13074 Association, Berkeley, CA, January 1991. Describes performance work 13075 in tuning the 4.3BSD Reno NFS implementation. Describes performance 13076 improvement (reduced CPU loading) through elimination of data copies. 13078 [Mogul] 13079 Mogul, Jeffrey C., "A Recovery Protocol for Spritely NFS," USENIX 13080 File System Workshop Proceedings, Ann Arbor, MI, USENIX Association, 13081 Berkeley, CA, May 1992. Second paper on Spritely NFS proposes a 13082 lease-based scheme for recovering state of consistency protocol. 13084 Draft Specification NFS version 4 Protocol September 2002 13086 [Nowicki] 13087 Nowicki, Bill, "Transport Issues in the Network File System," ACM 13088 SIGCOMM newsletter Computer Communication Review, April 1989. A 13089 brief description of the basis for the dynamic retransmission work. 13091 [Pawlowski] 13092 Pawlowski, Brian, Ron Hixon, Mark Stein, Joseph Tumminaro, "Network 13093 Computing in the UNIX and IBM Mainframe Environment," Uniforum `89 13094 Conf. Proc., (1989) Description of an NFS server implementation for 13095 IBM's MVS operating system. 13097 [RFC1094] 13098 Sun Microsystems, Inc., "NFS: Network File System Protocol 13099 Specification", RFC1094, March 1989. 13101 http://www.ietf.org/rfc/rfc1094.txt 13103 [RFC1345] 13104 Simonsen, K., "Character Mnemonics & Character Sets", RFC1345, 13105 Rationel Almen Planlaegning, June 1992. 13107 http://www.ietf.org/rfc/rfc1345.txt 13109 [RFC1700] 13110 Reynolds, J., Postel, J., "Assigned Numbers", RFC1700, ISI, October 13111 1994 13113 http://www.ietf.org/rfc/rfc1700.txt 13115 [RFC1813] 13116 Callaghan, B., Pawlowski, B., Staubach, P., "NFS Version 3 Protocol 13117 Specification", RFC1813, Sun Microsystems, Inc., June 1995. 13119 http://www.ietf.org/rfc/rfc1813.txt 13121 [RFC1831] 13122 Srinivasan, R., "RPC: Remote Procedure Call Protocol Specification 13123 Version 2", RFC1831, Sun Microsystems, Inc., August 1995. 13125 http://www.ietf.org/rfc/rfc1831.txt 13127 [RFC1832] 13128 Srinivasan, R., "XDR: External Data Representation Standard", 13129 RFC1832, Sun Microsystems, Inc., August 1995. 13131 Draft Specification NFS version 4 Protocol September 2002 13133 http://www.ietf.org/rfc/rfc1832.txt 13135 [RFC1833] 13136 Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC1833, 13137 Sun Microsystems, Inc., August 1995. 13139 http://www.ietf.org/rfc/rfc1833.txt 13141 [RFC1884] 13142 Hinden, R., Deering, S., "IP Version 6 Addressing Architecture", 13143 RFC1884, December 1995. 13145 http://www.ietf.org/rfc/rfc1884.txt 13147 [RFC1964] 13148 Linn, J., "The Kerberos Version 5 GSS-API Mechanism", RFC1964, 13149 OpenVision Technologies, June 1996. 13151 http://www.ietf.org/rfc/rfc1964.txt 13153 [RFC2025] 13154 Adams, C., "The Simple Public-Key GSS-API Mechanism (SPKM)", RFC2025, 13155 Bell-Northern Research, October 1996. 13157 http://www.ietf.org/rfc/rfc2026.txt 13159 [RFC2054] 13160 Callaghan, B., "WebNFS Client Specification", RFC2054, Sun 13161 Microsystems, Inc., October 1996 13163 http://www.ietf.org/rfc/rfc2054.txt 13165 [RFC2055] 13166 Callaghan, B., "WebNFS Server Specification", RFC2055, Sun 13167 Microsystems, Inc., October 1996 13169 http://www.ietf.org/rfc/rfc2055.txt 13171 [RFC2119] 13172 Bradner, S., "Key words for use in RFCs to Indicate Requirement 13173 Levels", RFC2119, Harvard University, March 1997 13175 http://www.ietf.org/rfc/rfc2119.txt 13177 Draft Specification NFS version 4 Protocol September 2002 13179 [RFC2152] 13180 Goldsmith, D., "UTF-7 A Mail-Safe Transformation Format of Unicode", 13181 RFC2152, Apple Computer, Inc., May 1997 13183 http://www.ietf.org/rfc/rfc2152.txt 13185 [RFC2203] 13186 Eisler, M., Chiu, A., Ling, L., "RPCSEC_GSS Protocol Specification", 13187 RFC2203, Sun Microsystems, Inc., August 1995. 13189 http://www.ietf.org/rfc/rfc2203.txt 13191 [RFC2224] 13192 Callaghan, B., "NFS URL Scheme", RFC2224, Sun Microsystems, Inc., 13193 October 1997 13195 http://www.ietf.org/rfc/rfc2224.txt 13197 [RFC2277] 13198 Alvestrand, H., "IETF Policy on Character Sets and Languages", 13199 RFC2277, UNINETT, January 1998. 13201 http://www.ietf.org/rfc/rfc2277.txt 13203 [RFC2279] 13204 Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC2279, 13205 Alis Technologies, January 1998. 13207 http://www.ietf.org/rfc/rfc2279.txt 13209 [RFC2623] 13210 Eisler, M., "NFS Version 2 and Version 3 Security Issues and the NFS 13211 Protocol's Use of RPCSEC_GSS and Kerberos V5", RFC2623, Sun 13212 Microsystems, June 1999 13214 http://www.ietf.org/rfc/rfc2623.txt 13216 [RFC2624] 13217 Shepler, S., "NFS Version 4 Design Considerations", RFC2624, Sun 13218 Microsystems, June 1999 13220 http://www.ietf.org/rfc/rfc2624.txt 13222 [RFC2743] 13223 Linn, J., "Generic Security Service Application Program Interface, 13225 Draft Specification NFS version 4 Protocol September 2002 13227 Version 2, Update 1", RFC2743, RSA Laboratories, January 2000. 13229 http://www.ietf.org/rfc/rfc2743.txt 13231 [RFC2755] 13232 Chiu, A., Eisler, M., Callaghan, B., "Security Negotiation for 13233 WebNFS" , RFC2755, Sun Microsystems, June 2000 13235 http://www.ietf.org/rfc/rfc2847.txt 13237 [RFC2847] 13238 Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism Using 13239 SPKM", RFC2847, Zambeel, June 2000 13241 http://www.ietf.org/rfc/rfc2847.txt 13243 [Sandberg] 13244 Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, "Design 13245 and Implementation of the Sun Network Filesystem," USENIX Conference 13246 Proceedings, USENIX Association, Berkeley, CA, Summer 1985. The 13247 basic paper describing the SunOS implementation of the NFS version 2 13248 protocol, and discusses the goals, protocol specification and trade- 13249 offs. 13251 [Srinivasan] 13252 Srinivasan, V., Jeffrey C. Mogul, "Spritely NFS: Implementation and 13253 Performance of Cache Consistency Protocols", WRL Research Report 13254 89/5, Digital Equipment Corporation Western Research Laboratory, 100 13255 Hamilton Ave., Palo Alto, CA, 94301, May 1989. This paper analyzes 13256 the effect of applying a Sprite-like consistency protocol applied to 13257 standard NFS. The issues of recovery in a stateful environment are 13258 covered in [Mogul]. 13260 [Unicode1] 13261 The Unicode Consortium, "The Unicode Standard, Version 3.0", 13262 Addison-Wesley Developers Press, Reading, MA, 2000. ISBN 0-201- 13263 61633-5. 13265 More information available at: http://www.unicode.org/ 13267 [Unicode2] 13268 "Unsupported Scripts" Unicode, Inc., The Unicode Consortium, P.O. Box 13269 700519, San Jose, CA 95710-0519 USA, September 1999 13271 http://www.unicode.org/unicode/standard/unsupported.html 13273 Draft Specification NFS version 4 Protocol September 2002 13275 [XNFS] 13276 The Open Group, Protocols for Interworking: XNFS, Version 3W, The 13277 Open Group, 1010 El Camino Real Suite 380, Menlo Park, CA 94025, ISBN 13278 1-85912-184-5, February 1998. 13280 HTML version available: http://www.opengroup.org 13282 Draft Specification NFS version 4 Protocol September 2002 13284 20. Authors 13286 20.1. Editor's Address 13288 Spencer Shepler 13289 Sun Microsystems, Inc. 13290 7808 Moonflower Drive 13291 Austin, Texas 78750 13293 Phone: +1 512-349-9376 13294 E-mail: spencer.shepler@sun.com 13296 20.2. Authors' Addresses 13298 Carl Beame 13299 Hummingbird Ltd. 13301 E-mail: beame@bws.com 13303 Brent Callaghan 13304 Sun Microsystems, Inc. 13305 17 Network Circle 13306 Menlo Park, CA 94025 13308 Phone: +1 650-786-5067 13309 E-mail: brent.callaghan@sun.com 13311 Mike Eisler 13312 5765 Chase Point Circle 13313 Colorado Springs, CO 80919 13315 Phone: +1 719-599-9026 13316 E-mail: mike@eisler.com 13318 David Noveck 13319 Network Appliance 13320 375 Totten Pond Road 13321 Waltham, MA 02451 13323 Phone: +1 781-768-5347 13324 E-mail: dnoveck@netapp.com 13326 David Robinson 13327 Sun Microsystems, Inc. 13328 5300 Riata Park Court 13329 Austin, TX 78727 13331 Draft Specification NFS version 4 Protocol September 2002 13333 Phone: +1 650-786-5088 13334 E-mail: david.robinson@sun.com 13336 Robert Thurlow 13337 Sun Microsystems, Inc. 13338 500 Eldorado Blvd. 13339 Broomfield, CO 80021 13341 Phone: +1 650-786-5096 13342 E-mail: robert.thurlow@sun.com 13344 20.3. Acknowledgements 13346 The authors thank and acknowledges: 13348 Neil Brown for his extensive review and comments of various drafts. 13349 Andy Adamson, Jim Rees, and Kendrick Smith from the CITI organization 13350 at the University of Michigan for their implementation efforts and 13351 feedback on the protocol specification. Mike Kupfer for his review 13352 of the file locking and ACL mechanisms. Alan Yoder for his input to 13353 ACL mechanisms. Peter Astrand for his close review of the protocol 13354 specification. Ran Atkinson for his constant reminder that users do 13355 matter. 13357 Draft Specification NFS version 4 Protocol September 2002 13359 21. Full Copyright Statement 13361 "Copyright (C) The Internet Society (2000-2002). All Rights 13362 Reserved. 13364 This document and translations of it may be copied and furnished to 13365 others, and derivative works that comment on or otherwise explain it 13366 or assist in its implementation may be prepared, copied, published 13367 and distributed, in whole or in part, without restriction of any 13368 kind, provided that the above copyright notice and this paragraph are 13369 included on all such copies and derivative works. However, this 13370 document itself may not be modified in any way, such as by removing 13371 the copyright notice or references to the Internet Society or other 13372 Internet organizations, except as needed for the purpose of 13373 developing Internet standards in which case the procedures for 13374 copyrights defined in the Internet Standards process must be 13375 followed, or as required to translate it into languages other than 13376 English. 13378 The limited permissions granted above are perpetual and will not be 13379 revoked by the Internet Society or its successors or assigns. 13381 This document and the information contained herein is provided on an 13382 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 13383 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 13384 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 13385 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 13386 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."