idnits 2.17.1 draft-ietf-nfsv4-rfc3010bis-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 5 longer pages, the longest (page 169) being 62 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119], [RFC1094], [RFC1813]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 639 has weird spacing: '...ned int uin...' == Line 643 has weird spacing: '...d hyper uint6...' == Line 705 has weird spacing: '...8string typ...' == Line 791 has weird spacing: '...8string ser...' == Line 897 has weird spacing: '...ned int cb_pr...' == (39 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 2002) is 7924 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC1094' on line 12717 looks like a reference -- Missing reference section? 'RFC1813' on line 12735 looks like a reference -- Missing reference section? 'RFC2119' on line 12791 looks like a reference -- Missing reference section? 'RFC1831' on line 12741 looks like a reference -- Missing reference section? 'RFC1832' on line 12747 looks like a reference -- Missing reference section? 'RFC2203' on line 12805 looks like a reference -- Missing reference section? 'RFC1964' on line 12767 looks like a reference -- Missing reference section? 'RFC2847' on line 12863 looks like a reference -- Missing reference section? 'RFC2743' on line 12851 looks like a reference -- Missing reference section? 'RFC1833' on line 12755 looks like a reference -- Missing reference section? 'RFC1884' on line 12761 looks like a reference -- Missing reference section? '12' on line 11460 looks like a reference -- Missing reference section? 'RFC1700' on line 12729 looks like a reference -- Missing reference section? 'RFC2581' on line 12829 looks like a reference -- Missing reference section? 'Floyd' on line 12660 looks like a reference -- Missing reference section? 'RFC2623' on line 12835 looks like a reference -- Missing reference section? 'RFC2025' on line 12773 looks like a reference -- Missing reference section? 'RFC2054' on line 12779 looks like a reference -- Missing reference section? 'RFC2055' on line 12785 looks like a reference -- Missing reference section? 'RFC2624' on line 12842 looks like a reference -- Missing reference section? 'RFC1345' on line 12723 looks like a reference -- Missing reference section? 'XNFS' on line 12901 looks like a reference -- Missing reference section? 'Juszczak' on line 12675 looks like a reference -- Missing reference section? 'ISO10646' on line 12670 looks like a reference -- Missing reference section? 'RFC2277' on line 12817 looks like a reference -- Missing reference section? 'RFC2279' on line 12823 looks like a reference -- Missing reference section? 'RFC2152' on line 12799 looks like a reference -- Missing reference section? 'Unicode1' on line 12886 looks like a reference -- Missing reference section? 'Unicode2' on line 12895 looks like a reference -- Missing reference section? 'RFC2224' on line 12811 looks like a reference -- Missing reference section? 'RFC2755' on line 12857 looks like a reference -- Missing reference section? 'Gray' on line 12665 looks like a reference -- Missing reference section? 'Kazar' on line 12684 looks like a reference -- Missing reference section? 'Macklem' on line 12691 looks like a reference -- Missing reference section? 'Mogul' on line 12884 looks like a reference -- Missing reference section? 'Nowicki' on line 12706 looks like a reference -- Missing reference section? 'Pawlowski' on line 12711 looks like a reference -- Missing reference section? 'Sandberg' on line 12869 looks like a reference -- Missing reference section? 'Srinivasan' on line 12877 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 42 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NFS version 4 Working Group S. Shepler 3 INTERNET-DRAFT Sun Microsystems, Inc. 4 Document: draft-ietf-nfsv4-rfc3010bis-02.txt C. Beame 5 Hummingbird Ltd. 6 B. Callaghan 7 Sun Microsystems, Inc. 8 M. Eisler 9 Network Appliance, Inc. 10 D. Noveck 11 Network Appliance, Inc. 12 D. Robinson 13 Sun Microsystems, Inc. 14 R. Thurlow 15 Sun Microsystems, Inc. 16 August 2002 18 NFS version 4 Protocol 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet- Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Abstract 43 NFS version 4 is a distributed filesystem protocol which owes 44 heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. 46 Draft Specification NFS version 4 Protocol August 2002 48 Unlike earlier versions, the NFS version 4 protocol supports 49 traditional file access while integrating support for file locking 50 and the mount protocol. In addition, support for strong security 51 (and its negotiation), compound operations, client caching, and 52 internationalization have been added. Of course, attention has been 53 applied to making NFS version 4 operate well in an Internet 54 environment. 56 Copyright 58 Copyright (C) The Internet Society (2000-2002). All Rights Reserved. 60 Key Words 62 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 63 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 64 document are to be interpreted as described in [RFC2119]. 66 Draft Specification NFS version 4 Protocol August 2002 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 71 1.1. Inconsistencies of this Document with Section 18 . . . . . 7 72 1.2. Overview of NFS version 4 Features . . . . . . . . . . . . 8 73 1.2.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 74 1.2.2. Procedure and Operation Structure . . . . . . . . . . . 8 75 1.2.3. Filesystem Model . . . . . . . . . . . . . . . . . . . . 9 76 1.2.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 77 1.2.3.2. Attribute Types . . . . . . . . . . . . . . . . . . 10 78 1.2.3.3. Filesystem Replication and Migration . . . . . . . . 10 79 1.2.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . . 11 80 1.2.5. File locking . . . . . . . . . . . . . . . . . . . . . 11 81 1.2.6. Client Caching and Delegation . . . . . . . . . . . . 11 82 1.3. General Definitions . . . . . . . . . . . . . . . . . . 12 83 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . 14 84 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 14 85 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 86 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 21 87 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 21 88 3.1.1. Client Retransmission Behavior . . . . . . . . . . . . 21 89 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 22 90 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 22 91 3.2.1.1. Kerberos V5 as a security triple . . . . . . . . . . 22 92 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 23 93 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 24 94 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 24 95 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 25 96 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . . 25 97 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 25 98 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 28 99 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 28 100 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 28 101 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 28 102 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 29 103 4.2.1. General Properties of a Filehandle . . . . . . . . . . 29 104 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 30 105 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 30 106 4.2.4. One Method of Constructing a Volatile Filehandle . . . 31 107 4.3. Client Recovery from Filehandle Expiration . . . . . . . 32 108 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 34 109 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 35 110 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 35 111 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 35 112 5.4. Classification of Attributes . . . . . . . . . . . . . . 36 113 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 38 114 5.6. Recommended Attributes - Definitions . . . . . . . . . . 40 115 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 45 116 5.8. Interpreting owner and owner_group . . . . . . . . . . . 45 117 5.9. Character Case Attributes . . . . . . . . . . . . . . . 47 118 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . 47 119 5.11. Access Control Lists . . . . . . . . . . . . . . . . . 48 121 Draft Specification NFS version 4 Protocol August 2002 123 5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . 49 124 5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 50 125 5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . 52 126 5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 53 127 5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . 54 128 5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . 55 129 5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 55 130 6. Filesystem Migration and Replication . . . . . . . . . . . 57 131 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 57 132 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 57 133 6.3. Interpretation of the fs_locations Attribute . . . . . . 58 134 6.4. Filehandle Recovery for Migration or Replication . . . . 59 135 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 60 136 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 60 137 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 60 138 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 60 139 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 61 140 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 61 141 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 61 142 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 62 143 7.8. Security Policy and Name Space Presentation . . . . . . 62 144 8. File Locking and Share Reservations . . . . . . . . . . . 64 145 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 64 146 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 64 147 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 67 148 8.1.3. lock_owner and stateid Definition . . . . . . . . . . 68 149 8.1.4. Use of the stateid and Locking . . . . . . . . . . . . 69 150 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 71 151 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 72 152 8.1.7. Releasing lock_owner State . . . . . . . . . . . . . . 72 153 8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . . 73 154 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 74 155 8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 74 156 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 75 157 8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 75 158 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 76 159 8.6.1. Client Failure and Recovery . . . . . . . . . . . . . 76 160 8.6.2. Server Failure and Recovery . . . . . . . . . . . . . 77 161 8.6.3. Network Partitions and Recovery . . . . . . . . . . . 79 162 8.7. Recovery from a Lock Request Timeout or Abort . . . . . 80 163 8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 80 164 8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 81 165 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 82 166 8.10.1. Close and Retention of State Information . . . . . . 83 167 8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . 83 168 8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 84 169 8.13. Clocks, Propagation Delay, and Calculating Lease 170 Expiration . . . . . . . . . . . . . . . . . . . . . . 84 171 8.14. Migration, Replication and State . . . . . . . . . . . 85 172 8.14.1. Migration and State . . . . . . . . . . . . . . . . . 85 173 8.14.2. Replication and State . . . . . . . . . . . . . . . . 86 174 8.14.3. Notification of Migrated Lease . . . . . . . . . . . 86 176 Draft Specification NFS version 4 Protocol August 2002 178 8.14.4. Migration and the Lease_time Attribute . . . . . . . 87 179 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 88 180 9.1. Performance Challenges for Client-Side Caching . . . . . 88 181 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 89 182 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 90 183 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 92 184 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 92 185 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 93 186 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 95 187 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 95 188 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 96 189 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 99 190 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 100 191 9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . . 100 192 9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . 102 193 9.4.5. Delegation Revocation . . . . . . . . . . . . . . . . 104 194 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 104 195 9.5.1. Revocation Recovery for Write Open Delegation . . . . 104 196 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 105 197 9.7. Name Caching . . . . . . . . . . . . . . . . . . . . . . 107 198 9.8. Directory Caching . . . . . . . . . . . . . . . . . . . 108 199 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 110 200 11. Internationalization . . . . . . . . . . . . . . . . . . 113 201 11.1. Universal Versus Local Character Sets . . . . . . . . . 113 202 11.2. Overview of Universal Character Set Standards . . . . . 114 203 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 115 204 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 115 205 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 116 206 11.6. UTF-8 Related Errors . . . . . . . . . . . . . . . . . 116 207 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 118 208 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . 124 209 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 124 210 13.2. Evaluation of a Compound Request . . . . . . . . . . . 125 211 13.3. Synchronous Modifying Operations . . . . . . . . . . . 125 212 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 126 213 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . 127 214 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 127 215 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 128 216 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 131 217 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 134 218 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 136 219 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 139 220 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 221 Recovery . . . . . . . . . . . . . . . . . . . . . . 142 222 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 143 223 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 144 224 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 146 225 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 148 226 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 150 227 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 154 228 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 156 229 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 158 231 Draft Specification NFS version 4 Protocol August 2002 233 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 161 234 14.2.15. Operation 17: NVERIFY - Verify Difference in 235 Attributes . . . . . . . . . . . . . . . . . . . . . 162 236 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 164 237 14.2.17. Operation 19: OPENATTR - Open Named Attribute 238 Directory . . . . . . . . . . . . . . . . . . . . . 174 239 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 176 240 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access179 241 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 181 242 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 182 243 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 184 244 14.2.23. Operation 25: READ - Read from File . . . . . . . . 185 245 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 188 246 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 192 247 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 194 248 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 197 249 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 200 250 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 201 251 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 203 252 14.2.31. Operation 33: SECINFO - Obtain Available Security . 204 253 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 208 254 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 211 255 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 215 256 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 219 257 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 221 258 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner 259 State . . . . . . . . . . . . . . . . . . . . . . . 226 260 14.2.38. Operation 10044: ILLEGAL - Illegal operation . . . . 228 261 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 229 262 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 229 263 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 230 264 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 232 265 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 234 266 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback 267 Operation . . . . . . . . . . . . . . . . . . . . . . 236 268 16. Security Considerations . . . . . . . . . . . . . . . . . 237 269 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 238 270 17.1. Named Attribute Definition . . . . . . . . . . . . . . 238 271 17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . 238 272 18. RPC definition file . . . . . . . . . . . . . . . . . . . 239 273 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 271 274 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 277 275 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 277 276 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 277 277 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 278 278 21. Full Copyright Statement . . . . . . . . . . . . . . . . 279 280 Draft Specification NFS version 4 Protocol August 2002 282 1. Introduction 284 The NFS version 4 protocol is a further revision of the NFS protocol 285 defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains 286 the essential characteristics of previous versions: design for easy 287 recovery, independent of transport protocols, operating systems and 288 filesystems, simplicity, and good performance. The NFS version 4 289 revision has the following goals: 291 o Improved access and good performance on the Internet. 293 The protocol is designed to transit firewalls easily, perform 294 well where latency is high and bandwidth is low, and scale to 295 very large numbers of clients per server. 297 o Strong security with negotiation built into the protocol. 299 The protocol builds on the work of the ONCRPC working group in 300 supporting the RPCSEC_GSS protocol. Additionally, the NFS 301 version 4 protocol provides a mechanism to allow clients and 302 servers the ability to negotiate security and require clients 303 and servers to support a minimal set of security schemes. 305 o Good cross-platform interoperability. 307 The protocol features a filesystem model that provides a useful, 308 common set of features that does not unduly favor one filesystem 309 or operating system over another. 311 o Designed for protocol extensions. 313 The protocol is designed to accept standard extensions that do 314 not compromise backward compatibility. 316 1.1. Inconsistencies of this Document with Section 18 318 Section 18, RPC Definition File, contains the definitions in XDR 319 description language of the constructs used by the protocol. Prior 320 to Section 18, several of the constructs are reproduced for purposes 321 of explanation. The reader is warned of the possibility of errors in 322 the reproduced constructs outside of Section 18. For any part of the 323 document that is inconsistent with Section 18, Section 18 is to be 324 considered authoritative. 326 Draft Specification NFS version 4 Protocol August 2002 328 1.2. Overview of NFS version 4 Features 330 To provide a reasonable context for the reader, the major features of 331 NFS version 4 protocol will be reviewed in brief. This will be done 332 to provide an appropriate context for both the reader who is familiar 333 with the previous versions of the NFS protocol and the reader that is 334 new to the NFS protocols. For the reader new to the NFS protocols, 335 there is still a fundamental knowledge that is expected. The reader 336 should be familiar with the XDR and RPC protocols as described in 337 [RFC1831] and [RFC1832]. A basic knowledge of filesystems and 338 distributed filesystems is expected as well. 340 1.2.1. RPC and Security 342 As with previous versions of NFS, the External Data Representation 343 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS 344 version 4 protocol are those defined in [RFC1831] and [RFC1832]. To 345 meet end to end security requirements, the RPCSEC_GSS framework 346 [RFC2203] will be used to extend the basic RPC security. With the 347 use of RPCSEC_GSS, various mechanisms can be provided to offer 348 authentication, integrity, and privacy to the NFS version 4 protocol. 349 Kerberos V5 will be used as described in [RFC1964] to provide one 350 security framework. The LIPKEY GSS-API mechanism described in 351 [RFC2847] will be used to provide for the use of user password and 352 server public key by the NFS version 4 protocol. With the use of 353 RPCSEC_GSS, other mechanisms may also be specified and used for NFS 354 version 4 security. 356 To enable in-band security negotiation, the NFS version 4 protocol 357 has added a new operation which provides the client a method of 358 querying the server about its policies regarding which security 359 mechanisms must be used for access to the server's filesystem 360 resources. With this, the client can securely match the security 361 mechanism that meets the policies specified at both the client and 362 server. 364 1.2.2. Procedure and Operation Structure 366 A significant departure from the previous versions of the NFS 367 protocol is the introduction of the COMPOUND procedure. For the NFS 368 version 4 protocol, there are two RPC procedures, NULL and COMPOUND. 369 The COMPOUND procedure is defined in terms of operations and these 370 operations correspond more closely to the traditional NFS procedures. 371 With the use of the COMPOUND procedure, the client is able to build 372 simple or complex requests. These COMPOUND requests allow for a 373 reduction in the number of RPCs needed for logical filesystem 374 operations. For example, without previous contact with a server a 375 client will be able to read data from a file in one request by 376 combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. 377 With previous versions of the NFS protocol, this type of single 379 Draft Specification NFS version 4 Protocol August 2002 381 request was not possible. 383 The model used for COMPOUND is very simple. There is no logical OR 384 or ANDing of operations. The operations combined within a COMPOUND 385 request are evaluated in order by the server. Once an operation 386 returns a failing result, the evaluation ends and the results of all 387 evaluated operations are returned to the client. 389 The NFS version 4 protocol continues to have the client refer to a 390 file or directory at the server by a "filehandle". The COMPOUND 391 procedure has a method of passing a filehandle from one operation to 392 another within the sequence of operations. There is a concept of a 393 "current filehandle" and "saved filehandle". Most operations use the 394 "current filehandle" as the filesystem object to operate upon. The 395 "saved filehandle" is used as temporary filehandle storage within a 396 COMPOUND procedure as well as an additional operand for certain 397 operations. 399 1.2.3. Filesystem Model 401 The general filesystem model used for the NFS version 4 protocol is 402 the same as previous versions. The server filesystem is hierarchical 403 with the regular files contained within being treated as opaque byte 404 streams. In a slight departure, file and directory names are encoded 405 with UTF-8 to deal with the basics of internationalization. 407 The NFS version 4 protocol does not require a separate protocol to 408 provide for the initial mapping between path name and filehandle. 409 Instead of using the older MOUNT protocol for this mapping, the 410 server provides a ROOT filehandle that represents the logical root or 411 top of the filesystem tree provided by the server. The server 412 provides multiple filesystems by glueing them together with pseudo 413 filesystems. These pseudo filesystems provide for potential gaps in 414 the path names between real filesystems. 416 1.2.3.1. Filehandle Types 418 In previous versions of the NFS protocol, the filehandle provided by 419 the server was guaranteed to be valid or persistent for the lifetime 420 of the filesystem object to which it referred. For some server 421 implementations, this persistence requirement has been difficult to 422 meet. For the NFS version 4 protocol, this requirement has been 423 relaxed by introducing another type of filehandle, volatile. With 424 persistent and volatile filehandle types, the server implementation 425 can match the abilities of the filesystem at the server along with 426 the operating environment. The client will have knowledge of the 427 type of filehandle being provided by the server and can be prepared 428 to deal with the semantics of each. 430 Draft Specification NFS version 4 Protocol August 2002 432 1.2.3.2. Attribute Types 434 The NFS version 4 protocol introduces three classes of filesystem or 435 file attributes. Like the additional filehandle type, the 436 classification of file attributes has been done to ease server 437 implementations along with extending the overall functionality of the 438 NFS protocol. This attribute model is structured to be extensible 439 such that new attributes can be introduced in minor revisions of the 440 protocol without requiring significant rework. 442 The three classifications are: mandatory, recommended and named 443 attributes. This is a significant departure from the previous 444 attribute model used in the NFS protocol. Previously, the attributes 445 for the filesystem and file objects were a fixed set of mainly UNIX 446 attributes. If the server or client did not support a particular 447 attribute, it would have to simulate the attribute the best it could. 449 Mandatory attributes are the minimal set of file or filesystem 450 attributes that must be provided by the server and must be properly 451 represented by the server. Recommended attributes represent 452 different filesystem types and operating environments. The 453 recommended attributes will allow for better interoperability and the 454 inclusion of more operating environments. The mandatory and 455 recommended attribute sets are traditional file or filesystem 456 attributes. The third type of attribute is the named attribute. A 457 named attribute is an opaque byte stream that is associated with a 458 directory or file and referred to by a string name. Named attributes 459 are meant to be used by client applications as a method to associate 460 application specific data with a regular file or directory. 462 One significant addition to the recommended set of file attributes is 463 the Access Control List (ACL) attribute. This attribute provides for 464 directory and file access control beyond the model used in previous 465 versions of the NFS protocol. The ACL definition allows for 466 specification of user and group level access control. 468 1.2.3.3. Filesystem Replication and Migration 470 With the use of a special file attribute, the ability to migrate or 471 replicate server filesystems is enabled within the protocol. The 472 filesystem locations attribute provides a method for the client to 473 probe the server about the location of a filesystem. In the event of 474 a migration of a filesystem, the client will receive an error when 475 operating on the filesystem and it can then query as to the new file 476 system location. Similar steps are used for replication, the client 477 is able to query the server for the multiple available locations of a 478 particular filesystem. From this information, the client can use its 479 own policies to access the appropriate filesystem location. 481 Draft Specification NFS version 4 Protocol August 2002 483 1.2.4. OPEN and CLOSE 485 The NFS version 4 protocol introduces OPEN and CLOSE operations. The 486 OPEN operation provides a single point where file lookup, creation, 487 and share semantics can be combined. The CLOSE operation also 488 provides for the release of state accumulated by OPEN. 490 1.2.5. File locking 492 With the NFS version 4 protocol, the support for byte range file 493 locking is part of the NFS protocol. The file locking support is 494 structured so that an RPC callback mechanism is not required. This 495 is a departure from the previous versions of the NFS file locking 496 protocol, Network Lock Manager (NLM). The state associated with file 497 locks is maintained at the server under a lease-based model. The 498 server defines a single lease period for all state held by a NFS 499 client. If the client does not renew its lease within the defined 500 period, all state associated with the client's lease may be released 501 by the server. The client may renew its lease with use of the RENEW 502 operation or implicitly by use of other operations (primarily READ). 504 1.2.6. Client Caching and Delegation 506 The file, attribute, and directory caching for the NFS version 4 507 protocol is similar to previous versions. Attributes and directory 508 information are cached for a duration determined by the client. At 509 the end of a predefined timeout, the client will query the server to 510 see if the related filesystem object has been updated. 512 For file data, the client checks its cache validity when the file is 513 opened. A query is sent to the server to determine if the file has 514 been changed. Based on this information, the client determines if 515 the data cache for the file should kept or released. Also, when the 516 file is closed, any modified data is written to the server. 518 If an application wants to serialize access to file data, file 519 locking of the file data ranges in question should be used. 521 The major addition to NFS version 4 in the area of caching is the 522 ability of the server to delegate certain responsibilities to the 523 client. When the server grants a delegation for a file to a client, 524 the client is guaranteed certain semantics with respect to the 525 sharing of that file with other clients. At OPEN, the server may 526 provide the client either a read or write delegation for the file. 527 If the client is granted a read delegation, it is assured that no 528 other client has the ability to write to the file for the duration of 529 the delegation. If the client is granted a write delegation, the 530 client is assured that no other client has read or write access to 531 the file. 533 Draft Specification NFS version 4 Protocol August 2002 535 Delegations can be recalled by the server. If another client 536 requests access to the file in such a way that the access conflicts 537 with the granted delegation, the server is able to notify the initial 538 client and recall the delegation. This requires that a callback path 539 exist between the server and client. If this callback path does not 540 exist, then delegations can not be granted. The essence of a 541 delegation is that it allows the client to locally service operations 542 such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate 543 interaction with the server. 545 1.3. General Definitions 547 The following definitions are provided for the purpose of providing 548 an appropriate context for the reader. 550 Client The "client" is the entity that accesses the NFS server's 551 resources. The client may be an application which contains 552 the logic to access the NFS server directly. The client 553 may also be the traditional operating system client remote 554 filesystem services for a set of applications. 556 In the case of file locking the client is the entity that 557 maintains a set of locks on behalf of one or more 558 applications. This client is responsible for crash or 559 failure recovery for those locks it manages. 561 Note that multiple clients may share the same transport and 562 multiple clients may exist on the same network node. 564 Clientid A 64-bit quantity used as a unique, short-hand reference to 565 a client supplied Verifier and ID. The server is 566 responsible for supplying the Clientid. 568 Lease An interval of time defined by the server for which the 569 client is irrevocably granted a lock. At the end of a 570 lease period the lock may be revoked if the lease has not 571 been extended. The lock must be revoked if a conflicting 572 lock has been granted after the lease interval. 574 All leases granted by a server have the same fixed 575 interval. Note that the fixed interval was chosen to 576 alleviate the expense a server would have in maintaining 577 state about variable length leases across server failures. 579 Lock The term "lock" is used to refer to both record (byte- 580 range) locks as well as share reservations unless 581 specifically stated otherwise. 583 Server The "Server" is the entity responsible for coordinating 584 client access to a set of filesystems. 586 Draft Specification NFS version 4 Protocol August 2002 588 Stable Storage 589 NFS version 4 servers must be able to recover without data 590 loss from multiple power failures (including cascading 591 power failures, that is, several power failures in quick 592 succession), operating system failures, and hardware 593 failure of components other than the storage medium itself 594 (for example, disk, nonvolatile RAM). 596 Some examples of stable storage that are allowable for an 597 NFS server include: 599 1. Media commit of data, that is, the modified data has 600 been successfully written to the disk media, 601 for example, the disk platter. 603 2. An immediate reply disk drive with battery-backed 604 on-drive intermediate storage or uninterruptible power 605 system (UPS). 607 3. Server commit of data with battery-backed intermediate 608 storage and recovery software. 610 4. Cache commit with uninterruptible power system (UPS) 611 and recovery software. 613 Stateid A 128-bit quantity returned by a server that uniquely 614 defines the open and locking state provided by the server 615 for a specific open or lock owner for a specific file. 617 Stateids composed of all bits 0 or all bits 1 have special 618 meaning and are reserved values. 620 Verifier A 64-bit quantity generated by the client that the server 621 can use to determine if the client has restarted and lost 622 all previous lock state. 624 Draft Specification NFS version 4 Protocol August 2002 626 2. Protocol Data Types 628 The syntax and semantics to describe the data types of the NFS 629 version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] 630 documents. The next sections build upon the XDR data types to define 631 types and structures specific to this protocol. 633 2.1. Basic Data Types 635 Data Type Definition 636 _____________________________________________________________________ 637 int32_t typedef int int32_t; 639 uint32_t typedef unsigned int uint32_t; 641 int64_t typedef hyper int64_t; 643 uint64_t typedef unsigned hyper uint64_t; 645 attrlist4 typedef opaque attrlist4<>; 646 Used for file/directory attributes 648 bitmap4 typedef uint32_t bitmap4<>; 649 Used in attribute array encoding. 651 changeid4 typedef uint64_t changeid4; 652 Used in definition of change_info 654 clientid4 typedef uint64_t clientid4; 655 Shorthand reference to client identification 657 component4 typedef utf8string component4; 658 Represents path name components 660 count4 typedef uint32_t count4; 661 Various count parameters (READ, WRITE, COMMIT) 663 length4 typedef uint64_t length4; 664 Describes LOCK lengths 666 linktext4 typedef utf8string linktext4; 667 Symbolic link contents 669 mode4 typedef uint32_t mode4; 670 Mode attribute data type 672 nfs_cookie4 typedef uint64_t nfs_cookie4; 673 Opaque cookie value for READDIR 675 nfs_fh4 typedef opaque nfs_fh4; 676 Filehandle definition; NFS4_FHSIZE is defined as 128 678 Draft Specification NFS version 4 Protocol August 2002 680 nfs_ftype4 enum nfs_ftype4; 681 Various defined file types 683 nfsstat4 enum nfsstat4; 684 Return value for operations 686 offset4 typedef uint64_t offset4; 687 Various offset designations (READ, WRITE, LOCK, COMMIT) 689 pathname4 typedef component4 pathname4<>; 690 Represents path name for LOOKUP, OPEN and others 692 qop4 typedef uint32_t qop4; 693 Quality of protection designation in SECINFO 695 sec_oid4 typedef opaque sec_oid4<>; 696 Security Object Identifier 697 The sec_oid4 data type is not really opaque. 698 Instead contains an ASN.1 OBJECT IDENTIFIER as used 699 by GSS-API in the mech_type argument to 700 GSS_Init_sec_context. See [RFC2743] for details. 702 seqid4 typedef uint32_t seqid4; 703 Sequence identifier used for file locking 705 utf8string typedef opaque utf8string<>; 706 UTF-8 encoding for strings 708 verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 709 Verifier used for various operations (COMMIT, CREATE, 710 OPEN, READDIR, SETCLIENTID, SETCLIENTID_CONFIRM, WRITE) 711 NFS4_VERIFIER_SIZE is defined as 8 713 2.2. Structured Data Types 715 nfstime4 716 struct nfstime4 { 717 int64_t seconds; 718 uint32_t nseconds; 719 } 721 The nfstime4 structure gives the number of seconds and 722 nanoseconds since midnight or 0 hour January 1, 1970 Coordinated 723 Universal Time (UTC). Values greater than zero for the seconds 724 field denote dates after the 0 hour January 1, 1970. Values 725 less than zero for the seconds field denote dates before the 0 726 hour January 1, 1970. In both cases, the nseconds field is to 727 be added to the seconds field for the final time representation. 728 For example, if the time to be represented is one-half second 730 Draft Specification NFS version 4 Protocol August 2002 732 before 0 hour January 1, 1970, the seconds field would have a 733 value of negative one (-1) and the nseconds fields would have a 734 value of one-half second (500000000). Values greater than 735 999,999,999 for nseconds are considered invalid. 737 This data type is used to pass time and date information. A 738 server converts to and from its local representation of time 739 when processing time values, preserving as much accuracy as 740 possible. If the precision of timestamps stored for a filesystem 741 object is less than defined, loss of precision can occur. An 742 adjunct time maintenance protocol is recommended to reduce 743 client and server time skew. 745 time_how4 747 enum time_how4 { 748 SET_TO_SERVER_TIME4 = 0, 749 SET_TO_CLIENT_TIME4 = 1 750 }; 752 settime4 754 union settime4 switch (time_how4 set_it) { 755 case SET_TO_CLIENT_TIME4: 756 nfstime4 time; 757 default: 758 void; 759 }; 761 The above definitions are used as the attribute definitions to 762 set time values. If set_it is SET_TO_SERVER_TIME4, then the 763 server uses its local representation of time for the time value. 765 specdata4 767 struct specdata4 { 768 uint32_t specdata1; /* major device number */ 769 uint32_t specdata2; /* minor device number */ 770 }; 772 This data type represents additional information for the device 773 file types NF4CHR and NF4BLK. 775 fsid4 777 struct fsid4 { 778 uint64_t major; 779 uint64_t minor; 781 Draft Specification NFS version 4 Protocol August 2002 783 }; 785 This type is the filesystem identifier that is used as a 786 mandatory attribute. 788 fs_location4 790 struct fs_location4 { 791 utf8string server<>; 792 pathname4 rootpath; 793 }; 795 fs_locations4 797 struct fs_locations4 { 798 pathname4 fs_root; 799 fs_location4 locations<>; 800 }; 802 The fs_location4 and fs_locations4 data types are used for the 803 fs_locations recommended attribute which is used for migration 804 and replication support. 806 fattr4 808 struct fattr4 { 809 bitmap4 attrmask; 810 attrlist4 attr_vals; 811 }; 813 The fattr4 structure is used to represent file and directory 814 attributes. 816 The bitmap is a counted array of 32 bit integers used to contain 817 bit values. The position of the integer in the array that 818 contains bit n can be computed from the expression (n / 32) and 819 its bit within that integer is (n mod 32). 821 0 1 822 +-----------+-----------+-----------+-- 823 | count | 31 .. 0 | 63 .. 32 | 824 +-----------+-----------+-----------+-- 826 change_info4 828 struct change_info4 { 829 bool atomic; 830 changeid4 before; 832 Draft Specification NFS version 4 Protocol August 2002 834 changeid4 after; 835 }; 837 This structure is used with the CREATE, LINK, REMOVE, RENAME 838 operations to let the client know the value of the change 839 attribute for the directory in which the target filesystem 840 object resides. 842 clientaddr4 844 struct clientaddr4 { 845 /* see struct rpcb in RFC1833 */ 846 string r_netid<>; /* network id */ 847 string r_addr<>; /* universal address */ 848 }; 850 The clientaddr4 structure is used as part of the SETCLIENTID 851 operation to either specify the address of the client that is 852 using a clientid or as part of the callback registration. The 853 r_netid and r_addr fields are specified in [RFC1833], but they 854 are underspecified in [RFC1833] as far as what they should look 855 like for specific protocols. 857 For TCP over IPv4 and for UDP over IPv4, the format of r_addr is 858 the US-ASCII string: 860 h1.h2.h3.h4.p1.p2 862 The prefix, "h1.h2.h3.h4", is the standard textual form for 863 representing an IPv4 address, which is always four octets long. 864 Assuming big-endian ordering, h1, h2, h3, and h4, are 865 respectively, the first through fourth octets each converted to 866 ASCII-decimal. Assuming big-endian ordering, p1 and p2 are, 867 respectively, the first and second octets each converted to 868 ASCII-decimal. For example, if a host, in big-endian order, has 869 an address of 0x0A010307 and there is a service listening on, in 870 big endian order, port 0x020F (decimal 527), then complete 871 universal address is "10.1.3.7.2.15". 873 For TCP over IPv4 the value of r_netid is the string "tcp". For 874 UDP over IPv4 the value of r_netid is the string "udp". 876 For TCP over IPv4 and for UDP over IPv6, the format of r_addr is 877 the US-ASCII string: 879 x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 881 The suffix "p1.p2" is the service port, and is computed the same 882 way as with univeral addresses for TCP and UDP over IPv4. The 883 prefix, "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form 884 for representing an IPv6 address as defined in Section 2.2 of 886 Draft Specification NFS version 4 Protocol August 2002 888 [RFC1884]. Additionally, the two alternative forms specified in 889 Section 2.2 of [RFC1884] are also acceptable. 891 For TCP over IPv6 the value of r_netid is the string "tcp6". 892 For UDP over IPv6 the value of r_netid is the string "udp6". 894 cb_client4 896 struct cb_client4 { 897 unsigned int cb_program; 898 clientaddr4 cb_location; 899 }; 901 This structure is used by the client to inform the server of its 902 call back address; includes the program number and client 903 address. 905 nfs_client_id4 907 struct nfs_client_id4 { 908 verifier4 verifier; 909 opaque id; 910 }; 912 This structure is part of the arguments to the SETCLIENTID 913 operation. NFS4_OPAQUE_LIMIT is defined as 1024. 915 open_owner4 917 struct open_owner4 { 918 clientid4 clientid; 919 opaque owner; 920 }; 922 This structure is used to identify the owner of open state. 923 NFS4_OPAQUE_LIMIT is defined as 1024. 925 lock_owner4 927 struct lock_owner4 { 928 clientid4 clientid; 929 opaque owner; 930 }; 932 This structure is used to identify the owner of file locking 933 state. NFS4_OPAQUE_LIMIT is defined as 1024. 935 Draft Specification NFS version 4 Protocol August 2002 937 open_to_lock_owner4 939 struct open_to_lock_owner4 { 940 seqid4 open_seqid; 941 stateid4 open_stateid; 942 seqid4 lock_seqid; 943 lock_owner4 lock_owner; 944 }; 946 This structure is used for the first LOCK operation done for an 947 open_owner4. It provides both the open_stateid and lock_owner 948 such that the transition is made from a valid open_stateid 949 sequence to that of the new lock_stateid sequence. Using this 950 mechanism avoids the confirmation of the lock_owner/lock_seqid 951 pair since it is tied to established state in the form of the 952 open_stateid/open_seqid. 954 stateid4 956 struct stateid4 { 957 uint32_t seqid; 958 opaque other[12]; 959 }; 961 This structure is used for the various state sharing mechanisms 962 between the client and server. For the client, this data 963 structure is read-only. The starting value of the seqid field 964 is undefined. The server is required to increment the seqid 965 field monotonically at each transition of the stateid. This is 966 important since the client will inspect the seqid in OPEN 967 stateids to determine the order of OPEN processing done by the 968 server. 970 Draft Specification NFS version 4 Protocol August 2002 972 3. RPC and Security Flavor 974 The NFS version 4 protocol is a Remote Procedure Call (RPC) 975 application that uses RPC version 2 and the corresponding eXternal 976 Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The 977 RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as 978 the mechanism to deliver stronger security for the NFS version 4 979 protocol. 981 3.1. Ports and Transports 983 Historically, NFS version 2 and version 3 servers have resided on 984 port 2049. The registered port 2049 [RFC1700] for the NFS protocol 985 should be the default configuration. Using the registered port for 986 NFS services means the NFS client will not need to use the RPC 987 binding protocols as described in [RFC1833]; this will allow NFS to 988 transit firewalls. 990 The transport used by the RPC service for the NFS version 4 protocol 991 MUST provide congestion control comparable to that defined for TCP in 992 [RFC2581]. If the operating environment implements TCP, the NFS 993 version 4 protocol SHOULD be supported over TCP. The NFS client and 994 server MAY use other transports if they support congestion control as 995 defined above and in those cases a mechanism may be provided to 996 override TCP usage in favor of another transport. 998 If TCP is used as the transport, the client and server SHOULD use 999 persistent connections. This will prevent the weakening of TCP's 1000 congestion control via short lived connections and will improve 1001 performance for the WAN environment by eliminating the need for SYN 1002 handshakes. 1004 Note that for various timers, the client and server should avoid 1005 inadvertent synchronization of those timers. For further discussion 1006 of the general issue refer to [Floyd]. 1008 3.1.1. Client Retransmission Behavior 1010 When processing a request received over a reliable transport such as 1011 TCP, the NFS version 4 server MUST NOT silently drop the request, 1012 except if the transport connection has been broken. Given such a 1013 contract between NFS version 4 clients and servers, clients MUST NOT 1014 retry a request unless one or both of the following are true: 1016 o The transport connection has been broken 1018 o The procedure being retried is the NULL procedure 1020 Since transports, including TCP, do not always synchronously inform a 1021 peer when the other peer has broken the connection (for example, when 1023 Draft Specification NFS version 4 Protocol August 2002 1025 an NFS server reboots), so the NFS version 4 client may want to 1026 actively "probe" the connection to see if has been broken. Use of 1027 the NULL procedure is one recommended way to do so. So, when a 1028 client experiences a remote procedure call timeout (of some arbitrary 1029 implementation specific amount), rather than retrying the remote 1030 procedure call, it could instead issue a NULL procedure call to the 1031 server. If the server has died, the transport connection break will 1032 eventually be indicated to the NFS version 4 client. The client can 1033 then reconnect, and then retry the original request. If the NULL 1034 procedure call gets a response, the connection has not broken. The 1035 client can decide to wait longer for the original request's response, 1036 or it can break the transport connection and reconnect before re- 1037 sending the original request. 1039 For callbacks from the server to the client, the same rules apply, 1040 but the server doing the callback becomes the client, and the client 1041 receiving the callback becomes the server. 1043 3.2. Security Flavors 1045 Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, 1046 AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an 1047 additional security flavor of RPCSEC_GSS has been introduced which 1048 uses the functionality of GSS-API [RFC2743]. This allows for the use 1049 of various security mechanisms by the RPC layer without the 1050 additional implementation overhead of adding RPC security flavors. 1051 For NFS version 4, the RPCSEC_GSS security flavor MUST be used to 1052 enable the mandatory security mechanism. Other flavors, such as, 1053 AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. 1055 3.2.1. Security mechanisms for NFS version 4 1057 The use of RPCSEC_GSS requires selection of: mechanism, quality of 1058 protection, and service (authentication, integrity, privacy). The 1059 remainder of this document will refer to these three parameters of 1060 the RPCSEC_GSS security as the security triple. 1062 3.2.1.1. Kerberos V5 as a security triple 1064 The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be 1065 implemented and provide the following security triples. 1067 column descriptions: 1069 1 == number of pseudo flavor 1070 2 == name of pseudo flavor 1071 3 == mechanism's OID 1072 4 == mechanism's algorithm(s) 1073 5 == RPCSEC_GSS service 1075 1 2 3 4 5 1077 Draft Specification NFS version 4 Protocol August 2002 1079 ----------------------------------------------------------------------- 1080 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none 1081 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity 1082 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy 1083 for integrity, 1084 and 56 bit DES 1085 for privacy. 1087 Note that the pseudo flavor is presented here as a mapping aid to the 1088 implementor. Because this NFS protocol includes a method to 1089 negotiate security and it understands the GSS-API mechanism, the 1090 pseudo flavor is not needed. The pseudo flavor is needed for NFS 1091 version 3 since the security negotiation is done via the MOUNT 1092 protocol. 1094 For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please 1095 see [RFC2623]. 1097 3.2.1.2. LIPKEY as a security triple 1099 The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be 1100 implemented and provide the following security triples. The 1101 definition of the columns matches the previous subsection "Kerberos 1102 V5 as security triple" 1104 1 2 3 4 5 1105 ----------------------------------------------------------------------- 1106 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none 1107 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity 1108 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy 1110 The mechanism algorithm is listed as "negotiated". This is because 1111 LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the 1112 confidentiality and integrity algorithms are negotiated. Since 1113 SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit 1114 cast5CBC for confidentiality for privacy as MANDATORY, and further 1115 specifies that HMAC-MD5 and cast5CBC MUST be listed first before 1116 weaker algorithms, specifying "negotiated" in column 4 does not 1117 impair interoperability. In the event an SPKM-3 peer does not 1118 support the mandatory algorithms, the other peer is free to accept or 1119 reject the GSS-API context creation. 1121 Because SPKM-3 negotiates the algorithms, subsequent calls to 1122 LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality 1123 of protection value of 0 (zero). See section 5.2 of [RFC2025] for an 1124 explanation. 1126 LIPKEY uses SPKM-3 to create a secure channel in which to pass a user 1127 name and password from the client to the server. Once the user name 1128 and password have been accepted by the server, calls to the LIPKEY 1129 context are redirected to the SPKM-3 context. See [RFC2847] for more 1131 Draft Specification NFS version 4 Protocol August 2002 1133 details. 1135 3.2.1.3. SPKM-3 as a security triple 1137 The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be 1138 implemented and provide the following security triples. The 1139 definition of the columns matches the previous subsection "Kerberos 1140 V5 as security triple". 1142 1 2 3 4 5 1143 ----------------------------------------------------------------------- 1144 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none 1145 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity 1146 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy 1148 For a discussion as to why the mechanism algorithm is listed as 1149 "negotiated", see the previous section "LIPKEY as a security triple." 1151 Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- 1152 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of 1153 protection value of 0 (zero). See section 5.2 of [RFC2025] for an 1154 explanation. 1156 Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a 1157 mandatory set of triples to handle the situations where the initiator 1158 (the client) is anonymous or where the initiator has its own 1159 certificate. If the initiator is anonymous, there will not be a user 1160 name and password to send to the target (the server). If the 1161 initiator has its own certificate, then using passwords is 1162 superfluous. 1164 3.3. Security Negotiation 1166 With the NFS version 4 server potentially offering multiple security 1167 mechanisms, the client needs a method to determine or negotiate which 1168 mechanism is to be used for its communication with the server. The 1169 NFS server may have multiple points within its filesystem name space 1170 that are available for use by NFS clients. In turn the NFS server 1171 may be configured such that each of these entry points may have 1172 different or multiple security mechanisms in use. 1174 The security negotiation between client and server must be done with 1175 a secure channel to eliminate the possibility of a third party 1176 intercepting the negotiation sequence and forcing the client and 1177 server to choose a lower level of security than required or desired. 1178 See the section "Security Considerations" for further discussion. 1180 Draft Specification NFS version 4 Protocol August 2002 1182 3.3.1. SECINFO 1184 The new SECINFO operation will allow the client to determine, on a 1185 per filehandle basis, what security triple is to be used for server 1186 access. In general, the client will not have to use the SECINFO 1187 operation except during initial communication with the server or when 1188 the client crosses policy boundaries at the server. It is possible 1189 that the server's policies change during the client's interaction 1190 therefore forcing the client to negotiate a new security triple. 1192 3.3.2. Security Error 1194 Based on the assumption that each NFS version 4 client and server 1195 must support a minimum set of security (i.e. LIPKEY, SPKM-3, and 1196 Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its 1197 communication with the server with one of the minimal security 1198 triples. During communication with the server, the client may 1199 receive an NFS error of NFS4ERR_WRONGSEC. This error allows the 1200 server to notify the client that the security triple currently being 1201 used is not appropriate for access to the server's filesystem 1202 resources. The client is then responsible for determining what 1203 security triples are available at the server and choose one which is 1204 appropriate for the client. See the section for the "SECINFO" 1205 operation for further discussion of how the client will respond to 1206 the NFS4ERR_WRONGSEC error and use SECINFO. 1208 3.4. Callback RPC Authentication 1210 Except as noted elsewhere in this section, the callback RPC 1211 (described later) MUST mutually authenticate the NFS server to the 1212 principal that acquired the clientid (also described later), using 1213 the security flavor the original SETCLIENTID operation used. 1215 For AUTH_NONE, there are no principals, so this is a non-issue. 1217 AUTH_SYS has no notions of mutual authentation or a server principal, 1218 so the callback from the server simply uses the AUTH_SYS credential 1219 that the user used when he set up the delegation. 1221 For AUTH_DH, one commonly used convention is that the server uses the 1222 credential corresponding to this AUTH_DH principal: 1224 unix.host@domain 1226 where host and domain are variables corresponding to the name of 1227 server host and directory services domain in which it lives such as a 1228 Network Information System domain or a DNS domain. 1230 Because LIPKEY is layered over SPKM-3, it is permissible for the 1231 server to use SPKM-3 and not LIPKEY for the callback even if the 1233 Draft Specification NFS version 4 Protocol August 2002 1235 client used LIPKEY for SETCLIENTID. 1237 Regardless of what security mechanism under RPCSEC_GSS is being used, 1238 the NFS server, MUST identify itself in GSS-API via a 1239 GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE 1240 names are of the form: 1242 service@hostname 1244 For NFS, the "service" element is 1246 nfs 1248 Implementations of security mechanisms will convert nfs@hostname to 1249 various different forms. For Kerberos V5 and LIPKEY, the following 1250 form is RECOMMENDED: 1252 nfs/hostname 1254 For Kerberos V5, nfs/hostname would be a server principal in the 1255 Kerberos Key Distribution Center database. For LIPKEY, this would be 1256 the username passed to the target (the NFS version 4 client that 1257 receives the callback). 1259 It should be noted that LIPKEY may not work for callbacks, since the 1260 LIPKEY client uses a user id/password. If the NFS client receiving 1261 the callback can authenticate the NFS server's user name/password 1262 pair, and if the user that the NFS server is authenticating to has a 1263 public key certificate, then it works. 1265 In situations where the NFS client uses LIPKEY and uses a per-host 1266 principal for the SETCLIENTID operation, instead of using LIPKEY for 1267 SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication 1268 be used. This effectively means that the client will use a 1269 certificate to authenticate and identify the initiator to the target 1270 on the NFS server. Using SPKM-3 and not LIPKEY has the following 1271 advantages: 1273 o When the server does a callback, it must authenticate to the 1274 principal used in the SETCLIENTID. Even if LIPKEY is used, 1275 because LIPKEY is layered over SPKM-3, the NFS client will need 1276 to have a certificate that corresponds to the principal used in 1277 the SETCLIENTID operation. From an administrative perspective, 1278 having a user name, password, and certificate for both the 1279 client and server is redundant. 1281 o LIPKEY was intended to minimize additional infrastructure 1282 requirements beyond a certificate for the target, and the 1283 expectation is that existing password infrastructure can be 1284 leveraged for the initiator. In some environments, a per-host 1285 password does not exist yet. If certificates are used for any 1286 per-host principals, then additional password infrastructure is 1288 Draft Specification NFS version 4 Protocol August 2002 1290 not needed. 1292 o In cases when a host is both an NFS client and server, it can 1293 share the same per-host certificate. 1295 Draft Specification NFS version 4 Protocol August 2002 1297 4. Filehandles 1299 The filehandle in the NFS protocol is a per server unique identifier 1300 for a filesystem object. The contents of the filehandle are opaque 1301 to the client. Therefore, the server is responsible for translating 1302 the filehandle to an internal representation of the filesystem 1303 object. 1305 4.1. Obtaining the First Filehandle 1307 The operations of the NFS protocol are defined in terms of one or 1308 more filehandles. Therefore, the client needs a filehandle to 1309 initiate communication with the server. With the NFS version 2 1310 protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there 1311 exists an ancillary protocol to obtain this first filehandle. The 1312 MOUNT protocol, RPC program number 100005, provides the mechanism of 1313 translating a string based filesystem path name to a filehandle which 1314 can then be used by the NFS protocols. 1316 The MOUNT protocol has deficiencies in the area of security and use 1317 via firewalls. This is one reason that the use of the public 1318 filehandle was introduced in [RFC2054] and [RFC2055]. With the use 1319 of the public filehandle in combination with the LOOKUP operation in 1320 the NFS version 2 and 3 protocols, it has been demonstrated that the 1321 MOUNT protocol is unnecessary for viable interaction between NFS 1322 client and server. 1324 Therefore, the NFS version 4 protocol will not use an ancillary 1325 protocol for translation from string based path names to a 1326 filehandle. Two special filehandles will be used as starting points 1327 for the NFS client. 1329 4.1.1. Root Filehandle 1331 The first of the special filehandles is the ROOT filehandle. The 1332 ROOT filehandle is the "conceptual" root of the filesystem name space 1333 at the NFS server. The client uses or starts with the ROOT 1334 filehandle by employing the PUTROOTFH operation. The PUTROOTFH 1335 operation instructs the server to set the "current" filehandle to the 1336 ROOT of the server's file tree. Once this PUTROOTFH operation is 1337 used, the client can then traverse the entirety of the server's file 1338 tree with the LOOKUP operation. A complete discussion of the server 1339 name space is in the section "NFS Server Name Space". 1341 4.1.2. Public Filehandle 1343 The second special filehandle is the PUBLIC filehandle. Unlike the 1344 ROOT filehandle, the PUBLIC filehandle may be bound or represent an 1345 arbitrary filesystem object at the server. The server is responsible 1347 Draft Specification NFS version 4 Protocol August 2002 1349 for this binding. It may be that the PUBLIC filehandle and the ROOT 1350 filehandle refer to the same filesystem object. However, it is up to 1351 the administrative software at the server and the policies of the 1352 server administrator to define the binding of the PUBLIC filehandle 1353 and server filesystem object. The client may not make any 1354 assumptions about this binding. The client uses the PUBLIC filehandle 1355 via the PUTPUBFH operation. 1357 4.2. Filehandle Types 1359 In the NFS version 2 and 3 protocols, there was one type of 1360 filehandle with a single set of semantics. This type of filehandle 1361 is termed "persistent" in NFS Version 4. The semantics of a 1362 persistent filehandle remain the same as before. A new type of 1363 filehandle introduced in NFS Version 4 is the "volatile" filehandle, 1364 which attempts to accommodate certain server environments. 1366 The volatile filehandle type was introduced to address server 1367 functionality or implementation issues which make correct 1368 implementation of a persistent filehandle infeasible. Some server 1369 environments do not provide a filesystem level invariant that can be 1370 used to construct a persistent filehandle. The underlying server 1371 filesystem may not provide the invariant or the server's filesystem 1372 programming interfaces may not provide access to the needed 1373 invariant. Volatile filehandles may ease the implementation of 1374 server functionality such as hierarchical storage management or 1375 filesystem reorganization or migration. However, the volatile 1376 filehandle increases the implementation burden for the client. 1378 Since the client will need to handle persistent and volatile 1379 filehandles differently, a file attribute is defined which may be 1380 used by the client to determine the filehandle types being returned 1381 by the server. 1383 4.2.1. General Properties of a Filehandle 1385 The filehandle contains all the information the server needs to 1386 distinguish an individual file. To the client, the filehandle is 1387 opaque. The client stores filehandles for use in a later request and 1388 can compare two filehandles from the same server for equality by 1389 doing a byte-by-byte comparison. However, the client MUST NOT 1390 otherwise interpret the contents of filehandles. If two filehandles 1391 from the same server are equal, they MUST refer to the same file. 1392 Servers SHOULD try to maintain a one-to-one correspondence between 1393 filehandles and files but this is not required. Clients MUST use 1394 filehandle comparisons only to improve performance, not for correct 1395 behavior. All clients need to be prepared for situations in which it 1396 cannot be determined whether two filehandles denote the same object 1397 and in such cases, avoid making invalid assumpions which might cause 1398 incorrect behavior. Further discussion of filehandle and attribute 1400 Draft Specification NFS version 4 Protocol August 2002 1402 comparison in the context of data caching is presented in the section 1403 "Data Caching and File Identity". 1405 As an example, in the case that two different path names when 1406 traversed at the server terminate at the same filesystem object, the 1407 server SHOULD return the same filehandle for each path. This can 1408 occur if a hard link is used to create two file names which refer to 1409 the same underlying file object and associated data. For example, if 1410 paths /a/b/c and /a/d/c refer to the same file, the server SHOULD 1411 return the same filehandle for both path names traversals. 1413 4.2.2. Persistent Filehandle 1415 A persistent filehandle is defined as having a fixed value for the 1416 lifetime of the filesystem object to which it refers. Once the 1417 server creates the filehandle for a filesystem object, the server 1418 MUST accept the same filehandle for the object for the lifetime of 1419 the object. If the server restarts or reboots the NFS server must 1420 honor the same filehandle value as it did in the server's previous 1421 instantiation. Similarly, if the filesystem is migrated, the new NFS 1422 server must honor the same filehandle as the old NFS server. 1424 The persistent filehandle will be become stale or invalid when the 1425 filesystem object is removed. When the server is presented with a 1426 persistent filehandle that refers to a deleted object, it MUST return 1427 an error of NFS4ERR_STALE. A filehandle may become stale when the 1428 filesystem containing the object is no longer available. The file 1429 system may become unavailable if it exists on removable media and the 1430 media is no longer available at the server or the filesystem in whole 1431 has been destroyed or the filesystem has simply been removed from the 1432 server's name space (i.e. unmounted in a UNIX environment). 1434 4.2.3. Volatile Filehandle 1436 A volatile filehandle does not share the same longevity 1437 characteristics of a persistent filehandle. The server may determine 1438 that a volatile filehandle is no longer valid at many different 1439 points in time. If the server can definitively determine that a 1440 volatile filehandle refers to an object that has been removed, the 1441 server should return NFS4ERR_STALE to the client (as is the case for 1442 persistent filehandles). In all other cases where the server 1443 determines that a volatile filehandle can no longer be used, it 1444 should return an error of NFS4ERR_FHEXPIRED. 1446 The mandatory attribute "fh_expire_type" is used by the client to 1447 determine what type of filehandle the server is providing for a 1448 particular filesystem. This attribute is a bitmask with the 1449 following values: 1451 Draft Specification NFS version 4 Protocol August 2002 1453 FH4_PERSISTENT 1454 The value of FH4_PERSISTENT is used to indicate a persistent 1455 filehandle, which is valid until the object is removed from the 1456 filesystem. The server will not return NFS4ERR_FHEXPIRED for 1457 this filehandle. FH4_PERSISTENT is defined as a value in which 1458 none of the bits specified below are set. 1460 FH4_VOLATILE_ANY 1461 The filehandle may expire at any time, except as specifically 1462 excluded (i.e. FH4_NO_EXPIRE_WITH_OPEN). 1464 FH4_NOEXPIRE_WITH_OPEN 1465 May only be set when FH4_VOLATILE_ANY is set. If this bit is 1466 set, then the meaning of FH4_VOLATILE_ANY is qualified to 1467 exclude any expiration of the filehandle when it is open. 1469 FH4_VOL_MIGRATION 1470 The filehandle will expire as a result of migration. If 1471 FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. 1473 FH4_VOL_RENAME 1474 The filehandle will expire during rename. This includes a 1475 rename by the requesting client or a rename by any other client. 1476 If FH4_VOL_ANY is set, FH4_VOL_RENAME is redundant. 1478 Servers which provide volatile filehandles that may expire while 1479 open (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if 1480 FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), 1481 should deny a RENAME or REMOVE that would affect an OPEN file of 1482 any of the components leading to the OPEN file. In addition, 1483 the server should deny all RENAME or REMOVE requests during the 1484 grace period upon server restart. 1486 Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow 1487 the client to determine that expiration has occurred whenever a 1488 specific event occurs, without an explicit filehandle expiration 1489 error from the server. FH4_VOL_ANY does not provide this form 1490 of information. In situations where the server will expire many, 1491 but not all filehandles upon migration (e.g. all but those that 1492 are open), FH4_VOLATILE_ANY (in this case with 1493 FH4_NOEXPIRE_WITH_OPEN) is a better choice since the client may 1494 not assume that all filehandles will expire when migration 1495 occurs, and it is likely that additional expirations will occur 1496 (as a result of file CLOSE) that are separated in time from the 1497 migration event itself. 1499 4.2.4. One Method of Constructing a Volatile Filehandle 1501 As mentioned, in some instances a filehandle is stale (no longer 1502 valid; perhaps because the file was removed from the server) or it is 1503 expired (the underlying file is valid but since the filehandle is 1505 Draft Specification NFS version 4 Protocol August 2002 1507 volatile, it may have expired). Thus the server needs to be able to 1508 return NFS4ERR_STALE in the former case and NFS4ERR_FHEXPIRED in the 1509 latter case. This can be done by careful construction of the volatile 1510 filehandle. One possible implementation follows. 1512 A volatile filehandle, while opaque to the client could contain: 1514 [volatile bit = 1 | server boot time | slot | generation number] 1516 o slot is an index in the server volatile filehandle table 1518 o generation number is the generation number for the table 1519 entry/slot 1521 If the server boot time is less than the current server boot time, 1522 return NFS4ERR_FHEXPIRED. If slot is out of range, return 1523 NFS4ERR_BADHANDLE. If the generation number does not match, return 1524 NFS4ERR_FHEXPIRED. 1526 When the server reboots, the table is gone (it is volatile). 1528 If volatile bit is 0, then it is a persistent filehandle with a 1529 different structure following it. 1531 4.3. Client Recovery from Filehandle Expiration 1533 If possible, the client SHOULD recover from the receipt of an 1534 NFS4ERR_FHEXPIRED error. The client must take on additional 1535 responsibility so that it may prepare itself to recover from the 1536 expiration of a volatile filehandle. If the server returns 1537 persistent filehandles, the client does not need these additional 1538 steps. 1540 For volatile filehandles, most commonly the client will need to store 1541 the component names leading up to and including the filesystem object 1542 in question. With these names, the client should be able to recover 1543 by finding a filehandle in the name space that is still available or 1544 by starting at the root of the server's filesystem name space. 1546 If the expired filehandle refers to an object that has been removed 1547 from the filesystem, obviously the client will not be able to recover 1548 from the expired filehandle. 1550 It is also possible that the expired filehandle refers to a file that 1551 has been renamed. If the file was renamed by another client, again 1552 it is possible that the original client will not be able to recover. 1553 However, in the case that the client itself is renaming the file and 1554 the file is open, it is possible that the client may be able to 1555 recover. The client can determine the new path name based on the 1556 processing of the rename request. The client can then regenerate the 1558 Draft Specification NFS version 4 Protocol August 2002 1560 new filehandle based on the new path name. The client could also use 1561 the compound operation mechanism to construct a set of operations 1562 like: 1563 RENAME A B 1564 LOOKUP B 1565 GETFH 1566 Note that the COMPOUND procedure does not provide atomicity. This 1567 example only reduces the overhead of recovering from an expired 1568 filehandle. 1570 Draft Specification NFS version 4 Protocol August 2002 1572 5. File Attributes 1574 To meet the requirements of extensibility and increased 1575 interoperability with non-UNIX platforms, attributes must be handled 1576 in a flexible manner. The NFS version 3 fattr3 structure contains a 1577 fixed list of attributes that not all clients and servers are able to 1578 support or care about. The fattr3 structure can not be extended as 1579 new needs arise and it provides no way to indicate non-support. With 1580 the NFS version 4 protocol, the client is able query what attributes 1581 the server supports and construct requests with only those supported 1582 attributes (or a subset thereof). 1584 To this end, attributes are divided into three groups: mandatory, 1585 recommended, and named. Both mandatory and recommended attributes 1586 are supported in the NFS version 4 protocol by a specific and well- 1587 defined encoding and are identified by number. They are requested by 1588 setting a bit in the bit vector sent in the GETATTR request; the 1589 server response includes a bit vector to list what attributes were 1590 returned in the response. New mandatory or recommended attributes 1591 may be added to the NFS protocol between major revisions by 1592 publishing a standards-track RFC which allocates a new attribute 1593 number value and defines the encoding for the attribute. See the 1594 section "Minor Versioning" for further discussion. 1596 Named attributes are accessed by the new OPENATTR operation, which 1597 accesses a hidden directory of attributes associated with a file 1598 system object. OPENATTR takes a filehandle for the object and 1599 returns the filehandle for the attribute hierarchy. The filehandle 1600 for the named attributes is a directory object accessible by LOOKUP 1601 or READDIR and contains files whose names represent the named 1602 attributes and whose data bytes are the value of the attribute. For 1603 example: 1605 LOOKUP "foo" ; look up file 1606 GETATTR attrbits 1607 OPENATTR ; access foo's named attributes 1608 LOOKUP "x11icon" ; look up specific attribute 1609 READ 0,4096 ; read stream of bytes 1611 Named attributes are intended for data needed by applications rather 1612 than by an NFS client implementation. NFS implementors are strongly 1613 encouraged to define their new attributes as recommended attributes 1614 by bringing them to the IETF standards-track process. 1616 The set of attributes which are classified as mandatory is 1617 deliberately small since servers must do whatever it takes to support 1618 them. A server should support as many of the recommended attributes 1619 as possible but by their definition, the server is not required to 1620 support all of them. Attributes are deemed mandatory if the data is 1621 both needed by a large number of clients and is not otherwise 1623 Draft Specification NFS version 4 Protocol August 2002 1625 reasonably computable by the client when support is not provided on 1626 the server. 1628 Note that the hidden directory returned by OPENATTR is a convenience 1629 for protocol processing. The client should not make any assumptions 1630 about the server's implementation of named attributes and whether the 1631 underlying filesystem at the server has a named attribute directory 1632 or not. Therefore, operations such as SETATTR and GETATTR on the 1633 named attribute directory are undefined. 1635 5.1. Mandatory Attributes 1637 These MUST be supported by every NFS version 4 client and server in 1638 order to ensure a minimum level of interoperability. The server must 1639 store and return these attributes and the client must be able to 1640 function with an attribute set limited to these attributes. With 1641 just the mandatory attributes some client functionality may be 1642 impaired or limited in some ways. A client may ask for any of these 1643 attributes to be returned by setting a bit in the GETATTR request and 1644 the server must return their value. 1646 5.2. Recommended Attributes 1648 These attributes are understood well enough to warrant support in the 1649 NFS version 4 protocol. However, they may not be supported on all 1650 clients and servers. A client may ask for any of these attributes to 1651 be returned by setting a bit in the GETATTR request but must handle 1652 the case where the server does not return them. A client may ask for 1653 the set of attributes the server supports and should not request 1654 attributes the server does not support. A server should be tolerant 1655 of requests for unsupported attributes and simply not return them 1656 rather than considering the request an error. It is expected that 1657 servers will support all attributes they comfortably can and only 1658 fail to support attributes which are difficult to support in their 1659 operating environments. A server should provide attributes whenever 1660 they don't have to "tell lies" to the client. For example, a file 1661 modification time should be either an accurate time or should not be 1662 supported by the server. This will not always be comfortable to 1663 clients but the client is better positioned decide whether and how to 1664 fabricate or construct an attribute or whether to do without the 1665 attribute. 1667 5.3. Named Attributes 1669 These attributes are not supported by direct encoding in the NFS 1670 Version 4 protocol but are accessed by string names rather than 1671 numbers and correspond to an uninterpreted stream of bytes which are 1672 stored with the filesystem object. The name space for these 1674 Draft Specification NFS version 4 Protocol August 2002 1676 attributes may be accessed by using the OPENATTR operation. The 1677 OPENATTR operation returns a filehandle for a virtual "attribute 1678 directory" and further perusal of the name space may be done using 1679 READDIR and LOOKUP operations on this filehandle. Named attributes 1680 may then be examined or changed by normal READ and WRITE and CREATE 1681 operations on the filehandles returned from READDIR and LOOKUP. 1682 Named attributes may have attributes. 1684 It is recommended that servers support arbitrary named attributes. A 1685 client should not depend on the ability to store any named attributes 1686 in the server's filesystem. If a server does support named 1687 attributes, a client which is also able to handle them should be able 1688 to copy a file's data and meta-data with complete transparency from 1689 one location to another; this would imply that names allowed for 1690 regular directory entries are valid for named attribute names as 1691 well. 1693 Names of attributes will not be controlled by this document or other 1694 IETF standards track documents. See the section "IANA 1695 Considerations" for further discussion. 1697 5.4. Classification of Attributes 1699 Each of the Mandatory and Recommended attributes can be classified in 1700 one of three categories: per server, per filesystem, or per 1701 filesystem object. Note that it is possible that some per filesystem 1702 attributes may vary within the filesystem. See the "homogeneous" 1703 attribute for its definition. Note that the attributes 1704 time_access_set and time_modify_set are not listed below because they 1705 are write-only attributes used in a special instance of SETATTR. 1707 o The per server attribute is: 1709 lease_time 1711 o The per filesystem attributes are: 1713 supp_attr, fh_expire_type, link_support, symlink_support, 1714 unique_handles, aclsupport, cansettime, case_insensitive, 1715 case_preserving, chown_restricted, files_avail, files_free, 1716 files_total, fs_locations, homogeneous, maxfilesize, maxname, 1717 maxread, maxwrite, no_trunc, space_avail, space_free, 1718 space_total, time_delta 1720 o The per filesystem object attributes are: 1722 type, change, size, named_attr, fsid, rdattr_error, filehandle, 1723 ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, 1724 owner, owner_group, rawdev, space_used, system, time_access, 1725 time_backup, time_create, time_metadata, time_modify, 1726 mounted_on_fileid 1728 Draft Specification NFS version 4 Protocol August 2002 1730 For quota_avail_hard, quota_avail_soft, and quota_used see their 1731 definitions below for the appropriate classification. 1733 Draft Specification NFS version 4 Protocol August 2002 1735 5.5. Mandatory Attributes - Definitions 1737 Name # DataType Access Description 1738 ___________________________________________________________________ 1739 supp_attr 0 bitmap READ The bit vector which 1740 would retrieve all 1741 mandatory and 1742 recommended attributes 1743 that are supported for 1744 this object. The 1745 scope of this 1746 attribute applies to 1747 all objects with a 1748 matching fsid. 1750 type 1 nfs4_ftype READ The type of the object 1751 (file, directory, 1752 symlink, etc.) 1754 fh_expire_type 2 uint32 READ Server uses this to 1755 specify filehandle 1756 expiration behavior to 1757 the client. See the 1758 section "Filehandles" 1759 for additional 1760 description. 1762 change 3 uint64 READ A value created by the 1763 server that the client 1764 can use to determine 1765 if file data, 1766 directory contents or 1767 attributes of the 1768 object have been 1769 modified. The server 1770 may return the 1771 object's time_metadata 1772 attribute for this 1773 attribute's value but 1774 only if the filesystem 1775 object can not be 1776 updated more 1777 frequently than the 1778 resolution of 1779 time_metadata. 1781 size 4 uint64 R/W 1782 The size of the object 1783 in bytes. 1785 Draft Specification NFS version 4 Protocol August 2002 1787 link_support 5 bool READ True, if the object's 1788 filesystem supports 1789 hard links. 1791 symlink_support 6 bool READ True, if the object's 1792 filesystem supports 1793 symbolic links. 1795 named_attr 7 bool READ True, if this object 1796 has named attributes. 1797 In other words, object 1798 has a non-empty named 1799 attribute directory. 1801 fsid 8 fsid4 READ Unique filesystem 1802 identifier for the 1803 filesystem holding 1804 this object. fsid 1805 contains major and 1806 minor components each 1807 of which are uint64. 1809 unique_handles 9 bool READ 1810 True, if two distinct 1811 filehandles guaranteed 1812 to refer to two 1813 different filesystem 1814 objects. 1816 lease_time 10 nfs_lease4 READ Duration of leases at 1817 server in seconds. 1819 rdattr_error 11 enum READ Error returned from 1820 getattr during 1821 readdir. 1823 filehandle 19 nfs_fh4 READ The filehandle of this 1824 object (primarily for 1825 readdir requests). 1827 Draft Specification NFS version 4 Protocol August 2002 1829 5.6. Recommended Attributes - Definitions 1831 Name # Data Type Access Description 1832 ______________________________________________________________________ 1833 ACL 12 nfsace4<> R/W The access control 1834 list for the object. 1836 aclsupport 13 uint32 READ Indicates what types 1837 of ACLs are supported 1838 on the current 1839 filesystem. 1841 archive 14 bool R/W True, if this file 1842 has been archived 1843 since the time of 1844 last modification 1845 (deprecated in favor 1846 of time_backup). 1848 cansettime 15 bool READ True, if the server 1849 able to change the 1850 times for a 1851 filesystem object as 1852 specified in a 1853 SETATTR operation. 1855 case_insensitive 16 bool READ True, if filename 1856 comparisons on this 1857 filesystem case 1858 insensitive. 1860 case_preserving 17 bool READ True, if filename 1861 case on this 1862 filesystem preserved. 1864 chown_restricted 18 bool READ If TRUE, the server 1865 will reject any 1866 request to change 1867 either the owner or 1868 the group associated 1869 with a file if the 1870 caller is not a 1871 privileged user (for 1872 example, "root" in 1873 UNIX operating 1874 environments or in 1875 Windows 2000 the 1876 "Take Ownership" 1877 privilege). 1879 Draft Specification NFS version 4 Protocol August 2002 1881 fileid 20 uint64 READ A number uniquely 1882 identifying the file 1883 within the 1884 filesystem. 1886 files_avail 21 uint64 READ File slots available 1887 to this user on the 1888 filesystem containing 1889 this object - this 1890 should be the 1891 smallest relevant 1892 limit. 1894 files_free 22 uint64 READ Free file slots on 1895 the filesystem 1896 containing this 1897 object - this should 1898 be the smallest 1899 relevant limit. 1901 files_total 23 uint64 READ Total file slots on 1902 the filesystem 1903 containing this 1904 object. 1906 fs_locations 24 fs_locations READ Locations where this 1907 filesystem may be 1908 found. If the server 1909 returns NFS4ERR_MOVED 1910 as an error, this 1911 attribute MUST be 1912 supported. 1914 hidden 25 bool R/W True, if the file is 1915 considered hidden 1916 with respect to the 1917 Windows API? 1919 homogeneous 26 bool READ True, if this 1920 object's filesystem 1921 is homogeneous, i.e. 1922 are per filesystem 1923 attributes the same 1924 for all filesystem's 1925 objects. 1927 maxfilesize 27 uint64 READ Maximum supported 1928 file size for the 1929 filesystem of this 1930 object. 1932 Draft Specification NFS version 4 Protocol August 2002 1934 maxlink 28 uint32 READ Maximum number of 1935 links for this 1936 object. 1938 maxname 29 uint32 READ Maximum filename size 1939 supported for this 1940 object. 1942 maxread 30 uint64 READ Maximum read size 1943 supported for this 1944 object. 1946 maxwrite 31 uint64 READ 1947 Maximum write size 1948 supported for this 1949 object. This 1950 attribute SHOULD be 1951 supported if the file 1952 is writable. Lack of 1953 this attribute can 1954 lead to the client 1955 either wasting 1956 bandwidth or not 1957 receiving the best 1958 performance. 1960 mimetype 32 utf8<> R/W MIME body 1961 type/subtype of this 1962 object. 1964 mode 33 mode4 R/W UNIX-style mode and 1965 permission bits for 1966 this object. 1968 no_trunc 34 bool READ True, if a name 1969 longer than name_max 1970 is used, an error be 1971 returned and name is 1972 not truncated. 1974 numlinks 35 uint32 READ Number of hard links 1975 to this object. 1977 owner 36 utf8<> R/W The string name of 1978 the owner of this 1979 object. 1981 owner_group 37 utf8<> R/W The string name of 1982 the group ownership 1983 of this object. 1985 Draft Specification NFS version 4 Protocol August 2002 1987 quota_avail_hard 38 uint64 READ For definition see 1988 "Quota Attributes" 1989 section below. 1991 quota_avail_soft 39 uint64 READ For definition see 1992 "Quota Attributes" 1993 section below. 1995 quota_used 40 uint64 READ For definition see 1996 "Quota Attributes" 1997 section below. 1999 rawdev 41 specdata4 READ Raw device 2000 identifier. UNIX 2001 device major/minor 2002 node information. If 2003 the value of type is 2004 not NF4BLK or NF4CHR, 2005 the value return 2006 SHOULD NOT be 2007 considered useful. 2009 space_avail 42 uint64 READ Disk space in bytes 2010 available to this 2011 user on the 2012 filesystem containing 2013 this object - this 2014 should be the 2015 smallest relevant 2016 limit. 2018 space_free 43 uint64 READ Free disk space in 2019 bytes on the 2020 filesystem containing 2021 this object - this 2022 should be the 2023 smallest relevant 2024 limit. 2026 space_total 44 uint64 READ Total disk space in 2027 bytes on the 2028 filesystem containing 2029 this object. 2031 space_used 45 uint64 READ Number of filesystem 2032 bytes allocated to 2033 this object. 2035 Draft Specification NFS version 4 Protocol August 2002 2037 system 46 bool R/W True, if this file is 2038 a "system" file with 2039 respect to the 2040 Windows API? 2042 time_access 47 nfstime4 READ The time of last 2043 access to the object 2044 by a read that was 2045 satisfied by the 2046 server. 2048 time_access_set 48 settime4 WRITE Set the time of last 2049 access to the object. 2050 SETATTR use only. 2052 time_backup 49 nfstime4 R/W The time of last 2053 backup of the object. 2055 time_create 50 nfstime4 R/W 2056 The time of creation 2057 of the object. This 2058 attribute does not 2059 have any relation to 2060 the traditional UNIX 2061 file attribute 2062 "ctime" or "change 2063 time". 2065 time_delta 51 nfstime4 READ Smallest useful 2066 server time 2067 granularity. 2069 time_metadata 52 nfstime4 R/W The time of last 2070 meta-data 2071 modification of the 2072 object. 2074 time_modify 53 nfstime4 READ The time of last 2075 modification to the 2076 object. 2078 time_modify_set 54 settime4 WRITE Set the time of last 2079 modification to the 2080 object. SETATTR use 2081 only. 2083 mounted_on_fileid 55 uint64 READ Like fileid, but if 2084 the target filehandle 2085 is the root of a 2086 filesystem return the 2087 fileid of the 2088 underlying directory. 2090 Draft Specification NFS version 4 Protocol August 2002 2092 5.7. Time Access 2094 As defined above, the time_access attribute represents the time of 2095 last access to the object by a read that was satisfied by the server. 2096 The notion of what is an "access" depends on server's operating 2097 environment and/or the server's filesystem semantics. For example, 2098 for servers obeying POSIX semantics, time_access would be updated 2099 only by the READLINK, READ, and READDIR operations and not any of the 2100 operations that modify the content of the object. Of course, setting 2101 the corresponding time_access_set attribute is another way to modify 2102 the time_access attribute. 2104 Whenever the file object resides on a writeable filesystem, the 2105 server should make best efforts to record time_access into stable 2106 storage. However, to mitigate the performance effects of doing so, 2107 and most especially whenever the server is satisifying the read of 2108 the object's content from its cache, the server MAY cache access time 2109 updates and lazily write them to stable storage. It is also 2110 acceptable to give administrators of the server the option to disable 2111 time_access updates. 2113 5.8. Interpreting owner and owner_group 2115 The recommended attributes "owner" and "owner_group" (and also users 2116 and groups within the "acl" attribute) are represented in terms of a 2117 UTF-8 string. To avoid a representation that is tied to a particular 2118 underlying implementation at the client or server, the use of the 2119 UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] 2120 provides additional rationale. It is expected that the client and 2121 server will have their own local representation of owner and 2122 owner_group that is used for local storage or presentation to the end 2123 user. Therefore, it is expected that when these attributes are 2124 transferred between the client and server that the local 2125 representation is translated to a syntax of the form 2126 "user@dns_domain". This will allow for a client and server that do 2127 not use the same local representation the ability to translate to a 2128 common syntax that can be interpreted by both. 2130 Similarly, security principals may be represented in different ways 2131 by different security mechanisms. Servers normally translate these 2132 representations into a common format, generally that used by local 2133 storage, to serve as a means of identifying the users corresponding 2134 to these security principals. When these local identifiers are 2135 translated to the form of the owner attribute, associated with files 2136 created by such principals they identify, in a common format, the 2137 users associated with each corresponding set of security principals. 2139 The translation used to interpret owner and group strings is not 2140 specified as part of the protocol. This allows various solutions to 2141 be employed. For example, a local translation table may be consulted 2142 that maps between a numeric id to the user@dns_domain syntax. A name 2144 Draft Specification NFS version 4 Protocol August 2002 2146 service may also be used to accomplish the translation. A server may 2147 provide a more general service, not limited by any particular 2148 translation (which would only translate a limited set of possible 2149 strings) by storing the owner and owner_group attributes in local 2150 storage without any translation or it may augment a translation 2151 method by storing the entire string for attributes for which no 2152 translation is available while using the local representation for 2153 those cases in which a translation is available. 2155 Servers that do not provide support for all possible values of the 2156 owner and owner_group attributes, should return an error 2157 (NFS4ERR_BADOWNER) when a string is presented that has no 2158 translation, as the value to be set for a SETATTR of the owner, 2159 owner_group, or acl attributes. When a server does accept an owner 2160 or owner_group value as valid on a SETATTR (and similarly for the 2161 owner and group strings in an acl), it is promising to return that 2162 same string when a corresponding GETATTR is done. Configuration 2163 changes and ill-constructed name translations (those that contain 2164 aliasing) may make that promise impossible to honor. Servers should 2165 make appropriate efforts to avoid a situation in which these 2166 attributes have their values changed when no real change to ownership 2167 has occurred. 2169 The "dns_domain" portion of the owner string is meant to be a DNS 2170 domain name. For example, user@ietf.org. Servers should accept as 2171 valid a set of users for at least one domain. A server may treat 2172 other domains as having no valid translations. A more general 2173 service is provided when a server is capable of accepting users for 2174 multiple domains, or for all domains, subject to security 2175 constraints. 2177 In the case where there is no translation available to the client or 2178 server, the attribute value must be constructed without the "@". 2179 Therefore, the absence of the @ from the owner or owner_group 2180 attribute signifies that no translation was available at the sender 2181 and that the receiver of the attribute should not use that string as 2182 a basis for translation into its own internal format. Even though 2183 the attribute value can not be translated, it may still be useful. 2184 In the case of a client, the attribute string may be used for local 2185 display of ownership. 2187 To provide a greater degree of compatibility with previous versions 2188 of NFS (i.e. v2 and v3), which identified users and groups by 32-bit 2189 unsigned uid's and gid's, owner and group strings that consist of 2190 decimal numeric values with no leading zeros can be given a special 2191 interpretation by clients and servers which choose to provide such 2192 support. The receiver may treat such a user or group string as 2193 representing the same user as would be represented by a v2/v3 uid or 2194 gid having the corresponding numeric value. A server is not 2195 obligated to accept such a string, but may return an NFS4ERR_BADOWNER 2196 instead. To avoid this mechanism being used to subvert user and 2197 group translation, so that a client might pass all of the owners and 2199 Draft Specification NFS version 4 Protocol August 2002 2201 groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER 2202 error when there is a valid translation for the user or owner 2203 designated in this way. In that case, the client must use the 2204 appropriate name@domain string and not the special form for 2205 compatibility. 2207 The owner string "nobody" may be used to designate an anonymous user, 2208 which will be associated with a file created by a security principal 2209 that cannot be mapped through normal means to the owner attribute. 2211 5.9. Character Case Attributes 2213 With respect to the case_insensitive and case_preserving attributes, 2214 each UCS-4 character (which UTF-8 encodes) has a "long descriptive 2215 name" [RFC1345] which may or may not included the word "CAPITAL" or 2216 "SMALL". The presence of SMALL or CAPITAL allows an NFS server to 2217 implement unambiguous and efficient table driven mappings for case 2218 insensitive comparisons, and non-case-preserving storage. For 2219 general character handling and internationalization issues, see the 2220 section "Internationalization". 2222 5.10. Quota Attributes 2224 For the attributes related to filesystem quotas, the following 2225 definitions apply: 2227 quota_avail_soft 2228 The value in bytes which represents the amount of additional 2229 disk space that can be allocated to this file or directory 2230 before the user may reasonably be warned. It is understood that 2231 this space may be consumed by allocations to other files or 2232 directories though there is a rule as to which other files or 2233 directories. 2235 quota_avail_hard 2236 The value in bytes which represent the amount of additional disk 2237 space beyond the current allocation that can be allocated to 2238 this file or directory before further allocations will be 2239 refused. It is understood that this space may be consumed by 2240 allocations to other files or directories. 2242 quota_used 2243 The value in bytes which represent the amount of disc space used 2244 by this file or directory and possibly a number of other similar 2245 files or directories, where the set of "similar" meets at least 2246 the criterion that allocating space to any file or directory in 2247 the set will reduce the "quota_avail_hard" of every other file 2248 or directory in the set. 2250 Draft Specification NFS version 4 Protocol August 2002 2252 Note that there may be a number of distinct but overlapping sets 2253 of files or directories for which a quota_used value is 2254 maintained. E.g. "all files with a given owner", "all files with 2255 a given group owner". etc. 2257 The server is at liberty to choose any of those sets but should 2258 do so in a repeatable way. The rule may be configured per- 2259 filesystem or may be "choose the set with the smallest quota". 2261 5.11. Access Control Lists 2263 The NFS version 4 ACL attribute is an array of access control entries 2264 (ACE). There are various access control entry types, as defined in 2265 the Section "ACE type". The server is able to communicate which ACE 2266 types are supported by returning the appropriate value within the 2267 aclsupport attribute. Each ACE covers one or more operations on a 2268 file or directory as described in the Section "ACE Access Mask". It 2269 may also contain one or more flags that modify the semantics of the 2270 ACE as defined in the Section "ACE flag". 2272 The NFS ACE attribute is defined as follows: 2274 typedef uint32_t acetype4; 2275 typedef uint32_t aceflag4; 2276 typedef uint32_t acemask4; 2278 struct nfsace4 { 2279 acetype4 type; 2280 aceflag4 flag; 2281 acemask4 access_mask; 2282 utf8string who; 2283 }; 2285 To determine if a request succeeds, each nfsace4 entry is processed 2286 in order by the server. Only ACEs which have a "who" that matches 2287 the requester are considered. Each ACE is processed until all of the 2288 bits of the requester's access have been ALLOWED. Once a bit (see 2289 below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer 2290 considered in the processing of later ACEs. If an ACCESS_DENIED_ACE 2291 is encountered where the requester's access still has unALLOWED bits 2292 in common with the "access_mask" of the ACE, the request is denied. 2293 However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT 2294 ACE types do not affect a requestor's access, and instead are for 2295 triggering events as a result of a requestor's access attempt. 2296 Therefore, all AUDIT and ALARM ACEs are processed until end of the 2297 ACL. 2299 The NFS version 4 ACL model is quite rich. Some server platforms may 2300 provide access control functionality that goes beyond the UNIX-style 2302 Draft Specification NFS version 4 Protocol August 2002 2304 mode attribute, but which is not as rich as the NFS ACL model. So 2305 that users can take advantage of this more limited functionality, the 2306 server may indicate that it supports ACLs as long as it follows the 2307 guidelines for mapping between its ACL model and the NFS version 4 2308 ACL model. 2310 The situation is complicated by the fact that a server may have 2311 multiple modules that enforce ACLs. For example, the enforcement for 2312 NFS version 4 access may be different from the enforcement for local 2313 access, and both may be different from the enforcement for access 2314 through other protocols such as SMB. So it may be useful for a 2315 server to accept an ACL even if not all of its modules are able to 2316 support it. 2318 The guiding principle in all cases is that the server must not accept 2319 ACLs that appear to make the file more secure than it really is. 2321 5.11.1. ACE type 2323 Type Description 2324 _____________________________________________________ 2325 ALLOW Explicitly grants the access defined in 2326 acemask4 to the file or directory. 2328 DENY Explicitly denies the access defined in 2329 acemask4 to the file or directory. 2331 AUDIT LOG (system dependent) any access 2332 attempt to a file or directory which 2333 uses any of the access methods specified 2334 in acemask4. 2336 ALARM Generate a system ALARM (system 2337 dependent) when any access attempt is 2338 made to a file or directory for the 2339 access methods specified in acemask4. 2341 A server need not support all of the above ACE types. The bitmask 2342 constants used to represent the above definitions within the 2343 aclsupport attribute are as follows: 2345 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 2346 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 2347 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 2348 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 2350 The semantics of the "type" field follow the descriptions provided 2352 Draft Specification NFS version 4 Protocol August 2002 2354 above. 2356 The constants used for the type field (acetype4) are as follows: 2358 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 2359 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 2360 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 2361 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 2363 Clients should not attempt to set an ACE unless the server claims 2364 support for that ACE type. If the server receives a request to set 2365 an ACE that it cannot store, it must reject the request with 2366 NFS4ERR_ATTRNOTSUPP. 2368 If the server receives a request to set an ACE that it can store but 2369 cannot enforce, the server SHOULD reject the request. 2371 Example: suppose a server can enforce NFS ACLs for NFS access but 2372 cannot enforce ACLs for local access. If arbitrary processes can run 2373 on the server, then the server SHOULD NOT indicate ACL support. On 2374 the other hand, if only trusted administrative programs run locally, 2375 then the server may indicate ACL support. 2377 5.11.2. ACE Access Mask 2379 The access_mask field contains values based on the following: 2381 Access Description 2382 _______________________________________________________________ 2383 READ_DATA Permission to read the data of the file 2384 LIST_DIRECTORY Permission to list the contents of a 2385 directory 2386 WRITE_DATA Permission to modify the file's data 2387 ADD_FILE Permission to add a new file to a 2388 directory 2389 APPEND_DATA Permission to append data to a file 2390 ADD_SUBDIRECTORY Permission to create a subdirectory to a 2391 directory 2392 READ_NAMED_ATTRS Permission to read the named attributes 2393 of a file 2394 WRITE_NAMED_ATTRS Permission to write the named attributes 2395 of a file 2396 EXECUTE Permission to execute a file 2397 DELETE_CHILD Permission to delete a file or directory 2398 within a directory 2399 READ_ATTRIBUTES The ability to read basic attributes 2400 (non-acls) of a file 2402 Draft Specification NFS version 4 Protocol August 2002 2404 WRITE_ATTRIBUTES Permission to change basic attributes 2405 (non-acls) of a file 2407 DELETE Permission to Delete the file 2408 READ_ACL Permission to Read the ACL 2409 WRITE_ACL Permission to Write the ACL 2410 WRITE_OWNER Permission to change the owner 2411 SYNCHRONIZE Permission to access file locally at the 2412 server with synchronous reads and writes 2414 The bitmask constants used for the access mask field are as follows: 2416 const ACE4_READ_DATA = 0x00000001; 2417 const ACE4_LIST_DIRECTORY = 0x00000001; 2418 const ACE4_WRITE_DATA = 0x00000002; 2419 const ACE4_ADD_FILE = 0x00000002; 2420 const ACE4_APPEND_DATA = 0x00000004; 2421 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 2422 const ACE4_READ_NAMED_ATTRS = 0x00000008; 2423 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 2424 const ACE4_EXECUTE = 0x00000020; 2425 const ACE4_DELETE_CHILD = 0x00000040; 2426 const ACE4_READ_ATTRIBUTES = 0x00000080; 2427 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 2428 const ACE4_DELETE = 0x00010000; 2429 const ACE4_READ_ACL = 0x00020000; 2430 const ACE4_WRITE_ACL = 0x00040000; 2431 const ACE4_WRITE_OWNER = 0x00080000; 2432 const ACE4_SYNCHRONIZE = 0x00100000; 2434 Server implementations need not provide the granularity of control 2435 that is implied by this list of masks. For example, POSIX-based 2436 systems might not distinguish APPEND_DATA (the ability to append to a 2437 file) from WRITE_DATA (the ability to modify existing contents); both 2438 masks would be tied to a single ``write'' permission. When such a 2439 server returns attributes to the client, it would show both 2440 APPEND_DATA and WRITE_DATA if and only if the write permission is 2441 enabled. 2443 If a server receives a SETATTR request that it cannot accurately 2444 implement, it should error in the direction of more restricted 2445 access. For example, suppose a server cannot distinguish overwriting 2446 data from appending new data, as described in the previous paragraph. 2447 If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is 2448 not (or vice versa), the server should reject the request with 2449 NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the 2450 server may silently turn on the other bit, so that both APPEND_DATA 2451 and WRITE_DATA are denied. 2453 Draft Specification NFS version 4 Protocol August 2002 2455 5.11.3. ACE flag 2457 The "flag" field contains values based on the following descriptions. 2459 ACE4_FILE_INHERIT_ACE 2461 Can be placed on a directory and indicates that this ACE should be 2462 added to each new non-directory file created. 2464 ACE4_DIRECTORY_INHERIT_ACE 2466 Can be placed on a directory and indicates that this ACE should be 2467 added to each new directory created. 2469 ACE4_INHERIT_ONLY_ACE 2471 Can be placed on a directory but does not apply to the directory, 2472 only to newly created files/directories as specified by the above two 2473 flags. 2475 ACE4_NO_PROPAGATE_INHERIT_ACE 2477 Can be placed on a directory. Normally when a new directory is 2478 created and an ACE exists on the parent directory which is marked 2479 ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new directory. 2480 One for the directory itself and one which is an inheritable ACE for 2481 newly created directories. This flag tells the server to not place 2482 an ACE on the newly created directory which is inheritable by 2483 subdirectories of the created directory. 2485 ACE4_SUCCESSFUL_ACCESS_ACE_FLAG 2487 ACL4_FAILED_ACCESS_ACE_FLAG 2489 The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and 2490 ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to 2491 ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE 2492 (ALARM) ACE types. If during the processing of the file's ACL, the 2493 server encounters an AUDIT or ALARM ACE that matches the principal 2494 attempting the OPEN, the server notes that fact, and the prescence, 2495 if any, of the SUCCESS and FAILED flags encountered in the AUDIT or 2496 ALARM ACE. Once the server completes the ACL processing, and the 2497 share reservation processing, and the OPEN call, it then notes if the 2498 OPEN succeeded or failed. If the OPEN succeeded, and if the SUCCESS 2499 flag was set for a matching AUDIT or ALARM, then the appropriate 2500 AUDIT or ALARM event occurs. If the OPEN failed, and if the FAILED 2501 flag was set for the matching AUDIT or ALARM, then the appropriate 2503 Draft Specification NFS version 4 Protocol August 2002 2505 AUDIT or ALARM event occurs. Clearly either or both of the SUCCESS 2506 or FAILED can be set, but if neither is set, the AUDIT or ALARM ACE 2507 is not useful. 2509 The previously described processing applies to that of the ACCESS 2510 operation as well. The difference being that "success" or "failure" 2511 does not mean whether ACCESS returns NFS4_OK or not. Success means 2512 whether ACCESS returns all requested and supported bits. Failure 2513 means whether ACCESS failed to return a bit that was requested and 2514 supported. 2516 ACE4_IDENTIFIER_GROUP 2518 Indicates that the "who" refers to a GROUP as defined under UNIX. 2520 The bitmask constants used for the flag field are as follows: 2522 const ACE4_FILE_INHERIT_ACE = 0x00000001; 2523 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 2524 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 2525 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 2526 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 2527 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 2528 const ACE4_IDENTIFIER_GROUP = 0x00000040; 2530 A server need not support any of these flags. If the server supports 2531 flags that are similar to, but not exactly the same as, these flags, 2532 the implementation may define a mapping between the protocol-defined 2533 flags and the implementation-defined flags. Again, the guiding 2534 principle is that the file not appear to be more secure than it 2535 really is. 2537 For example, suppose a client tries to set an ACE with 2538 ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the 2539 server does not support any form of ACL inheritance, the server 2540 should reject the request with NFS4ERR_ATTRNOTSUPP. If the server 2541 supports a single "inherit ACE" flag that applies to both files and 2542 directories, the server may reject the request (i.e., requiring the 2543 client to set both the file and directory inheritance flags). The 2544 server may also accept the request and silently turn on the 2545 ACE4_DIRECTORY_INHERIT_ACE flag. 2547 5.11.4. ACE who 2549 There are several special identifiers ("who") which need to be 2550 understood universally, rather than in the context of a particular 2551 DNS domain. Some of these identifiers cannot be understood when an 2552 NFS client accesses the server, but have meaning when a local process 2554 Draft Specification NFS version 4 Protocol August 2002 2556 accesses the file. The ability to display and modify these 2557 permissions is permitted over NFS, even if none of the access methods 2558 on the server understands the identifiers. 2560 Who Description 2561 _______________________________________________________________ 2562 "OWNER" The owner of the file. 2563 "GROUP" The group associated with the file. 2564 "EVERYONE" The world. 2565 "INTERACTIVE" Accessed from an interactive terminal. 2566 "NETWORK" Accessed via the network. 2567 "DIALUP" Accessed as a dialup user to the server. 2568 "BATCH" Accessed from a batch job. 2569 "ANONYMOUS" Accessed without any authentication. 2570 "AUTHENTICATED" Any authenticated user (opposite of 2571 ANONYMOUS) 2572 "SERVICE" Access from a system service. 2574 To avoid conflict, these special identifiers are distinguish by an 2575 appended "@" and should appear in the form "xxxx@" (note: no domain 2576 name after the "@"). For example: ANONYMOUS@. 2578 5.11.5. Mode Attribute 2580 The NFS version 4 mode attribute is based on the UNIX mode bits. The 2581 following bits are defined: 2583 const MODE4_SUID = 0x800; /* set user id on execution */ 2584 const MODE4_SGID = 0x400; /* set group id on execution */ 2585 const MODE4_SVTX = 0x200; /* save text even after use */ 2586 const MODE4_RUSR = 0x100; /* read permission: owner */ 2587 const MODE4_WUSR = 0x080; /* write permission: owner */ 2588 const MODE4_XUSR = 0x040; /* execute permission: owner */ 2589 const MODE4_RGRP = 0x020; /* read permission: group */ 2590 const MODE4_WGRP = 0x010; /* write permission: group */ 2591 const MODE4_XGRP = 0x008; /* execute permission: group */ 2592 const MODE4_ROTH = 0x004; /* read permission: other */ 2593 const MODE4_WOTH = 0x002; /* write permission: other */ 2594 const MODE4_XOTH = 0x001; /* execute permission: other */ 2596 Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal 2597 identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and 2598 MODE4_XGRP apply to the principals identified in the owner_group 2599 attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any 2600 principal that does not match that in the owner group, and does not 2601 have a group matching that of the owner_group attribute. 2603 The remaining bits are not defined by this protocol and MUST NOT be 2605 Draft Specification NFS version 4 Protocol August 2002 2607 used. The minor version mechanism must be used to define further bit 2608 usage. 2610 Note that in UNIX, if a file has the MODE4_SGID bit set and no 2611 MODE4_XGRP bit set, then READ and WRITE must use mandatory file 2612 locking. 2614 5.11.6. Mode and ACL Attribute 2616 The server that supports both mode and ACL must take care to 2617 synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the 2618 ACEs which have respective who fields of "OWNER@", "GROUP@", and 2619 "EVERYONE@" so that the client can see semantically equivalent access 2620 permissions exist whether the client asks for owner, owner_group and 2621 mode attributes, or for just the ACL. 2623 Because the mode attribute includes bits (e.g. MODE4_SVTX) that have 2624 nothing to do with ACL semantics, it is permitted for clients to 2625 specify both the ACL attribute and mode in the same SETATTR 2626 operation. However, because there is no prescribed order for 2627 processing the attributes in a SETATTR, the client must ensure that 2628 ACL attribute, if specified without mode, would produce the desired 2629 mode bits, and conversely, the mode attribute if specified without 2630 ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" 2631 ACEs. 2633 5.11.7. mounted_on_fileid 2635 UNIX-based operating environments connect a filesystem into the 2636 namespace by connecting (mounting) the filesystem onto the existing 2637 file object (the mount point, usually a directory) of an existing 2638 filesystem. When the mount point's parent directory is read via an 2639 API like readdir(), the return results are directory entries, each 2640 with a component name and a fileid. The fileid of the mount point's 2641 directory entry will be different from the fileid that the stat() 2642 system call returns. The stat() system call is returning the fileid 2643 of the root of the mounted filesystem, whereas readdir() is returning 2644 the fileid stat() would have returned before any filesystems were 2645 mounted on the mount point. 2647 Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request 2648 to cross other filesystems. The client detects the filesystem 2649 crossing whenever the filehandle argument of LOOKUP has an fsid 2650 attribute different from that of the filehandle returned by LOOKUP. A 2651 UNIX-based client will consider this a "mount point crossing". UNIX 2652 has a legacy scheme for allowing a process to determine its current 2653 working directory. This relies on readdir() of a mount point's parent 2654 and stat() of the mount point returning fileids as previously 2655 described. The mounted_on_fileid attribute corresponds to the fileid 2656 that readdir() would have returned as described previously. 2658 Draft Specification NFS version 4 Protocol August 2002 2660 While the NFS version 4 client could simply fabricate a fileid 2661 corresponding to what mounted_on_fileid provides (and if the server 2662 does not support mounted_on_fileid, the client has no choice), there 2663 is a risk that the client will generate a fileid that conflicts with 2664 one that is already assigned to another object in the filesystem. 2665 Instead, if the server can provide the mounted_on_fileid, the 2666 potential for client operational problems in this area is eliminated. 2668 If the server detects that there is no mounted point at the target 2669 file object, then the value for mounted_on_fileid that it returns is 2670 the same as that of the fileid attribute. 2672 The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD 2673 provide it if possible, and for a UNIX-based server, this is 2674 straightforward. Usually, mounted_on_fileid will be requested during 2675 a READDIR operation, in which case it is trivial (at least for UNIX- 2676 based servers) to return mounted_on_fileid since it is equal to the 2677 fileid of a directory entry returned by readdir(). If 2678 mounted_on_fileid is requested in a GETATTR operation, the server 2679 should obey an invariant that has it returning a value that is equal 2680 to the file object's entry in the object's parent directory, i.e. 2681 what readdir() would have returned. Some operating environments 2682 allow a series of two or more filesystems to be mounted onto a single 2683 mount point. In this case, for the server to obey the aforementioned 2684 invariant, it will need to find the base mount point, and not the 2685 intermediate mount points. 2687 Draft Specification NFS version 4 Protocol August 2002 2689 6. Filesystem Migration and Replication 2691 With the use of the recommended attribute "fs_locations", the NFS 2692 version 4 server has a method of providing filesystem migration or 2693 replication services. For the purposes of migration and replication, 2694 a filesystem will be defined as all files that share a given fsid 2695 (both major and minor values are the same). 2697 The fs_locations attribute provides a list of filesystem locations. 2698 These locations are specified by providing the server name (either 2699 DNS domain or IP address) and the path name representing the root of 2700 the filesystem. Depending on the type of service being provided, the 2701 list will provide a new location or a set of alternate locations for 2702 the filesystem. The client will use this information to redirect its 2703 requests to the new server. 2705 6.1. Replication 2707 It is expected that filesystem replication will be used in the case 2708 of read-only data. Typically, the filesystem will be replicated on 2709 two or more servers. The fs_locations attribute will provide the 2710 list of these locations to the client. On first access of the 2711 filesystem, the client should obtain the value of the fs_locations 2712 attribute. If, in the future, the client finds the server 2713 unresponsive, the client may attempt to use another server specified 2714 by fs_locations. 2716 If applicable, the client must take the appropriate steps to recover 2717 valid filehandles from the new server. This is described in more 2718 detail in the following sections. 2720 6.2. Migration 2722 Filesystem migration is used to move a filesystem from one server to 2723 another. Migration is typically used for a filesystem that is 2724 writable and has a single copy. The expected use of migration is for 2725 load balancing or general resource reallocation. The protocol does 2726 not specify how the filesystem will be moved between servers. This 2727 server-to-server transfer mechanism is left to the server 2728 implementor. However, the method used to communicate the migration 2729 event between client and server is specified here. 2731 Once the servers participating in the migration have completed the 2732 move of the filesystem, the error NFS4ERR_MOVED will be returned for 2733 subsequent requests received by the original server. The 2734 NFS4ERR_MOVED error is returned for all operations except PUTFH and 2735 GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will 2736 obtain the value of the fs_locations attribute. The client will then 2737 use the contents of the attribute to redirect its requests to the 2738 specified server. To facilitate the use of GETATTR, operations such 2740 Draft Specification NFS version 4 Protocol August 2002 2742 as PUTFH must also be accepted by the server for the migrated file 2743 system's filehandles. Note that if the server returns NFS4ERR_MOVED, 2744 the server MUST support the fs_locations attribute. 2746 If the client requests more attributes than just fs_locations, the 2747 server may return fs_locations only. This is to be expected since 2748 the server has migrated the filesystem and may not have a method of 2749 obtaining additional attribute data. 2751 The server implementor needs to be careful in developing a migration 2752 solution. The server must consider all of the state information 2753 clients may have outstanding at the server. This includes but is not 2754 limited to locking/share state, delegation state, and asynchronous 2755 file writes which are represented by WRITE and COMMIT verifiers. The 2756 server should strive to minimize the impact on its clients during and 2757 after the migration process. 2759 6.3. Interpretation of the fs_locations Attribute 2761 The fs_location attribute is structured in the following way: 2763 struct fs_location { 2764 utf8string server<>; 2765 pathname4 rootpath; 2766 }; 2768 struct fs_locations { 2769 pathname4 fs_root; 2770 fs_location locations<>; 2771 }; 2773 The fs_location struct is used to represent the location of a 2774 filesystem by providing a server name and the path to the root of the 2775 filesystem. For a multi-homed server or a set of servers that use 2776 the same rootpath, an array of server names may be provided. An 2777 entry in the server array is an UTF8 string and represents one of a 2778 traditional DNS host name, IPv4 address, or IPv6 address. It is not 2779 a requirement that all servers that share the same rootpath be listed 2780 in one fs_location struct. The array of server names is provided for 2781 convenience. Servers that share the same rootpath may also be listed 2782 in separate fs_location entries in the fs_locations attribute. 2784 The fs_locations struct and attribute then contains an array of 2785 locations. Since the name space of each server may be constructed 2786 differently, the "fs_root" field is provided. The path represented 2787 by fs_root represents the location of the filesystem in the server's 2788 name space. Therefore, the fs_root path is only associated with the 2789 server from which the fs_locations attribute was obtained. The 2790 fs_root path is meant to aid the client in locating the filesystem at 2791 the various servers listed. 2793 Draft Specification NFS version 4 Protocol August 2002 2795 As an example, there is a replicated filesystem located at two 2796 servers (servA and servB). At servA the filesystem is located at 2797 path "/a/b/c". At servB the filesystem is located at path "/x/y/z". 2798 In this example the client accesses the filesystem first at servA 2799 with a multi-component lookup path of "/a/b/c/d". Since the client 2800 used a multi-component lookup to obtain the filehandle at "/a/b/c/d", 2801 it is unaware that the filesystem's root is located in servA's name 2802 space at "/a/b/c". When the client switches to servB, it will need 2803 to determine that the directory it first referenced at servA is now 2804 represented by the path "/x/y/z/d" on servB. To facilitate this, the 2805 fs_locations attribute provided by servA would have a fs_root value 2806 of "/a/b/c" and two entries in fs_location. One entry in fs_location 2807 will be for itself (servA) and the other will be for servB with a 2808 path of "/x/y/z". With this information, the client is able to 2809 substitute "/x/y/z" for the "/a/b/c" at the beginning of its access 2810 path and construct "/x/y/z/d" to use for the new server. 2812 See the section "Security Considerations" for a discussion on the 2813 recommendations for the security flavor to be used by any GETATTR 2814 operation that requests the "fs_locations" attribute. 2816 6.4. Filehandle Recovery for Migration or Replication 2818 Filehandles for filesystems that are replicated or migrated generally 2819 have the same semantics as for filesystems that are not replicated or 2820 migrated. For example, if a filesystem has persistent filehandles 2821 and it is migrated to another server, the filehandle values for the 2822 filesystem will be valid at the new server. 2824 For volatile filehandles, the servers involved likely do not have a 2825 mechanism to transfer filehandle format and content between 2826 themselves. Therefore, a server may have difficulty in determining 2827 if a volatile filehandle from an old server should return an error of 2828 NFS4ERR_FHEXPIRED. Therefore, the client is informed, with the use 2829 of the fh_expire_type attribute, whether volatile filehandles will 2830 expire at the migration or replication event. If the bit 2831 FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client 2832 must treat the volatile filehandle as if the server had returned the 2833 NFS4ERR_FHEXPIRED error. At the migration or replication event in 2834 the presence of the FH4_VOL_MIGRATION bit, the client will not 2835 present the original or old volatile filehandle to the new server. 2836 The client will start its communication with the new server by 2837 recovering its filehandles using the saved file names. 2839 Draft Specification NFS version 4 Protocol August 2002 2841 7. NFS Server Name Space 2843 7.1. Server Exports 2845 On a UNIX server the name space describes all the files reachable by 2846 pathnames under the root directory or "/". On a Windows NT server 2847 the name space constitutes all the files on disks named by mapped 2848 disk letters. NFS server administrators rarely make the entire 2849 server's filesystem name space available to NFS clients. More often 2850 portions of the name space are made available via an "export" 2851 feature. In previous versions of the NFS protocol, the root 2852 filehandle for each export is obtained through the MOUNT protocol; 2853 the client sends a string that identifies the export of name space 2854 and the server returns the root filehandle for it. The MOUNT 2855 protocol supports an EXPORTS procedure that will enumerate the 2856 server's exports. 2858 7.2. Browsing Exports 2860 The NFS version 4 protocol provides a root filehandle that clients 2861 can use to obtain filehandles for these exports via a multi-component 2862 LOOKUP. A common user experience is to use a graphical user 2863 interface (perhaps a file "Open" dialog window) to find a file via 2864 progressive browsing through a directory tree. The client must be 2865 able to move from one export to another export via single-component, 2866 progressive LOOKUP operations. 2868 This style of browsing is not well supported by the NFS version 2 and 2869 3 protocols. The client expects all LOOKUP operations to remain 2870 within a single server filesystem. For example, the device attribute 2871 will not change. This prevents a client from taking name space paths 2872 that span exports. 2874 An automounter on the client can obtain a snapshot of the server's 2875 name space using the EXPORTS procedure of the MOUNT protocol. If it 2876 understands the server's pathname syntax, it can create an image of 2877 the server's name space on the client. The parts of the name space 2878 that are not exported by the server are filled in with a "pseudo 2879 filesystem" that allows the user to browse from one mounted 2880 filesystem to another. There is a drawback to this representation of 2881 the server's name space on the client: it is static. If the server 2882 administrator adds a new export the client will be unaware of it. 2884 7.3. Server Pseudo Filesystem 2886 NFS version 4 servers avoid this name space inconsistency by 2887 presenting all the exports within the framework of a single server 2888 name space. An NFS version 4 client uses LOOKUP and READDIR 2889 operations to browse seamlessly from one export to another. Portions 2891 Draft Specification NFS version 4 Protocol August 2002 2893 of the server name space that are not exported are bridged via a 2894 "pseudo filesystem" that provides a view of exported directories 2895 only. A pseudo filesystem has a unique fsid and behaves like a 2896 normal, read only filesystem. 2898 Based on the construction of the server's name space, it is possible 2899 that multiple pseudo filesystems may exist. For example, 2901 /a pseudo filesystem 2902 /a/b real filesystem 2903 /a/b/c pseudo filesystem 2904 /a/b/c/d real filesystem 2906 Each of the pseudo filesystems are considered separate entities and 2907 therefore will have a unique fsid. 2909 7.4. Multiple Roots 2911 The DOS and Windows operating environments are sometimes described as 2912 having "multiple roots". filesystems are commonly represented as 2913 disk letters. MacOS represents filesystems as top level names. NFS 2914 version 4 servers for these platforms can construct a pseudo file 2915 system above these root names so that disk letters or volume names 2916 are simply directory names in the pseudo root. 2918 7.5. Filehandle Volatility 2920 The nature of the server's pseudo filesystem is that it is a logical 2921 representation of filesystem(s) available from the server. 2922 Therefore, the pseudo filesystem is most likely constructed 2923 dynamically when the server is first instantiated. It is expected 2924 that the pseudo filesystem may not have an on disk counterpart from 2925 which persistent filehandles could be constructed. Even though it is 2926 preferable that the server provide persistent filehandles for the 2927 pseudo filesystem, the NFS client should expect that pseudo file 2928 system filehandles are volatile. This can be confirmed by checking 2929 the associated "fh_expire_type" attribute for those filehandles in 2930 question. If the filehandles are volatile, the NFS client must be 2931 prepared to recover a filehandle value (e.g. with a multi-component 2932 LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. 2934 7.6. Exported Root 2936 If the server's root filesystem is exported, one might conclude that 2937 a pseudo-filesystem is not needed. This would be wrong. Assume the 2938 following filesystems on a server: 2940 / disk1 (exported) 2941 /a disk2 (not exported) 2943 Draft Specification NFS version 4 Protocol August 2002 2945 /a/b disk3 (exported) 2947 Because disk2 is not exported, disk3 cannot be reached with simple 2948 LOOKUPs. The server must bridge the gap with a pseudo-filesystem. 2950 7.7. Mount Point Crossing 2952 The server filesystem environment may be constructed in such a way 2953 that one filesystem contains a directory which is 'covered' or 2954 mounted upon by a second filesystem. For example: 2956 /a/b (filesystem 1) 2957 /a/b/c/d (filesystem 2) 2959 The pseudo filesystem for this server may be constructed to look 2960 like: 2962 / (place holder/not exported) 2963 /a/b (filesystem 1) 2964 /a/b/c/d (filesystem 2) 2966 It is the server's responsibility to present the pseudo filesystem 2967 that is complete to the client. If the client sends a lookup request 2968 for the path "/a/b/c/d", the server's response is the filehandle of 2969 the filesystem "/a/b/c/d". In previous versions of the NFS protocol, 2970 the server would respond with the filehandle of directory "/a/b/c/d" 2971 within the filesystem "/a/b". 2973 The NFS client will be able to determine if it crosses a server mount 2974 point by a change in the value of the "fsid" attribute. 2976 7.8. Security Policy and Name Space Presentation 2978 The application of the server's security policy needs to be carefully 2979 considered by the implementor. One may choose to limit the 2980 viewability of portions of the pseudo filesystem based on the 2981 server's perception of the client's ability to authenticate itself 2982 properly. However, with the support of multiple security mechanisms 2983 and the ability to negotiate the appropriate use of these mechanisms, 2984 the server is unable to properly determine if a client will be able 2985 to authenticate itself. If, based on its policies, the server 2986 chooses to limit the contents of the pseudo filesystem, the server 2987 may effectively hide filesystems from a client that may otherwise 2988 have legitimate access. 2990 As suggested practice, the server should apply the security policy of 2991 a shared resource in the server's namespace to the ancestors 2992 components of the namespace. For example: 2994 / 2996 Draft Specification NFS version 4 Protocol August 2002 2998 /a/b 2999 /a/b/c 3000 The /a/b/c directory is a real filesystem and is the shared resource. 3001 The security policy for /a/b/c is Kerberos with integrity. The 3002 server should should apply the same security policy to /, /a, and 3003 /a/b. This allows for the extension of the protection of the 3004 server's namespace to the ancestors of the real shared resource. 3006 For the case of the use of multiple, disjoint security mechanisms in 3007 the server's resources, the security for a particular object in the 3008 server's namespace should be the union of all security mechanisms of 3009 all direct descendants. 3011 Draft Specification NFS version 4 Protocol August 2002 3013 8. File Locking and Share Reservations 3015 Integrating locking into the NFS protocol necessarily causes it to be 3016 stateful. With the inclusion of share reservations the protocol 3017 becomes substantially more dependent on state than the traditional 3018 combination of NFS and NLM [XNFS]. There are three components to 3019 making this state manageable: 3021 o Clear division between client and server 3023 o Ability to reliably detect inconsistency in state between client 3024 and server 3026 o Simple and robust recovery mechanisms 3028 In this model, the server owns the state information. The client 3029 communicates its view of this state to the server as needed. The 3030 client is also able to detect inconsistent state before modifying a 3031 file. 3033 To support Win32 share reservations it is necessary to atomically 3034 OPEN or CREATE files. Having a separate share/unshare operation 3035 would not allow correct implementation of the Win32 OpenFile API. In 3036 order to correctly implement share semantics, the previous NFS 3037 protocol mechanisms used when a file is opened or created (LOOKUP, 3038 CREATE, ACCESS) need to be replaced. The NFS version 4 protocol has 3039 an OPEN operation that subsumes the NFS version 3 methodology of 3040 LOOKUP, CREATE, and ACCESS. However, because many operations require 3041 a filehandle, the traditional LOOKUP is preserved to map a file name 3042 to filehandle without establishing state on the server. The policy 3043 of granting access or modifying files is managed by the server based 3044 on the client's state. These mechanisms can implement policy ranging 3045 from advisory only locking to full mandatory locking. 3047 8.1. Locking 3049 It is assumed that manipulating a lock is rare when compared to READ 3050 and WRITE operations. It is also assumed that crashes and network 3051 partitions are relatively rare. Therefore it is important that the 3052 READ and WRITE operations have a lightweight mechanism to indicate if 3053 they possess a held lock. A lock request contains the heavyweight 3054 information required to establish a lock and uniquely define the lock 3055 owner. 3057 The following sections describe the transition from the heavy weight 3058 information to the eventual stateid used for most client and server 3059 locking and lease interactions. 3061 8.1.1. Client ID 3063 For each LOCK request, the client must identify itself to the server. 3065 Draft Specification NFS version 4 Protocol August 2002 3067 This is done in such a way as to allow for correct lock 3068 identification and crash recovery. A sequence of a SETCLIENTID 3069 operation followed by a SETCLIENTID_CONFIRM operation is required to 3070 establish the identification onto the server. Establishment of 3071 identification by a new incarnation of the client also has the effect 3072 of immediately breaking any leased state that a previous incarnation 3073 of the client might have had on the server, as opposed to forcing the 3074 new client incarnation to wait for the leases to expire. Breaking 3075 the lease state amounts to the server removing all lock, share 3076 reservation, and, where the server is not supporting the 3077 CLAIM_DELEGATE_PREV claim type, all delegation state associated with 3078 same client with the same identity. For discussion of delegation 3079 state recovery, see the section "Delegation Recovery". 3081 Client identification is encapsulated in the following structure: 3083 struct nfs_client_id4 { 3084 verifier4 verifier; 3085 opaque id; 3086 }; 3088 The first field, verifier is a client incarnation verifier that is 3089 used to detect client reboots. Only if the verifier is different from 3090 that the server has previously recorded the client (as identified by 3091 the second field f the structure, id) does the server start the 3092 process of cancelling the client's leased state. 3094 The second field, id is a variable length string that uniquely 3095 defines the client. 3097 There are several considerations for how the client generates the id 3098 string: 3100 o The string should be unique so that multiple clients do not 3101 present the same string. The consequences of two clients 3102 presenting the same string range from one client getting an 3103 error to one client having its leased state abruptly and 3104 unexpectedly cancelled. 3106 o The string should be selected so the subsequent incarnations 3107 (e.g. reboots) of the same client cause the client to present 3108 the same string. The implementor is cautioned from an approach 3109 that requires the string to be recorded in a local file because 3110 this precludes the use of the implementation in an environment 3111 where there is no local disk and all file access is from an NFS 3112 version 4 server. 3114 o The string should be different for each server network address 3115 that the client accesses, rather than common to all server 3116 network addresses. The reason is that it may not be possible for 3117 the client to tell if same server is listening on multiple 3118 network addresses. If the client issues SETCLIENTID with the 3120 Draft Specification NFS version 4 Protocol August 2002 3122 same id string to each network address of such a server, the 3123 server will think it is the same client, and each successive 3124 SETCLIENTID will cause the server to begin the process of 3125 removing the client's previous leased state. 3127 o The algorithm for generating the string should not assume that 3128 the client's network address won't change. This includes 3129 changes between client incarnations and even changes while the 3130 client is stilling running in its current incarnation. This 3131 means that if the client includes just the client's and server's 3132 network address in the id string, there is a real risk, after 3133 the client gives up the network address, that another client, 3134 using a similar algorithm for generate the id string, will 3135 generating a conflicting id string. 3137 Given the above considerations, an example of a well generated id 3138 string is one that includes: 3140 o The server's network address. 3142 o The client's network address. 3144 o For a user level NFS version 4 client, it should contain 3145 additional information to distinguish the client from other user 3146 level clients running on the same host, such as a process id or 3147 other unique sequence. 3149 o Additional information that tends to be unique, such as one or 3150 more of: 3152 - The client machines serial number (for privacy reasons, it is 3153 best to perform some one way function on the serial number). 3155 - A MAC address. 3157 - The timestamp of when the NFS version 4 software was first 3158 installed on the client (though this is subject to the 3159 previously mentioned caution about using information that is 3160 stored in a file, because the file might only be accessible 3161 over NFS version 4). 3163 - A true random number. However since this number ought to be 3164 the same between client incarnations, this shares the same 3165 problem as that of the using the timestamp of the software 3166 installation. 3168 As a security measure, the server MUST NOT cancel a client's leased 3169 state if the principal established the state for a given id string is 3170 not the same as the principal issuing the SETCLIENTID. 3172 Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose 3174 Draft Specification NFS version 4 Protocol August 2002 3176 of establishing the information the server needs to make callbacks to 3177 the client for purpose of supporting delegations. It is permitted to 3178 change this information via SETCLIENTID and SETCLIENTID_CONFIRM 3179 within the same incarnation of the client without removing the 3180 client's leased state. 3182 Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully 3183 completed, the client uses the short hand client identifier, of type 3184 clientid4, instead of the longer and less compact nfs_client_id4 3185 structure. This short hand client identfier (a clientid) is assigned 3186 by the server and should be chosen so that it will not conflict with 3187 a clientid previously assigned by the server. This applies across 3188 server restarts or reboots. When a clientid is presented to a server 3189 and that clientid is not recognized, as would happen after a server 3190 reboot, the server will reject the request with the error 3191 NFS4ERR_STALE_CLIENTID. When this happens, the client must obtain a 3192 new clientid by use of the SETCLIENTID operation and then proceed to 3193 any other necessary recovery for the server reboot case (See the 3194 section "Server Failure and Recovery"). 3196 The client must also employ the SETCLIENTID operation when it 3197 receives a NFS4ERR_STALE_STATEID error using a stateid derived from 3198 its current clientid, since this also indicates a server reboot which 3199 has invalidated the existing clientid (see the next section 3200 "lock_owner and stateid Definition" for details). 3202 See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM 3203 for a complete specification of the operations. 3205 8.1.2. Server Release of Clientid 3207 If the server determines that the client holds no associated state 3208 for its clientid, the server may choose to release the clientid. The 3209 server may make this choice for an inactive client so that resources 3210 are not consumed by those intermittently active clients. If the 3211 client contacts the server after this release, the server must ensure 3212 the client receives the appropriate error so that it will use the 3213 SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. 3214 It should be clear that the server must be very hesitant to release a 3215 clientid since the resulting work on the client to recover from such 3216 an event will be the same burden as if the server had failed and 3217 restarted. Typically a server would not release a clientid unless 3218 there had been no activity from that client for many minutes. 3220 Note that if the id string in a SETCLIENTID request is properly 3221 constructed, and if the client takes care to use the same principal 3222 for each successive use of SETCLIENTID, then, barring an active 3223 denial of service attack, NFS4ERR_CLID_INUSE should never be 3224 returned. 3226 However, client bugs, server bugs, or perhaps a deliberate change of 3228 Draft Specification NFS version 4 Protocol August 2002 3230 the principal owner of the id string (such as the case of a client 3231 that changes security flavors, and under the new flavor, there is no 3232 mapping to the previous owner) will in rare cases result in 3233 NFS4ERR_CLID_INUSE. 3235 In that event, when the server gets a SETCLIENTID for a client id 3236 that currently has no state, or it has state, but the lease has 3237 expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST 3238 allow the SETCLIENTID, and confirm the new clientid if followed by 3239 the appropriate SETCLIENTID_CONFIRM. 3241 8.1.3. lock_owner and stateid Definition 3243 When requesting a lock, the client must present to the server the 3244 clientid and an identifier for the owner of the requested lock. 3245 These two fields are referred to as the lock_owner and the definition 3246 of those fields are: 3248 o A clientid returned by the server as part of the client's use of 3249 the SETCLIENTID operation. 3251 o A variable length opaque array used to uniquely define the owner 3252 of a lock managed by the client. 3254 This may be a thread id, process id, or other unique value. 3256 When the server grants the lock, it responds with a unique stateid. 3257 The stateid is used as a shorthand reference to the lock_owner, since 3258 the server will be maintaining the correspondence between them. 3260 The server is free to form the stateid in any manner that it chooses 3261 as long as it is able to recognize invalid and out-of-date stateids. 3262 This requirement includes those stateids generated by earlier 3263 instances of the server. From this, the client can be properly 3264 notified of a server restart. This notification will occur when the 3265 client presents a stateid to the server from a previous 3266 instantiation. 3268 The server must be able to distinguish the following situations and 3269 return the error as specified: 3271 o The stateid was generated by an earlier server instance (i.e. 3272 before a server reboot). The error NFS4ERR_STALE_STATEID should 3273 be returned. 3275 o The stateid was generated by the current server instance but the 3276 stateid no longer designates the current locking state for the 3277 lockowner-file pair in question (i.e. one or more locking 3278 operations has occurred). The error NFS4ERR_OLD_STATEID should 3279 be returned. 3281 Draft Specification NFS version 4 Protocol August 2002 3283 This error condition will only occur when the client issues a 3284 locking request which changes a stateid while an I/O request 3285 that uses that stateid is outstanding. 3287 o The stateid was generated by the current server instance but the 3288 stateid does not designate a locking state for any active 3289 lockowner-file pair. The error NFS4ERR_BAD_STATEID should be 3290 returned. 3292 This error condition will occur when there has been a logic 3293 error on the part of the client or server. This should not 3294 happen. 3296 One mechanism that may be used to satisfy these requirements is for 3297 the server to, 3299 o divide the "other" field of each stateid into two fields: 3301 - A server verifier which uniquely designates a particular 3302 server 3303 instantiation. 3305 - An index into a table of locking-state structures. 3307 o utilize the "seqid" field of each stateid, such that seqid is 3308 monotonically incremented for each stateid that is associated 3309 with the same index into the locking-state table. 3311 By matching the incoming stateid and its field values with the state 3312 held at the server, the server is able to easily determine if a 3313 stateid is valid for its current instantiation and state. If the 3314 stateid is not valid, the appropriate error can be supplied to the 3315 client. 3317 8.1.4. Use of the stateid and Locking 3319 All READ, WRITE and SETATTR operations contain a stateid. For the 3320 purposes of this section, SETATTR operations which change the size 3321 attribute of a file are treated as if they are writing the area 3322 between the old and new size (i.e. the range truncated or added to 3323 the file by means of the SETATTR), even where SETATTR is not 3324 explicitly mentioned in the text. 3326 If the lock_owner performs a READ or WRITE in a situation in which it 3327 has established a lock or share reservation on the server (any OPEN 3328 constitutes a share reservation) the stateid (previously returned by 3329 the server) must be used to indicate what locks, including both 3330 record locks and share reservations, are held by the lockowner. If 3331 no state is established by the client, either record lock or share 3332 reservation, a stateid of all bits 0 is used. Regardless whether a 3333 stateid of all bits 0, or a stateid returned by the server is used, 3335 Draft Specification NFS version 4 Protocol August 2002 3337 if there is a conflicting share reservation or mandatory record lock 3338 held on the file, the server MUST refuse to service the READ or WRITE 3339 operation. 3341 Share reservations are established by OPEN operations and by their 3342 nature are mandatory in that when the OPEN denies READ or WRITE 3343 operations, that denial results in such operations being rejected 3344 with error NFS4ERR_LOCKED. Record locks may be implemented by the 3345 server as either mandatory or advisory, or the choice of mandatory or 3346 advisory behavior may be determined by the server on the basis of the 3347 file being accessed (for example, some UNIX-based servers support a 3348 "mandatory lock bit" on the mode attribute such that if set, record 3349 locks are required on the file before I/O is possible). When record 3350 locks are advisory, they only prevent the granting of conflicting 3351 lock requests and have no effect on READ's or WRITE's. Mandatory 3352 record locks, however, prevent conflicting I/O operations. When they 3353 are attempted, they are rejected with NFS4ERR_LOCKED. Assuming an 3354 operating environment like UNIX that requires it, when the client 3355 gets NFS4ERR_LOCKED on a file it knows it has the proper share 3356 reservation for, it will need to issue a LOCK request on the region 3357 of the file that includes the region the I/O was to be performed on, 3358 with an appropriate locktype (i.e. READ*_LT for a READ operation, 3359 WRITE*_LT for a WRITE operation). 3361 With NFS version 3, there was no notion of a stateid so there was no 3362 way to tell if the application process of the client sending the READ 3363 or WRITE operation had also acquired the appropriate record lock on 3364 the file. Thus there was no way to implement mandatory locking. With 3365 the stateid construct, this barrier has been removed. 3367 Note that for UNIX environments that support mandatory file locking, 3368 the distinction between advisory and mandatory locking is subtle. In 3369 fact, advisory and mandatory record locks are exactly the same in so 3370 far as the APIs and requirements on implementation. If the mandatory 3371 lock attribute is set on the file, the server checks to see if the 3372 lockowner has an appropriate shared (read) or exclusive (write) 3373 record lock on the region it wishes to read or write to. If there is 3374 no appropriate lock, the server checks if there is a conflicting lock 3375 (which can be done by attempting to acquire the conflicting lock on 3376 the behalf of the lockowner, and if successful, release the lock 3377 after the READ or WRITE is done), and if there is, the server returns 3378 NFS4ERR_LOCKED. 3380 For Windows environments, there are no advisory record locks, so the 3381 server always checks for record locks during I/O requests. 3383 Thus, the NFS version 4 LOCK operation does not need to distinguish 3384 between advisory and mandatory record locks. It is the NFS version 4 3385 server's processing of the READ and WRITE operations that introduces 3386 the distinction. 3388 Every stateid other than the special stateid values noted in this 3390 Draft Specification NFS version 4 Protocol August 2002 3392 section, whether returned by an OPEN-type operation (i.e. OPEN, 3393 OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), 3394 defines an access mode for the file (i.e. READ, WRITE, or READ-WRITE) 3395 as established by the original OPEN which began the stateid sequence, 3396 and as modified by subsequent OPEN's and OPEN_DOWNGRADE's within that 3397 stateid sequence. When a READ, WRITE, or SETATTR which specifies the 3398 size attribute, is done, the operation is subject to checking against 3399 the access mode to verify that the operation is appropriate given the 3400 OPEN with which the operation is associated. 3402 In the case of WRITE-type operations (i.e. WRITE's and SETATTR's 3403 which set size), the server must verify that the access mode allows 3404 writing and return an NFS4ERR_OPENMODE error if it does not. In the 3405 case, of READ, the server may perform the corresponding check on the 3406 access mode, or it may choose to allow READ on opens for WRITE only, 3407 to accommodate clients whose write implementation may unavoidably do 3408 reads (e.g. due to buffer cache constraints). However, even if 3409 READ's are allowed in these circumstances, the server MUST still 3410 check for locks that conflict with the READ (e.g. another open 3411 specify denial of READ's). Note that a server which does enforce the 3412 access mode check on READ's need not explicitly check for conflicting 3413 share reservations since the existence of OPEN for read access 3414 guarantees that no conflicting share reservation can exist. 3416 A stateid of all bits 1 (one) MAY allow READ operations to bypass 3417 locking checks at the server. However, WRITE operations with a 3418 stateid with bits all 1 (one) MUST NOT bypass locking checks and are 3419 treated exactly the same as if a stateid of all bits 0 were used. 3421 A lock may not be granted while a READ or WRITE operation using one 3422 of the special stateids is being performed and the range of the lock 3423 request conflicts with the range of the READ or WRITE operation. For 3424 the purposes of this paragraph, a conflict occurs when a shared lock 3425 is requested and a WRITE operation is being performed, or an 3426 exclusive lock is requested and either a READ or a WRITE operation is 3427 being performed. A SETATTR that sets size is treated similarly to a 3428 WRITE as discussed above. 3430 8.1.5. Sequencing of Lock Requests 3432 Locking is different than most NFS operations as it requires "at- 3433 most-one" semantics that are not provided by ONCRPC. ONCRPC over a 3434 reliable transport is not sufficient because a sequence of locking 3435 requests may span multiple TCP connections. In the face of 3436 retransmission or reordering, lock or unlock requests must have a 3437 well defined and consistent behavior. To accomplish this, each lock 3438 request contains a sequence number that is a consecutively increasing 3439 integer. Different lock_owners have different sequences. The server 3440 maintains the last sequence number (L) received and the response that 3441 was returned. The first request issued for any given lock_owner is 3442 issued with a sequence number of zero. 3444 Draft Specification NFS version 4 Protocol August 2002 3446 Note that for requests that contain a sequence number, for each 3447 lock_owner, there should be no more than one outstanding request. 3449 If a request (r) with a previous sequence number (r < L) is received, 3450 it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a 3451 properly-functioning client, the response to (r) must have been 3452 received before the last request (L) was sent. If a duplicate of 3453 last request (r == L) is received, the stored response is returned. 3454 If a request beyond the next sequence (r == L + 2) is received, it is 3455 rejected with the return of error NFS4ERR_BAD_SEQID. Sequence 3456 history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM 3457 sequence changes the client verifier. 3459 Since the sequence number is represented with an unsigned 32-bit 3460 integer, the arithmetic involved with the sequence number is mod 3461 2^32. 3463 It is critical the server maintain the last response sent to the 3464 client to provide a more reliable cache of duplicate non-idempotent 3465 requests than that of the traditional cache described in [Juszczak]. 3466 The traditional duplicate request cache uses a least recently used 3467 algorithm for removing unneeded requests. However, the last lock 3468 request and response on a given lock_owner must be cached as long as 3469 the lock state exists on the server. 3471 The client MUST monotonically increment the sequence number for the 3472 CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE 3473 operations. This is true even in the event that the previous 3474 operation that used the sequence number received an error. The only 3475 exception to this rule is if the previous operation received one of 3476 the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, 3477 NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID. 3479 8.1.6. Recovery from Replayed Requests 3481 As described above, the sequence number is per lock_owner. As long 3482 as the server maintains the last sequence number received and follows 3483 the methods described above, there are no risks of a Byzantine router 3484 re-sending old requests. The server need only maintain the 3485 (lock_owner, sequence number) state as long as there are open files 3486 or closed files with locks outstanding. 3488 LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence 3489 number and therefore the risk of the replay of these operations 3490 resulting in undesired effects is non-existent while the server 3491 maintains the lock_owner state. 3493 8.1.7. Releasing lock_owner State 3495 When a particular lock_owner no longer holds open or file locking 3497 Draft Specification NFS version 4 Protocol August 2002 3499 state at the server, the server may choose to release the sequence 3500 number state associated with the lock_owner. The server may make 3501 this choice based on lease expiration, for the reclamation of server 3502 memory, or other implementation specific details. In any event, the 3503 server is able to do this safely only when the lock_owner no longer 3504 is being utilized by the client. The server may choose to hold the 3505 lock_owner state in the event that retransmitted requests are 3506 received. However, the period to hold this state is implementation 3507 specific. 3509 In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is 3510 retransmitted after the server has previously released the lock_owner 3511 state, the server will find that the lock_owner has no files open and 3512 an error will be returned to the client. If the lock_owner does have 3513 a file open, the stateid will not match and again an error is 3514 returned to the client. 3516 8.1.8. Use of Open Confirmation 3518 In the case that an OPEN is retransmitted and the lock_owner is being 3519 used for the first time or the lock_owner state has been previously 3520 released by the server, the use of the OPEN_CONFIRM operation will 3521 prevent incorrect behavior. When the server observes the use of the 3522 lock_owner for the first time, it will direct the client to perform 3523 the OPEN_CONFIRM for the corresponding OPEN. This sequence 3524 establishes the use of an lock_owner and associated sequence number. 3525 Since the OPEN_CONFIRM sequence connects a new open_owner on the 3526 server with an existing open_owner on a client, the sequence number 3527 may have any value. The OPEN_CONFIRM step assures the server that 3528 the value received is the correct one. See the section "OPEN_CONFIRM 3529 - Confirm Open" for further details. 3531 There are a number of situations in which the requirement to confirm 3532 an OPEN would pose difficulties for the client and server, in that 3533 they would be prevented from acting in a timely fashion on 3534 information received, because that information would be provisional, 3535 subject to deletion upon non-confirmation. Fortunately, these are 3536 situations in which the server can avoid the need for confirmation 3537 when responding to open requests. The two constraints are: 3539 o The server must not bestow a delegation for any open which would 3540 require confirmation. 3542 o The server MUST NOT require confirmation on a reclaim-type open 3543 (i.e. one specifying claim type CLAIM_PREVIOUS or 3544 CLAIM_DELEGATE_PREV). 3546 These constraints are related in that reclaim-type opens are the 3547 only ones in which the server may be required to send a 3548 delegation. For CLAIM_NULL, sending the delegation is optional 3549 while for CLAIM_DELEGATE_CUR, no delegation is sent. 3551 Draft Specification NFS version 4 Protocol August 2002 3553 Delegations being sent with an open requiring confirmation are 3554 troublesome because recovering from non-confirmation adds undue 3555 complexity to the protocol while requiring confirmation on 3556 reclaim-type opens poses difficulties in that the inability to 3557 resolve the status of the reclaim until lease expiration may 3558 make it difficult to have timely determination of the set of 3559 locks being reclaimed (since the grace period may expire). 3561 Requiring open confirmation on reclaim-type opens is avoidable 3562 because of the nature of the environments in which such opens 3563 are done. For CLAIM_PREVIOUS opens, this is immediately after 3564 server reboot, so there should be no time for lockowners to be 3565 created, found to be unused, and recycled. For 3566 CLAIM_DELEGATE_PREV opens, we are dealing with a client reboot 3567 situation. A server which supports delegation can be sure that 3568 no lockowners for that client have been recycled since client 3569 initialization and thus can ensure that confirmation will not be 3570 required. 3572 8.2. Lock Ranges 3574 The protocol allows a lock owner to request a lock with a byte range 3575 and then either upgrade or unlock a sub-range of the initial lock. 3576 It is expected that this will be an uncommon type of request. In any 3577 case, servers or server filesystems may not be able to support sub- 3578 range lock semantics. In the event that a server receives a locking 3579 request that represents a sub-range of current locking state for the 3580 lock owner, the server is allowed to return the error 3581 NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock 3582 operations. Therefore, the client should be prepared to receive this 3583 error and, if appropriate, report the error to the requesting 3584 application. 3586 The client is discouraged from combining multiple independent locking 3587 ranges that happen to be adjacent into a single request since the 3588 server may not support sub-range requests and for reasons related to 3589 the recovery of file locking state in the event of server failure. 3590 As discussed in the section "Server Failure and Recovery" below, the 3591 server may employ certain optimizations during recovery that work 3592 effectively only when the client's behavior during lock recovery is 3593 similar to the client's locking behavior prior to server failure. 3595 8.3. Upgrading and Downgrading Locks 3597 If a client has a write lock on a record, it can request an atomic 3598 downgrade of the lock to a read lock via the LOCK request, by setting 3599 the type to READ_LT. If the server supports atomic downgrade, the 3600 request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. 3601 The client should be prepared to receive this error, and if 3602 appropriate, report the error to the requesting application. 3604 Draft Specification NFS version 4 Protocol August 2002 3606 If a client has a read lock on a record, it can request an atomic 3607 upgrade of the lock to a write lock via the LOCK request by setting 3608 the type to WRITE_LT or WRITEW_LT. If the server does not support 3609 atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade 3610 can be achieved without an existing conflict, the request will 3611 succeed. Otherwise, the server will return either NFS4ERR_DENIED or 3612 NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the 3613 client issued the LOCK request with the type set to WRITEW_LT and the 3614 server has detected a deadlock. The client should be prepared to 3615 receive such errors and if appropriate, report the error to the 3616 requesting application. 3618 8.4. Blocking Locks 3620 Some clients require the support of blocking locks. The NFS version 3621 4 protocol must not rely on a callback mechanism and therefore is 3622 unable to notify a client when a previously denied lock has been 3623 granted. Clients have no choice but to continually poll for the 3624 lock. This presents a fairness problem. Two new lock types are 3625 added, READW and WRITEW, and are used to indicate to the server that 3626 the client is requesting a blocking lock. The server should maintain 3627 an ordered list of pending blocking locks. When the conflicting lock 3628 is released, the server may wait the lease period for the first 3629 waiting client to re-request the lock. After the lease period 3630 expires the next waiting client request is allowed the lock. Clients 3631 are required to poll at an interval sufficiently small that it is 3632 likely to acquire the lock in a timely manner. The server is not 3633 required to maintain a list of pending blocked locks as it is used to 3634 increase fairness and not correct operation. Because of the 3635 unordered nature of crash recovery, storing of lock state to stable 3636 storage would be required to guarantee ordered granting of blocking 3637 locks. 3639 Servers may also note the lock types and delay returning denial of 3640 the request to allow extra time for a conflicting lock to be 3641 released, allowing a successful return. In this way, clients can 3642 avoid the burden of needlessly frequent polling for blocking locks. 3643 The server should take care in the length of delay in the event the 3644 client retransmits the request. 3646 8.5. Lease Renewal 3648 The purpose of a lease is to allow a server to remove stale locks 3649 that are held by a client that has crashed or is otherwise 3650 unreachable. It is not a mechanism for cache consistency and lease 3651 renewals may not be denied if the lease interval has not expired. 3653 The following events cause implicit renewal of all of the leases for 3654 a given client (i.e. all those sharing a given clientid). Each of 3655 these is a positive indication that the client is still active and 3657 Draft Specification NFS version 4 Protocol August 2002 3659 that the associated state held at the server, for the client, is 3660 still valid. 3662 o An OPEN with a valid clientid. 3664 o Any operation made with a valid stateid (CLOSE, DELEGPURGE, 3665 DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, 3666 READ, RENEW, SETATTR, WRITE). This does not include the special 3667 stateids of all bits 0 or all bits 1. 3669 Note that if the client had restarted or rebooted, the 3670 client would not be making these requests without issuing 3671 the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of 3672 the SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that 3673 changes the client verifier) notifies the server to drop 3674 the locking state associated with the client. 3675 SETCLIENTID/SETCLIENTID_CONFIRM never renews a lease. 3677 If the server has rebooted, the stateids 3678 (NFS4ERR_STALE_STATEID error) or the clientid 3679 (NFS4ERR_STALE_CLIENTID error) will not be valid hence 3680 preventing spurious renewals. 3682 This approach allows for low overhead lease renewal which scales 3683 well. In the typical case no extra RPC calls are required for lease 3684 renewal and in the worst case one RPC is required every lease period 3685 (i.e. a RENEW operation). The number of locks held by the client is 3686 not a factor since all state for the client is involved with the 3687 lease renewal action. 3689 Since all operations that create a new lease also renew existing 3690 leases, the server must maintain a common lease expiration time for 3691 all valid leases for a given client. This lease time can then be 3692 easily updated upon implicit lease renewal actions. 3694 8.6. Crash Recovery 3696 The important requirement in crash recovery is that both the client 3697 and the server know when the other has failed. Additionally, it is 3698 required that a client sees a consistent view of data across server 3699 restarts or reboots. All READ and WRITE operations that may have 3700 been queued within the client or network buffers must wait until the 3701 client has successfully recovered the locks protecting the READ and 3702 WRITE operations. 3704 8.6.1. Client Failure and Recovery 3706 In the event that a client fails, the server may recover the client's 3707 locks when the associated leases have expired. Conflicting locks 3708 from another client may only be granted after this lease expiration. 3710 Draft Specification NFS version 4 Protocol August 2002 3712 If the client is able to restart or reinitialize within the lease 3713 period the client may be forced to wait the remainder of the lease 3714 period before obtaining new locks. 3716 To minimize client delay upon restart, lock requests are associated 3717 with an instance of the client by a client supplied verifier. This 3718 verifier is part of the initial SETCLIENTID call made by the client. 3719 The server returns a clientid as a result of the SETCLIENTID 3720 operation. The client then confirms the use of the clientid with 3721 SETCLIENTID_CONFIRM. The clientid in combination with an opaque 3722 owner field is then used by the client to identify the lock owner for 3723 OPEN. This chain of associations is then used to identify all locks 3724 for a particular client. 3726 Since the verifier will be changed by the client upon each 3727 initialization, the server can compare a new verifier to the verifier 3728 associated with currently held locks and determine that they do not 3729 match. This signifies the client's new instantiation and subsequent 3730 loss of locking state. As a result, the server is free to release 3731 all locks held which are associated with the old clientid which was 3732 derived from the old verifier. 3734 Note that the verifier must have the same uniqueness properties of 3735 the verifier for the COMMIT operation. 3737 8.6.2. Server Failure and Recovery 3739 If the server loses locking state (usually as a result of a restart 3740 or reboot), it must allow clients time to discover this fact and re- 3741 establish the lost locking state. The client must be able to re- 3742 establish the locking state without having the server deny valid 3743 requests because the server has granted conflicting access to another 3744 client. Likewise, if there is the possibility that clients have not 3745 yet re-established their locking state for a file, the server must 3746 disallow READ and WRITE operations for that file. The duration of 3747 this recovery period is equal to the duration of the lease period. 3749 A client can determine that server failure (and thus loss of locking 3750 state) has occurred, when it receives one of two errors. The 3751 NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a 3752 reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a 3753 clientid invalidated by reboot or restart. When either of these are 3754 received, the client must establish a new clientid (See the section 3755 "Client ID") and re-establish the locking state as discussed below. 3757 The period of special handling of locking and READs and WRITEs, equal 3758 in duration to the lease period, is referred to as the "grace 3759 period". During the grace period, clients recover locks and the 3760 associated state by reclaim-type locking requests (i.e. LOCK requests 3761 with reclaim set to true and OPEN operations with a claim type of 3762 CLAIM_PREVIOUS). During the grace period, the server must reject 3764 Draft Specification NFS version 4 Protocol August 2002 3766 READ and WRITE operations and non-reclaim locking requests (i.e. 3767 other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. 3769 If the server can reliably determine that granting a non-reclaim 3770 request will not conflict with reclamation of locks by other clients, 3771 the NFS4ERR_GRACE error does not have to be returned and the non- 3772 reclaim client request can be serviced. For the server to be able to 3773 service READ and WRITE operations during the grace period, it must 3774 again be able to guarantee that no possible conflict could arise 3775 between an impending reclaim locking request and the READ or WRITE 3776 operation. If the server is unable to offer that guarantee, the 3777 NFS4ERR_GRACE error must be returned to the client. 3779 For a server to provide simple, valid handling during the grace 3780 period, the easiest method is to simply reject all non-reclaim 3781 locking requests and READ and WRITE operations by returning the 3782 NFS4ERR_GRACE error. However, a server may keep information about 3783 granted locks in stable storage. With this information, the server 3784 could determine if a regular lock or READ or WRITE operation can be 3785 safely processed. 3787 For example, if a count of locks on a given file is available in 3788 stable storage, the server can track reclaimed locks for the file and 3789 when all reclaims have been processed, non-reclaim locking requests 3790 may be processed. This way the server can ensure that non-reclaim 3791 locking requests will not conflict with potential reclaim requests. 3792 With respect to I/O requests, if the server is able to determine that 3793 there are no outstanding reclaim requests for a file by information 3794 from stable storage or another similar mechanism, the processing of 3795 I/O requests could proceed normally for the file. 3797 To reiterate, for a server that allows non-reclaim lock and I/O 3798 requests to be processed during the grace period, it MUST determine 3799 that no lock subsequently reclaimed will be rejected and that no lock 3800 subsequently reclaimed would have prevented any I/O operation 3801 processed during the grace period. 3803 Clients should be prepared for the return of NFS4ERR_GRACE errors for 3804 non-reclaim lock and I/O requests. In this case the client should 3805 employ a retry mechanism for the request. A delay (on the order of 3806 several seconds) between retries should be used to avoid overwhelming 3807 the server. Further discussion of the general issue is included in 3808 [Floyd]. The client must account for the server that is able to 3809 perform I/O and non-reclaim locking requests within the grace period 3810 as well as those that can not do so. 3812 A reclaim-type locking request outside the server's grace period can 3813 only succeed if the server can guarantee that no conflicting lock or 3814 I/O request has been granted since reboot or restart. 3816 A server may, upon restart, establish a new value for the lease 3817 period. Therefore, clients should, once a new clientid is 3819 Draft Specification NFS version 4 Protocol August 2002 3821 established, refetch the lease_time attribute and use it as the basis 3822 for lease renewal for the lease associated with that server. However, 3823 the server must establish, for this restart event, a grace period at 3824 least as long as the lease period for the previous server 3825 instantiation. This allows the client state obtained during the 3826 previous server instance to be reliably re-established. 3828 8.6.3. Network Partitions and Recovery 3830 If the duration of a network partition is greater than the lease 3831 period provided by the server, the server will have not received a 3832 lease renewal from the client. If this occurs, the server may free 3833 all locks held for the client. As a result, all stateids held by the 3834 client will become invalid or stale. Once the client is able to 3835 reach the server after such a network partition, all I/O submitted by 3836 the client with the now invalid stateids will fail with the server 3837 returning the error NFS4ERR_EXPIRED. Once this error is received, 3838 the client will suitably notify the application that held the lock. 3840 As a courtesy to the client or as an optimization, the server may 3841 continue to hold locks on behalf of a client for which recent 3842 communication has extended beyond the lease period. If the server 3843 receives a lock or I/O request that conflicts with one of these 3844 courtesy locks, the server must free the courtesy lock and grant the 3845 new request. 3847 If the server continues to hold locks beyond the expiration of a 3848 client's lease, the server MUST employ a method of recording this 3849 fact in its stable storage. Conflicting lock requests from another 3850 client may be serviced after the lease expiration. There are various 3851 scenarios involving server failure after such an event that require 3852 the storage of these lease expirations or network partitions. One 3853 scenario is as follows: 3855 A client holds a lock at the server and encounters a 3856 network partition and is unable to renew the associated 3857 lease. A second client obtains a conflicting lock and then 3858 frees the lock. After the unlock request by the second 3859 client, the server reboots or reinitializes. Once the 3860 server recovers, the network partition heals and the 3861 original client attempts to reclaim the original lock. 3863 In this scenario and without any state information, the server will 3864 allow the reclaim and the client will be in an inconsistent state 3865 because the server or the client has no knowledge of the conflicting 3866 lock. 3868 The server may choose to store this lease expiration or network 3869 partitioning state in a way that will only identify the client as a 3870 whole. Note that this may potentially lead to lock reclaims being 3872 Draft Specification NFS version 4 Protocol August 2002 3874 denied unnecessarily because of a mix of conflicting and non- 3875 conflicting locks. The server may also choose to store information 3876 about each lock that has an expired lease with an associated 3877 conflicting lock. The choice of the amount and type of state 3878 information that is stored is left to the implementor. In any case, 3879 the server must have enough state information to enable correct 3880 recovery from multiple partitions and multiple server failures. 3882 For further discussion of revocation of locks see the section "Server 3883 Revocation of Locks". 3885 8.7. Recovery from a Lock Request Timeout or Abort 3887 In the event a lock request times out, a client may decide to not 3888 retry the request. The client may also abort the request when the 3889 process for which it was issued is terminated (e.g. in UNIX due to a 3890 signal). It is possible though that the server received the request 3891 and acted upon it. This would change the state on the server without 3892 the client being aware of the change. It is paramount that the 3893 client re-synchronize state with server before it attempts any other 3894 operation that takes a seqid and/or a stateid with the same 3895 lock_owner. This is straightforward to do without a special re- 3896 synchronize operation. 3898 Since the server maintains the last lock request and response 3899 received on the lock_owner, for each lock_owner, the client should 3900 cache the last lock request it sent such that the lock request did 3901 not receive a response. From this, the next time the client does a 3902 lock operation for the lock_owner, it can send the cached request, if 3903 there is one, and if the request was one that established state (e.g. 3904 a LOCK or OPEN operation), the server will return the cached result 3905 or if never saw the request, perform it. The client can follow up 3906 with a request to remove the state (e.g. a LOCKU or CLOSE operation). 3907 With this approach, the sequencing and stateid information on the 3908 client and server for the given lock_owner will re-synchronize and in 3909 turn the lock state will re-synchronize. 3911 8.8. Server Revocation of Locks 3913 At any point, the server can revoke locks held by a client and the 3914 client must be prepared for this event. When the client detects that 3915 its locks have been or may have been revoked, the client is 3916 responsible for validating the state information between itself and 3917 the server. Validating locking state for the client means that it 3918 must verify or reclaim state for each lock currently held. 3920 The first instance of lock revocation is upon server reboot or re- 3921 initialization. In this instance the client will receive an error 3922 (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will 3923 proceed with normal crash recovery as described in the previous 3925 Draft Specification NFS version 4 Protocol August 2002 3927 section. 3929 The second lock revocation event is the inability to renew the lease 3930 before expiration. While this is considered a rare or unusual event, 3931 the client must be prepared to recover. Both the server and client 3932 will be able to detect the failure to renew the lease and are capable 3933 of recovering without data corruption. For the server, it tracks the 3934 last renewal event serviced for the client and knows when the lease 3935 will expire. Similarly, the client must track operations which will 3936 renew the lease period. Using the time that each such request was 3937 sent and the time that the corresponding reply was received, the 3938 client should bound the time that the corresponding renewal could 3939 have occurred on the server and thus determine if it is possible that 3940 a lease period expiration could have occurred. 3942 The third lock revocation event can occur as a result of 3943 administrative intervention within the lease period. While this is 3944 considered a rare event, it is possible that the server's 3945 administrator has decided to release or revoke a particular lock held 3946 by the client. As a result of revocation, the client will receive an 3947 error of NFS4ERR_EXPIRED and the error is received within the lease 3948 period for the lock. In this instance the client may assume that 3949 only the lock_owner's locks have been lost. The client notifies the 3950 lock holder appropriately. The client may not assume the lease 3951 period has been renewed as a result of failed operation. 3953 When the client determines the lease period may have expired, the 3954 client must mark all locks held for the associated lease as 3955 "unvalidated". This means the client has been unable to re-establish 3956 or confirm the appropriate lock state with the server. As described 3957 in the previous section on crash recovery, there are scenarios in 3958 which the server may grant conflicting locks after the lease period 3959 has expired for a client. When it is possible that the lease period 3960 has expired, the client must validate each lock currently held to 3961 ensure that a conflicting lock has not been granted. The client may 3962 accomplish this task by issuing an I/O request, either a pending I/O 3963 or a zero-length read, specifying the stateid associated with the 3964 lock in question. If the response to the request is success, the 3965 client has validated all of the locks governed by that stateid and 3966 re-established the appropriate state between itself and the server. 3967 If the I/O request is not successful, then one or more of the locks 3968 associated with the stateid was revoked by the server and the client 3969 must notify the owner. 3971 8.9. Share Reservations 3973 A share reservation is a mechanism to control access to a file. It 3974 is a separate and independent mechanism from record locking. When a 3975 client opens a file, it issues an OPEN operation to the server 3976 specifying the type of access required (READ, WRITE, or BOTH) and the 3977 type of access to deny others (deny NONE, READ, WRITE, or BOTH). If 3979 Draft Specification NFS version 4 Protocol August 2002 3981 the OPEN fails the client will fail the application's open request. 3983 Pseudo-code definition of the semantics: 3985 if ((request.access & file_state.deny)) || 3986 (request.deny & file_state.access)) 3987 return (NFS4ERR_DENIED) 3989 This checking of share reservations on OPEN is done with no exception 3990 for an existing OPEN for the same open_owner. 3992 The constants used for the OPEN and OPEN_DOWNGRADE operations for the 3993 access and deny fields are as follows: 3995 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 3996 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 3997 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 3999 const OPEN4_SHARE_DENY_NONE = 0x00000000; 4000 const OPEN4_SHARE_DENY_READ = 0x00000001; 4001 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 4002 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 4004 8.10. OPEN/CLOSE Operations 4006 To provide correct share semantics, a client MUST use the OPEN 4007 operation to obtain the initial filehandle and indicate the desired 4008 access and what if any access to deny. Even if the client intends to 4009 use a stateid of all 0's or all 1's, it must still obtain the 4010 filehandle for the regular file with the OPEN operation so the 4011 appropriate share semantics can be applied. For clients that do not 4012 have a deny mode built into their open programming interfaces, deny 4013 equal to NONE should be used. 4015 The OPEN operation with the CREATE flag, also subsumes the CREATE 4016 operation for regular files as used in previous versions of the NFS 4017 protocol. This allows a create with a share to be done atomically. 4019 The CLOSE operation removes all share reservations held by the 4020 lock_owner on that file. If record locks are held, the client SHOULD 4021 release all locks before issuing a CLOSE. The server MAY free all 4022 outstanding locks on CLOSE but some servers may not support the CLOSE 4023 of a file that still has record locks held. The server MUST return 4024 failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the 4025 CLOSE. 4027 The LOOKUP operation will return a filehandle without establishing 4028 any lock state on the server. Without a valid stateid, the server 4029 will assume the client has the least access. For example, a file 4030 opened with deny READ/WRITE cannot be accessed using a filehandle 4032 Draft Specification NFS version 4 Protocol August 2002 4034 obtained through LOOKUP because it would not have a valid stateid 4035 (i.e. using a stateid of all bits 0 or all bits 1). 4037 8.10.1. Close and Retention of State Information 4039 Since a CLOSE operation requests deallocation of a stateid, dealing 4040 with retransmission of the CLOSE, may pose special difficulties, 4041 since the state information, which normally would be used to 4042 determine the state of the open file being designated, might be 4043 deallocated, resulting in an NFS4ERR_BAD_STATEID error. 4045 Servers may deal with this problem in a number of ways. To provide 4046 the greatest degree assurance that the protocol is being used 4047 properly, a server should, rather than deallocate the stateid, mark 4048 it as close-pending, and retain the stateid with this status, until 4049 later deallocation. In this way, a retransmitted CLOSE can be 4050 recognized since the stateid points to state information with this 4051 distinctive status, so that it can be handled without error. 4053 When adopting this strategy, a server should retain the state 4054 information until the earliest of: 4056 o Another validly sequenced request for the same lockowner, that 4057 is not a retransmission. 4059 o The time that a lockowner is freed by the server due to period 4060 with no activity. 4062 o All locks for the client are freed as a result of a SETCLIENTID. 4064 Servers may avoid this complexity, at the cost of less complete 4065 protocol error checking, by simply responding NFS4_OK in the event of 4066 a CLOSE for a deallocated stateid, on the assumption that this case 4067 must be caused by a retranmitted close. When adopting this approach, 4068 it is desirable to at least log an error when returning a no-error 4069 indication in this situation. If the server maintains a reply-cache 4070 mechanism, it can verify the CLOSE is indeed a retransmission and 4071 avoid error logging in most cases. 4073 8.11. Open Upgrade and Downgrade 4075 When an OPEN is done for a file and the lockowner for which the open 4076 is being done already has the file open, the result is to upgrade the 4077 open file status maintained on the server to include the access and 4078 deny bits specified by the new OPEN as well as those for the existing 4079 OPEN. The result is that there is one open file, as far as the 4080 protocol is concerned, and it includes the union of the access and 4081 deny bits for all of the OPEN requests completed. Only a single 4082 CLOSE will be done to reset the effects of both OPEN's. Note that 4084 Draft Specification NFS version 4 Protocol August 2002 4086 the client, when issuing the OPEN, may not know that the same file is 4087 in fact being opened. The above only applies if both OPEN's result 4088 in the OPEN'ed object being designated by the same filehandle. 4090 When the server chooses to export multiple filehandles corresponding 4091 to the same file object and returns different filehandles on two 4092 different OPEN's of the same file object, the server MUST NOT "OR" 4093 together the access and deny bits and coalesce the two open files. 4094 Instead the server must maintain separate OPEN's with separate 4095 stateid's and will require separate CLOSE's to free them. 4097 When multiple open files on the client are merged into a single open 4098 file object on the server, the close of one of the open files (on the 4099 client) may necessitate change of the access and deny status of the 4100 open file on the server. This is because the union of the access and 4101 deny bits for the remaining open's may be smaller (i.e. a proper 4102 subset) than previously. The OPEN_DOWNGRADE operation is used to 4103 make the necessary change and the client should use it to update the 4104 server so that share reservation requests by other clients are 4105 handled properly. 4107 8.12. Short and Long Leases 4109 When determining the time period for the server lease, the usual 4110 lease tradeoffs apply. Short leases are good for fast server 4111 recovery at a cost of increased RENEW or READ (with zero length) 4112 requests. Longer leases are certainly kinder and gentler to servers 4113 trying to handle very large numbers of clients. The number of RENEW 4114 requests drop in proportion to the lease time. The disadvantages of 4115 long leases are slower recovery after server failure (server must 4116 wait for leases to expire and grace period before granting new lock 4117 requests) and increased file contention (if client fails to transmit 4118 an unlock request then server must wait for lease expiration before 4119 granting new locks). 4121 Long leases are usable if the server is able to store lease state in 4122 non-volatile memory. Upon recovery, the server can reconstruct the 4123 lease state from its non-volatile memory and continue operation with 4124 its clients and therefore long leases would not be an issue. 4126 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration 4128 To avoid the need for synchronized clocks, lease times are granted by 4129 the server as a time delta. However, there is a requirement that the 4130 client and server clocks do not drift excessively over the duration 4131 of the lock. There is also the issue of propagation delay across the 4132 network which could easily be several hundred milliseconds as well as 4133 the possibility that requests will be lost and need to be 4134 retransmitted. 4136 Draft Specification NFS version 4 Protocol August 2002 4138 To take propagation delay into account, the client should subtract it 4139 from lease times (e.g. if the client estimates the one-way 4140 propagation delay as 200 msec, then it can assume that the lease is 4141 already 200 msec old when it gets it). In addition, it will take 4142 another 200 msec to get a response back to the server. So the client 4143 must send a lock renewal or write data back to the server 400 msec 4144 before the lease would expire. 4146 The server's lease period configuration should take into account the 4147 network distance of the clients that will be accessing the server's 4148 resources. It is expected that the lease period will take into 4149 account the network propogation delays and other network delay 4150 factors for the client population. Since the protocol does not allow 4151 for an automatic method to determine an appropriate lease period, the 4152 server's administrator may have to tune the lease period. 4154 8.14. Migration, Replication and State 4156 When responsibility for handling a given file system is transferred 4157 to a new server (migration) or the client chooses to use an alternate 4158 server (e.g. in response to server unresponsiveness) in the context 4159 of file system replication, the appropriate handling of state shared 4160 between the client and server (i.e. locks, leases, stateid's, and 4161 clientid's) is as described below. The handling differs between 4162 migration and replication. For related discussion of file server 4163 state and recover of such see the sections under "File Locking and 4164 Share Reservations" 4166 If server replica or a server immigrating a filesystem agrees to, or 4167 is expected to, accept opaque values from the client that originated 4168 from another server, then it is a wise implementation practice for 4169 the servers to encode the "opaque" values in network byte order. This 4170 way, servers acting as replicas or immigrating filesystems will be 4171 able to parse values like stateids, directory cookies, filehandles, 4172 etc. even if their native byte order is different from other servers 4173 cooperating in the replication and migration of the filesystem. 4175 8.14.1. Migration and State 4177 In the case of migration, the servers involved in the migration of a 4178 filesystem SHOULD transfer all server state from the original to the 4179 new server. This must be done in a way that is transparent to the 4180 client. This state transfer will ease the client's transition when a 4181 filesystem migration occurs. If the servers are successful in 4182 transferring all state, the client will continue to use stateid's 4183 assigned by the original server. Therefore the new server must 4184 recognize these stateid's as valid. This holds true for the clientid 4185 as well. Since responsibility for an entire filesystem is 4186 transferred with a migration event, there is no possibility that 4187 conflicts will arise on the new server as a result of the transfer of 4189 Draft Specification NFS version 4 Protocol August 2002 4191 locks. 4193 As part of the transfer of information between servers, leases would 4194 be transferred as well. The leases being transferred to the new 4195 server will typically have a different expiration time from those for 4196 the same client, previously on the old server. To maintain the 4197 property that all leases on a given server for a given client expire 4198 at the same time, the server should advance the expiration time to 4199 the later of the leases being transferred or the leases already 4200 present. This allows the client to maintain lease renewal of both 4201 classes without special effort. 4203 The servers may choose not to transfer the state information upon 4204 migration. However, this choice is discouraged. In this case, when 4205 the client presents state information from the original server, the 4206 client must be prepared to receive either NFS4ERR_STALE_CLIENTID or 4207 NFS4ERR_STALE_STATEID from the new server. The client should then 4208 recover its state information as it normally would in response to a 4209 server failure. The new server must take care to allow for the 4210 recovery of state information as it would in the event of server 4211 restart. 4213 8.14.2. Replication and State 4215 Since client switch-over in the case of replication is not under 4216 server control, the handling of state is different. In this case, 4217 leases, stateid's and clientid's do not have validity across a 4218 transition from one server to another. The client must re-establish 4219 its locks on the new server. This can be compared to the re- 4220 establishment of locks by means of reclaim-type requests after a 4221 server reboot. The difference is that the server has no provision to 4222 distinguish requests reclaiming locks from those obtaining new locks 4223 or to defer the latter. Thus, a client re-establishing a lock on the 4224 new server (by means of a LOCK or OPEN request), may have the 4225 requests denied due to a conflicting lock. Since replication is 4226 intended for read-only use of filesystems, such denial of locks 4227 should not pose large difficulties in practice. When an attempt to 4228 re-establish a lock on a new server is denied, the client should 4229 treat the situation as if his original lock had been revoked. 4231 8.14.3. Notification of Migrated Lease 4233 In the case of lease renewal, the client may not be submitting 4234 requests for a filesystem that has been migrated to another server. 4235 This can occur because of the implicit lease renewal mechanism. The 4236 client renews leases for all filesystems when submitting a request to 4237 any one filesystem at the server. 4239 In order for the client to schedule renewal of leases that may have 4240 been relocated to the new server, the client must find out about 4242 Draft Specification NFS version 4 Protocol August 2002 4244 lease relocation before those leases expire. To accomplish this, all 4245 operations which implicitly renew leases for a client (i.e. OPEN, 4246 CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error 4247 NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be 4248 renewed has been transferred to a new server. This condition will 4249 continue until the client receives an NFS4ERR_MOVED error and the 4250 server receives the subsequent GETATTR(fs_locations) for an access to 4251 each filesystem for which a lease has been moved to a new server. 4253 When a client receives an NFS4ERR_LEASE_MOVED error, it should 4254 perform some operation, such as a RENEW, on each filesystem 4255 associated with the server in question. When the client receives an 4256 NFS4ERR_MOVED error, the client can follow the normal process to 4257 obtain the new server information (through the fs_locations 4258 attribute) and perform renewal of those leases on the new server. If 4259 the server has not had state transferred to it transparently, the 4260 client will receive either NFS4ERR_STALE_CLIENTID or 4261 NFS4ERR_STALE_STATEID from the new server, as described above, and 4262 the client can then recover state information as it does in the event 4263 of server failure. 4265 8.14.4. Migration and the Lease_time Attribute 4267 In order that the client may appropriately manage its leases in the 4268 case of migration, the destination server must establish proper 4269 values for the lease_time attribute. 4271 When state is transferred transparently, that state should include 4272 the correct value of the lease_time attribute. The lease_time 4273 attribute on the destination server must never be less than that on 4274 the source since this would result in premature expiration of leases 4275 granted by the source server. Upon migration in which state is 4276 transferred transparently, the client is under no obligation to re- 4277 fetch the lease_time attribute and may continue to use the value 4278 previously fetched (on the source server). 4280 If state has not been transferred transparently (i.e. the client sees 4281 a real or simulated server reboot), the client should fetch the value 4282 of lease_time on the new (i.e. destination) server, and use it for 4283 subsequent locking requests. However the server must respect a grace 4284 period at least as long as the lease_time on the source server, in 4285 order to ensure that clients have ample time to reclaim their locks 4286 before potentially conflicting non-reclaimed locks are granted. The 4287 means by which the new server obtains the value of lease_time on the 4288 old server is left to the server implementations. It is not 4289 specified by the NFS version 4 protocol. 4291 Draft Specification NFS version 4 Protocol August 2002 4293 9. Client-Side Caching 4295 Client-side caching of data, of file attributes, and of file names is 4296 essential to providing good performance with the NFS protocol. 4297 Providing distributed cache coherence is a difficult problem and 4298 previous versions of the NFS protocol have not attempted it. 4299 Instead, several NFS client implementation techniques have been used 4300 to reduce the problems that a lack of coherence poses for users. 4301 These techniques have not been clearly defined by earlier protocol 4302 specifications and it is often unclear what is valid or invalid 4303 client behavior. 4305 The NFS version 4 protocol uses many techniques similar to those that 4306 have been used in previous protocol versions. The NFS version 4 4307 protocol does not provide distributed cache coherence. However, it 4308 defines a more limited set of caching guarantees to allow locks and 4309 share reservations to be used without destructive interference from 4310 client side caching. 4312 In addition, the NFS version 4 protocol introduces a delegation 4313 mechanism which allows many decisions normally made by the server to 4314 be made locally by clients. This mechanism provides efficient 4315 support of the common cases where sharing is infrequent or where 4316 sharing is read-only. 4318 9.1. Performance Challenges for Client-Side Caching 4320 Caching techniques used in previous versions of the NFS protocol have 4321 been successful in providing good performance. However, several 4322 scalability challenges can arise when those techniques are used with 4323 very large numbers of clients. This is particularly true when 4324 clients are geographically distributed which classically increases 4325 the latency for cache revalidation requests. 4327 The previous versions of the NFS protocol repeat their file data 4328 cache validation requests at the time the file is opened. This 4329 behavior can have serious performance drawbacks. A common case is 4330 one in which a file is only accessed by a single client. Therefore, 4331 sharing is infrequent. 4333 In this case, repeated reference to the server to find that no 4334 conflicts exist is expensive. A better option with regards to 4335 performance is to allow a client that repeatedly opens a file to do 4336 so without reference to the server. This is done until potentially 4337 conflicting operations from another client actually occur. 4339 A similar situation arises in connection with file locking. Sending 4340 file lock and unlock requests to the server as well as the read and 4341 write requests necessary to make data caching consistent with the 4342 locking semantics (see the section "Data Caching and File Locking") 4343 can severely limit performance. When locking is used to provide 4345 Draft Specification NFS version 4 Protocol August 2002 4347 protection against infrequent conflicts, a large penalty is incurred. 4348 This penalty may discourage the use of file locking by applications. 4350 The NFS version 4 protocol provides more aggressive caching 4351 strategies with the following design goals: 4353 o Compatibility with a large range of server semantics. 4355 o Provide the same caching benefits as previous versions of the 4356 NFS protocol when unable to provide the more aggressive model. 4358 o Requirements for aggressive caching are organized so that a 4359 large portion of the benefit can be obtained even when not all 4360 of the requirements can be met. 4362 The appropriate requirements for the server are discussed in later 4363 sections in which specific forms of caching are covered. (see the 4364 section "Open Delegation"). 4366 9.2. Delegation and Callbacks 4368 Recallable delegation of server responsibilities for a file to a 4369 client improves performance by avoiding repeated requests to the 4370 server in the absence of inter-client conflict. With the use of a 4371 "callback" RPC from server to client, a server recalls delegated 4372 responsibilities when another client engages in sharing of a 4373 delegated file. 4375 A delegation is passed from the server to the client, specifying the 4376 object of the delegation and the type of delegation. There are 4377 different types of delegations but each type contains a stateid to be 4378 used to represent the delegation when performing operations that 4379 depend on the delegation. This stateid is similar to those 4380 associated with locks and share reservations but differs in that the 4381 stateid for a delegation is associated with a clientid and may be 4382 used on behalf of all the open_owners for the given client. A 4383 delegation is made to the client as a whole and not to any specific 4384 process or thread of control within it. 4386 Because callback RPCs may not work in all environments (due to 4387 firewalls, for example), correct protocol operation does not depend 4388 on them. Preliminary testing of callback functionality by means of a 4389 CB_NULL procedure determines whether callbacks can be supported. The 4390 CB_NULL procedure checks the continuity of the callback path. A 4391 server makes a preliminary assessment of callback availability to a 4392 given client and avoids delegating responsibilities until it has 4393 determined that callbacks are supported. Because the granting of a 4394 delegation is always conditional upon the absence of conflicting 4395 access, clients must not assume that a delegation will be granted and 4396 they must always be prepared for OPENs to be processed without any 4398 Draft Specification NFS version 4 Protocol August 2002 4400 delegations being granted. 4402 Once granted, a delegation behaves in most ways like a lock. There 4403 is an associated lease that is subject to renewal together with all 4404 of the other leases held by that client. 4406 Unlike locks, an operation by a second client to a delegated file 4407 will cause the server to recall a delegation through a callback. 4409 On recall, the client holding the delegation must flush modified 4410 state (such as modified data) to the server and return the 4411 delegation. The conflicting request will not receive a response 4412 until the recall is complete. The recall is considered complete when 4413 the client returns the delegation or the server times out on the 4414 recall and revokes the delegation as a result of the timeout. 4415 Following the resolution of the recall, the server has the 4416 information necessary to grant or deny the second client's request. 4418 At the time the client receives a delegation recall, it may have 4419 substantial state that needs to be flushed to the server. Therefore, 4420 the server should allow sufficient time for the delegation to be 4421 returned since it may involve numerous RPCs to the server. If the 4422 server is able to determine that the client is diligently flushing 4423 state to the server as a result of the recall, the server may extend 4424 the usual time allowed for a recall. However, the time allowed for 4425 recall completion should not be unbounded. 4427 An example of this is when responsibility to mediate opens on a given 4428 file is delegated to a client (see the section "Open Delegation"). 4429 The server will not know what opens are in effect on the client. 4430 Without this knowledge the server will be unable to determine if the 4431 access and deny state for the file allows any particular open until 4432 the delegation for the file has been returned. 4434 A client failure or a network partition can result in failure to 4435 respond to a recall callback. In this case, the server will revoke 4436 the delegation which in turn will render useless any modified state 4437 still on the client. 4439 9.2.1. Delegation Recovery 4441 There are three situations that delegation recovery must deal with: 4443 o Client reboot or restart 4445 o Server reboot or restart 4447 o Network partition (full or callback-only) 4449 In the event the client reboots or restarts, the failure to renew 4451 Draft Specification NFS version 4 Protocol August 2002 4453 leases will result in the revocation of record locks and share 4454 reservations. Delegations, however, may be treated a bit 4455 differently. 4457 There will be situations in which delegations will need to be 4458 reestablished after a client reboots or restarts. The reason for 4459 this is the client may have file data stored locally and this data 4460 was associated with the previously held delegations. The client will 4461 need to reestablish the appropriate file state on the server. 4463 To allow for this type of client recovery, the server MAY extend the 4464 period for delegation recovery beyond the typical lease expiration 4465 period. This implies that requests from other clients that conflict 4466 with these delegations will need to wait. Because the normal recall 4467 process may require significant time for the client to flush changed 4468 state to the server, other clients need be prepared for delays that 4469 occur because of a conflicting delegation. This longer interval 4470 would increase the window for clients to reboot and consult stable 4471 storage so that the delegations can be reclaimed. For open 4472 delegations, such delegations are reclaimed using OPEN with a claim 4473 type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and 4474 Revocation" and "Operation 18: OPEN" for discussion of open 4475 delegation and the details of OPEN respectively). 4477 A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it 4478 does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and 4479 instead MUST, for a period of time no less than that of the value of 4480 the lease_time attribute, maintain the client's delegations to allow 4481 time for the client to issue CLAIM_DELEGATE_PREV requests. The server 4482 that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE 4483 operation. 4485 When the server reboots or restarts, delegations are reclaimed (using 4486 the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to 4487 record locks and share reservations. However, there is a slight 4488 semantic difference. In the normal case if the server decides that a 4489 delegation should not be granted, it performs the requested action 4490 (e.g. OPEN) without granting any delegation. For reclaim, the server 4491 grants the delegation but a special designation is applied so that 4492 the client treats the delegation as having been granted but recalled 4493 by the server. Because of this, the client has the duty to write all 4494 modified state to the server and then return the delegation. This 4495 process of handling delegation reclaim reconciles three principles of 4496 the NFS version 4 protocol: 4498 o Upon reclaim, a client reporting resources assigned to it by an 4499 earlier server instance must be granted those resources. 4501 o The server has unquestionable authority to determine whether 4502 delegations are to be granted and, once granted, whether they 4503 are to be continued. 4505 Draft Specification NFS version 4 Protocol August 2002 4507 o The use of callbacks is not to be depended upon until the client 4508 has proven its ability to receive them. 4510 When a network partition occurs, delegations are subject to freeing 4511 by the server when the lease renewal period expires. This is similar 4512 to the behavior for locks and share reservations. For delegations, 4513 however, the server may extend the period in which conflicting 4514 requests are held off. Eventually the occurrence of a conflicting 4515 request from another client will cause revocation of the delegation. 4516 A loss of the callback path (e.g. by later network configuration 4517 change) will have the same effect. A recall request will fail and 4518 revocation of the delegation will result. 4520 A client normally finds out about revocation of a delegation when it 4521 uses a stateid associated with a delegation and receives the error 4522 NFS4ERR_EXPIRED. It also may find out about delegation revocation 4523 after a client reboot when it attempts to reclaim a delegation and 4524 receives that same error. Note that in the case of a revoked write 4525 open delegation, there are issues because data may have been modified 4526 by the client whose delegation is revoked and separately by other 4527 clients. See the section "Revocation Recovery for Write Open 4528 Delegation" for a discussion of such issues. Note also that when 4529 delegations are revoked, information about the revoked delegation 4530 will be written by the server to stable storage (as described in the 4531 section "Crash Recovery"). This is done to deal with the case in 4532 which a server reboots after revoking a delegation but before the 4533 client holding the revoked delegation is notified about the 4534 revocation. 4536 9.3. Data Caching 4538 When applications share access to a set of files, they need to be 4539 implemented so as to take account of the possibility of conflicting 4540 access by another application. This is true whether the applications 4541 in question execute on different clients or reside on the same 4542 client. 4544 Share reservations and record locks are the facilities the NFS 4545 version 4 protocol provides to allow applications to coordinate 4546 access by providing mutual exclusion facilities. The NFS version 4 4547 protocol's data caching must be implemented such that it does not 4548 invalidate the assumptions that those using these facilities depend 4549 upon. 4551 9.3.1. Data Caching and OPENs 4553 In order to avoid invalidating the sharing assumptions that 4554 applications rely on, NFS version 4 clients should not provide cached 4555 data to applications or modify it on behalf of an application when it 4556 would not be valid to obtain or modify that same data via a READ or 4558 Draft Specification NFS version 4 Protocol August 2002 4560 WRITE operation. 4562 Furthermore, in the absence of open delegation (see the section "Open 4563 Delegation") two additional rules apply. Note that these rules are 4564 obeyed in practice by many NFS version 2 and version 3 clients. 4566 o First, cached data present on a client must be revalidated after 4567 doing an OPEN. Revalidating means that the client fetches the 4568 change attribute from the server, compares it with the cached 4569 change attribute, and if different, declares the cached data (as 4570 well as the cached attributes) as invalid. This is to ensure 4571 that the data for the OPENed file is still correctly reflected 4572 in the client's cache. This validation must be done at least 4573 when the client's OPEN operation includes DENY=WRITE or BOTH 4574 thus terminating a period in which other clients may have had 4575 the opportunity to open the file with WRITE access. Clients may 4576 choose to do the revalidation more often (i.e. at OPENs 4577 specifying DENY=NONE) to parallel the NFS version 3 protocol's 4578 practice for the benefit of users assuming this degree of cache 4579 revalidation. 4581 Since the change attribute is updated for data and metadata 4582 modifications, some client implementors may be tempted to use 4583 the time_modify attribute and not change to validate cached 4584 data, so that metadata changes do not spuriously invalidate 4585 clean data. The implementor is cautioned in this approach. The 4586 change attribute is guaranteed to change for each update to the 4587 file, whereas time_modify is guaranteed to change only at the 4588 granularity of the time_delta attribute. Use by the client's 4589 data cache validation logic of time_modify and not change runs 4590 the risk of the client incorrectly marking stale data as valid. 4592 o Second, modified data must be flushed to the server before 4593 closing a file OPENed for write. This is complementary to the 4594 first rule. If the data is not flushed at CLOSE, the 4595 revalidation done after client OPENs as file is unable to 4596 achieve its purpose. The other aspect to flushing the data 4597 before close is that the data must be committed to stable 4598 storage, at the server, before the CLOSE operation is requested 4599 by the client. In the case of a server reboot or restart and a 4600 CLOSEd file, it may not be possible to retransmit the data to be 4601 written to the file. Hence, this requirement. 4603 9.3.2. Data Caching and File Locking 4605 For those applications that choose to use file locking instead of 4606 share reservations to exclude inconsistent file access, there is an 4607 analogous set of constraints that apply to client side data caching. 4608 These rules are effective only if the file locking is used in a way 4609 that matches in an equivalent way the actual READ and WRITE 4611 Draft Specification NFS version 4 Protocol August 2002 4613 operations executed. This is as opposed to file locking that is 4614 based on pure convention. For example, it is possible to manipulate 4615 a two-megabyte file by dividing the file into two one-megabyte 4616 regions and protecting access to the two regions by file locks on 4617 bytes zero and one. A lock for write on byte zero of the file would 4618 represent the right to do READ and WRITE operations on the first 4619 region. A lock for write on byte one of the file would represent the 4620 right to do READ and WRITE operations on the second region. As long 4621 as all applications manipulating the file obey this convention, they 4622 will work on a local filesystem. However, they may not work with the 4623 NFS version 4 protocol unless clients refrain from data caching. 4625 The rules for data caching in the file locking environment are: 4627 o First, when a client obtains a file lock for a particular 4628 region, the data cache corresponding to that region (if any 4629 cache data exists) must be revalidated. If the change attribute 4630 indicates that the file may have been updated since the cached 4631 data was obtained, the client must flush or invalidate the 4632 cached data for the newly locked region. A client might choose 4633 to invalidate all of non-modified cached data that it has for 4634 the file but the only requirement for correct operation is to 4635 invalidate all of the data in the newly locked region. 4637 o Second, before releasing a write lock for a region, all modified 4638 data for that region must be flushed to the server. The 4639 modified data must also be written to stable storage. 4641 Note that flushing data to the server and the invalidation of cached 4642 data must reflect the actual byte ranges locked or unlocked. 4643 Rounding these up or down to reflect client cache block boundaries 4644 will cause problems if not carefully done. For example, writing a 4645 modified block when only half of that block is within an area being 4646 unlocked may cause invalid modification to the region outside the 4647 unlocked area. This, in turn, may be part of a region locked by 4648 another client. Clients can avoid this situation by synchronously 4649 performing portions of write operations that overlap that portion 4650 (initial or final) that is not a full block. Similarly, invalidating 4651 a locked area which is not an integral number of full buffer blocks 4652 would require the client to read one or two partial blocks from the 4653 server if the revalidation procedure shows that the data which the 4654 client possesses may not be valid. 4656 The data that is written to the server as a pre-requisite to the 4657 unlocking of a region must be written, at the server, to stable 4658 storage. The client may accomplish this either with synchronous 4659 writes or by following asynchronous writes with a COMMIT operation. 4660 This is required because retransmission of the modified data after a 4661 server reboot might conflict with a lock held by another client. 4663 A client implementation may choose to accommodate applications which 4664 use record locking in non-standard ways (e.g. using a record lock as 4666 Draft Specification NFS version 4 Protocol August 2002 4668 a global semaphore) by flushing to the server more data upon an LOCKU 4669 than is covered by the locked range. This may include modified data 4670 within files other than the one for which the unlocks are being done. 4671 In such cases, the client must not interfere with applications whose 4672 READs and WRITEs are being done only within the bounds of record 4673 locks which the application holds. For example, an application locks 4674 a single byte of a file and proceeds to write that single byte. A 4675 client that chose to handle a LOCKU by flushing all modified data to 4676 the server could validly write that single byte in response to an 4677 unrelated unlock. However, it would not be valid to write the entire 4678 block in which that single written byte was located since it includes 4679 an area that is not locked and might be locked by another client. 4680 Client implementations can avoid this problem by dividing files with 4681 modified data into those for which all modifications are done to 4682 areas covered by an appropriate record lock and those for which there 4683 are modifications not covered by a record lock. Any writes done for 4684 the former class of files must not include areas not locked and thus 4685 not modified on the client. 4687 9.3.3. Data Caching and Mandatory File Locking 4689 Client side data caching needs to respect mandatory file locking when 4690 it is in effect. The presence of mandatory file locking for a given 4691 file is indicated when the client gets back NFS4ERR_LOCKED from a 4692 READ or WRITE on a file it has an appropriate share reservation for. 4693 When mandatory locking is in effect for a file, the client must check 4694 for an appropriate file lock for data being read or written. If a 4695 lock exists for the range being read or written, the client may 4696 satisfy the request using the client's validated cache. If an 4697 appropriate file lock is not held for the range of the read or write, 4698 the read or write request must not be satisfied by the client's cache 4699 and the request must be sent to the server for processing. When a 4700 read or write request partially overlaps a locked region, the request 4701 should be subdivided into multiple pieces with each region (locked or 4702 not) treated appropriately. 4704 9.3.4. Data Caching and File Identity 4706 When clients cache data, the file data needs to organized according 4707 to the filesystem object to which the data belongs. For NFS version 4708 3 clients, the typical practice has been to assume for the purpose of 4709 caching that distinct filehandles represent distinct filesystem 4710 objects. The client then has the choice to organize and maintain the 4711 data cache on this basis. 4713 In the NFS version 4 protocol, there is now the possibility to have 4714 significant deviations from a "one filehandle per object" model 4715 because a filehandle may be constructed on the basis of the object's 4716 pathname. Therefore, clients need a reliable method to determine if 4717 two filehandles designate the same filesystem object. If clients 4719 Draft Specification NFS version 4 Protocol August 2002 4721 were simply to assume that all distinct filehandles denote distinct 4722 objects and proceed to do data caching on this basis, caching 4723 inconsistencies would arise between the distinct client side objects 4724 which mapped to the same server side object. 4726 By providing a method to differentiate filehandles, the NFS version 4 4727 protocol alleviates a potential functional regression in comparison 4728 with the NFS version 3 protocol. Without this method, caching 4729 inconsistencies within the same client could occur and this has not 4730 been present in previous versions of the NFS protocol. Note that it 4731 is possible to have such inconsistencies with applications executing 4732 on multiple clients but that is not the issue being addressed here. 4734 For the purposes of data caching, the following steps allow an NFS 4735 version 4 client to determine whether two distinct filehandles denote 4736 the same server side object: 4738 o If GETATTR directed to two filehandles have different values of 4739 the fsid attribute, then the filehandles represent distinct 4740 objects. 4742 o If GETATTR for any file with an fsid that matches the fsid of 4743 the two filehandles in question returns a unique_handles 4744 attribute with a value of TRUE, then the two objects are 4745 distinct. 4747 o If GETATTR directed to the two filehandles does not return the 4748 fileid attribute for one or both of the handles, then it cannot 4749 be determined whether the two objects are the same. Therefore, 4750 operations which depend on that knowledge (e.g. client side data 4751 caching) cannot be done reliably. 4753 o If GETATTR directed to the two filehandles returns different 4754 values for the fileid attribute, then they are distinct objects. 4756 o Otherwise they are the same object. 4758 9.4. Open Delegation 4760 When a file is being OPENed, the server may delegate further handling 4761 of opens and closes for that file to the opening client. Any such 4762 delegation is recallable, since the circumstances that allowed for 4763 the delegation are subject to change. In particular, the server may 4764 receive a conflicting OPEN from another client, the server must 4765 recall the delegation before deciding whether the OPEN from the other 4766 client may be granted. Making a delegation is up to the server and 4767 clients should not assume that any particular OPEN either will or 4768 will not result in an open delegation. The following is a typical 4769 set of conditions that servers might use in deciding whether OPEN 4770 should be delegated: 4772 Draft Specification NFS version 4 Protocol August 2002 4774 o The client must be able to respond to the server's callback 4775 requests. The server will use the CB_NULL procedure for a test 4776 of callback ability. 4778 o The client must have responded properly to previous recalls. 4780 o There must be no current open conflicting with the requested 4781 delegation. 4783 o There should be no current delegation that conflicts with the 4784 delegation being requested. 4786 o The probability of future conflicting open requests should be 4787 low based on the recent history of the file. 4789 o The existence of any server-specific semantics of OPEN/CLOSE 4790 that would make the required handling incompatible with the 4791 prescribed handling that the delegated client would apply (see 4792 below). 4794 There are two types of open delegations, read and write. A read open 4795 delegation allows a client to handle, on its own, requests to open a 4796 file for reading that do not deny read access to others. Multiple 4797 read open delegations may be outstanding simultaneously and do not 4798 conflict. A write open delegation allows the client to handle, on 4799 its own, all opens. Only one write open delegation may exist for a 4800 given file at a given time and it is inconsistent with any read open 4801 delegations. 4803 When a client has a read open delegation, it may not make any changes 4804 to the contents or attributes of the file but it is assured that no 4805 other client may do so. When a client has a write open delegation, 4806 it may modify the file data since no other client will be accessing 4807 the file's data. The client holding a write delegation may only 4808 affect file attributes which are intimately connected with the file 4809 data: size, time_modify, change. 4811 When a client has an open delegation, it does not send OPENs or 4812 CLOSEs to the server but updates the appropriate status internally. 4813 For a read open delegation, opens that cannot be handled locally 4814 (opens for write or that deny read access) must be sent to the 4815 server. 4817 When an open delegation is made, the response to the OPEN contains an 4818 open delegation structure which specifies the following: 4820 o the type of delegation (read or write) 4822 o space limitation information to control flushing of data on 4823 close (write open delegation only, see the section "Open 4824 Delegation and Data Caching") 4826 Draft Specification NFS version 4 Protocol August 2002 4828 o an nfsace4 specifying read and write permissions 4830 o a stateid to represent the delegation for READ and WRITE 4832 The delegation stateid is separate and distinct from the stateid for 4833 the OPEN proper. The standard stateid, unlike the delegation 4834 stateid, is associated with a particular lock_owner and will continue 4835 to be valid after the delegation is recalled and the file remains 4836 open. 4838 When a request internal to the client is made to open a file and open 4839 delegation is in effect, it will be accepted or rejected solely on 4840 the basis of the following conditions. Any requirement for other 4841 checks to be made by the delegate should result in open delegation 4842 being denied so that the checks can be made by the server itself. 4844 o The access and deny bits for the request and the file as 4845 described in the section "Share Reservations". 4847 o The read and write permissions as determined below. 4849 The nfsace4 passed with delegation can be used to avoid frequent 4850 ACCESS calls. The permission check should be as follows: 4852 o If the nfsace4 indicates that the open may be done, then it 4853 should be granted without reference to the server. 4855 o If the nfsace4 indicates that the open may not be done, then an 4856 ACCESS request must be sent to the server to obtain the 4857 definitive answer. 4859 The server may return an nfsace4 that is more restrictive than the 4860 actual ACL of the file. This includes an nfsace4 that specifies 4861 denial of all access. Note that some common practices such as 4862 mapping the traditional user "root" to the user "nobody" may make it 4863 incorrect to return the actual ACL of the file in the delegation 4864 response. 4866 The use of delegation together with various other forms of caching 4867 creates the possibility that no server authentication will ever be 4868 performed for a given user since all of the user's requests might be 4869 satisfied locally. Where the client is depending on the server for 4870 authentication, the client should be sure authentication occurs for 4871 each user by use of the ACCESS operation. This should be the case 4872 even if an ACCESS operation would not be required otherwise. As 4873 mentioned before, the server may enforce frequent authentication by 4874 returning an nfsace4 denying all access with every open delegation. 4876 Draft Specification NFS version 4 Protocol August 2002 4878 9.4.1. Open Delegation and Data Caching 4880 OPEN delegation allows much of the message overhead associated with 4881 the opening and closing files to be eliminated. An open when an open 4882 delegation is in effect does not require that a validation message be 4883 sent to the server. The continued endurance of the "read open 4884 delegation" provides a guarantee that no OPEN for write and thus no 4885 write has occurred. Similarly, when closing a file opened for write 4886 and if write open delegation is in effect, the data written does not 4887 have to be flushed to the server until the open delegation is 4888 recalled. The continued endurance of the open delegation provides a 4889 guarantee that no open and thus no read or write has been done by 4890 another client. 4892 For the purposes of open delegation, READs and WRITEs done without an 4893 OPEN are treated as the functional equivalents of a corresponding 4894 type of OPEN. This refers to the READs and WRITEs that use the 4895 special stateids consisting of all zero bits or all one bits. 4896 Therefore, READs or WRITEs with a special stateid done by another 4897 client will force the server to recall a write open delegation. A 4898 WRITE with a special stateid done by another client will force a 4899 recall of read open delegations. 4901 With delegations, a client is able to avoid writing data to the 4902 server when the CLOSE of a file is serviced. The file close system 4903 call is the usual point at which the client is notified of a lack of 4904 stable storage for the modified file data generated by the 4905 application. At the close, file data is written to the server and 4906 through normal accounting the server is able to determine if the 4907 available filesystem space for the data has been exceeded (i.e. 4908 server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting 4909 includes quotas. The introduction of delegations requires that a 4910 alternative method be in place for the same type of communication to 4911 occur between client and server. 4913 In the delegation response, the server provides either the limit of 4914 the size of the file or the number of modified blocks and associated 4915 block size. The server must ensure that the client will be able to 4916 flush data to the server of a size equal to that provided in the 4917 original delegation. The server must make this assurance for all 4918 outstanding delegations. Therefore, the server must be careful in 4919 its management of available space for new or modified data taking 4920 into account available filesystem space and any applicable quotas. 4921 The server can recall delegations as a result of managing the 4922 available filesystem space. The client should abide by the server's 4923 state space limits for delegations. If the client exceeds the stated 4924 limits for the delegation, the server's behavior is undefined. 4926 Based on server conditions, quotas or available filesystem space, the 4927 server may grant write open delegations with very restrictive space 4928 limitations. The limitations may be defined in a way that will 4929 always force modified data to be flushed to the server on close. 4931 Draft Specification NFS version 4 Protocol August 2002 4933 With respect to authentication, flushing modified data to the server 4934 after a CLOSE has occurred may be problematic. For example, the user 4935 of the application may have logged off the client and unexpired 4936 authentication credentials may not be present. In this case, the 4937 client may need to take special care to ensure that local unexpired 4938 credentials will in fact be available. This may be accomplished by 4939 tracking the expiration time of credentials and flushing data well in 4940 advance of their expiration or by making private copies of 4941 credentials to assure their availability when needed. 4943 9.4.2. Open Delegation and File Locks 4945 When a client holds a write open delegation, lock operations are 4946 performed locally. This includes those required for mandatory file 4947 locking. This can be done since the delegation implies that there 4948 can be no conflicting locks. Similarly, all of the revalidations 4949 that would normally be associated with obtaining locks and the 4950 flushing of data associated with the releasing of locks need not be 4951 done. 4953 When a client holds a read open delegation, lock operations are not 4954 performed locally. All lock operations, including those requesting 4955 non-exclusive locks, are sent to the server for resolution. 4957 9.4.3. Handling of CB_GETATTR 4959 The server needs to employ special handling for a GETATTR where the 4960 target is a file that has a write open delegation in effect. The 4961 reason for this is that the client holding the write delegation may 4962 have modified the data and the server needs to reflect this change to 4963 the second client that submitted the GETATTR. Therefore, the client 4964 holding the write delegation needs to be interrogated. The server 4965 will use the CB_GETATTR operation. The only attributes that the 4966 server can reliably query via CB_GETATTR are size and change. 4968 Since CB_GETATTR is being used to satisfy another client's GETATTR 4969 request, the server only needs to know if the client holding the 4970 delegation has a modified version of the file. If the client's copy 4971 of the delegated file is not modified (data or size), the server can 4972 satisfy the second client's GETATTR request from the attributes 4973 stored locally at the server. If the file is modified, the server 4974 only needs to know about this modified state. If the server 4975 determines that the file is currently modified, it will respond to 4976 the second client's GETATTR as if the file had been modified locally 4977 at the server. This means that the server will take the current time 4978 and apply it to the construction of attributes like change and 4979 time_modify. 4981 Since the form of the change attribute is determined by the server 4982 and is opaque to the client, the client and server need to agree on a 4984 Draft Specification NFS version 4 Protocol August 2002 4986 method of communicating the modified state of the file. For the size 4987 attribute, the client will report its current view of the file size. 4988 For the change attribute, the handling is more involved. 4990 For the client, the following steps will be taken when receiving a 4991 write delegation: 4993 o The value of the change attribute will be obtained from the 4994 server and cached. Let this value be represented by c. 4996 o The client will create a value greater than c that will be used 4997 for communicating modified data is held at the client. Let this 4998 value be represented by d. 5000 o When the client is queried via CB_GETATTR for the change 5001 attribute, it checks to see if it holds modified data. If the 5002 file is modified, the value d is returned for the change 5003 attribute value. If this file is not currently modified, the 5004 client returns the value c for the change attribute. 5006 While the change attribute is opaque to client in the sense that it 5007 has no idea what units of time, if any, the server is counting change 5008 with, it is not opaque in that the client has to treat it as an 5009 integer, and the server has to be able to see the results of the 5010 client's changes to that integer. Therefore, the server MUST encode 5011 the change attribute in network order when sending it to the client, 5012 the client MUST decode it from network order to its native order when 5013 receiving it, and the client MUST encode it network order when 5014 sending it to the server. For this reason, change is defined as an 5015 integer, rather than an opaque array of octets. 5017 For the server, the following steps will be taken when providing a 5018 write delegation: 5020 o On providing a write delegation, the server will cache a copy of 5021 the change attribute. Let this value be represented by sc. 5023 o The server obtains the change attribute from the client. Let 5024 this value be cc. 5026 o If the value cc is equal to sc, the file is not modified and the 5027 server returns the current values for change and time_modify 5028 (for example) to the client requesting GETATTR. 5030 o If the value cc is NOT equal to sc, the file is currently 5031 modified at the client and most likely will be modified at the 5032 server at a future time. The server then uses the current time 5033 to construct attributes values for change and time_modify and 5034 returns those values to the requestor. 5036 o In the case that the file attribute size is different than the 5038 Draft Specification NFS version 4 Protocol August 2002 5040 server's current value, the server treats this as a modification 5041 regardless of the value of the change attribute retrieved via 5042 CB_GETATTR and responds to the second client as in the last 5043 step. 5045 This methodology resolves issues of clock differences between client 5046 and server and other scenarios where the use of CB_GETATTR break 5047 down. 5049 9.4.4. Recall of Open Delegation 5051 The following events necessitate recall of an open delegation: 5053 o Potentially conflicting OPEN request (or READ/WRITE done with 5054 "special" stateid) 5056 o SETATTR issued by another client 5058 o REMOVE request for the file 5060 o RENAME request for the file as either source or target of the 5061 RENAME 5063 Whether a RENAME of a directory in the path leading to the file 5064 results in recall of an open delegation depends on the semantics of 5065 the server filesystem. If that filesystem denies such RENAMEs when a 5066 file is open, the recall must be performed to determine whether the 5067 file in question is, in fact, open. 5069 In addition to the situations above, the server may choose to recall 5070 open delegations at any time if resource constraints make it 5071 advisable to do so. Clients should always be prepared for the 5072 possibility of recall. 5074 When a client receives a recall for an open delegation, it needs to 5075 update state on the server before returning the delegation. These 5076 same updates must be done whenever a client chooses to return a 5077 delegation voluntarily. The following items of state need to be 5078 dealt with: 5080 o If the file associated with the delegation is no longer open and 5081 no previous CLOSE operation has been sent to the server, a CLOSE 5082 operation must be sent to the server. 5084 o If a file has other open references at the client, then OPEN 5085 operations must be sent to the server. The appropriate stateids 5086 will be provided by the server for subsequent use by the client 5087 since the delegation stateid will not longer be valid. These 5088 OPEN requests are done with the claim type of 5089 CLAIM_DELEGATE_CUR. This will allow the presentation of the 5091 Draft Specification NFS version 4 Protocol August 2002 5093 delegation stateid so that the client can establish the 5094 appropriate rights to perform the OPEN. (see the section 5095 "Operation 18: OPEN" for details.) 5097 o If there are granted file locks, the corresponding LOCK 5098 operations need to be performed. This applies to the write open 5099 delegation case only. 5101 o For a write open delegation, if at the time of recall the file 5102 is not open for write, all modified data for the file must be 5103 flushed to the server. If the delegation had not existed, the 5104 client would have done this data flush before the CLOSE 5105 operation. 5107 o For a write open delegation when a file is still open at the 5108 time of recall, any modified data for the file needs to be 5109 flushed to the server. 5111 o With the write open delegation in place, it is possible that the 5112 file was truncated during the duration of the delegation. For 5113 example, the truncation could have occurred as a result of an 5114 OPEN UNCHECKED with a size attribute value of zero. Therefore, 5115 if a truncation of the file has occurred and this operation has 5116 not been propagated to the server, the truncation must occur 5117 before any modified data is written to the server. 5119 In the case of write open delegation, file locking imposes some 5120 additional requirements. To precisely maintain the associated 5121 invariant, it is required to flush any modified data in any region 5122 for which a write lock was released while the write delegation was in 5123 effect. However, because the write open delegation implies no other 5124 locking by other clients, a simpler implementation is to flush all 5125 modified data for the file (as described just above) if any write 5126 lock has been released while the write open delegation was in effect. 5128 An implementation need not wait until delegation recall (or deciding 5129 to voluntarily return a delegation) to perform any of the above 5130 actions, if implementation considerations (e.g. resource availability 5131 constraints) make that desirable. Generally, however, the fact that 5132 the actual open state of the file may continue to change makes it not 5133 worthwhile to send information about opens and closes to the server, 5134 except as part of delegation return. Only in the case of closing the 5135 open that resulted in obtaining the delegation would clients be 5136 likely to do this early, since, in that case, the close once done 5137 will not be undone. Regardless of the client's choices on scheduling 5138 these actions, all must be performed before the delegation is 5139 returned, including (when applicable) the close that corresponds to 5140 the open that resulted in the delegation. These actions can be 5141 performed either in previous requests or in previous operations in 5142 the same COMPOUND request. 5144 Draft Specification NFS version 4 Protocol August 2002 5146 9.4.5. Delegation Revocation 5148 At the point a delegation is revoked, if there are associated opens 5149 on the client, the applications holding these opens need to be 5150 notified. This notification usually occurs by returning errors for 5151 READ/WRITE operations or when a close is attempted for the open file. 5153 If no opens exist for the file at the point the delegation is 5154 revoked, then notification of the revocation is unnecessary. 5155 However, if there is modified data present at the client for the 5156 file, the user of the application should be notified. Unfortunately, 5157 it may not be possible to notify the user since active applications 5158 may not be present at the client. See the section "Revocation 5159 Recovery for Write Open Delegation" for additional details. 5161 9.5. Data Caching and Revocation 5163 When locks and delegations are revoked, the assumptions upon which 5164 successful caching depend are no longer guaranteed. For any locks or 5165 share reservations that have been revoked, the corresponding owner 5166 needs to be notified. This notification includes applications with a 5167 file open that has a corresponding delegation which has been revoked. 5168 Cached data associated with the revocation must be removed from the 5169 client. In the case of modified data existing in the client's cache, 5170 that data must be removed from the client without it being written to 5171 the server. As mentioned, the assumptions made by the client are no 5172 longer valid at the point when a lock or delegation has been revoked. 5173 For example, another client may have been granted a conflicting lock 5174 after the revocation of the lock at the first client. Therefore, the 5175 data within the lock range may have been modified by the other 5176 client. Obviously, the first client is unable to guarantee to the 5177 application what has occurred to the file in the case of revocation. 5179 Notification to a lock owner will in many cases consist of simply 5180 returning an error on the next and all subsequent READs/WRITEs to the 5181 open file or on the close. Where the methods available to a client 5182 make such notification impossible because errors for certain 5183 operations may not be returned, more drastic action such as signals 5184 or process termination may be appropriate. The justification for 5185 this is that an invariant for which an application depends on may be 5186 violated. Depending on how errors are typically treated for the 5187 client operating environment, further levels of notification 5188 including logging, console messages, and GUI pop-ups may be 5189 appropriate. 5191 9.5.1. Revocation Recovery for Write Open Delegation 5193 Revocation recovery for a write open delegation poses the special 5194 issue of modified data in the client cache while the file is not 5195 open. In this situation, any client which does not flush modified 5197 Draft Specification NFS version 4 Protocol August 2002 5199 data to the server on each close must ensure that the user receives 5200 appropriate notification of the failure as a result of the 5201 revocation. Since such situations may require human action to 5202 correct problems, notification schemes in which the appropriate user 5203 or administrator is notified may be necessary. Logging and console 5204 messages are typical examples. 5206 If there is modified data on the client, it must not be flushed 5207 normally to the server. A client may attempt to provide a copy of 5208 the file data as modified during the delegation under a different 5209 name in the filesystem name space to ease recovery. Note that when 5210 the client can determine that the file has not been modified by any 5211 other client, or when the client has a complete cached copy of file 5212 in question, such a saved copy of the client's view of the file may 5213 be of particular value for recovery. In other case, recovery using a 5214 copy of the file based partially on the client's cached data and 5215 partially on the server copy as modified by other clients, will be 5216 anything but straightforward, so clients may avoid saving file 5217 contents in these situations or mark the results specially to warn 5218 users of possible problems. 5220 Saving of such modified data in delegation revocation situations may 5221 be limited to files of a certain size or might be used only when 5222 sufficient disk space is available within the target filesystem. 5223 Such saving may also be restricted to situations when the client has 5224 sufficient buffering resources to keep the cached copy available 5225 until it is properly stored to the target filesystem. 5227 9.6. Attribute Caching 5229 The attributes discussed in this section do not include named 5230 attributes. Individual named attributes are analogous to files and 5231 caching of the data for these needs to be handled just as data 5232 caching is for ordinary files. Similarly, LOOKUP results from an 5233 OPENATTR directory are to be cached on the same basis as any other 5234 pathnames and similarly for directory contents. 5236 Clients may cache file attributes obtained from the server and use 5237 them to avoid subsequent GETATTR requests. Such caching is write 5238 through in that modification to file attributes is always done by 5239 means of requests to the server and should not be done locally and 5240 cached. The exception to this are modifications to attributes that 5241 are intimately connected with data caching. Therefore, extending a 5242 file by writing data to the local data cache is reflected immediately 5243 in the size as seen on the client without this change being 5244 immediately reflected on the server. Normally such changes are not 5245 propagated directly to the server but when the modified data is 5246 flushed to the server, analogous attribute changes are made on the 5247 server. When open delegation is in effect, the modified attributes 5248 may be returned to the server in the response to a CB_RECALL call. 5250 Draft Specification NFS version 4 Protocol August 2002 5252 The result of local caching of attributes is that the attribute 5253 caches maintained on individual clients will not be coherent. Changes 5254 made in one order on the server may be seen in a different order on 5255 one client and in a third order on a different client. 5257 The typical filesystem application programming interfaces do not 5258 provide means to atomically modify or interrogate attributes for 5259 multiple files at the same time. The following rules provide an 5260 environment where the potential incoherences mentioned above can be 5261 reasonably managed. These rules are derived from the practice of 5262 previous NFS protocols. 5264 o All attributes for a given file (per-fsid attributes excepted) 5265 are cached as a unit at the client so that no non- 5266 serializability can arise within the context of a single file. 5268 o An upper time boundary is maintained on how long a client cache 5269 entry can be kept without being refreshed from the server. 5271 o When operations are performed that change attributes at the 5272 server, the updated attribute set is requested as part of the 5273 containing RPC. This includes directory operations that update 5274 attributes indirectly. This is accomplished by following the 5275 modifying operation with a GETATTR operation and then using the 5276 results of the GETATTR to update the client's cached attributes. 5278 Note that if the full set of attributes to be cached is requested by 5279 READDIR, the results can be cached by the client on the same basis as 5280 attributes obtained via GETATTR. 5282 A client may validate its cached version of attributes for a file by 5283 fetching just both the change and time_access attributes and assuming 5284 that if the change attribute has the same value as it did when the 5285 attributes were cached, then no attributes other than time_access 5286 have changed. The reason why time_access is also fetched is because 5287 many servers operate in environments where the operation that updates 5288 change does not update time_access. For example, POSIX file 5289 semantics do not update access time when a file is modified by the 5290 write system call. Therefore, the client that wants a current 5291 time_access value should fetch it with change during the attribute 5292 cache validation processing and update its cached time_access. 5294 The client may maintain a cache of modified attributes for those 5295 attributes intimately connected with data of modified regular files 5296 (size, time_modify, and change). Other than those three attributes, 5297 the client MUST NOT maintain a cache of modified attributes. Instead, 5298 attribute changes are immediately sent to the server. 5300 In some operating environments, the equivalent to time_access is 5301 expected to be implicitly updated by each read of the content of the 5302 file object. If an NFS client is caching the content of a file 5303 object, whether it is a regular file, directory, or symbolic link, 5305 Draft Specification NFS version 4 Protocol August 2002 5307 the client SHOULD NOT update the time_access attribute (via SETATTR 5308 or a small READ or READDIR request) on the server with each read that 5309 is satisfied from cache. The reason is that this can defeat the 5310 performance benefits of caching content, especially since an explicit 5311 SETATTR of time_access may alter the change attribute on the server. 5312 If the change attribute changes, clients that are caching the content 5313 will think the content has changed, and will re-read unmodified data 5314 from the server. Nor is the client encouraged to maintain a modified 5315 version of time_access in its cache, since this would mean that the 5316 client will either eventually have to write the access time to the 5317 server with bad performance effects, or it would never update the 5318 server's time_access, thereby resulting in a situation where an 5319 application that caches access time between a close and open of the 5320 same file observes the access time oscillating between the past and 5321 present. The time_access attribute always means the time of last 5322 access to a file by a read that was satisfied by the server. This way 5323 clients will tend to see only time_access changes that go forward in 5324 time. 5326 9.7. Name Caching 5328 The results of LOOKUP and READDIR operations may be cached to avoid 5329 the cost of subsequent LOOKUP operations. Just as in the case of 5330 attribute caching, inconsistencies may arise among the various client 5331 caches. To mitigate the effects of these inconsistencies and given 5332 the context of typical filesystem APIs, an upper time boundary is 5333 maintained on how long a client name cache entry can be kept without 5334 verifying that the entry has not been made invalid by a directory 5335 change operation performed by another client. 5337 When a client is not making changes to a directory for which there 5338 exist name cache entries, the client needs to periodically fetch 5339 attributes for that directory to ensure that it is not being 5340 modified. After determining that no modification has occurred, the 5341 expiration time for the associated name cache entries may be updated 5342 to be the current time plus the name cache staleness bound. 5344 When a client is making changes to a given directory, it needs to 5345 determine whether there have been changes made to the directory by 5346 other clients. It does this by using the change attribute as 5347 reported before and after the directory operation in the associated 5348 change_info4 value returned for the operation. The server is able to 5349 communicate to the client whether the change_info4 data is provided 5350 atomically with respect to the directory operation. If the change 5351 values are provided atomically, the client is then able to compare 5352 the pre-operation change value with the change value in the client's 5353 name cache. If the comparison indicates that the directory was 5354 updated by another client, the name cache associated with the 5355 modified directory is purged from the client. If the comparison 5356 indicates no modification, the name cache can be updated on the 5357 client to reflect the directory operation and the associated timeout 5359 Draft Specification NFS version 4 Protocol August 2002 5361 extended. The post-operation change value needs to be saved as the 5362 basis for future change_info4 comparisons. 5364 As demonstrated by the scenario above, name caching requires that the 5365 client revalidate name cache data by inspecting the change attribute 5366 of a directory at the point when the name cache item was cached. 5367 This requires that the server update the change attribute for 5368 directories when the contents of the corresponding directory is 5369 modified. For a client to use the change_info4 information 5370 appropriately and correctly, the server must report the pre and post 5371 operation change attribute values atomically. When the server is 5372 unable to report the before and after values atomically with respect 5373 to the directory operation, the server must indicate that fact in the 5374 change_info4 return value. When the information is not atomically 5375 reported, the client should not assume that other clients have not 5376 changed the directory. 5378 9.8. Directory Caching 5380 The results of READDIR operations may be used to avoid subsequent 5381 READDIR operations. Just as in the cases of attribute and name 5382 caching, inconsistencies may arise among the various client caches. 5383 To mitigate the effects of these inconsistencies, and given the 5384 context of typical filesystem APIs, the following rules should be 5385 followed: 5387 o Cached READDIR information for a directory which is not obtained 5388 in a single READDIR operation must always be a consistent 5389 snapshot of directory contents. This is determined by using a 5390 GETATTR before the first READDIR and after the last of READDIR 5391 that contributes to the cache. 5393 o An upper time boundary is maintained to indicate the length of 5394 time a directory cache entry is considered valid before the 5395 client must revalidate the cached information. 5397 The revalidation technique parallels that discussed in the case of 5398 name caching. When the client is not changing the directory in 5399 question, checking the change attribute of the directory with GETATTR 5400 is adequate. The lifetime of the cache entry can be extended at 5401 these checkpoints. When a client is modifying the directory, the 5402 client needs to use the change_info4 data to determine whether there 5403 are other clients modifying the directory. If it is determined that 5404 no other client modifications are occurring, the client may update 5405 its directory cache to reflect its own changes. 5407 As demonstrated previously, directory caching requires that the 5408 client revalidate directory cache data by inspecting the change 5409 attribute of a directory at the point when the directory was cached. 5410 This requires that the server update the change attribute for 5411 directories when the contents of the corresponding directory is 5413 Draft Specification NFS version 4 Protocol August 2002 5415 modified. For a client to use the change_info4 information 5416 appropriately and correctly, the server must report the pre and post 5417 operation change attribute values atomically. When the server is 5418 unable to report the before and after values atomically with respect 5419 to the directory operation, the server must indicate that fact in the 5420 change_info4 return value. When the information is not atomically 5421 reported, the client should not assume that other clients have not 5422 changed the directory. 5424 Draft Specification NFS version 4 Protocol August 2002 5426 10. Minor Versioning 5428 To address the requirement of an NFS protocol that can evolve as the 5429 need arises, the NFS version 4 protocol contains the rules and 5430 framework to allow for future minor changes or versioning. 5432 The base assumption with respect to minor versioning is that any 5433 future accepted minor version must follow the IETF process and be 5434 documented in a standards track RFC. Therefore, each minor version 5435 number will correspond to an RFC. Minor version zero of the NFS 5436 version 4 protocol is represented by this RFC. The COMPOUND 5437 procedure will support the encoding of the minor version being 5438 requested by the client. 5440 The following items represent the basic rules for the development of 5441 minor versions. Note that a future minor version may decide to 5442 modify or add to the following rules as part of the minor version 5443 definition. 5445 1 Procedures are not added or deleted 5447 To maintain the general RPC model, NFS version 4 minor versions 5448 will not add to or delete procedures from the NFS program. 5450 2 Minor versions may add operations to the COMPOUND and 5451 CB_COMPOUND procedures. 5453 The addition of operations to the COMPOUND and CB_COMPOUND 5454 procedures does not affect the RPC model. 5456 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, 5457 and GETATTR4res. 5459 This allows for the expansion of the attribute model to allow 5460 for future growth or adaptation. 5462 2.2 Minor version X must append any new attributes after the last 5463 documented attribute. 5465 Since attribute results are specified as an opaque array of 5466 per-attribute XDR encoded results, the complexity of adding new 5467 attributes in the midst of the current definitions will be too 5468 burdensome. 5470 3 Minor versions must not modify the structure of an existing 5471 operation's arguments or results. 5473 Draft Specification NFS version 4 Protocol August 2002 5475 Again the complexity of handling multiple structure definitions 5476 for a single operation is too burdensome. New operations should 5477 be added instead of modifying existing structures for a minor 5478 version. 5480 This rule does not preclude the following adaptations in a minor 5481 version. 5483 o adding bits to flag fields such as new attributes to 5484 GETATTR's bitmap4 data type 5486 o adding bits to existing attributes like ACLs that have flag 5487 words 5489 o extending enumerated types (including NFS4ERR_*) with new 5490 values 5492 4 Minor versions may not modify the structure of existing 5493 attributes. 5495 5 Minor versions may not delete operations. 5497 This prevents the potential reuse of a particular operation 5498 "slot" in a future minor version. 5500 6 Minor versions may not delete attributes. 5502 7 Minor versions may not delete flag bits or enumeration values. 5504 8 Minor versions may declare an operation as mandatory to NOT 5505 implement. 5507 Specifying an operation as "mandatory to not implement" is 5508 equivalent to obsoleting an operation. For the client, it means 5509 that the operation should not be sent to the server. For the 5510 server, an NFS error can be returned as opposed to "dropping" 5511 the request as an XDR decode error. This approach allows for 5512 the obsolescence of an operation while maintaining its structure 5513 so that a future minor version can reintroduce the operation. 5515 8.1 Minor versions may declare attributes mandatory to NOT 5516 implement. 5518 8.2 Minor versions may declare flag bits or enumeration values as 5519 mandatory to NOT implement. 5521 Draft Specification NFS version 4 Protocol August 2002 5523 9 Minor versions may downgrade features from mandatory to 5524 recommended, or recommended to optional. 5526 10 Minor versions may upgrade features from optional to recommended 5527 or recommended to mandatory. 5529 11 A client and server that support minor version X must support 5530 minor versions 0 (zero) through X-1 as well. 5532 12 No new features may be introduced as mandatory in a minor 5533 version. 5535 This rule allows for the introduction of new functionality and 5536 forces the use of implementation experience before designating a 5537 feature as mandatory. 5539 13 A client MUST NOT attempt to use a stateid, filehandle, or 5540 similar returned object from the COMPOUND procedure with minor 5541 version X for another COMPOUND procedure with minor version Y, 5542 where X != Y. 5544 Draft Specification NFS version 4 Protocol August 2002 5546 11. Internationalization 5548 The primary issue in which NFS needs to deal with 5549 internationalization, or I18N, is with respect to file names and 5550 other strings as used within the protocol. The choice of string 5551 representation must allow reasonable name/string access to clients 5552 which use various languages. The UTF-8 encoding of the UCS as 5553 defined by [ISO10646] allows for this type of access and follows the 5554 policy described in "IETF Policy on Character Sets and Languages", 5555 [RFC2277]. This choice is explained further in the following. 5557 11.1. Universal Versus Local Character Sets 5559 [RFC1345] describes a table of 16 bit characters for many different 5560 languages (the bit encodings match Unicode, though of course 5561 [RFC1345] is somewhat out of date with respect to current Unicode 5562 assignments). Each character from each language has a unique 16 bit 5563 value in the 16 bit character set. Thus this table can be thought of 5564 as a universal character set. [RFC1345] then talks about groupings 5565 of subsets of the entire 16 bit character set into "Charset Tables". 5566 For example one might take all the Greek characters from the 16 bit 5567 table (which are consecutively allocated), and normalize their 5568 offsets to a table that fits in 7 bits. Thus it is determined that 5569 "lower case alpha" is in the same position as "upper case a" in the 5570 US-ASCII table, and "upper case alpha" is in the same position as 5571 "lower case a" in the US-ASCII table. 5573 These normalized subset character sets can be thought of as "local 5574 character sets", suitable for an operating system locale. 5576 Local character sets are not suitable for the NFS protocol. Consider 5577 someone who creates a file with a name in a Swedish character set. 5578 If someone else later goes to access the file with their locale set 5579 to the Swedish language, then there are no problems. But if someone 5580 in say the US-ASCII locale goes to access the file, the file name 5581 will look very different, because the Swedish characters in the 7 bit 5582 table will now be represented in US-ASCII characters on the display. 5583 It would be preferable to give the US-ASCII user a way to display the 5584 file name using Swedish glyphs. In order to do that, the NFS protocol 5585 would have to include the locale with the file name on each operation 5586 to create a file. 5588 However, the complexity burden of defining such locales in a way that 5589 could be understood by all clients and servers, and maintaining them 5590 in the face of changes would be considerable. A better solution is 5591 desirable. 5593 If the NFS version 4 protocol used a universal 16 bit or 32 bit 5594 character set (or an encoding of a 16 bit or 32 bit character set 5595 into octets), then the server and client need not care if the locale 5596 of the user accessing the file is different than the locale of the 5597 user who created the file. The unique 16 bit or 32 bit encoding of 5599 Draft Specification NFS version 4 Protocol August 2002 5601 the character allows for determination of what language the character 5602 is from and also how to display that character on the client. The 5603 server need not know what locales are used. 5605 11.2. Overview of Universal Character Set Standards 5607 The previous section makes a case for using a universal character 5608 set. This section makes the case for using UTF-8 as the specific 5609 universal character set for the NFS version 4 protocol. 5611 [RFC2279] discusses UTF-* (UTF-8 and other UTF-XXX encodings), 5612 Unicode, and UCS-*. There are two standards bodies managing 5613 universal code sets: 5615 o ISO/IEC which has the standard 10646-1 5617 o Unicode which has the Unicode standard 5619 Both standards bodies have pledged to track each other's assignments 5620 of character codes. 5622 The following is a brief analysis of the various standards. 5624 UCS Universal Character Set. This is ISO/IEC 10646-1: "a 5625 multi-octet character set called the Universal Character 5626 Set (UCS), which encompasses most of the world's writing 5627 systems." 5629 UCS-2 a two octet per character encoding that addresses the first 5630 2^16 characters of UCS. Currently there are no UCS 5631 characters beyond that range. 5633 UCS-4 a four octet per character encoding that permits the 5634 encoding of up to 2^31 characters. 5636 UTF UTF is an abbreviation of the term "UCS transformation 5637 format" and is used in the naming of various standards for 5638 encoding of UCS characters as described below. 5640 UTF-1 Only historical interest; it has been removed from 10646-1 5642 UTF-7 Encodes the entire "repertoire" of UCS "characters using 5643 only octets with the higher order bit clear". [RFC2152] 5644 describes UTF-7. UTF-7 accomplishes this by reserving one 5645 of the 7bit US-ASCII characters as a "shift" character to 5646 indicate non-US-ASCII characters. 5648 Draft Specification NFS version 4 Protocol August 2002 5650 UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII 5651 characters are encoded as before unchanged. Any octet with 5652 the high bit cleared can only mean a US-ASCII character. 5653 The high bit set means that a UCS character is being 5654 encoded. 5656 UTF-16 Encodes UCS-4 characters into UCS-2 characters using a 5657 reserved range in UCS-2. 5659 Unicode Unicode and UCS-2 are the same; [RFC2279] states: 5661 Up to the present time, changes in Unicode and amendments 5662 to ISO/IEC 10646 have tracked each other, so that the 5663 character repertoires and code point assignments have 5664 remained in sync. The relevant standardization committees 5665 have committed to maintain this very useful synchronism. 5667 11.3. Difficulties with UCS-4, UCS-2, Unicode 5669 Adapting existing applications, and filesystems to multi-octet 5670 schemes like UCS and Unicode can be difficult. A significant amount 5671 of code has been written to process streams of bytes. Also there are 5672 many existing stored objects described with 7 bit or 8 bit 5673 characters. Doubling or quadrupling the bandwidth and storage 5674 requirements seems like an expensive way to accomplish I18N. 5676 UCS-2 and Unicode are "only" 16 bits long. That might seem to be 5677 enough but, according to [Unicode1], 49,194 Unicode characters are 5678 already assigned. According to [Unicode2] there are still more 5679 languages that need to be added. 5681 11.4. UTF-8 and its solutions 5683 UTF-8 solves problems for NFS that exist with the use of UCS and 5684 Unicode. UTF-8 will encode 16 bit and 32 bit characters in a way 5685 that will be compact for most users. The encoding table from UCS-4 to 5686 UTF-8, as copied from [RFC2279]: 5688 UCS-4 range (hex.) UTF-8 octet sequence (binary) 5689 0000 0000-0000 007F 0xxxxxxx 5690 0000 0080-0000 07FF 110xxxxx 10xxxxxx 5691 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 5693 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5694 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 5695 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 5696 10xxxxxx 5698 See [RFC2279] for precise encoding and decoding rules. Note because 5700 Draft Specification NFS version 4 Protocol August 2002 5702 of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account 5703 for the reserved range between D800 and DFFF. 5705 Note that the 16 bit UCS or Unicode characters require no more than 3 5706 octets to encode into UTF-8 5708 Interestingly, UTF-8 has room to handle characters larger than 31 5709 bits, because the leading octet of form: 5711 1111111x 5713 is not defined. If needed, ISO could either use that octet to 5714 indicate a sequence of an encoded 8 octet character, or perhaps use 5715 11111110 to permit the next octet to indicate an even more expandable 5716 character set. 5718 So using UTF-8 to represent character encodings means never having to 5719 run out of room. 5721 11.5. Normalization 5723 The client and server operating environments may differ in their 5724 policies and operational methods with respect to character 5725 normalization (See [Unicode1] for a discussion of normalization 5726 forms). This difference may also exist between applications on the 5727 same client. This adds to the difficulty of providing a single 5728 normalization policy for the protocol that allows for maximal 5729 interoperability. This issue is similar to the character case issues 5730 where the server may or may not support case insensitive file name 5731 matching and may or may not preserve the character case when storing 5732 file names. The protocol does not mandate a particular behavior but 5733 allows for the various permutations. 5735 The NFS version 4 protocol does not mandate the use of a particular 5736 normalization form at this time. A later revision of this 5737 specification may specify a particular normalization form. 5738 Therefore, the server and client can expect that they may receive 5739 unnormalized characters within protocol requests and responses. If 5740 the operating environment requires normalization, then the 5741 implementation must normalize the various UTF-8 encoded strings 5742 within the protocol before presenting the information to an 5743 application (at the client) or local filesystem (at the server). 5745 11.6. UTF-8 Related Errors 5747 Where the client sends an invalid UTF-8 string, the server should 5748 return an NFS4ERR_INVAL error. This includes cases in which 5749 inappropriate prefixes are detected and where the count includes 5750 trailing bytes that do not constitute a full UCS character. 5752 Where the client supplied string is valid UTF-8 but contains 5754 Draft Specification NFS version 4 Protocol August 2002 5756 characters that are not supported by the server as a value for that 5757 string (e.g. names containing characters that have more than two 5758 octets on a filesystem that supports Unicode characters only), the 5759 server should return an NFS4ERR_BADCHAR error. 5761 Where a UTF-8 string is used as a file name, and the filesystem, 5762 while supporting all of the characters within the name, does not 5763 allow that particular name to be used, the error should return the 5764 error NFS4ERR_BADNAME. This includes situations in which the server 5765 filesystem imposes a normalization constraint on name strings, but 5766 will also include such situations as filesystem prohibitions of "." 5767 and ".." as file names for certain operations, and other such 5768 constraints. 5770 Draft Specification NFS version 4 Protocol August 2002 5772 12. Error Definitions 5774 NFS error numbers are assigned to failed operations within a compound 5775 request. A compound request contains a number of NFS operations that 5776 have their results encoded in sequence in a compound reply. The 5777 results of successful operations will consist of an NFS4_OK status 5778 followed by the encoded results of the operation. If an NFS 5779 operation fails, an error status will be entered in the reply and the 5780 compound request will be terminated. 5782 A description of each defined error follows: 5784 NFS4_OK Indicates the operation completed successfully. 5786 NFS4ERR_ACCESS Permission denied. The caller does not have the 5787 correct permission to perform the requested 5788 operation. Contrast this with NFS4ERR_PERM, 5789 which restricts itself to owner or privileged 5790 user permission failures. 5792 NFS4ERR_ATTRNOTSUPP An attribute specified is not supported by the 5793 server. Does not apply to the GETATTR 5794 operation. 5796 NFS4ERR_BADCHAR A UTF-8 string contains a character which is 5797 not supported by the server in the context in 5798 which it being used. 5800 NFS4ERR_BAD_COOKIE READDIR cookie is stale. 5802 NFS4ERR_BADHANDLE Illegal NFS filehandle. The filehandle failed 5803 internal consistency checks. 5805 NFS4ERR_BADNAME A name string in a request consists of valid 5806 UTF-8 characters supported by the server but 5807 the name is not supported by the server as a 5808 valid name for current operation. 5810 NFS4ERR_BADOWNER An owner, owner_group, or ACL attribute value 5811 can not be translated to local representation. 5813 NFS4ERR_BADTYPE An attempt was made to create an object of a 5814 type not supported by the server. 5816 NFS4ERR_BAD_RANGE The range for a LOCK, LOCKT, or LOCKU operation 5817 is not appropriate to the allowable range of 5818 offsets for the server. 5820 NFS4ERR_BAD_SEQID The sequence number in a locking request is 5821 neither the next expected number or the last 5823 Draft Specification NFS version 4 Protocol August 2002 5825 number processed. 5827 NFS4ERR_BAD_STATEID A stateid generated by the current server 5828 instance, but which does not designate any 5829 locking state (either current or superseded) 5830 for a current lockowner-file pair, was used. 5832 NFS4ERR_BADXDR The server encountered an XDR decoding error 5833 while processing an operation. 5835 NFS4ERR_CLID_INUSE The SETCLIENTID operation has found that a 5836 client id is already in use by another client. 5838 NFS4ERR_DEADLOCK The server has been able to determine a file 5839 locking deadlock condition for a blocking lock 5840 request. 5842 NFS4ERR_DELAY The server initiated the request, but was not 5843 able to complete it in a timely fashion. The 5844 client should wait and then try the request 5845 with a new RPC transaction ID. For example, 5846 this error should be returned from a server 5847 that supports hierarchical storage and receives 5848 a request to process a file that has been 5849 migrated. In this case, the server should start 5850 the immigration process and respond to client 5851 with this error. This error may also occur 5852 when a necessary delegation recall makes 5853 processing a request in a timely fashion 5854 impossible. 5856 NFS4ERR_DENIED An attempt to lock a file is denied. Since 5857 this may be a temporary condition, the client 5858 is encouraged to retry the lock request until 5859 the lock is accepted. 5861 NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The 5862 user's resource limit on the server has been 5863 exceeded. 5865 NFS4ERR_EXIST File exists. The file specified already exists. 5867 NFS4ERR_EXPIRED A lease has expired that is being used in the 5868 current operation. 5870 NFS4ERR_FBIG File too large. The operation would have caused 5871 a file to grow beyond the server's limit. 5873 NFS4ERR_FHEXPIRED The filehandle provided is volatile and has 5874 expired at the server. 5876 NFS4ERR_FILE_OPEN The operation can not be successfully processed 5878 Draft Specification NFS version 4 Protocol August 2002 5880 because a file involved in the operation is 5881 currently open. 5883 NFS4ERR_GRACE The server is in its recovery or grace period 5884 which should match the lease period of the 5885 server. 5887 NFS4ERR_INVAL Invalid argument or unsupported argument for an 5888 operation. Two examples are attempting a 5889 READLINK on an object other than a symbolic 5890 link or attempting to SETATTR a time field on a 5891 server that does not support this operation. 5893 NFS4ERR_IO I/O error. A hard error (for example, a disk 5894 error) occurred while processing the requested 5895 operation. 5897 NFS4ERR_ISDIR Is a directory. The caller specified a 5898 directory in a non-directory operation. 5900 NFS4ERR_LEASE_MOVED A lease being renewed is associated with a 5901 filesystem that has been migrated to a new 5902 server. 5904 NFS4ERR_LOCKED A read or write operation was attempted on a 5905 locked file. 5907 NFS4ERR_LOCK_NOTSUPP Server does not support atomic upgrade or 5908 downgrade of locks. 5910 NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a 5911 current lock for the lock owner and the server 5912 does not support this type of request. 5914 NFS4ERR_LOCKS_HELD A CLOSE was attempted and file locks would 5915 exist after the CLOSE. 5917 NFS4ERR_MINOR_VERS_MISMATCH 5918 The server has received a request that 5919 specifies an unsupported minor version. The 5920 server must return a COMPOUND4res with a zero 5921 length operations result array. 5923 NFS4ERR_MLINK Too many hard links. 5925 NFS4ERR_MOVED The filesystem which contains the current 5926 filehandle object has been relocated or 5927 migrated to another server. The client may 5928 obtain the new filesystem location by obtaining 5929 the "fs_locations" attribute for the current 5930 filehandle. For further discussion, refer to 5931 the section "Filesystem Migration or 5933 Draft Specification NFS version 4 Protocol August 2002 5935 Relocation". 5937 NFS4ERR_NAMETOOLONG The filename in an operation was too long. 5939 NFS4ERR_NODEV No such device. 5941 NFS4ERR_NOENT No such file or directory. The file or 5942 directory name specified does not exist. 5944 NFS4ERR_NOFILEHANDLE The logical current filehandle value (or, in 5945 the case of RESTOREFH, the saved filehandle 5946 value) has not been set properly. This may be 5947 a result of a malformed COMPOUND operation 5948 (i.e. no PUTFH or PUTROOTFH before an operation 5949 that requires the current filehandle be set). 5951 NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of 5952 the grace period of the server. As a result, 5953 the server can not guarantee that conflicting 5954 state has not been provided to another client. 5956 NFS4ERR_NOSPC No space left on device. The operation would 5957 have caused the server's filesystem to exceed 5958 its limit. 5960 NFS4ERR_NOTDIR Not a directory. The caller specified a non- 5961 directory in a directory operation. 5963 NFS4ERR_NOTEMPTY An attempt was made to remove a directory that 5964 was not empty. 5966 NFS4ERR_NOTSUPP Operation is not supported. 5968 NFS4ERR_NOT_SAME This error is returned by the VERIFY operation 5969 to signify that the attributes compared were 5970 not the same as provided in the client's 5971 request. 5973 NFS4ERR_NXIO I/O error. No such device or address. 5975 NFS4ERR_OLD_STATEID A stateid which designates the locking state 5976 for a lockowner-file at an earlier time was 5977 used. 5979 NFS4ERR_OPENMODE The client attempted a READ, WRITE, LOCK or 5980 SETATTR operation not sanctioned by the stateid 5981 passed (e.g. writing to a file opened only for 5982 read). 5984 NFS4ERR_OP_ILLEGAL An illegal operation value has been specified 5985 in the argop field of a COMPOUND or CB_COMPOUND 5986 procedure. 5988 Draft Specification NFS version 4 Protocol August 2002 5990 NFS4ERR_PERM Not owner. The operation was not allowed 5991 because the caller is either not a privileged 5992 user (root) or not the owner of the target of 5993 the operation. 5995 NFS4ERR_READDIR_NOSPC The encoded response to a READDIR request 5996 exceeds the size limit set by the initial 5997 request. 5999 NFS4ERR_RECLAIM_BAD The reclaim provided by the client does not 6000 match any of the server's state consistency 6001 checks and is bad. 6003 NFS4ERR_RECLAIM_CONFLICT 6004 The reclaim provided by the client has 6005 encountered a conflict and can not be provided. 6006 Potentially indicates a misbehaving client. 6008 NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, 6009 the server may exhaust available resources and 6010 can not continue processing operationss within 6011 the COMPOUND procedure. This error will be 6012 returned from the server in those instances of 6013 resource exhaustion related to the processing 6014 of the COMPOUND procedure. 6016 NFS4ERR_ROFS Read-only filesystem. A modifying operation was 6017 attempted on a read-only filesystem. 6019 NFS4ERR_SAME This error is returned by the NVERIFY operation 6020 to signify that the attributes compared were 6021 the same as provided in the client's request. 6023 NFS4ERR_SERVERFAULT An error occurred on the server which does not 6024 map to any of the legal NFS version 4 protocol 6025 error values. The client should translate this 6026 into an appropriate error. UNIX clients may 6027 choose to translate this to EIO. 6029 NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share 6030 reservation has failed because of a share 6031 conflict. 6033 NFS4ERR_STALE Invalid filehandle. The filehandle given in the 6034 arguments was invalid. The file referred to by 6035 that filehandle no longer exists or access to 6036 it has been revoked. 6038 NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was 6039 used in a locking or SETCLIENTID_CONFIRM 6040 request. 6042 Draft Specification NFS version 4 Protocol August 2002 6044 NFS4ERR_STALE_STATEID A stateid generated by an earlier server 6045 instance was used. 6047 NFS4ERR_SYMLINK The current filehandle provided for a LOOKUP is 6048 not a directory but a symbolic link. Also used 6049 if the final component of the OPEN path is a 6050 symbolic link. 6052 NFS4ERR_TOOSMALL Buffer or request is too small. 6054 NFS4ERR_WRONGSEC The security mechanism being used by the client 6055 for the operation does not match the server's 6056 security policy. The client should change the 6057 security mechanism being used and retry the 6058 operation. 6060 NFS4ERR_XDEV Attempt to do an operation between different 6061 fsids. 6063 Draft Specification NFS version 4 Protocol August 2002 6065 13. NFS version 4 Requests 6067 For the NFS version 4 RPC program, there are two traditional RPC 6068 procedures: NULL and COMPOUND. All other functionality is defined as 6069 a set of operations and these operations are defined in normal 6070 XDR/RPC syntax and semantics. However, these operations are 6071 encapsulated within the COMPOUND procedure. This requires that the 6072 client combine one or more of the NFS version 4 operations into a 6073 single request. 6075 The NFS4_CALLBACK program is used to provide server to client 6076 signaling and is constructed in a similar fashion as the NFS version 6077 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the 6078 same way as NULL and COMPOUND are within the NFS program. The 6079 CB_COMPOUND request also encapsulates the remaining operations of the 6080 NFS4_CALLBACK program. There is no predefined RPC program number for 6081 the NFS4_CALLBACK program. It is up to the client to specify a 6082 program number in the "transient" program range. The program and 6083 port number of the NFS4_CALLBACK program are provided by the client 6084 as part of the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The program 6085 and port can be changed by another SETCLIENTID/SETCLIENTID_CONFIRM 6086 sequence, and it is possible to use the sequence to change them 6087 within a client incarnation without removing relevant leased client 6088 state. 6090 13.1. Compound Procedure 6092 The COMPOUND procedure provides the opportunity for better 6093 performance within high latency networks. The client can avoid 6094 cumulative latency of multiple RPCs by combining multiple dependent 6095 operations into a single COMPOUND procedure. A compound operation 6096 may provide for protocol simplification by allowing the client to 6097 combine basic procedures into a single request that is customized for 6098 the client's environment. 6100 The CB_COMPOUND procedure precisely parallels the features of 6101 COMPOUND as described above. 6103 The basic structure of the COMPOUND procedure is: 6105 +-----+--------------+--------+-----------+-----------+-----------+-- 6106 | tag | minorversion | numops | op + args | op + args | op + args | 6107 +-----+--------------+--------+-----------+-----------+-----------+-- 6109 and the reply's structure is: 6111 +------------+-----+--------+-----------------------+-- 6112 |last status | tag | numres | status + op + results | 6113 +------------+-----+--------+-----------------------+-- 6115 The numops and numres fields, used in the depiction above, represent 6117 Draft Specification NFS version 4 Protocol August 2002 6119 the count for the counted array encoding use to signify the number of 6120 arguments or results encoded in the request and response. As per the 6121 XDR encoding, these counts must match exactly the number of operation 6122 arguments or results encoded. 6124 13.2. Evaluation of a Compound Request 6126 The server will process the COMPOUND procedure by evaluating each of 6127 the operations within the COMPOUND procedure in order. Each 6128 component operation consists of a 32 bit operation code, followed by 6129 the argument of length determined by the type of operation. The 6130 results of each operation are encoded in sequence into a reply 6131 buffer. The results of each operation are preceded by the opcode and 6132 a status code (normally zero). If an operation results in a non-zero 6133 status code, the status will be encoded and evaluation of the 6134 compound sequence will halt and the reply will be returned. Note 6135 that evaluation stops even in the event of "non error" conditions 6136 such as NFS4ERR_SAME. 6138 There are no atomicity requirements for the operations contained 6139 within the COMPOUND procedure. The operations being evaluated as 6140 part of a COMPOUND request may be evaluated simultaneously with other 6141 COMPOUND requests that the server receives. 6143 It is the client's responsibility for recovering from any partially 6144 completed COMPOUND procedure. Partially completed COMPOUND 6145 procedures may occur at any point due to errors such as 6146 NFS4ERR_RESOURCE and NFS4ERR_DELAY. This may occur even given an 6147 otherwise valid operation string. Further, a server reboot which 6148 occurs in the middle of processing a COMPOUND procedure may leave the 6149 client with the difficult task of determining how far COMPOUND 6150 processing has proceeded. Therefore, the client should avoid overly 6151 complex COMPOUND procedures in the event of the failure of an 6152 operation within the procedure. 6154 Each operation assumes a "current" and "saved" filehandle that is 6155 available as part of the execution context of the compound request. 6156 Operations may set, change, or return the current filehandle. The 6157 "saved" filehandle is used for temporary storage of a filehandle 6158 value and as operands for the RENAME and LINK operations. 6160 13.3. Synchronous Modifying Operations 6162 NFS version 4 operations that modify the filesystem are synchronous. 6163 When an operation is successfully completed at the server, the client 6164 can depend that any data associated with the request is now on stable 6165 storage (the one exception is in the case of the file data in a WRITE 6166 operation with the UNSTABLE option specified). 6168 This implies that any previous operations within the same compound 6170 Draft Specification NFS version 4 Protocol August 2002 6172 request are also reflected in stable storage. This behavior enables 6173 the client's ability to recover from a partially executed compound 6174 request which may resulted from the failure of the server. For 6175 example, if a compound request contains operations A and B and the 6176 server is unable to send a response to the client, depending on the 6177 progress the server made in servicing the request the result of both 6178 operations may be reflected in stable storage or just operation A may 6179 be reflected. The server must not have just the results of operation 6180 B in stable storage. 6182 13.4. Operation Values 6184 The operations encoded in the COMPOUND procedure are identified by 6185 operation values. To avoid overlap with the RPC procedure numbers, 6186 operations 0 (zero) and 1 are not defined. Operation 2 is not 6187 defined but reserved for future use with minor versioning. 6189 Draft Specification NFS version 4 Protocol August 2002 6191 14. NFS version 4 Procedures 6193 14.1. Procedure 0: NULL - No Operation 6195 SYNOPSIS 6197 6199 ARGUMENT 6201 void; 6203 RESULT 6205 void; 6207 DESCRIPTION 6209 Standard NULL procedure. Void argument, void response. This 6210 procedure has no functionality associated with it. Because of this 6211 it is sometimes used to measure the overhead of processing a 6212 service request. Therefore, the server should ensure that no 6213 unnecessary work is done in servicing this procedure. 6215 ERRORS 6217 None. 6219 Draft Specification NFS version 4 Protocol August 2002 6221 14.2. Procedure 1: COMPOUND - Compound Operations 6223 SYNOPSIS 6225 compoundargs -> compoundres 6227 ARGUMENT 6229 union nfs_argop4 switch (nfs_opnum4 argop) { 6230 case : ; 6231 ... 6232 }; 6234 struct COMPOUND4args { 6235 utf8string tag; 6236 uint32_t minorversion; 6237 nfs_argop4 argarray<>; 6238 }; 6240 RESULT 6242 union nfs_resop4 switch (nfs_opnum4 resop){ 6243 case : ; 6244 ... 6245 }; 6247 struct COMPOUND4res { 6248 nfsstat4 status; 6249 utf8string tag; 6250 nfs_resop4 resarray<>; 6251 }; 6253 DESCRIPTION 6255 The COMPOUND procedure is used to combine one or more of the NFS 6256 operations into a single RPC request. The main NFS RPC program has 6257 two main procedures: NULL and COMPOUND. All other operations use 6258 the COMPOUND procedure as a wrapper. 6260 The COMPOUND procedure is used to combine individual operations 6261 into a single RPC request. The server interprets each of the 6262 operations in turn. If an operation is executed by the server and 6263 the status of that operation is NFS4_OK, then the next operation in 6264 the COMPOUND procedure is executed. The server continues this 6265 process until there are no more operations to be executed or one of 6266 the operations has a status value other than NFS4_OK. 6268 Draft Specification NFS version 4 Protocol August 2002 6270 In the processing of the COMPOUND procedure, the server may find 6271 that it does not have the available resources to execute any or all 6272 of the operations within the COMPOUND sequence. In this case, the 6273 error NFS4ERR_RESOURCE will be returned for the particular 6274 operation within the COMPOUND procedure where the resource 6275 exhaustion occurred. This assumes that all previous operations 6276 within the COMPOUND sequence have been evaluated successfully. The 6277 results for all of the evaluated operations must be returned to the 6278 client. 6280 The server will generally choose between two methods of decoding 6281 the client's request. The first would be the traditional one pass 6282 XDR decode. If there is an XDR decoding error in this case, the 6283 RPC XDR decode error would be returned. The second method would be 6284 to make an initial pass to decode the basic COMPOUND request and 6285 then to XDR decode the individual operations; the most interesting 6286 is the decode of attributes. In this case, the server may 6287 encounter an XDR decode error during the second pass. In this 6288 case, the server would return the error NFS4ERR_BADXDR to signify 6289 the decode error. 6291 The COMPOUND arguments contain a "minorversion" field. The initial 6292 and default value for this field is 0 (zero). This field will be 6293 used by future minor versions such that the client can communicate 6294 to the server what minor version is being requested. If the server 6295 receives a COMPOUND procedure with a minorversion field value that 6296 it does not support, the server MUST return an error of 6297 NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. 6299 Contained within the COMPOUND results is a "status" field. If the 6300 results array length is non-zero, this status must be equivalent to 6301 the status of the last operation that was executed within the 6302 COMPOUND procedure. Therefore, if an operation incurred an error 6303 then the "status" value will be the same error value as is being 6304 returned for the operation that failed. 6306 Note that operations, 0 (zero) and 1 (one) are not defined for the 6307 COMPOUND procedure. Operation 2 is not defined but reserved for 6308 future definition and use with minor versioning. If the server 6309 receives a operation array that contains operation 2 and the 6310 minorversion field has a value of 0 (zero), an error of 6311 NFS4ERR_OP_ILLEGAL, as described in the next paragraph, is returned 6312 to the client. If an operation array contains an operation 2 and 6313 the minorversion field is non-zero and the server does not support 6314 the minor version, the server returns an error of 6315 NFS4ERR_MINOR_VERS_MISMATCH. Therefore, the 6316 NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other 6317 errors. 6319 It is possible that the server receives a request that contains an 6320 operation that is less than the first legal operation (OP_ACCESS) 6321 or greater than the last legal operation (OP_RELEASE_LOCKOWNER). 6323 Draft Specification NFS version 4 Protocol August 2002 6325 In this case, the server's response will encode the opcode 6326 OP_ILLEGAL rather than the illegal opcode of the request. The 6327 status field in the ILLEGAL return results will set to 6328 NFS4ERR_OP_ILLEGAL. The COMPOUND procedure's return results will 6329 also be NFS4ERR_OP_ILLEGAL. 6331 The definition of the "tag" in the request is left to the 6332 implementor. It may be used to summarize the content of the 6333 compound request for the benefit of packet sniffers and engineers 6334 debugging implementations. However, the value of "tag" in the 6335 response SHOULD be the same value as provided in the request. This 6336 applies to the tag field of the CB_COMPOUND procedure as well. 6338 IMPLEMENTATION 6340 Since an error of any type may occur after only a portion of the 6341 operations have been evaluated, the client must be prepared to 6342 recover from any failure. If the source of an NFS4ERR_RESOURCE 6343 error was a complex or lengthy set of operations, it is likely that 6344 if the number of operations were reduced the server would be able 6345 to evaluate them successfully. Therefore, the client is 6346 responsible for dealing with this type of complexity in recovery. 6348 ERRORS 6350 All errors defined in the protocol 6352 Draft Specification NFS version 4 Protocol August 2002 6354 14.2.1. Operation 3: ACCESS - Check Access Rights 6356 SYNOPSIS 6358 (cfh), accessreq -> supported, accessrights 6360 ARGUMENT 6362 const ACCESS4_READ = 0x00000001; 6363 const ACCESS4_LOOKUP = 0x00000002; 6364 const ACCESS4_MODIFY = 0x00000004; 6365 const ACCESS4_EXTEND = 0x00000008; 6366 const ACCESS4_DELETE = 0x00000010; 6367 const ACCESS4_EXECUTE = 0x00000020; 6369 struct ACCESS4args { 6370 /* CURRENT_FH: object */ 6371 uint32_t access; 6372 }; 6374 RESULT 6376 struct ACCESS4resok { 6377 uint32_t supported; 6378 uint32_t access; 6379 }; 6381 union ACCESS4res switch (nfsstat4 status) { 6382 case NFS4_OK: 6383 ACCESS4resok resok4; 6384 default: 6385 void; 6386 }; 6388 DESCRIPTION 6390 ACCESS determines the access rights that a user, as identified by 6391 the credentials in the RPC request, has with respect to the file 6392 system object specified by the current filehandle. The client 6393 encodes the set of access rights that are to be checked in the bit 6394 mask "access". The server checks the permissions encoded in the 6395 bit mask. If a status of NFS4_OK is returned, two bit masks are 6396 included in the response. The first, "supported", represents the 6397 access rights for which the server can verify reliably. The 6398 second, "access", represents the access rights available to the 6399 user for the filehandle provided. On success, the current 6400 filehandle retains its value. 6402 Draft Specification NFS version 4 Protocol August 2002 6404 Note that the supported field will contain only as many values as 6405 was originally sent in the arguments. For example, if the client 6406 sends an ACCESS operation with only the ACCESS4_READ value set and 6407 the server supports this value, the server will return only 6408 ACCESS4_READ even if it could have reliably checked other values. 6410 The results of this operation are necessarily advisory in nature. 6411 A return status of NFS4_OK and the appropriate bit set in the bit 6412 mask does not imply that such access will be allowed to the file 6413 system object in the future. This is because access rights can be 6414 revoked by the server at any time. 6416 The following access permissions may be requested: 6418 ACCESS4_READ Read data from file or read a directory. 6420 ACCESS4_LOOKUP Look up a name in a directory (no meaning for non- 6421 directory objects). 6423 ACCESS4_MODIFY Rewrite existing file data or modify existing 6424 directory entries. 6426 ACCESS4_EXTEND Write new data or add directory entries. 6428 ACCESS4_DELETE Delete an existing directory entry (no meaning for 6429 non-directory objects). 6431 ACCESS4_EXECUTE Execute file (no meaning for a directory). 6433 On success, the current filehandle retains its value. 6435 IMPLEMENTATION 6437 In general, it is not sufficient for the client to attempt to 6438 deduce access permissions by inspecting the uid, gid, and mode 6439 fields in the file attributes or by attempting to interpret the 6440 contents of the ACL attribute. This is because the server may 6441 perform uid or gid mapping or enforce additional access control 6442 restrictions. It is also possible that the server may not be in 6443 the same ID space as the client. In these cases (and perhaps 6444 others), the client can not reliably perform an access check with 6445 only current file attributes. 6447 In the NFS version 2 protocol, the only reliable way to determine 6448 whether an operation was allowed was to try it and see if it 6449 succeeded or failed. Using the ACCESS operation in the NFS version 6450 4 protocol, the client can ask the server to indicate whether or 6451 not one or more classes of operations are permitted. The ACCESS 6452 operation is provided to allow clients to check before doing a 6453 series of operations which will result in an access failure. The 6454 OPEN operation provides a point where the server can verify access 6456 Draft Specification NFS version 4 Protocol August 2002 6458 to the file object and method to return that information to the 6459 client. The ACCESS operation is still useful for directory 6460 operations or for use in the case the UNIX API "access" is used on 6461 the client. 6463 The information returned by the server in response to an ACCESS 6464 call is not permanent. It was correct at the exact time that the 6465 server performed the checks, but not necessarily afterwards. The 6466 server can revoke access permission at any time. 6468 The client should use the effective credentials of the user to 6469 build the authentication information in the ACCESS request used to 6470 determine access rights. It is the effective user and group 6471 credentials that are used in subsequent read and write operations. 6473 Many implementations do not directly support the ACCESS4_DELETE 6474 permission. Operating systems like UNIX will ignore the 6475 ACCESS4_DELETE bit if set on an access request on a non-directory 6476 object. In these systems, delete permission on a file is 6477 determined by the access permissions on the directory in which the 6478 file resides, instead of being determined by the permissions of the 6479 file itself. Therefore, the mask returned enumerating which access 6480 rights can be determined will have the ACCESS4_DELETE value set to 6481 0. This indicates to the client that the server was unable to 6482 check that particular access right. The ACCESS4_DELETE bit in the 6483 access mask returned will then be ignored by the client. 6485 ERRORS 6487 NFS4ERR_ACCESS 6488 NFS4ERR_BADHANDLE 6489 NFS4ERR_BADXDR 6490 NFS4ERR_DELAY 6491 NFS4ERR_FHEXPIRED 6492 NFS4ERR_IO 6493 NFS4ERR_MOVED 6494 NFS4ERR_NOFILEHANDLE 6495 NFS4ERR_RESOURCE 6496 NFS4ERR_SERVERFAULT 6497 NFS4ERR_STALE 6499 Draft Specification NFS version 4 Protocol August 2002 6501 14.2.2. Operation 4: CLOSE - Close File 6503 SYNOPSIS 6505 (cfh), seqid, open_stateid -> open_stateid 6507 ARGUMENT 6509 struct CLOSE4args { 6510 /* CURRENT_FH: object */ 6511 seqid4 seqid 6512 stateid4 open_stateid; 6513 }; 6515 RESULT 6517 union CLOSE4res switch (nfsstat4 status) { 6518 case NFS4_OK: 6519 stateid4 open_stateid; 6520 default: 6521 void; 6522 }; 6524 DESCRIPTION 6526 The CLOSE operation releases share reservations for the regular or 6527 named attribute file as specified by the current filehandle. The 6528 share reservations and other state information released at the 6529 server as a result of this CLOSE is only associated with the 6530 supplied stateid. The sequence id provides for the correct 6531 ordering. State associated with other OPENs is not affected. 6533 If record locks are held, the client SHOULD release all locks 6534 before issuing a CLOSE. The server MAY free all outstanding locks 6535 on CLOSE but some servers may not support the CLOSE of a file that 6536 still has record locks held. The server MUST return failure if any 6537 locks would exist after the CLOSE. 6539 On success, the current filehandle retains its value. 6541 IMPLEMENTATION 6543 Even though CLOSE returns a stateid, this stateid is not useful to 6544 the client and should be treated as deprecated. CLOSE "shuts down" 6545 the state associated with all OPENs for the file by a single 6547 Draft Specification NFS version 4 Protocol August 2002 6549 open_owner. As noted above, CLOSE will either release all file 6550 locking state or return an error. Therefore, the stateid returned 6551 by CLOSE is not useful for operations that follow. 6553 ERRORS 6555 NFS4ERR_BADHANDLE 6556 NFS4ERR_BAD_SEQID 6557 NFS4ERR_BAD_STATEID 6558 NFS4ERR_BADXDR 6559 NFS4ERR_DELAY 6560 NFS4ERR_EXPIRED 6561 NFS4ERR_FHEXPIRED 6562 NFS4ERR_GRACE 6563 NFS4ERR_INVAL 6564 NFS4ERR_ISDIR 6565 NFS4ERR_LEASE_MOVED 6566 NFS4ERR_LOCKS_HELD 6567 NFS4ERR_MOVED 6568 NFS4ERR_NOFILEHANDLE 6569 NFS4ERR_OLD_STATEID 6570 NFS4ERR_RESOURCE 6571 NFS4ERR_SERVERFAULT 6572 NFS4ERR_STALE 6573 NFS4ERR_STALE_STATEID 6575 Draft Specification NFS version 4 Protocol August 2002 6577 14.2.3. Operation 5: COMMIT - Commit Cached Data 6579 SYNOPSIS 6581 (cfh), offset, count -> verifier 6583 ARGUMENT 6585 struct COMMIT4args { 6586 /* CURRENT_FH: file */ 6587 offset4 offset; 6588 count4 count; 6589 }; 6591 RESULT 6593 struct COMMIT4resok { 6594 verifier4 writeverf; 6595 }; 6597 union COMMIT4res switch (nfsstat4 status) { 6598 case NFS4_OK: 6599 COMMIT4resok resok4; 6600 default: 6601 void; 6602 }; 6604 DESCRIPTION 6606 The COMMIT operation forces or flushes data to stable storage for 6607 the file specified by the current filehandle. The flushed data is 6608 that which was previously written with a WRITE operation which had 6609 the stable field set to UNSTABLE4. 6611 The offset specifies the position within the file where the flush 6612 is to begin. An offset value of 0 (zero) means to flush data 6613 starting at the beginning of the file. The count specifies the 6614 number of bytes of data to flush. If count is 0 (zero), a flush 6615 from offset to the end of the file is done. 6617 The server returns a write verifier upon successful completion of 6618 the COMMIT. The write verifier is used by the client to determine 6619 if the server has restarted or rebooted between the initial 6620 WRITE(s) and the COMMIT. The client does this by comparing the 6621 write verifier returned from the initial writes and the verifier 6622 returned by the COMMIT operation. The server must vary the value 6623 of the write verifier at each server event or instantiation that 6624 may lead to a loss of uncommitted data. Most commonly this occurs 6625 when the server is rebooted; however, other events at the server 6627 Draft Specification NFS version 4 Protocol August 2002 6629 may result in uncommitted data loss as well. 6631 On success, the current filehandle retains its value. 6633 IMPLEMENTATION 6635 The COMMIT operation is similar in operation and semantics to the 6636 POSIX fsync(2) system call that synchronizes a file's state with 6637 the disk (file data and metadata is flushed to disk or stable 6638 storage). COMMIT performs the same operation for a client, flushing 6639 any unsynchronized data and metadata on the server to the server's 6640 disk or stable storage for the specified file. Like fsync(2), it 6641 may be that there is some modified data or no modified data to 6642 synchronize. The data may have been synchronized by the server's 6643 normal periodic buffer synchronization activity. COMMIT should 6644 return NFS4_OK, unless there has been an unexpected error. 6646 COMMIT differs from fsync(2) in that it is possible for the client 6647 to flush a range of the file (most likely triggered by a buffer- 6648 reclamation scheme on the client before file has been completely 6649 written). 6651 The server implementation of COMMIT is reasonably simple. If the 6652 server receives a full file COMMIT request, that is starting at 6653 offset 0 and count 0, it should do the equivalent of fsync()'ing 6654 the file. Otherwise, it should arrange to have the cached data in 6655 the range specified by offset and count to be flushed to stable 6656 storage. In both cases, any metadata associated with the file must 6657 be flushed to stable storage before returning. It is not an error 6658 for there to be nothing to flush on the server. This means that 6659 the data and metadata that needed to be flushed have already been 6660 flushed or lost during the last server failure. 6662 The client implementation of COMMIT is a little more complex. 6663 There are two reasons for wanting to commit a client buffer to 6664 stable storage. The first is that the client wants to reuse a 6665 buffer. In this case, the offset and count of the buffer are sent 6666 to the server in the COMMIT request. The server then flushes any 6667 cached data based on the offset and count, and flushes any metadata 6668 associated with the file. It then returns the status of the flush 6669 and the write verifier. The other reason for the client to 6670 generate a COMMIT is for a full file flush, such as may be done at 6671 close. In this case, the client would gather all of the buffers 6672 for this file that contain uncommitted data, do the COMMIT 6673 operation with an offset of 0 and count of 0, and then free all of 6674 those buffers. Any other dirty buffers would be sent to the server 6675 in the normal fashion. 6677 After a buffer is written by the client with the stable parameter 6678 set to UNSTABLE4, the buffer must be considered as modified by the 6679 client until the buffer has either been flushed via a COMMIT 6681 Draft Specification NFS version 4 Protocol August 2002 6683 operation or written via a WRITE operation with stable parameter 6684 set to FILE_SYNC4 or DATA_SYNC4. This is done to prevent the buffer 6685 from being freed and reused before the data can be flushed to 6686 stable storage on the server. 6688 When a response is returned from either a WRITE or a COMMIT 6689 operation and it contains a write verifier that is different than 6690 previously returned by the server, the client will need to 6691 retransmit all of the buffers containing uncommitted cached data to 6692 the server. How this is to be done is up to the implementor. If 6693 there is only one buffer of interest, then it should probably be 6694 sent back over in a WRITE request with the appropriate stable 6695 parameter. If there is more than one buffer, it might be 6696 worthwhile retransmitting all of the buffers in WRITE requests with 6697 the stable parameter set to UNSTABLE4 and then retransmitting the 6698 COMMIT operation to flush all of the data on the server to stable 6699 storage. The timing of these retransmissions is left to the 6700 implementor. 6702 The above description applies to page-cache-based systems as well 6703 as buffer-cache-based systems. In those systems, the virtual 6704 memory system will need to be modified instead of the buffer cache. 6706 ERRORS 6708 NFS4ERR_ACCESS 6709 NFS4ERR_BADHANDLE 6710 NFS4ERR_BADXDR 6711 NFS4ERR_FHEXPIRED 6712 NFS4ERR_INVAL 6713 NFS4ERR_IO 6714 NFS4ERR_ISDIR 6715 NFS4ERR_MOVED 6716 NFS4ERR_NOFILEHANDLE 6717 NFS4ERR_RESOURCE 6718 NFS4ERR_ROFS 6719 NFS4ERR_SERVERFAULT 6720 NFS4ERR_STALE 6722 Draft Specification NFS version 4 Protocol August 2002 6724 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 6726 SYNOPSIS 6728 (cfh), name, type, attrs -> (cfh), change_info, attrs_set 6730 ARGUMENT 6732 union createtype4 switch (nfs_ftype4 type) { 6733 case NF4LNK: 6734 linktext4 linkdata; 6735 case NF4BLK: 6736 case NF4CHR: 6737 specdata4 devdata; 6738 case NF4SOCK: 6739 case NF4FIFO: 6740 case NF4DIR: 6741 void; 6742 }; 6744 struct CREATE4args { 6745 /* CURRENT_FH: directory for creation */ 6746 createtype4 objtype; 6747 component4 objname; 6748 fattr4 createattrs; 6749 }; 6751 RESULT 6753 struct CREATE4resok { 6754 change_info4 cinfo; 6755 bitmap4 attrset; /* attributes set */ 6756 }; 6758 union CREATE4res switch (nfsstat4 status) { 6759 case NFS4_OK: 6760 CREATE4resok resok4; 6761 default: 6762 void; 6763 }; 6765 DESCRIPTION 6767 The CREATE operation creates a non-regular file object in a 6768 directory with a given name. The OPEN operation MUST be used to 6769 create a regular file. 6771 The objname specifies the name for the new object. The objtype 6772 determines the type of object to be created: directory, symlink, 6774 Draft Specification NFS version 4 Protocol August 2002 6776 etc. 6778 If an object of the same name already exists in the directory, the 6779 server will return the error NFS4ERR_EXIST. 6781 For the directory where the new file object was created, the server 6782 returns change_info4 information in cinfo. With the atomic field 6783 of the change_info4 struct, the server will indicate if the before 6784 and after change attributes were obtained atomically with respect 6785 to the file object creation. 6787 If the objname has a length of 0 (zero), or if objname does not 6788 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 6789 returned. 6791 The current filehandle is replaced by that of the new object. 6793 The createattrs specifies the initial set of attributes for the 6794 object. The set of attributes may include any writable attribute 6795 valid for the object type. When the operation is successful, the 6796 server will return to the client an attribute mask signifying which 6797 attributes were successfully set for the object. 6799 If createattrs includes neither the owner attribute nor an ACL with 6800 an ACE for the owner, and if the server's filesystem both supports 6801 and requires an owner attribute (or an owner ACE) then the server 6802 MUST derive the owner (or the owner ACE). This would typically be 6803 from the principal indicated in the RPC credentials of the call, 6804 but the server's operating environment or filesystem semantics may 6805 dictate other methods of derivation. Similarly, if createattrs 6806 includes neither the group attribute nor a group ACE, and if the 6807 server's filesystem both supports and requires the notion of a 6808 group attribute (or group ACE), the server MUST derive the group 6809 attribute (or the corresponding owner ACE) for the file. This could 6810 be from the RPC call's credentials, such as the group principal if 6811 the credentials include it (such as with AUTH_SYS), from the group 6812 identifier associated with the principal in the credentials (for 6813 e.g., POSIX systems have a passwd database that has the group 6814 identifier for every user identifier), inherited from directory the 6815 object is created in, or whatever else the server's operating 6816 environment or filesystem semantics dictate. This applies to the 6817 OPEN operation too. 6819 Conversely, it is possible the client will specify in createattrs 6820 an owner attribute or group attribute or ACL that the principal 6821 indicated the RPC call's credentials does not have permissions to 6822 create files for. The error to be returned in this instance is 6823 NFS4ERR_PERM. This applies to the OPEN operation too. 6825 IMPLEMENTATION 6827 Draft Specification NFS version 4 Protocol August 2002 6829 If the client desires to set attribute values after the create, a 6830 SETATTR operation can be added to the COMPOUND request so that the 6831 appropriate attributes will be set. 6833 ERRORS 6835 NFS4ERR_ACCESS 6836 NFS4ERR_ATTRNOTSUPP 6837 NFS4ERR_BADCHAR 6838 NFS4ERR_BADHANDLE 6839 NFS4ERR_BADNAME 6840 NFS4ERR_BADOWNER 6841 NFS4ERR_BADTYPE 6842 NFS4ERR_BADXDR 6843 NFS4ERR_DQUOT 6844 NFS4ERR_EXIST 6845 NFS4ERR_FHEXPIRED 6846 NFS4ERR_INVAL 6847 NFS4ERR_IO 6848 NFS4ERR_MOVED 6849 NFS4ERR_NAMETOOLONG 6850 NFS4ERR_NOFILEHANDLE 6851 NFS4ERR_NOSPC 6852 NFS4ERR_NOTDIR 6853 NFS4ERR_NOTSUPP 6854 NFS4ERR_RESOURCE 6855 NFS4ERR_ROFS 6856 NFS4ERR_SERVERFAULT 6857 NFS4ERR_STALE 6859 Draft Specification NFS version 4 Protocol August 2002 6861 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery 6863 SYNOPSIS 6865 clientid -> 6867 ARGUMENT 6869 struct DELEGPURGE4args { 6870 clientid4 clientid; 6871 }; 6873 RESULT 6875 struct DELEGPURGE4res { 6876 nfsstat4 status; 6877 }; 6879 DESCRIPTION 6881 Purges all of the delegations awaiting recovery for a given client. 6882 This is useful for clients which do not commit delegation 6883 information to stable storage to indicate that conflicting requests 6884 need not be delayed by the server awaiting recovery of delegation 6885 information. 6887 This operation should be used by clients that record delegation 6888 information on stable storage on the client. In this case, 6889 DELEGPURGE should be issued immediately after doing delegation 6890 recovery on all delegations known to the client. Doing so will 6891 notify the server that no additional delegations for the client 6892 will be recovered allowing it to free resources, and avoid delaying 6893 other clients who make requests that conflict with the unrecovered 6894 delegations. The set of delegations known to the server and the 6895 client may be different. The reason for this is that a client may 6896 fail after making a request which resulted in delegation but before 6897 it received the results and committed them to the client's stable 6898 storage. 6900 The server MAY support DELEGPURGE, but if it does not, it MUST NOT 6901 support CLAIM_DELEGATE_PREV. 6903 ERRORS 6905 NFS4ERR_BADXDR 6906 NFS4ERR_NOTSUPP 6907 NFS4ERR_RESOURCE 6908 NFS4ERR_SERVERFAULT 6909 NFS4ERR_STALE_CLIENTID 6911 Draft Specification NFS version 4 Protocol August 2002 6913 14.2.6. Operation 8: DELEGRETURN - Return Delegation 6915 SYNOPSIS 6917 (cfh), stateid -> 6919 ARGUMENT 6921 struct DELEGRETURN4args { 6922 /* CURRENT_FH: delegated file */ 6923 stateid4 stateid; 6924 }; 6926 RESULT 6928 struct DELEGRETURN4res { 6929 nfsstat4 status; 6930 }; 6932 DESCRIPTION 6934 Returns the delegation represented by the current filehandle and 6935 stateid. 6937 Delegations may be returned when recalled or voluntarily (i.e. 6938 before the server has recalled them). In either case the client 6939 must properly propagate state changed under the context of the 6940 delegation to the server before returning the delegation. 6942 ERRORS 6944 NFS4ERR_BAD_STATEID 6945 NFS4ERR_BADXDR 6946 NFS4ERR_EXPIRED 6947 NFS4ERR_OLD_STATEID 6948 NFS4ERR_RESOURCE 6949 NFS4ERR_SERVERFAULT 6950 NFS4ERR_STALE_STATEID 6952 Draft Specification NFS version 4 Protocol August 2002 6954 14.2.7. Operation 9: GETATTR - Get Attributes 6956 SYNOPSIS 6958 (cfh), attrbits -> attrbits, attrvals 6960 ARGUMENT 6962 struct GETATTR4args { 6963 /* CURRENT_FH: directory or file */ 6964 bitmap4 attr_request; 6965 }; 6967 RESULT 6969 struct GETATTR4resok { 6970 fattr4 obj_attributes; 6971 }; 6973 union GETATTR4res switch (nfsstat4 status) { 6974 case NFS4_OK: 6975 GETATTR4resok resok4; 6976 default: 6977 void; 6978 }; 6980 DESCRIPTION 6982 The GETATTR operation will obtain attributes for the filesystem 6983 object specified by the current filehandle. The client sets a bit 6984 in the bitmap argument for each attribute value that it would like 6985 the server to return. The server returns an attribute bitmap that 6986 indicates the attribute values for which it was able to return, 6987 followed by the attribute values ordered lowest attribute number 6988 first. 6990 The server must return a value for each attribute that the client 6991 requests if the attribute is supported by the server. If the 6992 server does not support an attribute or cannot approximate a useful 6993 value then it must not return the attribute value and must not set 6994 the attribute bit in the result bitmap. The server must return an 6995 error if it supports an attribute but cannot obtain its value. In 6996 that case no attribute values will be returned. 6998 All servers must support the mandatory attributes as specified in 6999 the section "File Attributes". 7001 On success, the current filehandle retains its value. 7003 Draft Specification NFS version 4 Protocol August 2002 7005 IMPLEMENTATION 7007 ERRORS 7009 NFS4ERR_ACCESS 7010 NFS4ERR_BADHANDLE 7011 NFS4ERR_BADXDR 7012 NFS4ERR_DELAY 7013 NFS4ERR_FHEXPIRED 7014 NFS4ERR_INVAL 7015 NFS4ERR_IO 7016 NFS4ERR_MOVED 7017 NFS4ERR_NOFILEHANDLE 7018 NFS4ERR_RESOURCE 7019 NFS4ERR_SERVERFAULT 7020 NFS4ERR_STALE 7022 Draft Specification NFS version 4 Protocol August 2002 7024 14.2.8. Operation 10: GETFH - Get Current Filehandle 7026 SYNOPSIS 7028 (cfh) -> filehandle 7030 ARGUMENT 7032 /* CURRENT_FH: */ 7033 void; 7035 RESULT 7037 struct GETFH4resok { 7038 nfs_fh4 object; 7039 }; 7041 union GETFH4res switch (nfsstat4 status) { 7042 case NFS4_OK: 7043 GETFH4resok resok4; 7044 default: 7045 void; 7046 }; 7048 DESCRIPTION 7050 This operation returns the current filehandle value. 7052 On success, the current filehandle retains its value. 7054 IMPLEMENTATION 7056 Operations that change the current filehandle like LOOKUP or CREATE 7057 do not automatically return the new filehandle as a result. For 7058 instance, if a client needs to lookup a directory entry and obtain 7059 its filehandle then the following request is needed. 7061 PUTFH (directory filehandle) 7062 LOOKUP (entry name) 7063 GETFH 7065 ERRORS 7067 NFS4ERR_BADHANDLE 7068 NFS4ERR_FHEXPIRED 7069 NFS4ERR_MOVED 7071 Draft Specification NFS version 4 Protocol August 2002 7073 NFS4ERR_NOFILEHANDLE 7074 NFS4ERR_RESOURCE 7075 NFS4ERR_SERVERFAULT 7076 NFS4ERR_STALE 7078 Draft Specification NFS version 4 Protocol August 2002 7080 14.2.9. Operation 11: LINK - Create Link to a File 7082 SYNOPSIS 7084 (sfh), (cfh), newname -> (cfh), change_info 7086 ARGUMENT 7088 struct LINK4args { 7089 /* SAVED_FH: source object */ 7090 /* CURRENT_FH: target directory */ 7091 component4 newname; 7092 }; 7094 RESULT 7096 struct LINK4resok { 7097 change_info4 cinfo; 7098 }; 7100 union LINK4res switch (nfsstat4 status) { 7101 case NFS4_OK: 7102 LINK4resok resok4; 7103 default: 7104 void; 7105 }; 7107 DESCRIPTION 7109 The LINK operation creates an additional newname for the file 7110 represented by the saved filehandle, as set by the SAVEFH 7111 operation, in the directory represented by the current filehandle. 7112 The existing file and the target directory must reside within the 7113 same filesystem on the server. On success, the current filehandle 7114 will continue to be the target directory. If an object exists in 7115 the target directory with the same name as newname, the server must 7116 return NFS4ERR_EXIST. 7118 For the target directory, the server returns change_info4 7119 information in cinfo. With the atomic field of the change_info4 7120 struct, the server will indicate if the before and after change 7121 attributes were obtained atomically with respect to the link 7122 creation. 7124 If the newname has a length of 0 (zero), or if newname does not 7125 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 7126 returned. 7128 Draft Specification NFS version 4 Protocol August 2002 7130 IMPLEMENTATION 7132 Changes to any property of the "hard" linked files are reflected in 7133 all of the linked files. When a link is made to a file, the 7134 attributes for the file should have a value for numlinks that is 7135 one greater than the value before the LINK operation. 7137 The statement "file and the target directory must reside within the 7138 same filesystem on the server" means that the fsid fields in the 7139 attributes for the objects are the same. If they reside on 7140 different filesystems, the error, NFS4ERR_XDEV, is returned. On 7141 some servers, the filenames, "." and "..", are illegal as newname. 7143 In the case that newname is already linked to the file represented 7144 by the saved filehandle, the server will return NFS4ERR_EXIST. 7146 Note that symbolic links are created with the CREATE operation. 7148 ERRORS 7150 NFS4ERR_ACCESS 7151 NFS4ERR_BADCHAR 7152 NFS4ERR_BADHANDLE 7153 NFS4ERR_BADNAME 7154 NFS4ERR_BADXDR 7155 NFS4ERR_DELAY 7156 NFS4ERR_DQUOT 7157 NFS4ERR_EXIST 7158 NFS4ERR_FHEXPIRED 7159 NFS4ERR_FILE_OPEN 7160 NFS4ERR_INVAL 7161 NFS4ERR_IO 7162 NFS4ERR_ISDIR 7163 NFS4ERR_MLINK 7164 NFS4ERR_MOVED 7165 NFS4ERR_NAMETOOLONG 7166 NFS4ERR_NOENT 7167 NFS4ERR_NOFILEHANDLE 7168 NFS4ERR_NOSPC 7169 NFS4ERR_NOTDIR 7170 NFS4ERR_NOTSUPP 7171 NFS4ERR_RESOURCE 7172 NFS4ERR_ROFS 7173 NFS4ERR_SERVERFAULT 7174 NFS4ERR_STALE 7175 NFS4ERR_WRONGSEC 7176 NFS4ERR_XDEV 7178 Draft Specification NFS version 4 Protocol August 2002 7180 14.2.10. Operation 12: LOCK - Create Lock 7182 SYNOPSIS 7184 (cfh) locktype, reclaim, offset, length, locker -> stateid 7186 ARGUMENT 7188 struct open_to_lock_owner4 { 7189 seqid4 open_seqid; 7190 stateid4 open_stateid; 7191 seqid4 lock_seqid; 7192 lock_owner4 lock_owner; 7193 }; 7195 struct exist_lock_owner4 { 7196 stateid4 lock_stateid; 7197 seqid4 lock_seqid; 7198 }; 7200 union locker4 switch (bool new_lock_owner) { 7201 case TRUE: 7202 open_to_lock_owner4 open_owner; 7203 case FALSE: 7204 exist_lock_owner4 lock_owner; 7205 }; 7207 enum nfs_lock_type4 { 7208 READ_LT = 1, 7209 WRITE_LT = 2, 7210 READW_LT = 3, /* blocking read */ 7211 WRITEW_LT = 4 /* blocking write */ 7212 }; 7214 struct LOCK4args { 7215 /* CURRENT_FH: file */ 7216 nfs_lock_type4 locktype; 7217 bool reclaim; 7218 offset4 offset; 7219 length4 length; 7220 locker4 locker; 7221 }; 7223 RESULT 7225 struct LOCK4denied { 7226 offset4 offset; 7227 length4 length; 7228 nfs_lock_type4 locktype; 7230 Draft Specification NFS version 4 Protocol August 2002 7232 lock_owner4 owner; 7233 }; 7235 struct LOCK4resok { 7236 stateid4 lock_stateid; 7237 }; 7239 union LOCK4res switch (nfsstat4 status) { 7240 case NFS4_OK: 7241 LOCK4resok resok4; 7242 case NFS4ERR_DENIED: 7243 LOCK4denied denied; 7244 default: 7245 void; 7246 }; 7248 DESCRIPTION 7250 The LOCK operation requests a record lock for the byte range 7251 specified by the offset and length parameters. The lock type is 7252 also specified to be one of the nfs_lock_type4s. If this is a 7253 reclaim request, the reclaim parameter will be TRUE; 7255 Bytes in a file may be locked even if those bytes are not currently 7256 allocated to the file. To lock the file from a specific offset 7257 through the end-of-file (no matter how long the file actually is) 7258 use a length field with all bits set to 1 (one). If the length is 7259 zero, or if a length which is not all bits set to one is specified, 7260 and length when added to the offset exceeds the maximum 64-bit 7261 unsigned integer value, the error NFS4ERR_INVAL will result. 7263 Some servers may only support locking for byte offsets that fit 7264 within 32 bits. If the client specifies a range that includes a 7265 byte beyond the last byte offset of the 32-bit range, but does not 7266 include the last byte offset of the 32-bit and all of the byte 7267 offsets beyond it, up to the end of the valid 64-bit range, such a 7268 32-bit server MUST return the error NFS4ERR_BAD_RANGE. 7270 In the case that the lock is denied, the owner, offset, and length 7271 of a conflicting lock are returned. 7273 On success, the current filehandle retains its value. 7275 IMPLEMENTATION 7277 If the server is unable to determine the exact offset and length of 7278 the conflicting lock, the same offset and length that were provided 7279 in the arguments should be returned in the denied results. The 7280 File Locking section contains a full description of this and the 7281 other file locking operations. 7283 Draft Specification NFS version 4 Protocol August 2002 7285 LOCK operations are subject to permission checks and to checks 7286 against the access type of the associated file. However, the 7287 specific right and modes required for various type of locks, 7288 reflect the semantics of the server-exported filesystem, and are 7289 not specified by the protocol. For example, Windows 2000 allows a 7290 write lock of a file open for READ, while a POSIX-compliant system 7291 does not. 7293 When the client makes a lock request that corresponds to a range 7294 that the lockowner has locked already (with the same or different 7295 lock type), or to a sub-region of such a range, or to a region 7296 which includes multiple locks already granted to that lockowner, in 7297 whole or in part, and the server does not support such locking 7298 operations (i.e. does not support POSIX locking semantics), the 7299 server will return the error NFS4ERR_LOCK_RANGE. In that case, the 7300 client may return an error, or it may emulate the required 7301 operations, using only LOCK for ranges that do not include any 7302 bytes already locked by that lock_owner and LOCKU of locks held by 7303 that lock_owner (specifying an exactly-matching range and type). 7304 Similarly, when the client makes a lock request that amounts to 7305 upgrading (changing from a read lock to a write lock) or 7306 downgrading (changing from write lock to a read lock) an existing 7307 record lock, and the server does not support such a lock, the 7308 server will return NFS4ERR_LOCK_NOTSUPP. Such operations may not 7309 perfectly reflect the required semantics in the face of conflicting 7310 lock requests from other clients. 7312 ERRORS 7314 NFS4ERR_ACCESS 7315 NFS4ERR_BADHANDLE 7316 NFS4ERR_BAD_RANGE 7317 NFS4ERR_BAD_SEQID 7318 NFS4ERR_BAD_STATEID 7319 NFS4ERR_BADXDR 7320 NFS4ERR_DEADLOCK 7321 NFS4ERR_DELAY 7322 NFS4ERR_DENIED 7323 NFS4ERR_EXPIRED 7324 NFS4ERR_FHEXPIRED 7325 NFS4ERR_GRACE 7326 NFS4ERR_INVAL 7327 NFS4ERR_ISDIR 7328 NFS4ERR_LEASE_MOVED 7329 NFS4ERR_LOCK_NOTSUPP 7330 NFS4ERR_LOCK_RANGE 7331 NFS4ERR_MOVED 7332 NFS4ERR_NOFILEHANDLE 7333 NFS4ERR_NO_GRACE 7334 NFS4ERR_OLD_STATEID 7335 NFS4ERR_OPENMODE 7337 Draft Specification NFS version 4 Protocol August 2002 7339 NFS4ERR_RECLAIM_BAD 7340 NFS4ERR_RECLAIM_CONFLICT 7341 NFS4ERR_RESOURCE 7342 NFS4ERR_SERVERFAULT 7343 NFS4ERR_STALE 7344 NFS4ERR_STALE_CLIENTID 7345 NFS4ERR_STALE_STATEID 7347 Draft Specification NFS version 4 Protocol August 2002 7349 14.2.11. Operation 13: LOCKT - Test For Lock 7351 SYNOPSIS 7353 (cfh) locktype, offset, length owner -> {void, NFS4ERR_DENIED -> 7354 owner} 7356 ARGUMENT 7358 struct LOCKT4args { 7359 /* CURRENT_FH: file */ 7360 nfs_lock_type4 locktype; 7361 offset4 offset; 7362 length4 length; 7363 lock_owner4 owner; 7364 }; 7366 RESULT 7368 struct LOCK4denied { 7369 offset4 offset; 7370 length4 length; 7371 nfs_lock_type4 locktype; 7372 lock_owner4 owner; 7373 }; 7375 union LOCKT4res switch (nfsstat4 status) { 7376 case NFS4ERR_DENIED: 7377 LOCK4denied denied; 7378 case NFS4_OK: 7379 void; 7380 default: 7381 void; 7382 }; 7384 DESCRIPTION 7386 The LOCKT operation tests the lock as specified in the arguments. 7387 If a conflicting lock exists, the owner, offset, length, and type 7388 of the conflicting lock are returned; if no lock is held, nothing 7389 other than NFS4_OK is returned. Lock types READ_LT and READW_LT 7390 are processed in the same way in that a conflicting lock test is 7391 done without regard to blocking or non-blocking. The same is true 7392 for WRITE_LT and WRITEW_LT. 7394 The ranges are specified as for LOCK. The NFS4ERR_INVAL and 7395 NFS4ERR_BAD_RANGE errors are returned under the same circumstances 7396 as for LOCK. 7398 Draft Specification NFS version 4 Protocol August 2002 7400 On success, the current filehandle retains its value. 7402 IMPLEMENTATION 7404 If the server is unable to determine the exact offset and length of 7405 the conflicting lock, the same offset and length that were provided 7406 in the arguments should be returned in the denied results. The 7407 File Locking section contains further discussion of the file 7408 locking mechanisms. 7410 LOCKT uses a lock_owner4 rather a stateid4, as is used in LOCK to 7411 identify the owner. This is because the client does not have to 7412 open the file to test for the existence of a lock, so a stateid may 7413 not be available. 7415 The test for conflicting locks should exclude locks for the current 7416 lockowner. Note that since such locks are not examined the 7417 possible existence of overlapping ranges may not affect the results 7418 of LOCKT. If the server does examine locks that match the 7419 lockowner for the purpose of range checking, NFS4ERR_LOCK_RANGE may 7420 be returned.. In the event that it returns NFS4_OK, clients may do 7421 a LOCK and receive NFS4ERR_LOCK_RANGE on the LOCK request because 7422 of the flexibility provided to the server. 7424 ERRORS 7426 NFS4ERR_ACCESS 7427 NFS4ERR_BADHANDLE 7428 NFS4ERR_BAD_RANGE 7429 NFS4ERR_BADXDR 7430 NFS4ERR_DELAY 7431 NFS4ERR_DENIED 7432 NFS4ERR_FHEXPIRED 7433 NFS4ERR_GRACE 7434 NFS4ERR_INVAL 7435 NFS4ERR_ISDIR 7436 NFS4ERR_LEASE_MOVED 7437 NFS4ERR_LOCK_RANGE 7438 NFS4ERR_MOVED 7439 NFS4ERR_NOFILEHANDLE 7440 NFS4ERR_RESOURCE 7441 NFS4ERR_SERVERFAULT 7442 NFS4ERR_STALE 7443 NFS4ERR_STALE_CLIENTID 7445 Draft Specification NFS version 4 Protocol August 2002 7447 14.2.12. Operation 14: LOCKU - Unlock File 7449 SYNOPSIS 7451 (cfh) type, seqid, stateid, offset, length -> stateid 7453 ARGUMENT 7455 struct LOCKU4args { 7456 /* CURRENT_FH: file */ 7457 nfs_lock_type4 locktype; 7458 seqid4 seqid; 7459 stateid4 stateid; 7460 offset4 offset; 7461 length4 length; 7462 }; 7464 RESULT 7466 union LOCKU4res switch (nfsstat4 status) { 7467 case NFS4_OK: 7468 stateid4 stateid; 7469 default: 7470 void; 7471 }; 7473 DESCRIPTION 7475 The LOCKU operation unlocks the record lock specified by the 7476 parameters. The client may set the locktype field to any value that 7477 is legal for the nfs_lock_type4 enumerated type, and the server 7478 MUST accept any legal value for locktype. Any legal value for 7479 locktype has no effect on the success or failure of the LOCKU 7480 operation. 7482 The ranges are specified as for LOCK. The NFS4ERR_INVAL and 7483 NFS4ERR_BAD_RANGE errors are returned under the same circumstances 7484 as for LOCK. 7486 On success, the current filehandle retains its value. 7488 IMPLEMENTATION 7490 If the area to be unlocked does not correspond exactly to a lock 7491 actually held by the lockowner the server may return the error 7492 NFS4ERR_LOCK_RANGE. This includes the case in which the area is 7493 not locked, where the area is a sub-range of the area locked, where 7494 it overlaps the area locked without matching exactly or the area 7496 Draft Specification NFS version 4 Protocol August 2002 7498 specified includes multiple locks held by the lockowner. In all of 7499 these cases, allowed by POSIX locking semantics, a client receiving 7500 this error, should if it desires support for such operations, 7501 simulate the operation using LOCKU on ranges corresponding to locks 7502 it actually holds, possibly followed by LOCK requests for the sub- 7503 ranges not being unlocked. 7505 ERRORS 7507 NFS4ERR_ACCESS 7508 NFS4ERR_BADHANDLE 7509 NFS4ERR_BAD_RANGE 7510 NFS4ERR_BAD_SEQID 7511 NFS4ERR_BAD_STATEID 7512 NFS4ERR_BADXDR 7513 NFS4ERR_EXPIRED 7514 NFS4ERR_FHEXPIRED 7515 NFS4ERR_GRACE 7516 NFS4ERR_INVAL 7517 NFS4ERR_ISDIR 7518 NFS4ERR_LEASE_MOVED 7519 NFS4ERR_LOCK_RANGE 7520 NFS4ERR_MOVED 7521 NFS4ERR_NOFILEHANDLE 7522 NFS4ERR_OLD_STATEID 7523 NFS4ERR_RESOURCE 7524 NFS4ERR_SERVERFAULT 7525 NFS4ERR_STALE 7526 NFS4ERR_STALE_CLIENTID 7527 NFS4ERR_STALE_STATEID 7529 Draft Specification NFS version 4 Protocol August 2002 7531 14.2.13. Operation 15: LOOKUP - Lookup Filename 7533 SYNOPSIS 7535 (cfh), component -> (cfh) 7537 ARGUMENT 7539 struct LOOKUP4args { 7540 /* CURRENT_FH: directory */ 7541 component4 objname; 7542 }; 7544 RESULT 7546 struct LOOKUP4res { 7547 /* CURRENT_FH: object */ 7548 nfsstat4 status; 7549 }; 7551 DESCRIPTION 7553 This operation LOOKUPs or finds a filesystem object using the 7554 directory specified by the current filehandle. LOOKUP evaluates 7555 the component and if the object exists the current filehandle is 7556 replaced with the component's filehandle. 7558 If the component cannot be evaluated either because it does not 7559 exist or because the client does not have permission to evaluate 7560 the component, then an error will be returned and the current 7561 filehandle will be unchanged. 7563 If the component is a zero length string or if any component does 7564 not obey the UTF-8 definition, the error NFS4ERR_INVAL will be 7565 returned. 7567 IMPLEMENTATION 7569 If the client wants to achieve the effect of a multi-component 7570 lookup, it may construct a COMPOUND request such as (and obtain 7571 each filehandle): 7573 Draft Specification NFS version 4 Protocol August 2002 7575 PUTFH (directory filehandle) 7576 LOOKUP "pub" 7577 GETFH 7578 LOOKUP "foo" 7579 GETFH 7580 LOOKUP "bar" 7581 GETFH 7583 NFS version 4 servers depart from the semantics of previous NFS 7584 versions in allowing LOOKUP requests to cross mountpoints on the 7585 server. The client can detect a mountpoint crossing by comparing 7586 the fsid attribute of the directory with the fsid attribute of the 7587 directory looked up. If the fsids are different then the new 7588 directory is a server mountpoint. UNIX clients that detect a 7589 mountpoint crossing will need to mount the server's filesystem. 7590 This needs to be done to maintain the file object identity checking 7591 mechanisms common to UNIX clients. 7593 Servers that limit NFS access to "shares" or "exported" filesystems 7594 should provide a pseudo-filesystem into which the exported 7595 filesystems can be integrated, so that clients can browse the 7596 server's name space. The clients view of a pseudo filesystem will 7597 be limited to paths that lead to exported filesystems. 7599 Note: previous versions of the protocol assigned special semantics 7600 to the names "." and "..". NFS version 4 assigns no special 7601 semantics to these names. The LOOKUPP operator must be used to 7602 lookup a parent directory. 7604 Note that this operation does not follow symbolic links. The 7605 client is responsible for all parsing of filenames including 7606 filenames that are modified by symbolic links encountered during 7607 the lookup process. 7609 If the current filehandle supplied is not a directory but a 7610 symbolic link, the error NFS4ERR_SYMLINK is returned as the error. 7611 For all other non-directory file types, the error NFS4ERR_NOTDIR is 7612 returned. 7614 ERRORS 7616 NFS4ERR_ACCESS 7617 NFS4ERR_BADCHAR 7618 NFS4ERR_BADHANDLE 7619 NFS4ERR_BADNAME 7620 NFS4ERR_BADXDR 7621 NFS4ERR_FHEXPIRED 7622 NFS4ERR_INVAL 7623 NFS4ERR_IO 7624 NFS4ERR_MOVED 7626 Draft Specification NFS version 4 Protocol August 2002 7628 NFS4ERR_NAMETOOLONG 7629 NFS4ERR_NOENT 7630 NFS4ERR_NOFILEHANDLE 7631 NFS4ERR_NOTDIR 7632 NFS4ERR_RESOURCE 7633 NFS4ERR_SERVERFAULT 7634 NFS4ERR_STALE 7635 NFS4ERR_SYMLINK 7636 NFS4ERR_WRONGSEC 7638 Draft Specification NFS version 4 Protocol August 2002 7640 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory 7642 SYNOPSIS 7644 (cfh) -> (cfh) 7646 ARGUMENT 7648 /* CURRENT_FH: object */ 7649 void; 7651 RESULT 7653 struct LOOKUPP4res { 7654 /* CURRENT_FH: directory */ 7655 nfsstat4 status; 7656 }; 7658 DESCRIPTION 7660 The current filehandle is assumed to refer to a regular directory 7661 or a named attribute directory. LOOKUPP assigns the filehandle for 7662 its parent directory to be the current filehandle. If there is no 7663 parent directory an NFS4ERR_NOENT error must be returned. 7664 Therefore, NFS4ERR_NOENT will be returned by the server when the 7665 current filehandle is at the root or top of the server's file tree. 7667 IMPLEMENTATION 7669 As for LOOKUP, LOOKUPP will also cross mountpoints. 7671 If the current filehandle is not a directory or named attribute 7672 directory, the error NFS4ERR_NOTDIR is returned. 7674 ERRORS 7676 NFS4ERR_ACCESS 7677 NFS4ERR_BADHANDLE 7678 NFS4ERR_FHEXPIRED 7679 NFS4ERR_INVAL 7680 NFS4ERR_IO 7681 NFS4ERR_MOVED 7682 NFS4ERR_NOENT 7683 NFS4ERR_NOFILEHANDLE 7684 NFS4ERR_NOTDIR 7685 NFS4ERR_RESOURCE 7686 NFS4ERR_SERVERFAULT 7687 NFS4ERR_STALE 7689 Draft Specification NFS version 4 Protocol August 2002 7691 14.2.15. Operation 17: NVERIFY - Verify Difference in Attributes 7693 SYNOPSIS 7695 (cfh), fattr -> - 7697 ARGUMENT 7699 struct NVERIFY4args { 7700 /* CURRENT_FH: object */ 7701 fattr4 obj_attributes; 7702 }; 7704 RESULT 7706 struct NVERIFY4res { 7707 nfsstat4 status; 7708 }; 7710 DESCRIPTION 7712 This operation is used to prefix a sequence of operations to be 7713 performed if one or more attributes have changed on some filesystem 7714 object. If all the attributes match then the error NFS4ERR_SAME 7715 must be returned. 7717 On success, the current filehandle retains its value. 7719 IMPLEMENTATION 7721 This operation is useful as a cache validation operator. If the 7722 object to which the attributes belong has changed then the 7723 following operations may obtain new data associated with that 7724 object. For instance, to check if a file has been changed and 7725 obtain new data if it has: 7727 PUTFH (public) 7728 LOOKUP "foobar" 7729 NVERIFY attrbits attrs 7730 READ 0 32767 7732 In the case that a recommended attribute is specified in the 7733 NVERIFY operation and the server does not support that attribute 7734 for the filesystem object, the error NFS4ERR_NOTSUPP is returned to 7735 the client. 7737 When the attribute rdattr_error or any write-only attribute (e.g. 7739 Draft Specification NFS version 4 Protocol August 2002 7741 time_modify_set) is specified, the error NFS4ERR_INVAL is returned to 7742 the client. If both of these conditions apply, the server is free to 7743 return either error. 7745 ERRORS 7747 NFS4ERR_ACCESS 7748 NFS4ERR_ATTRNOTSUPP 7749 NFS4ERR_BADCHAR 7750 NFS4ERR_BADHANDLE 7751 NFS4ERR_BADXDR 7752 NFS4ERR_DELAY 7753 NFS4ERR_FHEXPIRED 7754 NFS4ERR_INVAL 7755 NFS4ERR_IO 7756 NFS4ERR_MOVED 7757 NFS4ERR_NOFILEHANDLE 7758 NFS4ERR_NOTSUPP 7759 NFS4ERR_RESOURCE 7760 NFS4ERR_SAME 7761 NFS4ERR_SERVERFAULT 7762 NFS4ERR_STALE 7764 Draft Specification NFS version 4 Protocol August 2002 7766 14.2.16. Operation 18: OPEN - Open a Regular File 7768 SYNOPSIS 7770 (cfh), seqid, share_access, share_deny, owner, openhow, claim -> 7771 (cfh), stateid, cinfo, rflags, open_confirm, attrset delegation 7773 ARGUMENT 7775 struct OPEN4args { 7776 seqid4 seqid; 7777 uint32_t share_access; 7778 uint32_t share_deny; 7779 open_owner4 owner; 7780 openflag4 openhow; 7781 open_claim4 claim; 7782 }; 7784 enum createmode4 { 7785 UNCHECKED4 = 0, 7786 GUARDED4 = 1, 7787 EXCLUSIVE4 = 2 7788 }; 7790 union createhow4 switch (createmode4 mode) { 7791 case UNCHECKED4: 7792 case GUARDED4: 7793 fattr4 createattrs; 7794 case EXCLUSIVE4: 7795 verifier4 createverf; 7796 }; 7798 enum opentype4 { 7799 OPEN4_NOCREATE = 0, 7800 OPEN4_CREATE = 1 7801 }; 7803 union openflag4 switch (opentype4 opentype) { 7804 case OPEN4_CREATE: 7805 createhow4 how; 7806 default: 7807 void; 7808 }; 7810 /* Next definitions used for OPEN delegation */ 7811 enum limit_by4 { 7812 NFS_LIMIT_SIZE = 1, 7813 NFS_LIMIT_BLOCKS = 2 7814 /* others as needed */ 7815 }; 7817 Draft Specification NFS version 4 Protocol August 2002 7819 struct nfs_modified_limit4 { 7820 uint32_t num_blocks; 7821 uint32_t bytes_per_block; 7822 }; 7824 union nfs_space_limit4 switch (limit_by4 limitby) { 7825 /* limit specified as file size */ 7826 case NFS_LIMIT_SIZE: 7827 uint64_t filesize; 7828 /* limit specified by number of blocks */ 7829 case NFS_LIMIT_BLOCKS: 7830 nfs_modified_limit4 mod_blocks; 7831 } ; 7833 enum open_delegation_type4 { 7834 OPEN_DELEGATE_NONE = 0, 7835 OPEN_DELEGATE_READ = 1, 7836 OPEN_DELEGATE_WRITE = 2 7837 }; 7839 enum open_claim_type4 { 7840 CLAIM_NULL = 0, 7841 CLAIM_PREVIOUS = 1, 7842 CLAIM_DELEGATE_CUR = 2, 7843 CLAIM_DELEGATE_PREV = 3 7844 }; 7846 struct open_claim_delegate_cur4 { 7847 stateid4 delegate_stateid; 7848 component4 file; 7849 }; 7851 union open_claim4 switch (open_claim_type4 claim) { 7852 /* 7853 * No special rights to file. Ordinary OPEN of the specified file. 7854 */ 7855 case CLAIM_NULL: 7856 /* CURRENT_FH: directory */ 7857 component4 file; 7859 /* 7860 * Right to the file established by an open previous to server 7861 * reboot. File identified by filehandle obtained at that time 7862 * rather than by name. 7863 */ 7864 case CLAIM_PREVIOUS: 7865 /* CURRENT_FH: file being reclaimed */ 7866 open_delegation_type4 delegate_type; 7868 /* 7869 * Right to file based on a delegation granted by the server. 7870 * File is specified by name. 7872 Draft Specification NFS version 4 Protocol August 2002 7874 */ 7875 case CLAIM_DELEGATE_CUR: 7876 /* CURRENT_FH: directory */ 7877 open_claim_delegate_cur4 delegate_cur_info; 7879 /* Right to file based on a delegation granted to a previous boot 7880 * instance of the client. File is specified by name. 7881 */ 7882 case CLAIM_DELEGATE_PREV: 7883 /* CURRENT_FH: directory */ 7884 component4 file_delegate_prev; 7885 }; 7887 RESULT 7889 struct open_read_delegation4 { 7890 stateid4 stateid; /* Stateid for delegation*/ 7891 bool recall; /* Pre-recalled flag for 7892 delegations obtained 7893 by reclaim 7894 (CLAIM_PREVIOUS) */ 7895 nfsace4 permissions; /* Defines users who don't 7896 need an ACCESS call to 7897 open for read */ 7898 }; 7900 struct open_write_delegation4 { 7901 stateid4 stateid; /* Stateid for delegation*/ 7902 bool recall; /* Pre-recalled flag for 7903 delegations obtained 7904 by reclaim 7905 (CLAIM_PREVIOUS) */ 7906 nfs_space_limit4 space_limit; /* Defines condition that 7907 the client must check to 7908 determine whether the 7909 file needs to be flushed 7910 to the server on close. 7911 */ 7912 nfsace4 permissions; /* Defines users who don't 7913 need an ACCESS call as 7914 part of a delegated 7915 open. */ 7916 }; 7918 union open_delegation4 7919 switch (open_delegation_type4 delegation_type) { 7920 case OPEN_DELEGATE_NONE: 7921 void; 7922 case OPEN_DELEGATE_READ: 7923 open_read_delegation4 read; 7925 Draft Specification NFS version 4 Protocol August 2002 7927 case OPEN_DELEGATE_WRITE: 7928 open_write_delegation4 write; 7929 }; 7931 const OPEN4_RESULT_CONFIRM = 0x00000002; 7932 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 7934 struct OPEN4resok { 7935 stateid4 stateid; /* Stateid for open */ 7936 change_info4 cinfo; /* Directory Change Info */ 7937 uint32_t rflags; /* Result flags */ 7938 bitmap4 attrset; /* attributes on create */ 7939 open_delegation4 delegation; /* Info on any open 7940 delegation */ 7941 }; 7943 union OPEN4res switch (nfsstat4 status) { 7944 case NFS4_OK: 7945 /* CURRENT_FH: opened file */ 7946 OPEN4resok resok4; 7947 default: 7948 void; 7949 }; 7951 WARNING TO CLIENT IMPLEMENTORS 7953 OPEN resembles LOOKUP in that it generates a filehandle for the 7954 client to use. Unlike LOOKUP though, OPEN creates server state on 7955 the filehandle. In normal circumstances, the client can only 7956 release this state with a CLOSE operation. CLOSE uses the current 7957 filehandle to determine which file to close. Therefore the client 7958 MUST follow every OPEN operation with a GETFH operation in the same 7959 COMPOUND procedure. This will supply the client with the 7960 filehandle such that CLOSE can be used appropriately. 7962 Simply waiting for the lease on the file to expire is insufficient 7963 because the server may maintain the state indefinitely as long as 7964 another client does not attempt to make a conflicting access to the 7965 same file. 7967 DESCRIPTION 7969 The OPEN operation creates and/or opens a regular file in a 7970 directory with the provided name. If the file does not exist at 7971 the server and creation is desired, specification of the method of 7972 creation is provided by the openhow parameter. The client has the 7973 choice of three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. 7975 If the current filehandle is a named attribute directory, OPEN will 7976 then create or open a named attribute file. Note that exclusive 7978 Draft Specification NFS version 4 Protocol August 2002 7980 create of a named attribute is not supported. If the createmode is 7981 EXCLUSIVE4 and the current filehandle is a named attribute 7982 directory, the server will return EINVAL. 7984 UNCHECKED means that the file should be created if a file of that 7985 name does not exist and encountering an existing regular file of 7986 that name is not an error. For this type of create, createattrs 7987 specifies the initial set of attributes for the file. The set of 7988 attributes may include any writable attribute valid for regular 7989 files. When an UNCHECKED create encounters an existing file, the 7990 attributes specified by createattrs are not used, except that when 7991 an size of zero is specified, the existing file is truncated. If 7992 GUARDED is specified, the server checks for the presence of a 7993 duplicate object by name before performing the create. If a 7994 duplicate exists, an error of NFS4ERR_EXIST is returned as the 7995 status. If the object does not exist, the request is performed as 7996 described for UNCHECKED. For each of these cases (UNCHECKED and 7997 GUARDED) where the operation is successful, the server will return 7998 to the client an attribute mask signifying which attributes were 7999 successfully set for the object. 8001 EXCLUSIVE specifies that the server is to follow exclusive creation 8002 semantics, using the verifier to ensure exclusive creation of the 8003 target. The server should check for the presence of a duplicate 8004 object by name. If the object does not exist, the server creates 8005 the object and stores the verifier with the object. If the object 8006 does exist and the stored verifier matches the client provided 8007 verifier, the server uses the existing object as the newly created 8008 object. If the stored verifier does not match, then an error of 8009 NFS4ERR_EXIST is returned. No attributes may be provided in this 8010 case, since the server may use an attribute of the target object to 8011 store the verifier. If the server uses an attribute to store the 8012 exclusive create verifier, it will signify which attribute by 8013 setting the appropriate bit in the attribute mask that is returned 8014 in the results. 8016 For the target directory, the server returns change_info4 8017 information in cinfo. With the atomic field of the change_info4 8018 struct, the server will indicate if the before and after change 8019 attributes were obtained atomically with respect to the link 8020 creation. 8022 Upon successful creation, the current filehandle is replaced by 8023 that of the new object. 8025 The OPEN operation provides for Windows share reservation 8026 capability with the use of the access and deny fields of the OPEN 8027 arguments. The client specifies at OPEN the required access and 8028 deny modes. For clients that do not directly support SHAREs (i.e. 8029 UNIX), the expected deny value is DENY_NONE. In the case that 8030 there is a existing SHARE reservation that conflicts with the OPEN 8031 request, the server returns the error NFS4ERR_SHARE_DENIED. For a 8033 Draft Specification NFS version 4 Protocol August 2002 8035 complete SHARE request, the client must provide values for the 8036 owner and seqid fields for the OPEN argument. For additional 8037 discussion of SHARE semantics see the section on 'Share 8038 Reservations'. 8040 In the case that the client is recovering state from a server 8041 failure, the claim field of the OPEN argument is used to signify 8042 that the request is meant to reclaim state previously held. 8044 The "claim" field of the OPEN argument is used to specify the file 8045 to be opened and the state information which the client claims to 8046 possess. There are four basic claim types which cover the various 8047 situations for an OPEN. They are as follows: 8049 CLAIM_NULL 8050 For the client, this is a new OPEN 8051 request and there is no previous state 8052 associate with the file for the client. 8054 CLAIM_PREVIOUS 8055 The client is claiming basic OPEN state 8056 for a file that was held previous to a 8057 server reboot. Generally used when a 8058 server is returning persistent 8059 filehandles; the client may not have the 8060 file name to reclaim the OPEN. 8062 CLAIM_DELEGATE_CUR 8063 The client is claiming a delegation for 8064 OPEN as granted by the server. 8065 Generally this is done as part of 8066 recalling a delegation. 8068 CLAIM_DELEGATE_PREV 8069 The client is claiming a delegation 8070 granted to a previous client instance; 8071 used after the client reboots. The 8072 server MAY support CLAIM_DELEGATE_PREV. 8073 If it does support CLAIM_DELEGATE_PREV, 8074 SETCLIENTID_CONFIRM MUST NOT remove the 8075 client's delegation state, and the 8076 server MUST support the DELEGEPURGE 8077 operation. 8079 For OPEN requests whose claim type is other than CLAIM_PREVIOUS 8080 (i.e. requests other than those devoted to reclaiming opens after a 8081 server reboot) that reach the server during its grace or lease 8082 expiration period, the server returns an error of NFS4ERR_GRACE. 8084 For any OPEN request, the server may return an open delegation, 8085 which allows further opens and closes to be handled locally on the 8086 client as described in the section Open Delegation. Note that 8087 delegation is up to the server to decide. The client should never 8088 assume that delegation will or will not be granted in a particular 8089 instance. It should always be prepared for either case. A partial 8091 Draft Specification NFS version 4 Protocol August 2002 8093 exception is the reclaim (CLAIM_PREVIOUS) case, in which a 8094 delegation type is claimed. In this case, delegation will always 8095 be granted, although the server may specify an immediate recall in 8096 the delegation structure. 8098 The rflags returned by a successful OPEN allow the server to return 8099 information governing how the open file is to be handled. 8100 OPEN4_RESULT_CONFIRM indicates that the client MUST execute an 8101 OPEN_CONFIRM operation before using the open file. 8102 OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking 8103 behavior is Posix like with respect to lock range coalescing. From 8104 this the client can choose to manage file locking state in a way to 8105 handle a mis-match of file locking management. 8107 If the component is of zero length, NFS4ERR_INVAL will be returned. 8108 The component is also subject to the normal UTF-8, character 8109 support, and name checks. See the section "UTF-8 Related Errors" 8110 for further discussion. 8112 When an OPEN is done and the specified lockowner already has the 8113 resulting filehandle open, the result is to "OR" together the new 8114 share and deny status together with the existing status. In this 8115 case, only a single CLOSE need be done, even though multiple OPEN's 8116 were completed. When such an OPEN is done, checking of share 8117 reservations for the new OPEN proceeds normally, with no exception 8118 for the existing OPEN held by the same lockowner. 8120 If the underlying filesystem at the server is only accessible in a 8121 read-only mode and the OPEN request has specified ACCESS_WRITE or 8122 ACCESS_BOTH, the server will return NFS4ERR_ROFS to indicate a 8123 read-only filesystem. 8125 As with the CREATE operation, the server MUST derive the owner, 8126 owner ACE, group, or group ACE if any of the four attributes are 8127 required and supported by the server's filesystem. For an OPEN 8128 with the EXCLUSIVE4 createmode, the server has no choice, since 8129 such OPEN calls do not include the createattrs field. Conversely, 8130 if createattrs is specified, and includes owner or group (or 8131 corresponding ACEs) that the principal in the RPC call's 8132 credentials does not have authorization to create files for, then 8133 the server may return NFS4ERR_PERM. 8135 In the case of a OPEN which specifies a size of zero (e.g. 8136 truncation) and the file has named attributes, the named attributes 8137 are left as is. They are not removed. 8139 IMPLEMENTATION 8141 The OPEN operation contains support for EXCLUSIVE create. The 8142 mechanism is similar to the support in NFS version 3 [RFC1813]. As 8143 in NFS version 3, this mechanism provides reliable exclusive 8145 Draft Specification NFS version 4 Protocol August 2002 8147 creation. Exclusive create is invoked when the how parameter is 8148 EXCLUSIVE. In this case, the client provides a verifier that can 8149 reasonably be expected to be unique. A combination of a client 8150 identifier, perhaps the client network address, and a unique number 8151 generated by the client, perhaps the RPC transaction identifier, 8152 may be appropriate. 8154 If the object does not exist, the server creates the object and 8155 stores the verifier in stable storage. For filesystems that do not 8156 provide a mechanism for the storage of arbitrary file attributes, 8157 the server may use one or more elements of the object meta-data to 8158 store the verifier. The verifier must be stored in stable storage 8159 to prevent erroneous failure on retransmission of the request. It 8160 is assumed that an exclusive create is being performed because 8161 exclusive semantics are critical to the application. Because of the 8162 expected usage, exclusive CREATE does not rely solely on the 8163 normally volatile duplicate request cache for storage of the 8164 verifier. The duplicate request cache in volatile storage does not 8165 survive a crash and may actually flush on a long network partition, 8166 opening failure windows. In the UNIX local filesystem environment, 8167 the expected storage location for the verifier on creation is the 8168 meta-data (time stamps) of the object. For this reason, an 8169 exclusive object create may not include initial attributes because 8170 the server would have nowhere to store the verifier. 8172 If the server can not support these exclusive create semantics, 8173 possibly because of the requirement to commit the verifier to 8174 stable storage, it should fail the OPEN request with the error, 8175 NFS4ERR_NOTSUPP. 8177 During an exclusive CREATE request, if the object already exists, 8178 the server reconstructs the object's verifier and compares it with 8179 the verifier in the request. If they match, the server treats the 8180 request as a success. The request is presumed to be a duplicate of 8181 an earlier, successful request for which the reply was lost and 8182 that the server duplicate request cache mechanism did not detect. 8183 If the verifiers do not match, the request is rejected with the 8184 status, NFS4ERR_EXIST. 8186 Once the client has performed a successful exclusive create, it 8187 must issue a SETATTR to set the correct object attributes. Until 8188 it does so, it should not rely upon any of the object attributes, 8189 since the server implementation may need to overload object meta- 8190 data to store the verifier. The subsequent SETATTR must not occur 8191 in the same COMPOUND request as the OPEN. This separation will 8192 guarantee that the exclusive create mechanism will continue to 8193 function properly in the face of retransmission of the request. 8195 Use of the GUARDED attribute does not provide exactly-once 8196 semantics. In particular, if a reply is lost and the server does 8197 not detect the retransmission of the request, the operation can 8198 fail with NFS4ERR_EXIST, even though the create was performed 8200 Draft Specification NFS version 4 Protocol August 2002 8202 successfully. The client would use this behavior in the case that 8203 the application has not requested an exclusive create but has asked 8204 to have the file truncated when the file is opened. In the case of 8205 the client timing out and retransmitting the create request, the 8206 client can use GUARDED to prevent against a sequence like: create, 8207 write, create (retransmitted) from occurring. 8209 For SHARE reservations, the client must specify a value for access 8210 that is one of READ, WRITE, or BOTH. For deny, the client must 8211 specify one of NONE, READ, WRITE, or BOTH. If the client fails to 8212 do this, the server must return NFS4ERR_INVAL. 8214 Based on the access value (READ, WRITE, or BOTH) the client should 8215 check that the requestor has the proper access rights to perform 8216 the specified operation. This would generally be the results of 8217 applying the ACL access rules to the file for the current 8218 requestor. However, just as with the ACCESS operation, the client 8219 should not attempt to second-guess the server's decisions, as 8220 access rights may change and may be subject to server 8221 administrative controls outside the ACL framework. If the 8222 requestor is not authorized to READ or WRITE (depending on the 8223 access value), the server must return NFS4ERR_ACCESS. Note that 8224 since the NFS version 4 protocol does not impose any requirement 8225 that READ's and WRITE's issued for an open file have the same 8226 credentials as the OPEN itself, the server still must do 8227 appropriate access checking on the READ's and WRITE's themselves. 8229 If the component provided to OPEN is a symbolic link, the error 8230 NFS4ERR_SYMLINK will be returned to the client. If the current 8231 filehandle is not a directory, the error NFS4ERR_NOTDIR will be 8232 returned. 8234 ERRORS 8236 NFS4ERR_ACCESS 8237 NFS4ERR_ATTRNOTSUPP 8238 NFS4ERR_BADCHAR 8239 NFS4ERR_BADHANDLE 8240 NFS4ERR_BADNAME 8241 NFS4ERR_BADOWNER 8242 NFS4ERR_BAD_SEQID 8243 NFS4ERR_BADXDR 8244 NFS4ERR_DELAY 8245 NFS4ERR_DQUOT 8246 NFS4ERR_EXIST 8247 NFS4ERR_EXPIRED 8248 NFS4ERR_FHEXPIRED 8249 NFS4ERR_GRACE 8250 NFS4ERR_IO 8251 NFS4ERR_ISDIR 8252 NFS4ERR_LEASE_MOVED 8254 Draft Specification NFS version 4 Protocol August 2002 8256 NFS4ERR_MOVED 8257 NFS4ERR_NAMETOOLONG 8258 NFS4ERR_NOENT 8259 NFS4ERR_NOFILEHANDLE 8260 NFS4ERR_NOSPC 8261 NFS4ERR_NOTDIR 8262 NFS4ERR_NOTSUPP 8263 NFS4ERR_NO_GRACE 8264 NFS4ERR_RECLAIM_BAD 8265 NFS4ERR_RECLAIM_CONFLICT 8266 NFS4ERR_RESOURCE 8267 NFS4ERR_ROFS 8268 NFS4ERR_SERVERFAULT 8269 NFS4ERR_SHARE_DENIED 8270 NFS4ERR_STALE_CLIENTID 8271 NFS4ERR_SYMLINK 8272 NFS4ERR_WRONGSEC 8274 Draft Specification NFS version 4 Protocol August 2002 8276 14.2.17. Operation 19: OPENATTR - Open Named Attribute Directory 8278 SYNOPSIS 8280 (cfh) createdir -> (cfh) 8282 ARGUMENT 8284 struct OPENATTR4args { 8285 /* CURRENT_FH: object */ 8286 bool createdir; 8287 }; 8289 RESULT 8291 struct OPENATTR4res { 8292 /* CURRENT_FH: named attr directory*/ 8293 nfsstat4 status; 8294 }; 8296 DESCRIPTION 8298 The OPENATTR operation is used to obtain the filehandle of the 8299 named attribute directory associated with the current filehandle. 8300 The result of the OPENATTR will be a filehandle to an object of 8301 type NF4ATTRDIR. From this filehandle, READDIR and LOOKUP 8302 operations can be used to obtain filehandles for the various named 8303 attributes associated with the original filesystem object. 8304 Filehandles returned within the named attribute directory will have 8305 a type of NF4NAMEDATTR. 8307 The createdir argument allows the client to signify if a named 8308 attribute directory should be created as a result of the OPENATTR 8309 operation. Some clients may use the OPENATTR operation with a 8310 value of FALSE for createdir to determine if any named attributes 8311 exist for the object. If none exist, then NFS4ERR_NOENT will be 8312 returned. If createdir has a value of TRUE and no named attribute 8313 directory exists, one is created. The creation of a named 8314 attribute directory assumes that the server has implemented named 8315 attribute support in this fashion and is not required to do so by 8316 this definition. 8318 IMPLEMENTATION 8320 If the server does not support named attributes for the current 8321 filehandle, an error of NFS4ERR_NOTSUPP will be returned to the 8322 client. 8324 Draft Specification NFS version 4 Protocol August 2002 8326 ERRORS 8328 NFS4ERR_ACCESS 8329 NFS4ERR_BADHANDLE 8330 NFS4ERR_BADXDR 8331 NFS4ERR_DELAY 8332 NFS4ERR_FHEXPIRED 8333 NFS4ERR_INVAL 8334 NFS4ERR_IO 8335 NFS4ERR_MOVED 8336 NFS4ERR_NOENT 8337 NFS4ERR_NOFILEHANDLE 8338 NFS4ERR_NOSPC 8339 NFS4ERR_NOTSUPP 8340 NFS4ERR_RESOURCE 8341 NFS4ERR_ROFS 8342 NFS4ERR_SERVERFAULT 8343 NFS4ERR_STALE 8344 NFS4ERR_WRONGSEC 8346 Draft Specification NFS version 4 Protocol August 2002 8348 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open 8350 SYNOPSIS 8352 (cfh), seqid, stateid-> stateid 8354 ARGUMENT 8356 struct OPEN_CONFIRM4args { 8357 /* CURRENT_FH: opened file */ 8358 seqid4 seqid; 8359 stateid4 stateid; 8360 }; 8362 RESULT 8364 struct OPEN_CONFIRM4resok { 8365 stateid4 stateid; 8366 }; 8368 union OPEN_CONFIRM4res switch (nfsstat4 status) { 8369 case NFS4_OK: 8370 OPEN_CONFIRM4resok resok4; 8371 default: 8372 void; 8373 }; 8375 DESCRIPTION 8377 This operation is used to confirm the sequence id usage for the 8378 first time that a open_owner is used by a client. The stateid 8379 returned from the OPEN operation is used as the argument for this 8380 operation along with the next sequence id for the open_owner. The 8381 sequence id passed to the OPEN_CONFIRM must be 1 (one) greater than 8382 the seqid passed to the OPEN operation from which the open_confirm 8383 value was obtained. If the server receives an unexpected sequence 8384 id with respect to the original open, then the server assumes that 8385 the client will not confirm the original OPEN and all state 8386 associated with the original OPEN is released by the server. 8388 On success, the current filehandle retains its value. 8390 IMPLEMENTATION 8392 A given client might generate many open_owner4 data structures for 8393 a given clientid. The client will periodically either dispose of 8394 its open_owner4s or stop using them for indefinite periods of time. 8395 The latter situation is why the NFS version 4 protocol does not 8397 Draft Specification NFS version 4 Protocol August 2002 8399 have an explicit operation to exit an open_owner4: such an 8400 operation is of no use in that situation. Instead, to avoid 8401 unbounded memory use, the server needs to implement a strategy for 8402 disposing of open_owner4s that have no current lock, open, or 8403 delegation state for any files and have not been used recently. 8404 The time period used to determine when to dispose of open_owner4s 8405 is an implementation choice. The time period should certainly be 8406 no less than the lease time plus any grace period the server wishes 8407 to implement beyond a lease time. The OPEN_CONFIRM operation 8408 allows the server to safely dispose of unused open_owner4 data 8409 structures. 8411 In the case that a client issues an OPEN operation and the server 8412 no longer has a record of the open_owner4, the server needs ensure 8413 that this is a new OPEN and not a replay or retransmission. 8415 Servers must not require confirmation on OPEN's that grant 8416 delegations or are doing reclaim operations. See section "Use of 8417 Open Confirmation" for details. The server can easily avoid this 8418 by noting whether it has disposed of one open_owner4 for the given 8419 clientid. If the server does not support delegation, it might 8420 simply maintain a single bit that notes whether any open_owner4 8421 (for any client) has been disposed of. 8423 The server must hold unconfirmed OPEN state until one of three 8424 events occur. First, the client sends an OPEN_CONFIRM request with 8425 the appropriate sequence id and stateid within the lease period. 8426 In this case, the OPEN state on the server goes to confirmed, and 8427 the open_owner4 on the server is fully established. 8429 Second, the client sends another OPEN request with a sequence id 8430 that is incorrect for the open_owner4 (out of sequence). In this 8431 case, the server assumes the second OPEN request is valid and the 8432 first one is a replay. The server cancels the OPEN state of the 8433 first OPEN request, establishes an unconfirmed OPEN state for the 8434 second OPEN request, and responds to the second OPEN request with 8435 an indication that an OPEN_CONFIRM is needed. The process then 8436 repeats itself. While there is a potential for a denial of service 8437 attack on the client, it is mitigated if the client and server 8438 require the use of a security flavor based on Kerberos V5, LIPKEY, 8439 or some other flavor that uses cryptography. 8441 What if the server is in the unconfirmed OPEN state for a given 8442 open_owner4, and it receives an operation on the open_owner4 that 8443 has a stateid but the operation is not OPEN, or it is OPEN_CONFIRM 8444 but with the wrong stateid? Then, even if the seqid is correct, 8445 the server returns NFS4ERR_BAD_STATEID, because the server assumes 8446 the operation is a replay: if the server has no established OPEN 8447 state, then there is no way, for example, a LOCK operation could be 8448 valid. 8450 Third, neither of the two aforementioned events occur for the 8452 Draft Specification NFS version 4 Protocol August 2002 8454 open_owner4 within the lease period. In this case, the OPEN state 8455 is cancelled and disposal of the open_owner4 can occur. 8457 ERRORS 8459 NFS4ERR_BADHANDLE 8460 NFS4ERR_BAD_SEQID 8461 NFS4ERR_BADXDR 8462 NFS4ERR_EXPIRED 8463 NFS4ERR_FHEXPIRED 8464 NFS4ERR_GRACE 8465 NFS4ERR_INVAL 8466 NFS4ERR_ISDIR 8467 NFS4ERR_MOVED 8468 NFS4ERR_NOENT 8469 NFS4ERR_NOFILEHANDLE 8470 NFS4ERR_NOTSUPP 8471 NFS4ERR_RESOURCE 8472 NFS4ERR_SERVERFAULT 8473 NFS4ERR_STALE 8475 Draft Specification NFS version 4 Protocol August 2002 8477 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access 8479 SYNOPSIS 8481 (cfh), stateid, seqid, access, deny -> stateid 8483 ARGUMENT 8485 struct OPEN_DOWNGRADE4args { 8486 /* CURRENT_FH: opened file */ 8487 stateid4 stateid; 8488 seqid4 seqid; 8489 uint32_t share_access; 8490 uint32_t share_deny; 8491 }; 8493 RESULT 8495 struct OPEN_DOWNGRADE4resok { 8496 stateid4 stateid; 8497 }; 8499 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 8500 case NFS4_OK: 8501 OPEN_DOWNGRADE4resok resok4; 8502 default: 8503 void; 8504 }; 8506 DESCRIPTION 8508 This operation is used to adjust the access and deny bits for a 8509 given open. This is necessary when a given lockowner opens the 8510 same file multiple times with different access and deny flags. In 8511 this situation, a close of one of the open's may change the 8512 appropriate access and deny flags to remove bits associated with 8513 open's no longer in effect. 8515 The access and deny bits specified in this operation replace the 8516 current ones for the specified open file. The access and deny bits 8517 specified must be exactly equal to the union of the access and deny 8518 bits specified for some subset of the OPEN's in effect for current 8519 openowner on the current file. If that constraint is not 8520 respected, the error NFS4ERR_INVAL should be returned. Since 8521 access and deny bits are subsets of those already granted, it is 8522 not possible for this request to be denied because of conflicting 8523 share reservations. 8525 On success, the current filehandle retains its value. 8527 Draft Specification NFS version 4 Protocol August 2002 8529 ERRORS 8531 NFS4ERR_BADHANDLE 8532 NFS4ERR_BAD_SEQID 8533 NFS4ERR_BAD_STATEID 8534 NFS4ERR_BADXDR 8535 NFS4ERR_EXPIRED 8536 NFS4ERR_FHEXPIRED 8537 NFS4ERR_INVAL 8538 NFS4ERR_MOVED 8539 NFS4ERR_NOFILEHANDLE 8540 NFS4ERR_OLD_STATEID 8541 NFS4ERR_RESOURCE 8542 NFS4ERR_SERVERFAULT 8543 NFS4ERR_STALE 8544 NFS4ERR_STALE_STATEID 8546 Draft Specification NFS version 4 Protocol August 2002 8548 14.2.20. Operation 22: PUTFH - Set Current Filehandle 8550 SYNOPSIS 8552 filehandle -> (cfh) 8554 ARGUMENT 8556 struct PUTFH4args { 8557 nfs_fh4 object; 8558 }; 8560 RESULT 8562 struct PUTFH4res { 8563 /* CURRENT_FH: */ 8564 nfsstat4 status; 8565 }; 8567 DESCRIPTION 8569 Replaces the current filehandle with the filehandle provided as an 8570 argument. 8572 If the security mechanism used by the requestor does not meet the 8573 requirements of the filehandle provided to this operation, the 8574 server MUST return NFS4ERR_WRONGSEC. 8576 IMPLEMENTATION 8578 Commonly used as the first operator in an NFS request to set the 8579 context for following operations. 8581 ERRORS 8583 NFS4ERR_BADHANDLE 8584 NFS4ERR_BADXDR 8585 NFS4ERR_FHEXPIRED 8586 NFS4ERR_MOVED 8587 NFS4ERR_RESOURCE 8588 NFS4ERR_SERVERFAULT 8589 NFS4ERR_STALE 8590 NFS4ERR_WRONGSEC 8592 Draft Specification NFS version 4 Protocol August 2002 8594 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle 8596 SYNOPSIS 8598 - -> (cfh) 8600 ARGUMENT 8602 void; 8604 RESULT 8606 struct PUTPUBFH4res { 8607 /* CURRENT_FH: public fh */ 8608 nfsstat4 status; 8609 }; 8611 DESCRIPTION 8613 Replaces the current filehandle with the filehandle that represents 8614 the public filehandle of the server's name space. This filehandle 8615 may be different from the "root" filehandle which may be associated 8616 with some other directory on the server. 8618 The public filehandle represents the concepts embodied in 8619 [RFC2054], [RFC2055], [RFC2224]. The intent for NFS version 4 is 8620 that the public filehandle (represented by the PUTPUBFH operation) 8621 be used as a method of providing WebNFS server compatibility with 8622 NFS versions 2 and 3. 8624 The public filehandle and the root filehandle (represented by the 8625 PUTROOTFH operation) should be equivalent. If the public and root 8626 filehandles are not equivalent, then the public filehandle MUST be 8627 a descendant of the root filehandle. 8629 IMPLEMENTATION 8631 Used as the first operator in an NFS request to set the context for 8632 following operations. 8634 With the NFS version 2 and 3 public filehandle, the client is able 8635 to specify whether the path name provided in the LOOKUP should be 8636 evaluated as either an absolute path relative to the server's root 8637 or relative to the public filehandle. [RFC2224] contains further 8638 discussion of the functionality. With NFS version 4, that type of 8639 specification is not directly available in the LOOKUP operation. 8640 The reason for this is because the component separators needed to 8641 specify absolute vs. relative are not allowed in NFS version 4. 8643 Draft Specification NFS version 4 Protocol August 2002 8645 Therefore, the client is responsible for constructing its request 8646 such that the use of either PUTROOTFH or PUTPUBFH are used to 8647 signify absolute or relative evaluation of an NFS URL respectively. 8649 Note that there are warnings mentioned in [RFC2224] with respect to 8650 the use of absolute evaluation and the restrictions the server may 8651 place on that evaluation with respect to how much of its namespace 8652 has been made available. These same warnings apply to NFS version 8653 4. It is likely, therefore that because of server implementation 8654 details, an NFS version 3 absolute public filehandle lookup may 8655 behave differently than an NFS version 4 absolute resolution. 8657 There is a form of security negotiation as described in [RFC2755] 8658 that uses the public filehandle a method of employing SNEGO. This 8659 method is not available with NFS version 4 as filehandles are not 8660 overloaded with special meaning and therefore do not provide the 8661 same framework as NFS versions 2 and 3. Clients should therefore 8662 use the security negotiation mechanisms described in this RFC. 8664 ERRORS 8666 NFS4ERR_RESOURCE 8667 NFS4ERR_SERVERFAULT 8668 NFS4ERR_WRONGSEC 8670 Draft Specification NFS version 4 Protocol August 2002 8672 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle 8674 SYNOPSIS 8676 - -> (cfh) 8678 ARGUMENT 8680 void; 8682 RESULT 8684 struct PUTROOTFH4res { 8685 /* CURRENT_FH: root fh */ 8686 nfsstat4 status; 8687 }; 8689 DESCRIPTION 8691 Replaces the current filehandle with the filehandle that represents 8692 the root of the server's name space. From this filehandle a LOOKUP 8693 operation can locate any other filehandle on the server. This 8694 filehandle may be different from the "public" filehandle which may 8695 be associated with some other directory on the server. 8697 IMPLEMENTATION 8699 Commonly used as the first operator in an NFS request to set the 8700 context for following operations. 8702 ERRORS 8704 NFS4ERR_RESOURCE 8705 NFS4ERR_SERVERFAULT 8706 NFS4ERR_WRONGSEC 8708 Draft Specification NFS version 4 Protocol August 2002 8710 14.2.23. Operation 25: READ - Read from File 8712 SYNOPSIS 8714 (cfh), stateid, offset, count -> eof, data 8716 ARGUMENT 8718 struct READ4args { 8719 /* CURRENT_FH: file */ 8720 stateid4 stateid; 8721 offset4 offset; 8722 count4 count; 8723 }; 8725 RESULT 8727 struct READ4resok { 8728 bool eof; 8729 opaque data<>; 8730 }; 8732 union READ4res switch (nfsstat4 status) { 8733 case NFS4_OK: 8734 READ4resok resok4; 8735 default: 8736 void; 8737 }; 8739 DESCRIPTION 8741 The READ operation reads data from the regular file identified by 8742 the current filehandle. 8744 The client provides an offset of where the READ is to start and a 8745 count of how many bytes are to be read. An offset of 0 (zero) 8746 means to read data starting at the beginning of the file. If 8747 offset is greater than or equal to the size of the file, the 8748 status, NFS4_OK, is returned with a data length set to 0 (zero) and 8749 eof is set to TRUE. The READ is subject to access permissions 8750 checking. 8752 If the client specifies a count value of 0 (zero), the READ 8753 succeeds and returns 0 (zero) bytes of data again subject to access 8754 permissions checking. The server may choose to return fewer bytes 8755 than specified by the client. The client needs to check for this 8756 condition and handle the condition appropriately. 8758 The stateid value for a READ request represents a value returned 8760 Draft Specification NFS version 4 Protocol August 2002 8762 from a previous record lock or share reservation request. The 8763 stateid is used by the server to verify that the associated share 8764 reservation and any record locks are still valid and to update 8765 lease timeouts for the client. 8767 If the read ended at the end-of-file (formally, in a correctly 8768 formed READ request, if offset + count is equal to the size of the 8769 file), or the read request extends beyond the size of the file (if 8770 offset + count is greater than the size of the file), eof is 8771 returned as TRUE; otherwise it is FALSE. A successful READ of an 8772 empty file will always return eof as TRUE. 8774 If the current filehandle is not a regular file, an error will be 8775 returned to the client. In the case the current filehandle 8776 represents a directory, NFS4ERR_ISDIR is return; otherwise, 8777 NFS4ERR_INVAL is returned. 8779 For a READ with a stateid value of all bits 0, the server MAY allow 8780 the READ to be serviced subject to mandatory file locks or the 8781 current share deny modes for the file. For a READ with a stateid 8782 value of all bits 1, the server MAY allow READ operations to bypass 8783 locking checks at the server. 8785 On success, the current filehandle retains its value. 8787 IMPLEMENTATION 8789 It is possible for the server to return fewer than count bytes of 8790 data. If the server returns less than the count requested and eof 8791 is set to FALSE, the client should issue another READ to get the 8792 remaining data. A server may return less data than requested under 8793 several circumstances. The file may have been truncated by another 8794 client or perhaps on the server itself, changing the file size from 8795 what the requesting client believes to be the case. This would 8796 reduce the actual amount of data available to the client. It is 8797 possible that the server may back off the transfer size and reduce 8798 the read request return. Server resource exhaustion may also occur 8799 necessitating a smaller read return. 8801 If mandatory file locking is on for the file, and if the region 8802 corresponding to the data to be read from file is write locked by 8803 an owner not associated the stateid, server will return an 8804 NFS4ERR_LOCKED error. The client should try to get appropriate 8805 read record lock via the LOCK operation before re-attempting the 8806 READ. When the READ completes, the client should release the record 8807 lock via LOCKU. 8809 ERRORS 8811 NFS4ERR_ACCESS 8813 Draft Specification NFS version 4 Protocol August 2002 8815 NFS4ERR_BADHANDLE 8816 NFS4ERR_BAD_STATEID 8817 NFS4ERR_BADXDR 8818 NFS4ERR_DELAY 8819 NFS4ERR_EXPIRED 8820 NFS4ERR_FHEXPIRED 8821 NFS4ERR_GRACE 8822 NFS4ERR_INVAL 8823 NFS4ERR_IO 8824 NFS4ERR_ISDIR 8825 NFS4ERR_LEASE_MOVED 8826 NFS4ERR_LOCKED 8827 NFS4ERR_MOVED 8828 NFS4ERR_NOFILEHANDLE 8829 NFS4ERR_NXIO 8830 NFS4ERR_OLD_STATEID 8831 NFS4ERR_OPENMODE 8832 NFS4ERR_RESOURCE 8833 NFS4ERR_SERVERFAULT 8834 NFS4ERR_STALE 8835 NFS4ERR_STALE_STATEID 8837 Draft Specification NFS version 4 Protocol August 2002 8839 14.2.24. Operation 26: READDIR - Read Directory 8841 SYNOPSIS 8842 (cfh), cookie, cookieverf, dircount, maxcount, attr_request -> 8843 cookieverf { cookie, name, attrs } 8845 ARGUMENT 8847 struct READDIR4args { 8848 /* CURRENT_FH: directory */ 8849 nfs_cookie4 cookie; 8850 verifier4 cookieverf; 8851 count4 dircount; 8852 count4 maxcount; 8853 bitmap4 attr_request; 8854 }; 8856 RESULT 8858 struct entry4 { 8859 nfs_cookie4 cookie; 8860 component4 name; 8861 fattr4 attrs; 8862 entry4 *nextentry; 8863 }; 8865 struct dirlist4 { 8866 entry4 *entries; 8867 bool eof; 8868 }; 8870 struct READDIR4resok { 8871 verifier4 cookieverf; 8872 dirlist4 reply; 8873 }; 8875 union READDIR4res switch (nfsstat4 status) { 8876 case NFS4_OK: 8877 READDIR4resok resok4; 8878 default: 8879 void; 8880 }; 8882 DESCRIPTION 8884 The READDIR operation retrieves a variable number of entries from a 8885 filesystem directory and returns client requested attributes for 8886 each entry along with information to allow the client to request 8888 Draft Specification NFS version 4 Protocol August 2002 8890 additional directory entries in a subsequent READDIR. 8892 The arguments contain a cookie value that represents where the 8893 READDIR should start within the directory. A value of 0 (zero) for 8894 the cookie is used to start reading at the beginning of the 8895 directory. For subsequent READDIR requests, the client specifies a 8896 cookie value that is provided by the server on a previous READDIR 8897 request. 8899 The cookieverf value should be set to 0 (zero) when the cookie 8900 value is 0 (zero) (first directory read). On subsequent requests, 8901 it should be a cookieverf as returned by the server. The 8902 cookieverf must match that returned by the READDIR in which the 8903 cookie was acquired. If the server determines that the cookieverf 8904 is no longer valid for the directory, the error NFS4ERR_NOT_SAME 8905 must be returned. 8907 The dircount portion of the argument is a hint of the maximum 8908 number of bytes of directory information that should be returned. 8909 This value represents the length of the names of the directory 8910 entries and the cookie value for these entries. This length 8911 represents the XDR encoding of the data (names and cookies) and not 8912 the length in the native format of the server. 8914 The maxcount value of the argument is the maximum number of bytes 8915 for the result. This maximum size represents all of the data being 8916 returned within the READDIR4resok structure and includes the XDR 8917 overhead. The server may return less data. If the server is 8918 unable to return a single directory entry within the maxcount 8919 limit, the error NFS4ERR_READDIR_NOSPC will be returned to the 8920 client. 8922 Finally, attrbits represents the list of attributes to be returned 8923 for each directory entry supplied by the server. 8925 On successful return, the server's response will provide a list of 8926 directory entries. Each of these entries contains the name of the 8927 directory entry, a cookie value for that entry, and the associated 8928 attributes as requested. The "eof" flag has a value of TRUE if 8929 there are no more entries in the directory. 8931 The cookie value is only meaningful to the server and is used as a 8932 "bookmark" for the directory entry. As mentioned, this cookie is 8933 used by the client for subsequent READDIR operations so that it may 8934 continue reading a directory. The cookie is similar in concept to 8935 a READ offset but should not be interpreted as such by the client. 8936 Ideally, the cookie value should not change if the directory is 8937 modified since the client may be caching these values. 8939 In some cases, the server may encounter an error while obtaining 8940 the attributes for a directory entry. Instead of returning an 8941 error for the entire READDIR operation, the server can instead 8943 Draft Specification NFS version 4 Protocol August 2002 8945 return the attribute 'fattr4_rdattr_error'. With this, the server 8946 is able to communicate the failure to the client and not fail the 8947 entire operation in the instance of what might be a transient 8948 failure. Obviously, the client must request the 8949 fattr4_rdattr_error attribute for this method to work properly. If 8950 the client does not request the attribute, the server has no choice 8951 but to return failure for the entire READDIR operation. 8953 For some filesystem environments, the directory entries "." and 8954 ".." have special meaning and in other environments, they may not. 8955 If the server supports these special entries within a directory, 8956 they should not be returned to the client as part of the READDIR 8957 response. To enable some client environments, the cookie values of 8958 0, 1, and 2 are to be considered reserved. Note that the UNIX 8959 client will use these values when combining the server's response 8960 and local representations to enable a fully formed UNIX directory 8961 presentation to the application. 8963 For READDIR arguments, cookie values of 1 and 2 should not be used 8964 and for READDIR results cookie values of 0, 1, and 2 should not 8965 returned. 8967 On success, the current filehandle retains its value. 8969 IMPLEMENTATION 8971 The server's filesystem directory representations can differ 8972 greatly. A client's programming interfaces may also be bound to 8973 the local operating environment in a way that does not translate 8974 well into the NFS protocol. Therefore the use of the dircount and 8975 maxcount fields are provided to allow the client the ability to 8976 provide guidelines to the server. If the client is aggressive 8977 about attribute collection during a READDIR, the server has an idea 8978 of how to limit the encoded response. The dircount field provides 8979 a hint on the number of entries based solely on the names of the 8980 directory entries. Since it is a hint, it may be possible that a 8981 dircount value is zero. In this case, the server is free to ignore 8982 the dircount value and return directory information based on the 8983 specified maxcount value. 8985 The cookieverf may be used by the server to help manage cookie 8986 values that may become stale. It should be a rare occurrence that 8987 a server is unable to continue properly reading a directory with 8988 the provided cookie/cookieverf pair. The server should make every 8989 effort to avoid this condition since the application at the client 8990 may not be able to properly handle this type of failure. 8992 The use of the cookieverf will also protect the client from using 8993 READDIR cookie values that may be stale. For example, if the file 8994 system has been migrated, the server may or may not be able to use 8995 the same cookie values to service READDIR as the previous server 8997 Draft Specification NFS version 4 Protocol August 2002 8999 used. With the client providing the cookieverf, the server is able 9000 to provide the appropriate response to the client. This prevents 9001 the case where the server may accept a cookie value but the 9002 underlying directory has changed and the response is invalid from 9003 the client's context of its previous READDIR. 9005 Since some servers will not be returning "." and ".." entries as 9006 has been done with previous versions of the NFS protocol, the 9007 client that requires these entries be present in READDIR responses 9008 must fabricate them. 9010 ERRORS 9012 NFS4ERR_ACCESS 9013 NFS4ERR_BADHANDLE 9014 NFS4ERR_BAD_COOKIE 9015 NFS4ERR_BADXDR 9016 NFS4ERR_DELAY 9017 NFS4ERR_FHEXPIRED 9018 NFS4ERR_INVAL 9019 NFS4ERR_IO 9020 NFS4ERR_MOVED 9021 NFS4ERR_NOFILEHANDLE 9022 NFS4ERR_NOTDIR 9023 NFS4ERR_NOTSUPP 9024 NFS4ERR_NOT_SAME 9025 NFS4ERR_READDIR_NOSPC 9026 NFS4ERR_RESOURCE 9027 NFS4ERR_SERVERFAULT 9028 NFS4ERR_STALE 9029 NFS4ERR_TOOSMALL 9031 Draft Specification NFS version 4 Protocol August 2002 9033 14.2.25. Operation 27: READLINK - Read Symbolic Link 9035 SYNOPSIS 9037 (cfh) -> linktext 9039 ARGUMENT 9041 /* CURRENT_FH: symlink */ 9042 void; 9044 RESULT 9046 struct READLINK4resok { 9047 linktext4 link; 9048 }; 9050 union READLINK4res switch (nfsstat4 status) { 9051 case NFS4_OK: 9052 READLINK4resok resok4; 9053 default: 9054 void; 9055 }; 9057 DESCRIPTION 9059 READLINK reads the data associated with a symbolic link. The data 9060 is a UTF-8 string that is opaque to the server. That is, whether 9061 created by an NFS client or created locally on the server, the data 9062 in a symbolic link is not interpreted when created, but is simply 9063 stored. 9065 On success, the current filehandle retains its value. 9067 IMPLEMENTATION 9069 A symbolic link is nominally a pointer to another file. The data 9070 is not necessarily interpreted by the server, just stored in the 9071 file. It is possible for a client implementation to store a path 9072 name that is not meaningful to the server operating system in a 9073 symbolic link. A READLINK operation returns the data to the client 9074 for interpretation. If different implementations want to share 9075 access to symbolic links, then they must agree on the 9076 interpretation of the data in the symbolic link. 9078 The READLINK operation is only allowed on objects of type NF4LNK. 9079 The server should return the error, NFS4ERR_INVAL, if the object is 9080 not of type, NF4LNK. 9082 Draft Specification NFS version 4 Protocol August 2002 9084 ERRORS 9086 NFS4ERR_ACCESS 9087 NFS4ERR_BADHANDLE 9088 NFS4ERR_DELAY 9089 NFS4ERR_FHEXPIRED 9090 NFS4ERR_INVAL 9091 NFS4ERR_IO 9092 NFS4ERR_MOVED 9093 NFS4ERR_NOFILEHANDLE 9094 NFS4ERR_NOTSUPP 9095 NFS4ERR_RESOURCE 9096 NFS4ERR_SERVERFAULT 9097 NFS4ERR_STALE 9099 Draft Specification NFS version 4 Protocol August 2002 9101 14.2.26. Operation 28: REMOVE - Remove Filesystem Object 9103 SYNOPSIS 9105 (cfh), filename -> change_info 9107 ARGUMENT 9109 struct REMOVE4args { 9110 /* CURRENT_FH: directory */ 9111 component4 target; 9112 }; 9114 RESULT 9116 struct REMOVE4resok { 9117 change_info4 cinfo; 9118 } 9120 union REMOVE4res switch (nfsstat4 status) { 9121 case NFS4_OK: 9122 REMOVE4resok resok4; 9123 default: 9124 void; 9125 } 9127 DESCRIPTION 9129 The REMOVE operation removes (deletes) a directory entry named by 9130 filename from the directory corresponding to the current 9131 filehandle. If the entry in the directory was the last reference 9132 to the corresponding filesystem object, the object may be 9133 destroyed. 9135 For the directory where the filename was removed, the server 9136 returns change_info4 information in cinfo. With the atomic field 9137 of the change_info4 struct, the server will indicate if the before 9138 and after change attributes were obtained atomically with respect 9139 to the removal. 9141 If the target has a length of 0 (zero), or if target does not obey 9142 the UTF-8 definition, the error NFS4ERR_INVAL will be returned. 9144 On success, the current filehandle retains its value. 9146 IMPLEMENTATION 9148 NFS versions 2 and 3 required a different operator RMDIR for 9150 Draft Specification NFS version 4 Protocol August 2002 9152 directory removal and REMOVE for non-directory removal. This 9153 allowed clients to skip checking the file type when being passed a 9154 non-directory delete system call (e.g. unlink() in POSIX) to remove 9155 a directory, as well as the converse (e.g. a rmdir() on a non- 9156 directory) because they knew the server would check the file type. 9157 NFS version 4 REMOVE can be used to delete any directory entry 9158 independent of its file type. The implementor of an NFS version 4 9159 client's entry points from the unlink() and rmdir() system calls 9160 should first check the file type against the types the system call 9161 is allowed to remove before issuing a REMOVE. Alternatively, the 9162 implementor can produce a COMPOUND call that includes a 9163 LOOKUP/VERIFY sequence to verify the file type before a REMOVE 9164 operation in the same COMPOUND call. 9166 The concept of last reference is server specific. However, if the 9167 numlinks field in the previous attributes of the object had the 9168 value 1, the client should not rely on referring to the object via 9169 a filehandle. Likewise, the client should not rely on the resources 9170 (disk space, directory entry, and so on) formerly associated with 9171 the object becoming immediately available. Thus, if a client needs 9172 to be able to continue to access a file after using REMOVE to 9173 remove it, the client should take steps to make sure that the file 9174 will still be accessible. The usual mechanism used is to RENAME 9175 the file from its old name to a new hidden name. 9177 If the server finds that the file is still open when the REMOVE 9178 arrives: 9180 o The server SHOULD NOT delete the file's directory entry if the file 9181 was opened with OPEN4_SHARE_DENY_WRITE or OPEN4_SHARE_DENY_BOTH. 9183 o If the file was not opened with OPEN4_SHARE_DENY_WRITE or 9184 OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's 9185 directory. However, until last CLOSE of the file, the server MAY 9186 continue to allow access to the file via its filehandle. 9188 ERRORS 9190 NFS4ERR_ACCESS 9191 NFS4ERR_BADCHAR 9192 NFS4ERR_BADHANDLE 9193 NFS4ERR_BADNAME 9194 NFS4ERR_BADXDR 9195 NFS4ERR_DELAY 9196 NFS4ERR_FHEXPIRED 9197 NFS4ERR_FILE_OPEN 9198 NFS4ERR_INVAL 9199 NFS4ERR_IO 9200 NFS4ERR_MOVED 9201 NFS4ERR_NAMETOOLONG 9202 NFS4ERR_NOENT 9204 Draft Specification NFS version 4 Protocol August 2002 9206 NFS4ERR_NOFILEHANDLE 9207 NFS4ERR_NOTDIR 9208 NFS4ERR_NOTEMPTY 9209 NFS4ERR_NOTSUPP 9210 NFS4ERR_RESOURCE 9211 NFS4ERR_ROFS 9212 NFS4ERR_SERVERFAULT 9213 NFS4ERR_STALE 9215 Draft Specification NFS version 4 Protocol August 2002 9217 14.2.27. Operation 29: RENAME - Rename Directory Entry 9219 SYNOPSIS 9221 (sfh), oldname, (cfh), newname -> source_change_info, 9222 target_change_info 9224 ARGUMENT 9226 struct RENAME4args { 9227 /* SAVED_FH: source directory */ 9228 component4 oldname; 9229 /* CURRENT_FH: target directory */ 9230 component4 newname; 9231 }; 9233 RESULT 9235 struct RENAME4resok { 9236 change_info4 source_cinfo; 9237 change_info4 target_cinfo; 9238 }; 9240 union RENAME4res switch (nfsstat4 status) { 9241 case NFS4_OK: 9242 RENAME4resok resok4; 9243 default: 9244 void; 9245 }; 9247 DESCRIPTION 9249 The RENAME operation renames the object identified by oldname in 9250 the source directory corresponding to the saved filehandle, as set 9251 by the SAVEFH operation, to newname in the target directory 9252 corresponding to the current filehandle. The operation is required 9253 to be atomic to the client. Source and target directories must 9254 reside on the same filesystem on the server. On success, the 9255 current filehandle will continue to be the target directory. 9257 If the target directory already contains an entry with the name, 9258 newname, the source object must be compatible with the target: 9259 either both are non-directories or both are directories and the 9260 target must be empty. If compatible, the existing target is 9261 removed before the rename occurs (See the IMPLEMENTATION subsection 9262 of the section "Operation 28: REMOVE - Remove Filesystem Object" 9263 for client and server actions whenever a target is removed). If 9264 they are not compatible or if the target is a directory but not 9265 empty, the server will return the error, NFS4ERR_EXIST. 9267 Draft Specification NFS version 4 Protocol August 2002 9269 If oldname and newname both refer to the same file (they might be 9270 hard links of each other), then RENAME should perform no action and 9271 return success. 9273 For both directories involved in the RENAME, the server returns 9274 change_info4 information. With the atomic field of the 9275 change_info4 struct, the server will indicate if the before and 9276 after change attributes were obtained atomically with respect to 9277 the rename. 9279 If the oldname refers to a named attribute and the saved and 9280 current filehandles refer to different filesystem objects, the 9281 server will return NFS4ERR_XDEV just as if the saved and current 9282 filehandles represented directories on different filesystems. 9284 If the oldname or newname has a length of 0 (zero), or if oldname 9285 or newname does not obey the UTF-8 definition, the error 9286 NFS4ERR_INVAL will be returned. 9288 IMPLEMENTATION 9290 The RENAME operation must be atomic to the client. The statement 9291 "source and target directories must reside on the same filesystem 9292 on the server" means that the fsid fields in the attributes for the 9293 directories are the same. If they reside on different filesystems, 9294 the error, NFS4ERR_XDEV, is returned. 9296 Based on the value of the fh_expire_type attribute for the object, 9297 the filehandle may or may not expire on a RENAME. However, server 9298 implementors are strongly encouraged to attempt to keep filehandles 9299 from expiring in this fashion. 9301 On some servers, the file names "." and ".." are illegal as either 9302 oldname or newname, and will result in the error NFS4ERR_BADNAME. 9303 In addition, on many servers the case of oldname or newname being 9304 an alias for the source directory will be checked for. Such 9305 servers will return the error NFS4ERR_INVAL in these cases. 9307 If either of the source or target filehandles are not directories, 9308 the server will return NFS4ERR_NOTDIR. 9310 ERRORS 9312 NFS4ERR_ACCESS 9313 NFS4ERR_BADCHAR 9314 NFS4ERR_BADHANDLE 9315 NFS4ERR_BADNAME 9316 NFS4ERR_BADXDR 9317 NFS4ERR_DELAY 9319 Draft Specification NFS version 4 Protocol August 2002 9321 NFS4ERR_DQUOT 9322 NFS4ERR_EXIST 9323 NFS4ERR_FHEXPIRED 9324 NFS4ERR_FILE_OPEN 9325 NFS4ERR_INVAL 9326 NFS4ERR_IO 9327 NFS4ERR_MOVED 9328 NFS4ERR_NAMETOOLONG 9329 NFS4ERR_NOENT 9330 NFS4ERR_NOFILEHANDLE 9331 NFS4ERR_NOSPC 9332 NFS4ERR_NOTDIR 9333 NFS4ERR_NOTEMPTY 9334 NFS4ERR_NOTSUPP 9335 NFS4ERR_RESOURCE 9336 NFS4ERR_ROFS 9337 NFS4ERR_SERVERFAULT 9338 NFS4ERR_STALE 9339 NFS4ERR_WRONGSEC 9341 Draft Specification NFS version 4 Protocol August 2002 9343 14.2.28. Operation 30: RENEW - Renew a Lease 9345 SYNOPSIS 9347 clientid -> () 9349 ARGUMENT 9351 struct RENEW4args { 9352 clientid4 clientid; 9353 }; 9355 RESULT 9357 struct RENEW4res { 9358 nfsstat4 status; 9359 }; 9361 DESCRIPTION 9363 The RENEW operation is used by the client to renew leases which it 9364 currently holds at a server. In processing the RENEW request, the 9365 server renews all leases associated with the client. The 9366 associated leases are determined by the clientid provided via the 9367 SETCLIENTID operation. 9369 IMPLEMENTATION 9371 ERRORS 9373 NFS4ERR_BADXDR 9374 NFS4ERR_EXPIRED 9375 NFS4ERR_INVAL 9376 NFS4ERR_LEASE_MOVED 9377 NFS4ERR_RESOURCE 9378 NFS4ERR_SERVERFAULT 9379 NFS4ERR_STALE_CLIENTID 9381 Draft Specification NFS version 4 Protocol August 2002 9383 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle 9385 SYNOPSIS 9387 (sfh) -> (cfh) 9389 ARGUMENT 9391 /* SAVED_FH: */ 9392 void; 9394 RESULT 9396 struct RESTOREFH4res { 9397 /* CURRENT_FH: value of saved fh */ 9398 nfsstat4 status; 9399 }; 9401 DESCRIPTION 9403 Set the current filehandle to the value in the saved filehandle. 9404 If there is no saved filehandle then return an error 9405 NFS4ERR_NOFILEHANDLE. 9407 IMPLEMENTATION 9409 Operations like OPEN and LOOKUP use the current filehandle to 9410 represent a directory and replace it with a new filehandle. 9411 Assuming the previous filehandle was saved with a SAVEFH operator, 9412 the previous filehandle can be restored as the current filehandle. 9413 This is commonly used to obtain post-operation attributes for the 9414 directory, e.g. 9416 PUTFH (directory filehandle) 9417 SAVEFH 9418 GETATTR attrbits (pre-op dir attrs) 9419 CREATE optbits "foo" attrs 9420 GETATTR attrbits (file attributes) 9421 RESTOREFH 9422 GETATTR attrbits (post-op dir attrs) 9424 ERRORS 9426 NFS4ERR_BADHANDLE 9427 NFS4ERR_FHEXPIRED 9428 NFS4ERR_MOVED 9430 Draft Specification NFS version 4 Protocol August 2002 9432 NFS4ERR_NOFILEHANDLE 9433 NFS4ERR_RESOURCE 9434 NFS4ERR_RESTOREFH 9435 NFS4ERR_SERVERFAULT 9436 NFS4ERR_STALE 9437 NFS4ERR_WRONGSEC 9439 Draft Specification NFS version 4 Protocol August 2002 9441 14.2.30. Operation 32: SAVEFH - Save Current Filehandle 9443 SYNOPSIS 9445 (cfh) -> (sfh) 9447 ARGUMENT 9449 /* CURRENT_FH: */ 9450 void; 9452 RESULT 9454 struct SAVEFH4res { 9455 /* SAVED_FH: value of current fh */ 9456 nfsstat4 status; 9457 }; 9459 DESCRIPTION 9461 Save the current filehandle. If a previous filehandle was saved 9462 then it is no longer accessible. The saved filehandle can be 9463 restored as the current filehandle with the RESTOREFH operator. 9465 On success, the current filehandle retains its value. 9467 IMPLEMENTATION 9469 ERRORS 9471 NFS4ERR_BADHANDLE 9472 NFS4ERR_FHEXPIRED 9473 NFS4ERR_MOVED 9474 NFS4ERR_NOFILEHANDLE 9475 NFS4ERR_RESOURCE 9476 NFS4ERR_SERVERFAULT 9477 NFS4ERR_STALE 9479 Draft Specification NFS version 4 Protocol August 2002 9481 14.2.31. Operation 33: SECINFO - Obtain Available Security 9483 SYNOPSIS 9485 (cfh), name -> { secinfo } 9487 ARGUMENT 9489 struct SECINFO4args { 9490 /* CURRENT_FH: directory */ 9491 component4 name; 9492 }; 9494 RESULT 9496 enum rpc_gss_svc_t {/* From RFC 2203 */ 9497 RPC_GSS_SVC_NONE = 1, 9498 RPC_GSS_SVC_INTEGRITY = 2, 9499 RPC_GSS_SVC_PRIVACY = 3 9500 }; 9502 struct rpcsec_gss_info { 9503 sec_oid4 oid; 9504 qop4 qop; 9505 rpc_gss_svc_t service; 9506 }; 9508 union secinfo4 switch (uint32_t flavor) { 9509 case RPCSEC_GSS: 9510 rpcsec_gss_info flavor_info; 9511 default: 9512 void; 9513 }; 9515 typedef secinfo4 SECINFO4resok<>; 9517 union SECINFO4res switch (nfsstat4 status) { 9518 case NFS4_OK: 9519 SECINFO4resok resok4; 9520 default: 9521 void; 9522 }; 9524 DESCRIPTION 9526 The SECINFO operation is used by the client to obtain a list of 9527 valid RPC authentication flavors for a specific directory 9528 filehandle, file name pair. SECINFO should apply the same access 9530 Draft Specification NFS version 4 Protocol August 2002 9532 methodology used for LOOKUP when evaluating the name. Therefore, 9533 if the requestor does not have the appropriate access to LOOKUP the 9534 name then SECINFO must behave the same way and return 9535 NFS4ERR_ACCESS. 9537 The result will contain an array which represents the security 9538 mechanisms available, with an order corresponding to server's 9539 preferences, the most preferred being first in the array. The 9540 client is free to pick whatever security mechanism it both desires 9541 and supports, or to pick in the server's preference order the first 9542 one it supports. The array entries are represented by the secinfo4 9543 structure. The field 'flavor' will contain a value of AUTH_NONE, 9544 AUTH_SYS (as defined in [RFC1831]), or RPCSEC_GSS (as defined in 9545 [RFC2203]). 9547 For the flavors AUTH_NONE and AUTH_SYS, no additional security 9548 information is returned. For a return value of RPCSEC_GSS, a 9549 security triple is returned that contains the mechanism object id 9550 (as defined in [RFC2743]), the quality of protection (as defined in 9551 [RFC2743]) and the service type (as defined in [RFC2203]). It is 9552 possible for SECINFO to return multiple entries with flavor equal 9553 to RPCSEC_GSS with different security triple values. 9555 On success, the current filehandle retains its value. 9557 If the name has a length of 0 (zero), or if name does not obey the 9558 UTF-8 definition, the error NFS4ERR_INVAL will be returned. 9560 IMPLEMENTATION 9562 The SECINFO operation is expected to be used by the NFS client when 9563 the error value of NFS4ERR_WRONGSEC is returned from another NFS 9564 operation. This signifies to the client that the server's security 9565 policy is different from what the client is currently using. At 9566 this point, the client is expected to obtain a list of possible 9567 security flavors and choose what best suits its policies. 9569 As mentioned, the server's security policies will determine when a 9570 client request receives NFS4ERR_WRONGSEC. The operations which may 9571 receive this error are: LINK, LOOKUP, OPEN, PUTFH, PUTPUBFH, 9572 PUTROOTFH, RESTOREFH, RENAME, and indirectly READDIR. LINK and 9573 RENAME will only receive this error if the security used for the 9574 operation is inappropriate for saved filehandle. With the exception 9575 of READDIR, With the exception of READDIR, these operations 9576 represent the point at which the client can instantiate a 9577 filehandle into the "current filehandle" at the server. The 9578 filehandle is either provided by the client (PUTFH, PUTPUBFH, 9579 PUTROOTFH) or generated as a result of a name to filehandle 9580 translation (LOOKUP and OPEN). RESTOREFH is different because the 9581 filehandle is a result of a previous SAVEFH. Even though the 9582 filehandle, for RESTOREFH, might have previously passed the 9584 Draft Specification NFS version 4 Protocol August 2002 9586 server's inspection for a security match, the server will check it 9587 again on RESTOREFH to ensure that the security policy has not 9588 changed. 9590 If the client wants to resolve an error return of NFS4ERR_WRONGSEC, 9591 the following will occur: 9593 o For LOOKUP and OPEN, the client will use SECINFO with the same 9594 current filehandle and name as provided in the original LOOKUP 9595 or OPEN to enumerate the available security triples. 9597 o For LINK, PUTFH, RENAME, and RESTOREFH, the client will use 9598 SECINFO and provide the parent directory filehandle and object 9599 name which corresponds to the filehandle originally provided by 9600 the PUTFH RESTOREFH, or for LINK and RENAME, the SAVEFH. 9602 o For PUTROOTFH and PUTPUBFH, the client will be unable to use 9603 the SECINFO operation since SECINFO requires a current 9604 filehandle and none exist for these two operations. Therefore, 9605 the client must iterate through the security triples available 9606 at the client and reattempt the PUTROOTFH or PUTPUBFH 9607 operation. In the unfortunate event none of the MANDATORY 9608 security triples are supported by the client and server, the 9609 client SHOULD try using others that support integrity. Failing 9610 that, the client can try using AUTH_NONE, but because such 9611 forms lack integrity checks, this puts the client at risk. 9612 Nonetheless, the server SHOULD allow the client to use whatever 9613 security form the client requests and the server supports, 9614 since the risks of doing so are on the client. 9616 The READDIR operation will not directly return the NFS4ERR_WRONGSEC 9617 error. However, if the READDIR request included a request for 9618 attributes, it is possible that the READDIR request's security 9619 triple does not match that of a directory entry. If this is the 9620 case and the client has requested the rdattr_error attribute, the 9621 server will return the NFS4ERR_WRONGSEC error in rdattr_error for 9622 the entry. 9624 See the section "Security Considerations" for a discussion on the 9625 recommendations for security flavor used by SECINFO. 9627 ERRORS 9629 NFS4ERR_ACCESS 9630 NFS4ERR_BADCHAR 9631 NFS4ERR_BADHANDLE 9632 NFS4ERR_BADNAME 9633 NFS4ERR_BADXDR 9634 NFS4ERR_FHEXPIRED 9635 NFS4ERR_INVAL 9636 NFS4ERR_MOVED 9638 Draft Specification NFS version 4 Protocol August 2002 9640 NFS4ERR_NAMETOOLONG 9641 NFS4ERR_NOENT 9642 NFS4ERR_NOFILEHANDLE 9643 NFS4ERR_NOTDIR 9644 NFS4ERR_RESOURCE 9645 NFS4ERR_SERVERFAULT 9646 NFS4ERR_STALE 9648 Draft Specification NFS version 4 Protocol August 2002 9650 14.2.32. Operation 34: SETATTR - Set Attributes 9652 SYNOPSIS 9654 (cfh), stateid, attrmask, attr_vals -> attrsset 9656 ARGUMENT 9658 struct SETATTR4args { 9659 /* CURRENT_FH: target object */ 9660 stateid4 stateid; 9661 fattr4 obj_attributes; 9662 }; 9664 RESULT 9666 struct SETATTR4res { 9667 nfsstat4 status; 9668 bitmap4 attrsset; 9669 }; 9671 DESCRIPTION 9673 The SETATTR operation changes one or more of the attributes of a 9674 filesystem object. The new attributes are specified with a bitmap 9675 and the attributes that follow the bitmap in bit order. 9677 The stateid argument for SETATTR is used to provide file locking 9678 context that is necessary for SETATTR requests that set the size 9679 attribute. Since setting the size attribute modifies the file's 9680 data, it has the same locking requirements as a corresponding 9681 WRITE. Any SETATTR that sets the size attribute is incompatible 9682 with a share reservation that specifies DENY_WRITE. The area 9683 between the old end-of-file and the new end-of-file is considered 9684 to be modified just as would have been the case had the area in 9685 question been specified as the target of WRITE, for the purpose of 9686 checking conflicts with record locks, for those cases in which a 9687 server is implementing mandatory record locking behavior. A valid 9688 stateid should always be specified. When the file size attribute 9689 is not set, the special stateid consisting of all bits zero should 9690 be passed. 9692 On either success or failure of the operation, the server will 9693 return the attrsset bitmask to represent what (if any) attributes 9694 were successfully set. The attrsset in the response is a subset of 9695 the bitmap4 that is part of the obj_attributes in the argument. 9697 On success, the current filehandle retains its value. 9699 Draft Specification NFS version 4 Protocol August 2002 9701 IMPLEMENTATION 9703 If the request specifies the owner attribute to be set, the server 9704 should allow the operation to succeed if the current owner of the 9705 object matches the value specified in the request. Some servers 9706 may be implemented in a way as to prohibit the setting of the owner 9707 attribute unless the requestor has privilege to do so. If the 9708 server is lenient in this one case of matching owner values, the 9709 client implementation may be simplified in cases of creation of an 9710 object followed by a SETATTR. 9712 The file size attribute is used to request changes to the size of a 9713 file. A value of 0 (zero) causes the file to be truncated, a value 9714 less than the current size of the file causes data from new size to 9715 the end of the file to be discarded, and a size greater than the 9716 current size of the file causes logically zeroed data bytes to be 9717 added to the end of the file. Servers are free to implement this 9718 using holes or actual zero data bytes. Clients should not make any 9719 assumptions regarding a server's implementation of this feature, 9720 beyond that the bytes returned will be zeroed. Servers must 9721 support extending the file size via SETATTR. 9723 SETATTR is not guaranteed atomic. A failed SETATTR may partially 9724 change a file's attributes. 9726 Changing the size of a file with SETATTR indirectly changes the 9727 time_modify. A client must account for this as size changes can 9728 result in data deletion. 9730 The attributes time_access_set and time_modify_set are write-only 9731 attributes constructed as a switched union so the client can direct 9732 the server in setting the time values. If the switched union 9733 specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 9734 to be used for the operation. If the switch union does not specify 9735 SET_TO_CLIENT_TIME4, the server is to use its current time for the 9736 SETATTR operation. 9738 If server and client times differ, programs that compare client 9739 time to file times can break. A time maintenance protocol should be 9740 used to limit client/server time skew. 9742 Use of a COMPOUND containing a VERIFY operation specifying only the 9743 change attribute, immediately followed by a SETATTR, provides a 9744 means whereby a client may specify a request that emulates the 9745 functionality of the SETATTR guard mechanism of NFS version 3. 9746 Since the function of the guard mechanism is to avoid changes to 9747 the file attributes based on stale information, delays between 9748 checking of the guard condition and the setting of the attributes 9749 have the potential to compromise this function, as would the 9750 corresponding delay in the NFS version 4 emulation. Therefore, NFS 9751 version 4 servers should take care to avoid such delays, to the 9752 degree possible, when executing such a request. 9754 Draft Specification NFS version 4 Protocol August 2002 9756 If the server does not support an attribute as requested by the 9757 client, the server should return NFS4ERR_ATTRNOTSUPP. 9759 A mask of the attibutes actually set is returned by SETATTR in all 9760 cases. That mask must not include attributes bits not requested to 9761 be set by the client, and must be equal to the mask of attributes 9762 requested to be set only if the SETATTR completes without error. 9764 ERRORS 9766 NFS4ERR_ACCESS 9767 NFS4ERR_ATTRNOTSUPP 9768 NFS4ERR_BADCHAR 9769 NFS4ERR_BADHANDLE 9770 NFS4ERR_BADOWNER 9771 NFS4ERR_BAD_STATEID 9772 NFS4ERR_BADXDR 9773 NFS4ERR_DELAY 9774 NFS4ERR_DQUOT 9775 NFS4ERR_EXPIRED 9776 NFS4ERR_FBIG 9777 NFS4ERR_FHEXPIRED 9778 NFS4ERR_GRACE 9779 NFS4ERR_INVAL 9780 NFS4ERR_IO 9781 NFS4ERR_ISDIR 9782 NFS4ERR_LOCKED 9783 NFS4ERR_MOVED 9784 NFS4ERR_NOFILEHANDLE 9785 NFS4ERR_NOSPC 9786 NFS4ERR_NOTSUPP 9787 NFS4ERR_OLD_STATEID 9788 NFS4ERR_OPENMODE 9789 NFS4ERR_PERM 9790 NFS4ERR_RESOURCE 9791 NFS4ERR_ROFS 9792 NFS4ERR_SERVERFAULT 9793 NFS4ERR_STALE 9794 NFS4ERR_STALE_STATEID 9796 Draft Specification NFS version 4 Protocol August 2002 9798 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid 9800 SYNOPSIS 9802 client, callback, callback_ident -> clientid, setclientid_confirm 9804 ARGUMENT 9806 struct SETCLIENTID4args { 9807 nfs_client_id4 client; 9808 cb_client4 callback; 9809 uint32_t callback_ident; 9810 }; 9812 RESULT 9814 struct SETCLIENTID4resok { 9815 clientid4 clientid; 9816 verifier4 setclientid_confirm; 9817 }; 9819 union SETCLIENTID4res switch (nfsstat4 status) { 9820 case NFS4_OK: 9821 SETCLIENTID4resok resok4; 9822 case NFS4ERR_CLID_INUSE: 9823 clientaddr4 client_using; 9824 default: 9825 void; 9826 }; 9828 DESCRIPTION 9830 The client uses SETCLIENTID operation to notify the server of its 9831 intention to use a particular client identifier, callback, and 9832 callback_ident for subsequent requests that entail creating lock, 9833 share reservation, and delegation state on the server. Upon 9834 successful completion the server will return a short hand clientid 9835 which, if confirmed via a separate step, will be used in subsequent 9836 file locking and file open requests. Confirmation of the clientid 9837 must be done via the SETCLIENTID_CONFIRM operation to return the 9838 clientid and setclientid_confirm values, as verifiers, to the 9839 server. The reason why two verifiers are necessary is that it is 9840 possible to use SETCLIENTID and SETCLIENTID_CONFIRM to modify the 9841 callback and callback_ident information but not the short hand 9842 clientid. In that event, the setclientid_confirm value is 9843 effectively the only verifier. 9845 The callback information provided in this operation will be used if 9846 the client is provided an open delegation at a future point. 9848 Draft Specification NFS version 4 Protocol August 2002 9850 Therefore, the client must correctly reflect the program and port 9851 numbers for the callback program at the time SETCLIENTID is used. 9853 The callback_ident value is used by the server on the callback. 9854 The client can use leverage the callback_ident eliminate the need 9855 for more than one callback RPC program number while still being 9856 able to determine which server is initiating the callback. 9858 IMPLEMENTATION 9860 To understand how to implement SETCLIENTID, make the following 9861 notations. Let: 9863 x be the value of the client.id subfield of the SETCLIENTID4args 9864 structure. 9866 v be the value of the client.verifier subfield of the 9867 SETCLIENTID4args structure. 9869 c be the value of the clientid field returned in the 9870 SETCLIENTID4resok structure. 9872 k represent the value combination of the fields callback and 9873 callback_ident fields of the SETCLIENTID4args structure. 9875 s be the setclientid_confirm value returned in the 9876 SETCLIENTID4resok structure. 9878 { x, v, c, k, s } 9879 be a quintuple for a client record. A client record is 9880 confirmed if there has been a SETCLIENTID_CONFIRM operation to 9881 confirm it. Otherwise it is unconfirmed. An unconfirmed 9882 record is established by a SETCLIENTID call. 9884 Since SETCLIENTID is a non-idempotent operation, let us assume that 9885 the server is implementing the duplicate request cache (DRC). 9887 When the server gets a SETCLIENTID { v, x, k } request, it 9888 processes it in the following manner. 9890 o It first looks up the request in the DRC. If there is a hit, it 9891 returns the result cached in the DRC. The server does NOT remove 9892 client state (locks, shares, delegations) nor does it modify any 9893 recorded callback and callback_ident information for client { x 9894 }. 9896 For any DRC miss, the server takes the client id string x, and 9897 searches for client records for x that the server may have 9898 recorded from previous SETCLIENTID calls. For any confirmed 9900 Draft Specification NFS version 4 Protocol August 2002 9902 record with the same id string x, if the recorded principal does 9903 not match that of SETCLIENTID call, then the server returns a 9904 NFS4ERR_CLID_INUSE error. 9906 For brevity of discussion, the remaining description of the 9907 processing assumes that there was a DRC miss, and that where the 9908 server has previously recorded a confirmed record for client x, 9909 the aforementioned principal check has successfully passed. 9911 o The server checks if it has recorded a confirmed recorded for { 9912 v, x, c, l, s }, where l may or may not equal k. If so, and since 9913 the id verifier v of the request matches that which is confirmed 9914 and recorded, the server treats this as a probable callback 9915 information update and records an unconfirmed { v, x, c, k, t } 9916 and leaves the confirmed { v, x, c, l, s } in place, such that t 9917 != s. It does not matter if k equals l or not. Any pre-existing 9918 unconfirmed { v, x, c, *, * } is removed. 9920 The server returns { c, t }. It is indeed returning the old 9921 clientid4 value c, because the client apparently only wants to 9922 update callback value k to value l. It's possible this request 9923 is one from the Byzantine router that has stale callback 9924 information, but this is not a problem. The callback information 9925 update is only confirmed if followed up by a SETCLIENTID_CONFIRM 9926 { c, t }. 9928 The server awaits confirmation of k via SETCLIENTID_CONFIRM { c, 9929 t }. 9931 The server does NOT remove client (lock/share/delegation) state 9932 for x. 9934 o The server has previously recorded a confirmed { u, x, c, l, s } 9935 record such that v != u, l may or may not equal k, and has not 9936 recorded any unconfirmed { *, x, *, *, * } record for x. The 9937 server records an unconfirmed { v, x, d, k, t } (d != c, t != s). 9939 The server returns { d, t }. 9941 The server awaits confirmation of { d, k } via 9942 SETCLIENTID_CONFIRM { d, t }. 9944 The server does NOT remove client (lock/share/delegation) state 9945 for x. 9947 o The server has previously recorded a confirmed { u, x, c, l, s } 9948 record such that v != u, l may or may not equal k, and recorded 9949 an unconfirmed { w, x, d, m, t } record such that c != d, t != s, 9950 m may or may not equal k, m may or may not equal l, and k may or 9952 Draft Specification NFS version 4 Protocol August 2002 9954 may not equal l. Whether w == v or w != v makes no difference. 9955 The server simply removes the unconfirmed { w, x, d, m, t } 9956 record and replaces it with an unconfirmed { v, x, e, k, r } 9957 record, such that e != d, e != c, r != t, r != s. 9959 The server returns { e, r }. 9961 The server awaits confirmation of { e, k } via 9962 SETCLIENTID_CONFIRM { e, r }. 9964 The server does NOT remove client (lock/share/delegation) state 9965 for x. 9967 o The server has no confirmed { *, x, *, *, * } for x. It may or 9968 may not have recorded an unconfirmed { u, x, c, l, s }, where l 9969 may or may not equal k, and u may or may not equal v. Any 9970 unconfirmed record { u, x, c, l, * }, regardless whether u == v 9971 or l == k, is replaced with an unconfirmed record { v, x, d, k, t 9972 } where d != c, t != s. 9974 The server returns { d, t }. 9976 The server awaits confirmation of { d, k } via 9977 SETCLIENTID_CONFIRM { d, t }. The server does NOT remove client 9978 (lock/share/delegation) state for x. 9980 The server generates the clientid and setclientid_confirm values 9981 and must take care to ensure that these values are extremely 9982 unlikely to ever be regenerated. 9984 ERRORS 9986 NFS4ERR_BADXDR 9987 NFS4ERR_CLID_INUSE 9988 NFS4ERR_INVAL 9989 NFS4ERR_RESOURCE 9990 NFS4ERR_SERVERFAULT 9992 Draft Specification NFS version 4 Protocol August 2002 9994 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 9996 SYNOPSIS 9998 clientid, verifier -> - 10000 ARGUMENT 10002 struct SETCLIENTID_CONFIRM4args { 10003 clientid4 clientid; 10004 verifier4 setclientid_confirm; 10005 }; 10007 RESULT 10009 struct SETCLIENTID_CONFIRM4res { 10010 nfsstat4 status; 10011 }; 10013 DESCRIPTION 10015 This operation is used by the client to confirm the results from a 10016 previous call to SETCLIENTID. The client provides the server 10017 supplied (from a SETCLIENTID response) clientid. The server 10018 responds with a simple status of success or failure. 10020 IMPLEMENTATION 10022 The client must use the SETCLIENTID_CONFIRM operation to confirm 10023 the following two distinct cases: 10025 o The client's use of a new shorthand client identifier (as 10026 returned from the server in the response to SETCLIENTID), a new 10027 callback value (as specified in the arguments to SETCLIENTID) and 10028 a new callback_ident (as specified in the arguments to 10029 SETCLIENTID) value. The client's use of SETCLIENTID_CONFIRM in 10030 this case also confirms the removal of any of the client's 10031 previous relevant leased state. Relevant leased client state 10032 includes record locks, share reservations, and where the server 10033 does not support the CLAIM_DELEGATE_PREV claim type, delegations. 10034 If the server supports CLAIM_DELEGATE_PREV, then 10035 SETCLIENTID_CONFIRM MUST NOT remove delegations for this client; 10036 relevant leased client state would then just include record locks 10037 and share reservations. 10039 o The client's re-use of an old, previously confirmed, shorthand 10040 client identifier, a new callback value, and a new callback_ident 10042 Draft Specification NFS version 4 Protocol August 2002 10044 value. The client's use of SETCLIENTID_CONFIRM in this case MUST 10045 NOT result in the removal of any previous leased state (locks, 10046 share reservations, and delegations) 10048 We use the same notation and definitions for v, x, c, k, s, and 10049 unconfirmed and confirmed client records as introduced in the 10050 description of the SETCLIENTID operation. The arguments to 10051 SETCLIENTID_CONFIRM are indicated by the notation { c, s }, where c 10052 is a value of type clientid4, and s is a value of type verifier4 10053 corresponding to the setclientid_confirm field. 10055 As with SETCLIENTID, SETCLIENTID_CONFIRM is a nonidempotent 10056 operation, and we assume that the server is implementing the 10057 duplicate request cache (DRC). 10059 When the server gets a SETCLIENTID_CONFIRM { c, s } request, it 10060 processes it in the following manner. 10062 o It first looks up the request in the DRC. If there is a hit, it 10063 returns the result cached in the DRC. The server does not remove 10064 any relevant leased client state nor does it modify any recorded 10065 callback and callback_ident information for client { x } as 10066 represented by the short hand value c. 10068 For a DRC miss, the server checks for client records that match the 10069 short hand value c. The processing cases are as follows: 10071 o The server has recorded an unconfirmed { v, x, c, k, s } record 10072 and a confirmed { v, x, c, l, t } record, such that s != t. If 10073 the principals of the records do not match that of the 10074 SETCLIENTID_CONFIRM, the server returns NFS4ERR_CLID_INUSE, and 10075 no relevant leased client state is removed and no recorded 10076 callback and callback_ident information for client { x } is 10077 changed. Otherwise, the confirmed { v, x, c, l, t } record is 10078 removed and the unconfirmed { v, x, c, k, s } is marked as 10079 confirmed, thereby modifying recorded and confirmed callback and 10080 callback_ident information for client { x }. 10082 The server does not remove any relevant leased client state. 10084 The server returns NFS4_OK. 10086 o The server has not recorded an unconfirmed { v, x, c, *, * } and 10087 has recorded a confirmed { v, x, c, *, s }. If the principals of 10088 the record and of SETCLIENTID_CONFIRM do not match, the server 10089 returns NFS4ERR_CLID_INUSE without removing any relevant leased 10090 client state and without changing recorded callback and 10091 callback_ident values for client { x }. 10093 If the principals match, then what has likely happened is that 10094 the client never got the response from the SETCLIENTID_CONFIRM, 10095 and the DRC entry has been purged. Whatever the scenario, since 10097 Draft Specification NFS version 4 Protocol August 2002 10099 the principals match, as well as { c, s } matching a confirmed 10100 record, the server leaves client x's relevant leased client state 10101 intact, leaves its callback and callback_ident values unmodified, 10102 and returns NFS4_OK. 10104 o The server has not recorded a confirmed { *, *, c, *, * }, and 10105 has recorded an unconfirmed { *, x, c, k, s }. Even if this is a 10106 retry from client, nonetheless the client's first 10107 SETCLIENTID_CONFIRM attempt was not received by the server. 10108 Retry or not, the server doesn't know, but it processes it as if 10109 were a first try. If the principal of the unconfirmed { *, x, c, 10110 k, s } record mismatches that of the SETCLIENTID_CONFIRM request 10111 the server returns NFS4ERR_CLID_INUSE without removing any 10112 relevant leased client state. 10114 Otherwise, the server records a confirmed { *, x, c, k, s }. If 10115 there is also a confirmed { *, x, d, *, t }, the server MUST 10116 remove the client x's relevant leased client state, and overwrite 10117 the callback state with k. The confirmed record { *, x, d, *, t } 10118 is removed. 10120 Server returns NFS4_OK. 10122 o The server has no record of a confirmed or unconfirmed { *, *, c, 10123 *, s }. Return NFS4ERR_STALE_CLIENTID. The server does not 10124 remove any relevant leased client state, nor does it modify any 10125 recorded callback and callback_ident information for any client. 10127 The server needs to cache unconfirmed { v, x, c, k, s } client 10128 records and await for some time their confirmation. As should be 10129 clear from the record processing discussions for SETCLIENTID and 10130 SETCLIENTID_CONFIRM, there are cases where the server does not 10131 deterministically remove unconfirmed client records. To avoid 10132 running out of resources, the server is not required to hold 10133 unconfirmed records indefinitely. One strategy the server might 10134 use is to set a limit on how many unconfirmed client records it 10135 will maintain, and then when the limit would be exceeded, remove 10136 the oldest record. Another strategy might be to remove an 10137 unconfirmed record when some amount of time has elapsed. The choice 10138 of the amount of time is fairly arbitrary but it is surely no 10139 higher than the server's lease time period. Consider that leases 10140 need to be renewed before the lease time expires via an operation 10141 from the client. If the client cannot issue a SETCLIENTID_CONFIRM 10142 after a SETCLIENTID before a perod of time equal to that of a lease 10143 expires, then the client is unlikely to be able maintain state on 10144 the server during steady state operation. 10146 If the client does send a SETCLIENTID_CONFIRM for an unconfirmed 10147 record that the server has already deleted, the client will get 10148 NFS4ERR_STALE_CLIENTID back. If so, the client should then start 10149 over, and send SETCLIENTID to reestablish an unconfirmed client 10150 record and get back an unconfirmed clientid and setclientid_confirm 10152 Draft Specification NFS version 4 Protocol August 2002 10154 verifier. The client should then send the SETCLIENTID_CONFIRM to 10155 confirm the clientid. 10157 SETCLIENTID_CONFIRM does not establish or renew a lease. However, 10158 if SETCLIENTID_CONFIRM removes relevant leased client state, and 10159 that state does not include existing delegations, the server MUST 10160 allow the client a period of time no less than the value of 10161 lease_time attribute, to reclaim, (via the CLAIM_DELEGATE_PREV 10162 claim tpe of the OPEN operation) its delegations before removing 10163 unreclaimed delegations. 10165 ERRORS 10167 NFS4ERR_BADXDR 10168 NFS4ERR_CLID_INUSE 10169 NFS4ERR_INVAL 10170 NFS4ERR_RESOURCE 10171 NFS4ERR_SERVERFAULT 10172 NFS4ERR_STALE_CLIENTID 10174 Draft Specification NFS version 4 Protocol August 2002 10176 14.2.35. Operation 37: VERIFY - Verify Same Attributes 10178 SYNOPSIS 10180 (cfh), fattr -> - 10182 ARGUMENT 10184 struct VERIFY4args { 10185 /* CURRENT_FH: object */ 10186 fattr4 obj_attributes; 10187 }; 10189 RESULT 10191 struct VERIFY4res { 10192 nfsstat4 status; 10193 }; 10195 DESCRIPTION 10197 The VERIFY operation is used to verify that attributes have a value 10198 assumed by the client before proceeding with following operations 10199 in the compound request. If any of the attributes do not match 10200 then the error NFS4ERR_NOT_SAME must be returned. The current 10201 filehandle retains its value after successful completion of the 10202 operation. 10204 IMPLEMENTATION 10206 One possible use of the VERIFY operation is the following compound 10207 sequence. With this the client is attempting to verify that the 10208 file being removed will match what the client expects to be 10209 removed. This sequence can help prevent the unintended deletion of 10210 a file. 10212 PUTFH (directory filehandle) 10213 LOOKUP (file name) 10214 VERIFY (filehandle == fh) 10215 PUTFH (directory filehandle) 10216 REMOVE (file name) 10218 This sequence does not prevent a second client from removing and 10219 creating a new file in the middle of this sequence but it does help 10220 avoid the unintended result. 10222 In the case that a recommended attribute is specified in the VERIFY 10223 operation and the server does not support that attribute for the 10224 filesystem object, the error NFS4ERR_NOTSUPP is returned to the 10226 Draft Specification NFS version 4 Protocol August 2002 10228 client. 10230 When the attribute rdattr_error or any write-only attribute (e.g. 10231 time_modify_set) is specified, the error NFS4ERR_INVAL is returned to 10232 the client. If both of these conditions apply, the server is free to 10233 return either error. 10235 ERRORS 10237 NFS4ERR_ACCESS 10238 NFS4ERR_ATTRNOTSUPP 10239 NFS4ERR_BADCHAR 10240 NFS4ERR_BADHANDLE 10241 NFS4ERR_BADXDR 10242 NFS4ERR_DELAY 10243 NFS4ERR_FHEXPIRED 10244 NFS4ERR_INVAL 10245 NFS4ERR_MOVED 10246 NFS4ERR_NOFILEHANDLE 10247 NFS4ERR_NOTSUPP 10248 NFS4ERR_NOT_SAME 10249 NFS4ERR_RESOURCE 10250 NFS4ERR_SERVERFAULT 10251 NFS4ERR_STALE 10253 Draft Specification NFS version 4 Protocol August 2002 10255 14.2.36. Operation 38: WRITE - Write to File 10257 SYNOPSIS 10259 (cfh), stateid, offset, stable, data -> count, committed, writeverf 10261 ARGUMENT 10263 enum stable_how4 { 10264 UNSTABLE4 = 0, 10265 DATA_SYNC4 = 1, 10266 FILE_SYNC4 = 2 10267 }; 10269 struct WRITE4args { 10270 /* CURRENT_FH: file */ 10271 stateid4 stateid; 10272 offset4 offset; 10273 stable_how4 stable; 10274 opaque data<>; 10275 }; 10277 RESULT 10279 struct WRITE4resok { 10280 count4 count; 10281 stable_how4 committed; 10282 verifier4 writeverf; 10283 }; 10285 union WRITE4res switch (nfsstat4 status) { 10286 case NFS4_OK: 10287 WRITE4resok resok4; 10288 default: 10289 void; 10290 }; 10292 DESCRIPTION 10294 The WRITE operation is used to write data to a regular file. The 10295 target file is specified by the current filehandle. The offset 10296 specifies the offset where the data should be written. An offset 10297 of 0 (zero) specifies that the write should start at the beginning 10298 of the file. The count, as encoded as part of the opaque data 10299 parameter, represents the number of bytes of data that are to be 10300 written. If the count is 0 (zero), the WRITE will succeed and 10301 return a count of 0 (zero) subject to permissions checking. The 10302 server may choose to write fewer bytes than requested by the 10303 client. 10305 Draft Specification NFS version 4 Protocol August 2002 10307 Part of the write request is a specification of how the write is to 10308 be performed. The client specifies with the stable parameter the 10309 method of how the data is to be processed by the server. If stable 10310 is FILE_SYNC4, the server must commit the data written plus all 10311 filesystem metadata to stable storage before returning results. 10312 This corresponds to the NFS version 2 protocol semantics. Any 10313 other behavior constitutes a protocol violation. If stable is 10314 DATA_SYNC4, then the server must commit all of the data to stable 10315 storage and enough of the metadata to retrieve the data before 10316 returning. The server implementor is free to implement DATA_SYNC4 10317 in the same fashion as FILE_SYNC4, but with a possible performance 10318 drop. If stable is UNSTABLE4, the server is free to commit any 10319 part of the data and the metadata to stable storage, including all 10320 or none, before returning a reply to the client. There is no 10321 guarantee whether or when any uncommitted data will subsequently be 10322 committed to stable storage. The only guarantees made by the server 10323 are that it will not destroy any data without changing the value of 10324 verf and that it will not commit the data and metadata at a level 10325 less than that requested by the client. 10327 The stateid value for a WRITE request represents a value returned 10328 from a previous record lock or share reservation request. The 10329 stateid is used by the server to verify that the associated share 10330 reservation and any record locks are still valid and to update 10331 lease timeouts for the client. 10333 Upon successful completion, the following results are returned. 10334 The count result is the number of bytes of data written to the 10335 file. The server may write fewer bytes than requested. If so, the 10336 actual number of bytes written starting at location, offset, is 10337 returned. 10339 The server also returns an indication of the level of commitment of 10340 the data and metadata via committed. If the server committed all 10341 data and metadata to stable storage, committed should be set to 10342 FILE_SYNC4. If the level of commitment was at least as strong as 10343 DATA_SYNC4, then committed should be set to DATA_SYNC4. Otherwise, 10344 committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, 10345 then committed must also be FILE_SYNC4: anything else constitutes a 10346 protocol violation. If stable was DATA_SYNC4, then committed may be 10347 FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol 10348 violation. If stable was UNSTABLE4, then committed may be either 10349 FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. 10351 The final portion of the result is the write verifier. The write 10352 verifier is a cookie that the client can use to determine whether 10353 the server has changed instance (boot) state between a call to 10354 WRITE and a subsequent call to either WRITE or COMMIT. This cookie 10355 must be consistent during a single instance of the NFS version 4 10356 protocol service and must be unique between instances of the NFS 10357 version 4 protocol server, where uncommitted data may be lost. 10359 Draft Specification NFS version 4 Protocol August 2002 10361 If a client writes data to the server with the stable argument set 10362 to UNSTABLE4 and the reply yields a committed response of 10363 DATA_SYNC4 or UNSTABLE4, the client will follow up some time in the 10364 future with a COMMIT operation to synchronize outstanding 10365 asynchronous data and metadata with the server's stable storage, 10366 barring client error. It is possible that due to client crash or 10367 other error that a subsequent COMMIT will not be received by the 10368 server. 10370 For a WRITE with a stateid value of all bits 0, the server MAY 10371 allow the WRITE to be serviced subject to mandatory file locks or 10372 the current share deny modes for the file. For a WRITE with a 10373 stateid value of all bits 1, the server MUST NOT allow the WRITE 10374 operation to bypass locking checks at the server and are treated 10375 exactly the same as if a stateid of all bits 0 were used. 10377 On success, the current filehandle retains its value. 10379 IMPLEMENTATION 10381 It is possible for the server to write fewer bytes of data than 10382 requested by the client. In this case, the server should not 10383 return an error unless no data was written at all. If the server 10384 writes less than the number of bytes specified, the client should 10385 issue another WRITE to write the remaining data. 10387 It is assumed that the act of writing data to a file will cause the 10388 time_modified of the file to be updated. However, the 10389 time_modified of the file should not be changed unless the contents 10390 of the file are changed. Thus, a WRITE request with count set to 0 10391 should not cause the time_modified of the file to be updated. 10393 The definition of stable storage has been historically a point of 10394 contention. The following expected properties of stable storage 10395 may help in resolving design issues in the implementation. Stable 10396 storage is persistent storage that survives: 10398 1. Repeated power failures. 10399 2. Hardware failures (of any board, power supply, etc.). 10400 3. Repeated software crashes, including reboot cycle. 10402 This definition does not address failure of the stable storage 10403 module itself. 10405 The verifier is defined to allow a client to detect different 10406 instances of an NFS version 4 protocol server over which cached, 10407 uncommitted data may be lost. In the most likely case, the verifier 10408 allows the client to detect server reboots. This information is 10409 required so that the client can safely determine whether the server 10411 Draft Specification NFS version 4 Protocol August 2002 10413 could have lost cached data. If the server fails unexpectedly and 10414 the client has uncommitted data from previous WRITE requests (done 10415 with the stable argument set to UNSTABLE4 and in which the result 10416 committed was returned as UNSTABLE4 as well) it may not have 10417 flushed cached data to stable storage. The burden of recovery is on 10418 the client and the client will need to retransmit the data to the 10419 server. 10421 A suggested verifier would be to use the time that the server was 10422 booted or the time the server was last started (if restarting the 10423 server without a reboot results in lost buffers). 10425 The committed field in the results allows the client to do more 10426 effective caching. If the server is committing all WRITE requests 10427 to stable storage, then it should return with committed set to 10428 FILE_SYNC4, regardless of the value of the stable field in the 10429 arguments. A server that uses an NVRAM accelerator may choose to 10430 implement this policy. The client can use this to increase the 10431 effectiveness of the cache by discarding cached data that has 10432 already been committed on the server. 10434 Some implementations may return NFS4ERR_NOSPC instead of 10435 NFS4ERR_DQUOT when a user's quota is exceeded. In the case that 10436 the current filehandle is a directory, the server will return 10437 NFS4ERR_ISDIR. If the current filehandle is not a regular file or 10438 a directory, the server will return NFS4ERR_INVAL. 10440 If mandatory file locking is on for the file, and corresponding 10441 record of the to be written file is read or write locked by an 10442 owner that is not associated with the stateid, the server will 10443 return NFS4ERR_LOCKED. If so, the client must check if the owner 10444 corresponding to the stateid used with the WRITE operation has a 10445 conflicting read lock that overlaps with the region that was to be 10446 written. If the stateid's owner has no conflicting read lock, then 10447 the client should try to get the appropriate write record lock via 10448 the LOCK operation before re-attempting the WRITE. When the WRITE 10449 completes, the client should release the record lock via LOCKU. 10451 If the stateid's owner had a conflicting read lock, then the client 10452 has no choice but to return an error to the application that 10453 attempted the WRITE. The reason is that since the stateid's owner 10454 had a read lock, the server either attempted to temporarily 10455 effectively upgrade this read lock to a write lock, or the server 10456 has no upgrade capability. If the server attempted to upgrade the 10457 read lock and failed, it is pointless for the client to re-attempt 10458 the upgrade via the LOCK operation, because there might be another 10459 client also trying to upgrade. If two clients are blocked trying 10460 upgrade the same lock, the clients deadlock. If the server has no 10461 upgrade capability, then it pointless to try a LOCK operation to 10462 upgrade. 10464 Draft Specification NFS version 4 Protocol August 2002 10466 ERRORS 10468 NFS4ERR_ACCESS 10469 NFS4ERR_BADHANDLE 10470 NFS4ERR_BAD_STATEID 10471 NFS4ERR_BADXDR 10472 NFS4ERR_DELAY 10473 NFS4ERR_DQUOT 10474 NFS4ERR_EXPIRED 10475 NFS4ERR_FBIG 10476 NFS4ERR_FHEXPIRED 10477 NFS4ERR_GRACE 10478 NFS4ERR_INVAL 10479 NFS4ERR_IO 10480 NFS4ERR_ISDIR 10481 NFS4ERR_LEASE_MOVED 10482 NFS4ERR_LOCKED 10483 NFS4ERR_MOVED 10484 NFS4ERR_NOFILEHANDLE 10485 NFS4ERR_NOSPC 10486 NFS4ERR_NXIO 10487 NFS4ERR_OLD_STATEID 10488 NFS4ERR_OPENMODE 10489 NFS4ERR_RESOURCE 10490 NFS4ERR_ROFS 10491 NFS4ERR_SERVERFAULT 10492 NFS4ERR_STALE 10493 NFS4ERR_STALE_STATEID 10495 Draft Specification NFS version 4 Protocol August 2002 10497 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner State 10499 SYNOPSIS 10501 lockowner -> () 10503 ARGUMENT 10505 struct RELEASE_LOCKOWNER4args { 10506 lock_owner4 lock_owner; 10507 }; 10509 RESULT 10511 struct RELEASE_LOCKOWNER4res { 10512 nfsstat4 status; 10513 }; 10515 DESCRIPTION 10517 This operation is used to notify the server that the lock_owner is 10518 no longer in use by the client. This allows the server to release 10519 cached state related to the specified lock_owner. If file locks, 10520 associated with the lock_owner, are held at the server, the error 10521 NFS4ERR_LOCKS_HELD will be returned and no further action will be 10522 taken. 10524 IMPLEMENTATION 10526 The client may choose to use this operation to ease the amount of 10527 server state that is held. Depending on behavior of applications 10528 at the client, it may be important for the client to use this 10529 operation since the server has certain obligations with respect to 10530 holding a reference to a lock_owner as long as the associated file 10531 is open. Therefore, if the client knows for certain that the 10532 lock_owner will no longer be used under the context of the 10533 associated open_owner4, it should use RELEASE_LOCKOWNER. 10535 ERRORS 10537 NFS4ERR_BADXDR 10538 NFS4ERR_EXPIRED 10539 NFS4ERR_GRACE 10540 NFS4ERR_LEASE_MOVED 10541 NFS4ERR_LOCKS_HELD 10542 NFS4ERR_RESOURCE 10543 NFS4ERR_SERVERFAULT 10545 Draft Specification NFS version 4 Protocol August 2002 10547 NFS4ERR_STALE_CLIENTID 10549 Draft Specification NFS version 4 Protocol August 2002 10551 14.2.38. Operation 10044: ILLEGAL - Illegal operation 10553 SYNOPSIS 10555 10557 ARGUMENT 10559 void; 10561 RESULT 10563 struct ILLEGAL4res { 10564 nfsstat4 status; 10565 }; 10567 DESCRIPTION 10569 This operation is a placeholder for encoding a result to handle the 10570 case of the client sending an operation code within COMPOUND that 10571 is not supported. See the COMPOUND procedure description for more 10572 details. 10574 The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. 10576 IMPLEMENTATION 10578 A client will probably not send an operation with code OP_ILLEGAL 10579 but if it does, the response will be ILLEGAL4res just as it would 10580 be with any other invalid operation code. Note that if the server 10581 gets an illegal operation code that is not OP_ILLEGAL, and if the 10582 server checks for legal operation codes during the XDR decode 10583 phase, then the ILLEGAL4res would not be returned. 10585 ERRORS 10587 NFS4ERR_OP_ILLEGAL 10589 Draft Specification NFS version 4 Protocol August 2002 10591 15. NFS version 4 Callback Procedures 10593 The procedures used for callbacks are defined in the following 10594 sections. In the interest of clarity, the terms "client" and 10595 "server" refer to NFS clients and servers, despite the fact that for 10596 an individual callback RPC, the sense of these terms would be 10597 precisely the opposite. 10599 15.1. Procedure 0: CB_NULL - No Operation 10601 SYNOPSIS 10603 10605 ARGUMENT 10607 void; 10609 RESULT 10611 void; 10613 DESCRIPTION 10615 Standard NULL procedure. Void argument, void response. Even 10616 though there is no direct functionality associated with this 10617 procedure, the server will use CB_NULL to confirm the existence of 10618 a path for RPCs from server to client. 10620 ERRORS 10622 None. 10624 Draft Specification NFS version 4 Protocol August 2002 10626 15.2. Procedure 1: CB_COMPOUND - Compound Operations 10628 SYNOPSIS 10630 compoundargs -> compoundres 10632 ARGUMENT 10634 enum nfs_cb_opnum4 { 10635 OP_CB_GETATTR = 3, 10636 OP_CB_RECALL = 4, 10637 OP_CB_ILLEGAL = 10044 10638 }; 10640 union nfs_cb_argop4 switch (unsigned argop) { 10641 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 10642 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 10643 case OP_CB_ILLEGAL: void opcbillegal; 10644 }; 10646 struct CB_COMPOUND4args { 10647 utf8string tag; 10648 uint32_t minorversion; 10649 uint32_t callback_ident; 10650 nfs_cb_argop4 argarray<>; 10651 }; 10653 RESULT 10655 union nfs_cb_resop4 switch (unsigned resop){ 10656 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 10657 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 10658 }; 10660 struct CB_COMPOUND4res { 10661 nfsstat4 status; 10662 utf8string tag; 10663 nfs_cb_resop4 resarray<>; 10664 }; 10666 DESCRIPTION 10668 The CB_COMPOUND procedure is used to combine one or more of the 10669 callback procedures into a single RPC request. The main callback 10670 RPC program has two main procedures: CB_NULL and CB_COMPOUND. All 10671 other operations use the CB_COMPOUND procedure as a wrapper. 10673 In the processing of the CB_COMPOUND procedure, the client may find 10674 that it does not have the available resources to execute any or all 10676 Draft Specification NFS version 4 Protocol August 2002 10678 of the operations within the CB_COMPOUND sequence. In this case, 10679 the error NFS4ERR_RESOURCE will be returned for the particular 10680 operation within the CB_COMPOUND procedure where the resource 10681 exhaustion occurred. This assumes that all previous operations 10682 within the CB_COMPOUND sequence have been evaluated successfully. 10684 Contained within the CB_COMPOUND results is a 'status' field. This 10685 status must be equivalent to the status of the last operation that 10686 was executed within the CB_COMPOUND procedure. Therefore, if an 10687 operation incurred an error then the 'status' value will be the 10688 same error value as is being returned for the operation that 10689 failed. 10691 For the definition of the "tag" field, see the section "Procedure 10692 1: COMPOUND - Compound Operatoins". 10694 The value of callback_ident is supplied by the client during 10695 SETCLIENTID. The server must use the client supplied 10696 callback_ident during the CB_COMPOUND to allow the client to 10697 properly identify the server. 10699 Illegal operation codes are handled in the same way as they are 10700 handled for the COMPOUND procedure. 10702 IMPLEMENTATION 10704 The CB_COMPOUND procedure is used to combine individual operations 10705 into a single RPC request. The client interprets each of the 10706 operations in turn. If an operation is executed by the client and 10707 the status of that operation is NFS4_OK, then the next operation in 10708 the CB_COMPOUND procedure is executed. The client continues this 10709 process until there are no more operations to be executed or one of 10710 the operations has a status value other than NFS4_OK. 10712 ERRORS 10714 NFS4ERR_BADHANDLE 10715 NFS4ERR_BAD_STATEID 10716 NFS4ERR_OP_ILLEGAL 10717 NFS4ERR_RESOURCE 10719 Draft Specification NFS version 4 Protocol August 2002 10721 15.2.1. Operation 3: CB_GETATTR - Get Attributes 10723 SYNOPSIS 10725 fh, attrbits -> attrbits, attrvals 10727 ARGUMENT 10729 struct CB_GETATTR4args { 10730 nfs_fh4 fh; 10731 bitmap4 attr_request; 10732 }; 10734 RESULT 10736 struct CB_GETATTR4resok { 10737 fattr4 obj_attributes; 10738 }; 10740 union CB_GETATTR4res switch (nfsstat4 status) { 10741 case NFS4_OK: 10742 CB_GETATTR4resok resok4; 10743 default: 10744 void; 10745 }; 10747 DESCRIPTION 10749 The CB_GETATTR operation is used by the server to obtain the 10750 current modified state of a file that has been write delegated. 10751 The attributes size and change are the only ones guaranteed to be 10752 serviced by the client. See the section "Handling of CB_GETATTR" 10753 for a full description of how the client and server are to interact 10754 with the use of CB_GETATTR. 10756 If the filehandle specified is not one for which the client holds a 10757 write open delegation, an NFS4ERR_BADHANDLE error is returned. 10759 IMPLEMENTATION 10761 The client returns attrbits and the associated attribute values 10762 only for attributes that it may change (change, time_modify, size). 10764 ERRORS 10766 NFS4ERR_BADHANDLE 10768 Draft Specification NFS version 4 Protocol August 2002 10770 NFS4ERR_BADXDR 10771 NFS4ERR_RESOURCE 10772 NFS4ERR_SERVERFAULT 10774 Draft Specification NFS version 4 Protocol August 2002 10776 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation 10778 SYNOPSIS 10780 stateid, truncate, fh -> status 10782 ARGUMENT 10784 struct CB_RECALL4args { 10785 stateid4 stateid; 10786 bool truncate; 10787 nfs_fh4 fh; 10788 }; 10790 RESULT 10792 struct CB_RECALL4res { 10793 nfsstat4 status; 10794 }; 10796 DESCRIPTION 10798 The CB_RECALL operation is used to begin the process of recalling 10799 an open delegation and returning it to the server. 10801 The truncate flag is used to optimize recall for a file which is 10802 about to be truncated to zero. When it is set, the client is freed 10803 of obligation to propagate modified data for the file to the 10804 server, since this data is irrelevant. 10806 If the handle specified is not one for which the client holds an 10807 open delegation, an NFS4ERR_BADHANDLE error is returned. 10809 If the stateid specified is not one corresponding to an open 10810 delegation for the file specified by the filehandle, an 10811 NFS4ERR_BAD_STATEID is returned. 10813 IMPLEMENTATION 10815 The client should reply to the callback immediately. Replying does 10816 not complete the recall except when an error was returned. The 10817 recall is not complete until the delegation is returned using a 10818 DELEGRETURN. 10820 ERRORS 10822 NFS4ERR_BADHANDLE 10824 Draft Specification NFS version 4 Protocol August 2002 10826 NFS4ERR_BAD_STATEID 10827 NFS4ERR_BADXDR 10828 NFS4ERR_RESOURCE 10829 NFS4ERR_SERVERFAULT 10831 Draft Specification NFS version 4 Protocol August 2002 10833 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback Operation 10835 SYNOPSIS 10837 10839 ARGUMENT 10841 void; 10843 RESULT 10845 struct CB_ILLEGAL4res { 10846 nfsstat4 status; 10847 }; 10849 DESCRIPTION 10851 This operation is a placeholder for encoding a result to handle the 10852 case of the client sending an operation code within COMPOUND that 10853 is not supported. See the COMPOUND procedure description for more 10854 details. 10856 The status field of CB_ILLEGAL4res MUST be set to 10857 NFS4ERR_OP_ILLEGAL. 10859 IMPLEMENTATION 10861 A server will probably not send an operation with code 10862 OP_CB_ILLEGAL but if it does, the response will be CB_ILLEGAL4res 10863 just as it would be with any other invalid operation code. Note 10864 that if the client gets an illegal operation code that is not 10865 OP_ILLEGAL, and if the client checks for legal operation codes 10866 during the XDR decode phase, then the CB_ILLEGAL4res would not be 10867 returned. 10869 ERRORS 10871 NFS4ERR_OP_ILLEGAL 10873 Draft Specification NFS version 4 Protocol August 2002 10875 16. Security Considerations 10877 The major security feature to consider is the authentication of the 10878 user making the request of NFS service. Consideration should also be 10879 given to the integrity and privacy of this NFS request. These 10880 specific issues are discussed as part of the section on "RPC and 10881 Security Flavor". 10883 For reasons of reduced administration overhead, better performance 10884 and/or reduction of CPU utilization, users of NFS version 4 10885 implementations may choose to not use security mechanisms that enable 10886 integrity protection on each remote procedure call and response. The 10887 use of mechanisms without integrity leaves the customer vulnerable to 10888 an attacker in between the NFS client and server that modifies the 10889 RPC request and/or the response. While implementations are free to 10890 provide the option to use weaker security mechanisms, there are two 10891 operations in particular that warrant the implementation overriding 10892 user choices. 10894 The first such operation is SECINFO. It is recommended that the 10895 client issue the SECINFO call such that it is protected with a 10896 security flavor that has integrity protection, such as RPCSEC_GSS 10897 with a security triple that uses either rpc_gss_svc_integrity or 10898 rpc_gss_svc_privacy (rpc_gss_svc_privacy includes integrity 10899 protection) service. Without integrity protection encapsulating 10900 SECINFO and therefore its results, an attacker in the middle could 10901 modify results such that the client might select a weaker algorithm 10902 in the set allowed by server, making the client and/or server 10903 vulnerable to further attacks. 10905 The second operation that should definitely use integrity protection 10906 is any GETATTR for the fs_locations attribute. The attack has two 10907 steps. First the attacker modifies the unprotected results of some 10908 operation to return NFS4ERR_MOVED. Second, when the client follows up 10909 with a GETATTR for the fs_locations attribute, the attacker modifies 10910 the results to cause the client migrate its traffic to a server 10911 controlled by the attacker. 10913 Because the operations SETCLIENTID/SETCLIENTID_CONFIRM are 10914 responsible for the release of client state, it is imperative that 10915 the principal used for these operations is checked against and match 10916 the previous use of these operations. See the section "Client ID" 10917 for further discussion. 10919 Draft Specification NFS version 4 Protocol August 2002 10921 17. IANA Considerations 10923 17.1. Named Attribute Definition 10925 The NFS version 4 protocol provides for the association of named 10926 attributes to files. The name space identifiers for these attributes 10927 are defined as string names. The protocol does not define the 10928 specific assignment of the name space for these file attributes; the 10929 application developer or system vendor is allowed to define the 10930 attribute, its semantics, and the associated name. Even though this 10931 name space will not be specifically controlled to prevent collisions, 10932 the application developer or system vendor is strongly encouraged 10933 register its named attributes with IANA, and provide the name 10934 assignment and associated semantics for attributes via an 10935 Informational RFC. This will provide for interoperability where 10936 common interests exist. 10938 17.2. ONC RPC Network Identifiers (netids) 10940 The section "Structured Data Types" discussed the r_netid field and 10941 the corresponding r_addr field of a clientaddr4 structure. There 10942 should be a registry at IANA for netids and the corresponding 10943 universal address format corresponding to the native address format 10944 for the transport represent by a a netid. 10946 Draft Specification NFS version 4 Protocol August 2002 10948 18. RPC definition file 10950 /* 10951 * Copyright (C) The Internet Society (1998,1999,2000,2001,2002). 10952 * All Rights Reserved. 10953 */ 10955 /* 10956 * nfs4_prot.x 10957 * 10958 */ 10960 %#pragma ident "@(#)nfs4_prot.x 1.117" 10962 /* 10963 * Basic typedefs for RFC 1832 data type definitions 10964 */ 10965 typedef int int32_t; 10966 typedef unsigned int uint32_t; 10967 typedef hyper int64_t; 10968 typedef unsigned hyper uint64_t; 10970 /* 10971 * Sizes 10972 */ 10973 const NFS4_FHSIZE = 128; 10974 const NFS4_VERIFIER_SIZE = 8; 10975 const NFS4_OPAQUE_LIMIT = 1024; 10977 /* 10978 * File types 10979 */ 10980 enum nfs_ftype4 { 10981 NF4REG = 1, /* Regular File */ 10982 NF4DIR = 2, /* Directory */ 10983 NF4BLK = 3, /* Special File - block device */ 10984 NF4CHR = 4, /* Special File - character device */ 10985 NF4LNK = 5, /* Symbolic Link */ 10986 NF4SOCK = 6, /* Special File - socket */ 10987 NF4FIFO = 7, /* Special File - fifo */ 10988 NF4ATTRDIR = 8, /* Attribute Directory */ 10989 NF4NAMEDATTR = 9 /* Named Attribute */ 10990 }; 10992 /* 10993 * Error status 10994 */ 10995 enum nfsstat4 { 10996 NFS4_OK = 0, /* everything is okay */ 10997 NFS4ERR_PERM = 1, /* caller not privileged */ 10998 NFS4ERR_NOENT = 2, /* no such file/directory */ 10999 NFS4ERR_IO = 5, /* hard I/O error */ 11001 Draft Specification NFS version 4 Protocol August 2002 11003 NFS4ERR_NXIO = 6, /* no such device */ 11004 NFS4ERR_ACCESS = 13, /* access denied */ 11005 NFS4ERR_EXIST = 17, /* file already exists */ 11006 NFS4ERR_XDEV = 18, /* different filesystems */ 11007 NFS4ERR_NODEV = 19, /* no such device */ 11008 NFS4ERR_NOTDIR = 20, /* should be a directory */ 11009 NFS4ERR_ISDIR = 21, /* should not be directory */ 11010 NFS4ERR_INVAL = 22, /* invalid argument */ 11011 NFS4ERR_FBIG = 27, /* file exceeds server max */ 11012 NFS4ERR_NOSPC = 28, /* no space on filesystem */ 11013 NFS4ERR_ROFS = 30, /* read-only filesystem */ 11014 NFS4ERR_MLINK = 31, /* too many hard links */ 11015 NFS4ERR_NAMETOOLONG = 63, /* name exceeds server max */ 11016 NFS4ERR_NOTEMPTY = 66, /* directory not empty */ 11017 NFS4ERR_DQUOT = 69, /* hard quota limit reached*/ 11018 NFS4ERR_STALE = 70, /* file no longer exists */ 11019 NFS4ERR_BADHANDLE = 10001,/* Illegal filehandle */ 11020 NFS4ERR_BAD_COOKIE = 10003,/* READDIR cookie is stale */ 11021 NFS4ERR_NOTSUPP = 10004,/* operation not supported */ 11022 NFS4ERR_TOOSMALL = 10005,/* buffer too small */ 11023 NFS4ERR_SERVERFAULT = 10006,/* undefined server error */ 11024 NFS4ERR_BADTYPE = 10007,/* type invalide for CREATE*/ 11025 NFS4ERR_DELAY = 10008,/* file "busy" - retry */ 11026 NFS4ERR_SAME = 10009,/* nverify says attrs same */ 11027 NFS4ERR_DENIED = 10010,/* lock unavailable */ 11028 NFS4ERR_EXPIRED = 10011,/* lock lease expired */ 11029 NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */ 11030 NFS4ERR_GRACE = 10013,/* in grace period */ 11031 NFS4ERR_FHEXPIRED = 10014,/* filehandle expired */ 11032 NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */ 11033 NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */ 11034 NFS4ERR_CLID_INUSE = 10017,/* clientid in use */ 11035 NFS4ERR_RESOURCE = 10018,/* resource exhaustion */ 11036 NFS4ERR_MOVED = 10019,/* filesystem relocated */ 11037 NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */ 11038 NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ 11039 NFS4ERR_STALE_CLIENTID = 10022,/* server has rebooted */ 11040 NFS4ERR_STALE_STATEID = 10023,/* server has rebooted */ 11041 NFS4ERR_OLD_STATEID = 10024,/* state is out of sync */ 11042 NFS4ERR_BAD_STATEID = 10025,/* incorrect stateid */ 11043 NFS4ERR_BAD_SEQID = 10026,/* request is out of seq. */ 11044 NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */ 11045 NFS4ERR_LOCK_RANGE = 10028,/* lock range not supported*/ 11046 NFS4ERR_SYMLINK = 10029,/* should be file/directory*/ 11047 NFS4ERR_READDIR_NOSPC = 10030,/* response limit exceeded */ 11048 NFS4ERR_LEASE_MOVED = 10031,/* some filesystem moved */ 11049 NFS4ERR_ATTRNOTSUPP = 10032,/* recommended attr not sup*/ 11050 NFS4ERR_NO_GRACE = 10033,/* reclaim outside of grace*/ 11051 NFS4ERR_RECLAIM_BAD = 10034,/* reclaim error at server */ 11052 NFS4ERR_RECLAIM_CONFLICT = 10035,/* conflict on reclaim */ 11053 NFS4ERR_BADXDR = 10036,/* XDR decode failed */ 11054 NFS4ERR_LOCKS_HELD = 10037,/* file lcoks held at CLOSE*/ 11056 Draft Specification NFS version 4 Protocol August 2002 11058 NFS4ERR_OPENMODE = 10038,/* conflict in OPEN and I/O*/ 11059 NFS4ERR_BADOWNER = 10039,/* owner translation bad */ 11060 NFS4ERR_BADCHAR = 10040,/* utf-8 char not supported*/ 11061 NFS4ERR_BADNAME = 10041,/* name not supported */ 11062 NFS4ERR_BAD_RANGE = 10042,/* lock range not supported*/ 11063 NFS4ERR_LOCK_NOTSUPP = 10043,/* no atomic up/downgrade */ 11064 NFS4ERR_OP_ILLEGAL = 10044,/* undefined operation */ 11065 NFS4ERR_DEADLOCK = 10045,/* file locking deadlock */ 11066 NFS4ERR_FILE_OPEN = 10046 /* open file blocks op. */ 11067 }; 11069 /* 11070 * Basic data types 11071 */ 11072 typedef uint32_t bitmap4<>; 11073 typedef uint64_t offset4; 11074 typedef uint32_t count4; 11075 typedef uint64_t length4; 11076 typedef uint64_t clientid4; 11077 typedef uint32_t seqid4; 11078 typedef opaque utf8string<>; 11079 typedef utf8string component4; 11080 typedef component4 pathname4<>; 11081 typedef uint64_t nfs_lockid4; 11082 typedef uint64_t nfs_cookie4; 11083 typedef utf8string linktext4; 11084 typedef opaque sec_oid4<>; 11085 typedef uint32_t qop4; 11086 typedef uint32_t mode4; 11087 typedef uint64_t changeid4; 11088 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 11090 /* 11091 * Timeval 11092 */ 11093 struct nfstime4 { 11094 int64_t seconds; 11095 uint32_t nseconds; 11096 }; 11098 enum time_how4 { 11099 SET_TO_SERVER_TIME4 = 0, 11100 SET_TO_CLIENT_TIME4 = 1 11101 }; 11103 union settime4 switch (time_how4 set_it) { 11104 case SET_TO_CLIENT_TIME4: 11105 nfstime4 time; 11106 default: 11107 void; 11108 }; 11110 Draft Specification NFS version 4 Protocol August 2002 11112 /* 11113 * File access handle 11114 */ 11115 typedef opaque nfs_fh4; 11117 /* 11118 * File attribute definitions 11119 */ 11121 /* 11122 * FSID structure for major/minor 11123 */ 11124 struct fsid4 { 11125 uint64_t major; 11126 uint64_t minor; 11127 }; 11129 /* 11130 * Filesystem locations attribute for relocation/migration 11131 */ 11132 struct fs_location4 { 11133 utf8string server<>; 11134 pathname4 rootpath; 11135 }; 11137 struct fs_locations4 { 11138 pathname4 fs_root; 11139 fs_location4 locations<>; 11140 }; 11142 /* 11143 * Various Access Control Entry definitions 11144 */ 11146 /* 11147 * Mask that indicates which Access Control Entries are supported. 11148 * Values for the fattr4_aclsupport attribute. 11149 */ 11150 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 11151 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 11152 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 11153 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 11155 typedef uint32_t acetype4; 11157 /* 11158 * acetype4 values, others can be added as needed. 11159 */ 11160 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 11161 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 11163 Draft Specification NFS version 4 Protocol August 2002 11165 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 11166 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 11168 /* 11169 * ACE flag 11170 */ 11171 typedef uint32_t aceflag4; 11173 /* 11174 * ACE flag values 11175 */ 11176 const ACE4_FILE_INHERIT_ACE = 0x00000001; 11177 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 11178 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 11179 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 11180 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 11181 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 11182 const ACE4_IDENTIFIER_GROUP = 0x00000040; 11184 /* 11185 * ACE mask 11186 */ 11187 typedef uint32_t acemask4; 11189 /* 11190 * ACE mask values 11191 */ 11192 const ACE4_READ_DATA = 0x00000001; 11193 const ACE4_LIST_DIRECTORY = 0x00000001; 11194 const ACE4_WRITE_DATA = 0x00000002; 11195 const ACE4_ADD_FILE = 0x00000002; 11196 const ACE4_APPEND_DATA = 0x00000004; 11197 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 11198 const ACE4_READ_NAMED_ATTRS = 0x00000008; 11199 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 11200 const ACE4_EXECUTE = 0x00000020; 11201 const ACE4_DELETE_CHILD = 0x00000040; 11202 const ACE4_READ_ATTRIBUTES = 0x00000080; 11203 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 11205 const ACE4_DELETE = 0x00010000; 11206 const ACE4_READ_ACL = 0x00020000; 11207 const ACE4_WRITE_ACL = 0x00040000; 11208 const ACE4_WRITE_OWNER = 0x00080000; 11209 const ACE4_SYNCHRONIZE = 0x00100000; 11211 /* 11212 * ACE4_GENERIC_READ -- defined as combination of 11213 * ACE4_READ_ACL | 11214 * ACE4_READ_DATA | 11216 Draft Specification NFS version 4 Protocol August 2002 11218 * ACE4_READ_ATTRIBUTES | 11219 * ACE4_SYNCHRONIZE 11220 */ 11222 const ACE4_GENERIC_READ = 0x00120081; 11224 /* 11225 * ACE4_GENERIC_WRITE -- defined as combination of 11226 * ACE4_READ_ACL | 11227 * ACE4_WRITE_DATA | 11228 * ACE4_WRITE_ATTRIBUTES | 11229 * ACE4_WRITE_ACL | 11230 * ACE4_APPEND_DATA | 11231 * ACE4_SYNCHRONIZE 11232 */ 11233 const ACE4_GENERIC_WRITE = 0x00160106; 11235 /* 11236 * ACE4_GENERIC_EXECUTE -- defined as combination of 11237 * ACE4_READ_ACL 11238 * ACE4_READ_ATTRIBUTES 11239 * ACE4_EXECUTE 11240 * ACE4_SYNCHRONIZE 11241 */ 11242 const ACE4_GENERIC_EXECUTE = 0x001200A0; 11244 /* 11245 * Access Control Entry definition 11246 */ 11247 struct nfsace4 { 11248 acetype4 type; 11249 aceflag4 flag; 11250 acemask4 access_mask; 11251 utf8string who; 11252 }; 11254 /* 11255 * Field definitions for the fattr4_mode attribute 11256 */ 11257 const MODE4_SUID = 0x800; /* set user id on execution */ 11258 const MODE4_SGID = 0x400; /* set group id on execution */ 11259 const MODE4_SVTX = 0x200; /* save text even after use */ 11260 const MODE4_RUSR = 0x100; /* read permission: owner */ 11261 const MODE4_WUSR = 0x080; /* write permission: owner */ 11262 const MODE4_XUSR = 0x040; /* execute permission: owner */ 11263 const MODE4_RGRP = 0x020; /* read permission: group */ 11264 const MODE4_WGRP = 0x010; /* write permission: group */ 11265 const MODE4_XGRP = 0x008; /* execute permission: group */ 11266 const MODE4_ROTH = 0x004; /* read permission: other */ 11267 const MODE4_WOTH = 0x002; /* write permission: other */ 11269 Draft Specification NFS version 4 Protocol August 2002 11271 const MODE4_XOTH = 0x001; /* execute permission: other */ 11273 /* 11274 * Special data/attribute associated with 11275 * file types NF4BLK and NF4CHR. 11276 */ 11277 struct specdata4 { 11278 uint32_t specdata1; /* major device number */ 11279 uint32_t specdata2; /* minor device number */ 11280 }; 11282 /* 11283 * Values for fattr4_fh_expire_type 11284 */ 11285 const FH4_PERSISTENT = 0x00000000; 11286 const FH4_NOEXPIRE_WITH_OPEN = 0x00000001; 11287 const FH4_VOLATILE_ANY = 0x00000002; 11288 const FH4_VOL_MIGRATION = 0x00000004; 11289 const FH4_VOL_RENAME = 0x00000008; 11291 typedef bitmap4 fattr4_supported_attrs; 11292 typedef nfs_ftype4 fattr4_type; 11293 typedef uint32_t fattr4_fh_expire_type; 11294 typedef changeid4 fattr4_change; 11295 typedef uint64_t fattr4_size; 11296 typedef bool fattr4_link_support; 11297 typedef bool fattr4_symlink_support; 11298 typedef bool fattr4_named_attr; 11299 typedef fsid4 fattr4_fsid; 11300 typedef bool fattr4_unique_handles; 11301 typedef uint32_t fattr4_lease_time; 11302 typedef nfsstat4 fattr4_rdattr_error; 11304 typedef nfsace4 fattr4_acl<>; 11305 typedef uint32_t fattr4_aclsupport; 11306 typedef bool fattr4_archive; 11307 typedef bool fattr4_cansettime; 11308 typedef bool fattr4_case_insensitive; 11309 typedef bool fattr4_case_preserving; 11310 typedef bool fattr4_chown_restricted; 11311 typedef uint64_t fattr4_fileid; 11312 typedef uint64_t fattr4_files_avail; 11313 typedef nfs_fh4 fattr4_filehandle; 11314 typedef uint64_t fattr4_files_free; 11315 typedef uint64_t fattr4_files_total; 11316 typedef fs_locations4 fattr4_fs_locations; 11317 typedef bool fattr4_hidden; 11318 typedef bool fattr4_homogeneous; 11319 typedef uint64_t fattr4_maxfilesize; 11320 typedef uint32_t fattr4_maxlink; 11321 typedef uint32_t fattr4_maxname; 11323 Draft Specification NFS version 4 Protocol August 2002 11325 typedef uint64_t fattr4_maxread; 11326 typedef uint64_t fattr4_maxwrite; 11327 typedef utf8string fattr4_mimetype; 11328 typedef mode4 fattr4_mode; 11329 typedef uint64_t fattr4_mounted_on_fileid; 11330 typedef bool fattr4_no_trunc; 11331 typedef uint32_t fattr4_numlinks; 11332 typedef utf8string fattr4_owner; 11333 typedef utf8string fattr4_owner_group; 11334 typedef uint64_t fattr4_quota_avail_hard; 11335 typedef uint64_t fattr4_quota_avail_soft; 11336 typedef uint64_t fattr4_quota_used; 11337 typedef specdata4 fattr4_rawdev; 11338 typedef uint64_t fattr4_space_avail; 11339 typedef uint64_t fattr4_space_free; 11340 typedef uint64_t fattr4_space_total; 11341 typedef uint64_t fattr4_space_used; 11342 typedef bool fattr4_system; 11343 typedef nfstime4 fattr4_time_access; 11344 typedef settime4 fattr4_time_access_set; 11345 typedef nfstime4 fattr4_time_backup; 11346 typedef nfstime4 fattr4_time_create; 11347 typedef nfstime4 fattr4_time_delta; 11348 typedef nfstime4 fattr4_time_metadata; 11349 typedef nfstime4 fattr4_time_modify; 11350 typedef settime4 fattr4_time_modify_set; 11352 /* 11353 * Mandatory Attributes 11354 */ 11355 const FATTR4_SUPPORTED_ATTRS = 0; 11356 const FATTR4_TYPE = 1; 11357 const FATTR4_FH_EXPIRE_TYPE = 2; 11358 const FATTR4_CHANGE = 3; 11359 const FATTR4_SIZE = 4; 11360 const FATTR4_LINK_SUPPORT = 5; 11361 const FATTR4_SYMLINK_SUPPORT = 6; 11362 const FATTR4_NAMED_ATTR = 7; 11363 const FATTR4_FSID = 8; 11364 const FATTR4_UNIQUE_HANDLES = 9; 11365 const FATTR4_LEASE_TIME = 10; 11366 const FATTR4_RDATTR_ERROR = 11; 11367 const FATTR4_FILEHANDLE = 19; 11369 /* 11370 * Recommended Attributes 11371 */ 11372 const FATTR4_ACL = 12; 11373 const FATTR4_ACLSUPPORT = 13; 11374 const FATTR4_ARCHIVE = 14; 11375 const FATTR4_CANSETTIME = 15; 11377 Draft Specification NFS version 4 Protocol August 2002 11379 const FATTR4_CASE_INSENSITIVE = 16; 11380 const FATTR4_CASE_PRESERVING = 17; 11381 const FATTR4_CHOWN_RESTRICTED = 18; 11382 const FATTR4_FILEID = 20; 11383 const FATTR4_FILES_AVAIL = 21; 11384 const FATTR4_FILES_FREE = 22; 11385 const FATTR4_FILES_TOTAL = 23; 11386 const FATTR4_FS_LOCATIONS = 24; 11387 const FATTR4_HIDDEN = 25; 11388 const FATTR4_HOMOGENEOUS = 26; 11389 const FATTR4_MAXFILESIZE = 27; 11390 const FATTR4_MAXLINK = 28; 11391 const FATTR4_MAXNAME = 29; 11392 const FATTR4_MAXREAD = 30; 11393 const FATTR4_MAXWRITE = 31; 11394 const FATTR4_MIMETYPE = 32; 11395 const FATTR4_MODE = 33; 11396 const FATTR4_NO_TRUNC = 34; 11397 const FATTR4_NUMLINKS = 35; 11398 const FATTR4_OWNER = 36; 11399 const FATTR4_OWNER_GROUP = 37; 11400 const FATTR4_QUOTA_AVAIL_HARD = 38; 11401 const FATTR4_QUOTA_AVAIL_SOFT = 39; 11402 const FATTR4_QUOTA_USED = 40; 11403 const FATTR4_RAWDEV = 41; 11404 const FATTR4_SPACE_AVAIL = 42; 11405 const FATTR4_SPACE_FREE = 43; 11406 const FATTR4_SPACE_TOTAL = 44; 11407 const FATTR4_SPACE_USED = 45; 11408 const FATTR4_SYSTEM = 46; 11409 const FATTR4_TIME_ACCESS = 47; 11410 const FATTR4_TIME_ACCESS_SET = 48; 11411 const FATTR4_TIME_BACKUP = 49; 11412 const FATTR4_TIME_CREATE = 50; 11413 const FATTR4_TIME_DELTA = 51; 11414 const FATTR4_TIME_METADATA = 52; 11415 const FATTR4_TIME_MODIFY = 53; 11416 const FATTR4_TIME_MODIFY_SET = 54; 11417 const FATTR4_MOUNTED_ON_FILEID = 55; 11419 typedef opaque attrlist4<>; 11421 /* 11422 * File attribute container 11423 */ 11424 struct fattr4 { 11425 bitmap4 attrmask; 11426 attrlist4 attr_vals; 11427 }; 11429 /* 11430 * Change info for the client 11432 Draft Specification NFS version 4 Protocol August 2002 11434 */ 11435 struct change_info4 { 11436 bool atomic; 11437 changeid4 before; 11438 changeid4 after; 11439 }; 11441 struct clientaddr4 { 11442 /* see struct rpcb in RFC 1833 */ 11443 string r_netid<>; /* network id */ 11444 string r_addr<>; /* universal address */ 11445 }; 11447 /* 11448 * Callback program info as provided by the client 11449 */ 11450 struct cb_client4 { 11451 uint32_t cb_program; 11452 clientaddr4 cb_location; 11453 }; 11455 /* 11456 * Stateid 11457 */ 11458 struct stateid4 { 11459 uint32_t seqid; 11460 opaque other[12]; 11461 }; 11463 /* 11464 * Client ID 11465 */ 11466 struct nfs_client_id4 { 11467 verifier4 verifier; 11468 opaque id; 11469 }; 11471 struct open_owner4 { 11472 clientid4 clientid; 11473 opaque owner; 11474 }; 11476 struct lock_owner4 { 11477 clientid4 clientid; 11478 opaque owner; 11479 }; 11481 enum nfs_lock_type4 { 11482 READ_LT = 1, 11483 WRITE_LT = 2, 11484 READW_LT = 3, /* blocking read */ 11485 WRITEW_LT = 4 /* blocking write */ 11487 Draft Specification NFS version 4 Protocol August 2002 11489 }; 11491 /* 11492 * ACCESS: Check access permission 11493 */ 11494 const ACCESS4_READ = 0x00000001; 11495 const ACCESS4_LOOKUP = 0x00000002; 11496 const ACCESS4_MODIFY = 0x00000004; 11497 const ACCESS4_EXTEND = 0x00000008; 11498 const ACCESS4_DELETE = 0x00000010; 11499 const ACCESS4_EXECUTE = 0x00000020; 11501 struct ACCESS4args { 11502 /* CURRENT_FH: object */ 11503 uint32_t access; 11504 }; 11506 struct ACCESS4resok { 11507 uint32_t supported; 11508 uint32_t access; 11509 }; 11511 union ACCESS4res switch (nfsstat4 status) { 11512 case NFS4_OK: 11513 ACCESS4resok resok4; 11514 default: 11515 void; 11516 }; 11518 /* 11519 * CLOSE: Close a file and release share reservations 11520 */ 11521 struct CLOSE4args { 11522 /* CURRENT_FH: object */ 11523 seqid4 seqid; 11524 stateid4 open_stateid; 11525 }; 11527 union CLOSE4res switch (nfsstat4 status) { 11528 case NFS4_OK: 11529 stateid4 open_stateid; 11530 default: 11531 void; 11532 }; 11534 /* 11535 * COMMIT: Commit cached data on server to stable storage 11536 */ 11537 struct COMMIT4args { 11538 /* CURRENT_FH: file */ 11539 offset4 offset; 11540 count4 count; 11542 Draft Specification NFS version 4 Protocol August 2002 11544 }; 11546 struct COMMIT4resok { 11547 verifier4 writeverf; 11548 }; 11550 union COMMIT4res switch (nfsstat4 status) { 11551 case NFS4_OK: 11552 COMMIT4resok resok4; 11553 default: 11554 void; 11555 }; 11557 /* 11558 * CREATE: Create a non-regular file 11559 */ 11560 union createtype4 switch (nfs_ftype4 type) { 11561 case NF4LNK: 11562 linktext4 linkdata; 11563 case NF4BLK: 11564 case NF4CHR: 11565 specdata4 devdata; 11566 case NF4SOCK: 11567 case NF4FIFO: 11568 case NF4DIR: 11569 void; 11570 default: 11571 void; /* server should return NFS4ERR_BADTYPE */ 11572 }; 11574 struct CREATE4args { 11575 /* CURRENT_FH: directory for creation */ 11576 createtype4 objtype; 11577 component4 objname; 11578 fattr4 createattrs; 11579 }; 11581 struct CREATE4resok { 11582 change_info4 cinfo; 11583 bitmap4 attrset; /* attributes set */ 11584 }; 11586 union CREATE4res switch (nfsstat4 status) { 11587 case NFS4_OK: 11588 CREATE4resok resok4; 11589 default: 11590 void; 11591 }; 11593 /* 11594 * DELEGPURGE: Purge Delegations Awaiting Recovery 11596 Draft Specification NFS version 4 Protocol August 2002 11598 */ 11599 struct DELEGPURGE4args { 11600 clientid4 clientid; 11601 }; 11603 struct DELEGPURGE4res { 11604 nfsstat4 status; 11605 }; 11607 /* 11608 * DELEGRETURN: Return a delegation 11609 */ 11610 struct DELEGRETURN4args { 11611 /* CURRENT_FH: delegated file */ 11612 stateid4 deleg_stateid; 11613 }; 11615 struct DELEGRETURN4res { 11616 nfsstat4 status; 11617 }; 11619 /* 11620 * GETATTR: Get file attributes 11621 */ 11622 struct GETATTR4args { 11623 /* CURRENT_FH: directory or file */ 11624 bitmap4 attr_request; 11625 }; 11627 struct GETATTR4resok { 11628 fattr4 obj_attributes; 11629 }; 11631 union GETATTR4res switch (nfsstat4 status) { 11632 case NFS4_OK: 11633 GETATTR4resok resok4; 11634 default: 11635 void; 11636 }; 11638 /* 11639 * GETFH: Get current filehandle 11640 */ 11641 struct GETFH4resok { 11642 nfs_fh4 object; 11643 }; 11645 union GETFH4res switch (nfsstat4 status) { 11646 case NFS4_OK: 11647 GETFH4resok resok4; 11648 default: 11649 void; 11651 Draft Specification NFS version 4 Protocol August 2002 11653 }; 11655 /* 11656 * LINK: Create link to an object 11657 */ 11658 struct LINK4args { 11659 /* SAVED_FH: source object */ 11660 /* CURRENT_FH: target directory */ 11661 component4 newname; 11662 }; 11664 struct LINK4resok { 11665 change_info4 cinfo; 11666 }; 11668 union LINK4res switch (nfsstat4 status) { 11669 case NFS4_OK: 11670 LINK4resok resok4; 11671 default: 11672 void; 11673 }; 11675 /* 11676 * For LOCK, transition from open_owner to new lock_owner 11677 */ 11678 struct open_to_lock_owner4 { 11679 seqid4 open_seqid; 11680 stateid4 open_stateid; 11681 seqid4 lock_seqid; 11682 lock_owner4 lock_owner; 11683 }; 11685 /* 11686 * For LOCK, existing lock_owner continues to request file locks 11687 */ 11688 struct exist_lock_owner4 { 11689 stateid4 lock_stateid; 11690 seqid4 lock_seqid; 11691 }; 11693 union locker4 switch (bool new_lock_owner) { 11694 case TRUE: 11695 open_to_lock_owner4 open_owner; 11696 case FALSE: 11697 exist_lock_owner4 lock_owner; 11698 }; 11700 /* 11701 * LOCK/LOCKT/LOCKU: Record lock management 11702 */ 11703 struct LOCK4args { 11704 /* CURRENT_FH: file */ 11706 Draft Specification NFS version 4 Protocol August 2002 11708 nfs_lock_type4 locktype; 11709 bool reclaim; 11710 offset4 offset; 11711 length4 length; 11712 locker4 locker; 11713 }; 11715 struct LOCK4denied { 11716 offset4 offset; 11717 length4 length; 11718 nfs_lock_type4 locktype; 11719 lock_owner4 owner; 11720 }; 11722 struct LOCK4resok { 11723 stateid4 lock_stateid; 11724 }; 11726 union LOCK4res switch (nfsstat4 status) { 11727 case NFS4_OK: 11728 LOCK4resok resok4; 11729 case NFS4ERR_DENIED: 11730 LOCK4denied denied; 11731 default: 11732 void; 11733 }; 11735 struct LOCKT4args { 11736 /* CURRENT_FH: file */ 11737 nfs_lock_type4 locktype; 11738 offset4 offset; 11739 length4 length; 11740 lock_owner4 owner; 11741 }; 11743 union LOCKT4res switch (nfsstat4 status) { 11744 case NFS4ERR_DENIED: 11745 LOCK4denied denied; 11746 case NFS4_OK: 11747 void; 11748 default: 11749 void; 11750 }; 11752 struct LOCKU4args { 11753 /* CURRENT_FH: file */ 11754 nfs_lock_type4 locktype; 11755 seqid4 seqid; 11756 stateid4 lock_stateid; 11757 offset4 offset; 11758 length4 length; 11759 }; 11761 Draft Specification NFS version 4 Protocol August 2002 11763 union LOCKU4res switch (nfsstat4 status) { 11764 case NFS4_OK: 11765 stateid4 lock_stateid; 11766 default: 11767 void; 11768 }; 11770 /* 11771 * LOOKUP: Lookup filename 11772 */ 11773 struct LOOKUP4args { 11774 /* CURRENT_FH: directory */ 11775 component4 objname; 11776 }; 11778 struct LOOKUP4res { 11779 /* CURRENT_FH: object */ 11780 nfsstat4 status; 11781 }; 11783 /* 11784 * LOOKUPP: Lookup parent directory 11785 */ 11786 struct LOOKUPP4res { 11787 /* CURRENT_FH: directory */ 11788 nfsstat4 status; 11789 }; 11791 /* 11792 * NVERIFY: Verify attributes different 11793 */ 11794 struct NVERIFY4args { 11795 /* CURRENT_FH: object */ 11796 fattr4 obj_attributes; 11797 }; 11799 struct NVERIFY4res { 11800 nfsstat4 status; 11801 }; 11803 /* 11804 * Various definitions for OPEN 11805 */ 11806 enum createmode4 { 11807 UNCHECKED4 = 0, 11808 GUARDED4 = 1, 11809 EXCLUSIVE4 = 2 11810 }; 11812 union createhow4 switch (createmode4 mode) { 11813 case UNCHECKED4: 11814 case GUARDED4: 11816 Draft Specification NFS version 4 Protocol August 2002 11818 fattr4 createattrs; 11819 case EXCLUSIVE4: 11820 verifier4 createverf; 11821 }; 11823 enum opentype4 { 11824 OPEN4_NOCREATE = 0, 11825 OPEN4_CREATE = 1 11826 }; 11828 union openflag4 switch (opentype4 opentype) { 11829 case OPEN4_CREATE: 11830 createhow4 how; 11831 default: 11832 void; 11833 }; 11835 /* Next definitions used for OPEN delegation */ 11836 enum limit_by4 { 11837 NFS_LIMIT_SIZE = 1, 11838 NFS_LIMIT_BLOCKS = 2 11839 /* others as needed */ 11840 }; 11842 struct nfs_modified_limit4 { 11843 uint32_t num_blocks; 11844 uint32_t bytes_per_block; 11845 }; 11847 union nfs_space_limit4 switch (limit_by4 limitby) { 11848 /* limit specified as file size */ 11849 case NFS_LIMIT_SIZE: 11850 uint64_t filesize; 11851 /* limit specified by number of blocks */ 11852 case NFS_LIMIT_BLOCKS: 11853 nfs_modified_limit4 mod_blocks; 11854 } ; 11856 /* 11857 * Share Access and Deny constants for open argument 11858 */ 11859 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 11860 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 11861 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 11863 const OPEN4_SHARE_DENY_NONE = 0x00000000; 11864 const OPEN4_SHARE_DENY_READ = 0x00000001; 11865 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 11866 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 11868 enum open_delegation_type4 { 11869 OPEN_DELEGATE_NONE = 0, 11871 Draft Specification NFS version 4 Protocol August 2002 11873 OPEN_DELEGATE_READ = 1, 11874 OPEN_DELEGATE_WRITE = 2 11875 }; 11877 enum open_claim_type4 { 11878 CLAIM_NULL = 0, 11879 CLAIM_PREVIOUS = 1, 11880 CLAIM_DELEGATE_CUR = 2, 11881 CLAIM_DELEGATE_PREV = 3 11882 }; 11884 struct open_claim_delegate_cur4 { 11885 stateid4 delegate_stateid; 11886 component4 file; 11887 }; 11889 union open_claim4 switch (open_claim_type4 claim) { 11890 /* 11891 * No special rights to file. Ordinary OPEN of the specified file. 11892 */ 11893 case CLAIM_NULL: 11894 /* CURRENT_FH: directory */ 11895 component4 file; 11897 /* 11898 * Right to the file established by an open previous to server 11899 * reboot. File identified by filehandle obtained at that time 11900 * rather than by name. 11901 */ 11902 case CLAIM_PREVIOUS: 11903 /* CURRENT_FH: file being reclaimed */ 11904 open_delegation_type4 delegate_type; 11906 /* 11907 * Right to file based on a delegation granted by the server. 11908 * File is specified by name. 11909 */ 11910 case CLAIM_DELEGATE_CUR: 11911 /* CURRENT_FH: directory */ 11912 open_claim_delegate_cur4 delegate_cur_info; 11914 /* Right to file based on a delegation granted to a previous boot 11915 * instance of the client. File is specified by name. 11916 */ 11917 case CLAIM_DELEGATE_PREV: 11918 /* CURRENT_FH: directory */ 11919 component4 file_delegate_prev; 11920 }; 11922 /* 11923 * OPEN: Open a file, potentially receiving an open delegation 11924 */ 11926 Draft Specification NFS version 4 Protocol August 2002 11928 struct OPEN4args { 11929 seqid4 seqid; 11930 uint32_t share_access; 11931 uint32_t share_deny; 11932 open_owner4 owner; 11933 openflag4 openhow; 11934 open_claim4 claim; 11935 }; 11937 struct open_read_delegation4 { 11938 stateid4 stateid; /* Stateid for delegation*/ 11939 bool recall; /* Pre-recalled flag for 11940 delegations obtained 11941 by reclaim 11942 (CLAIM_PREVIOUS) */ 11943 nfsace4 permissions; /* Defines users who don't 11944 need an ACCESS call to 11945 open for read */ 11946 }; 11948 struct open_write_delegation4 { 11949 stateid4 stateid; /* Stateid for delegation */ 11950 bool recall; /* Pre-recalled flag for 11951 delegations obtained 11952 by reclaim 11953 (CLAIM_PREVIOUS) */ 11954 nfs_space_limit4 space_limit; /* Defines condition that 11955 the client must check to 11956 determine whether the 11957 file needs to be flushed 11958 to the server on close. 11959 */ 11960 nfsace4 permissions; /* Defines users who don't 11961 need an ACCESS call as 11962 part of a delegated 11963 open. */ 11964 }; 11966 union open_delegation4 11967 switch (open_delegation_type4 delegation_type) { 11968 case OPEN_DELEGATE_NONE: 11969 void; 11970 case OPEN_DELEGATE_READ: 11971 open_read_delegation4 read; 11972 case OPEN_DELEGATE_WRITE: 11973 open_write_delegation4 write; 11974 }; 11976 /* 11977 * Result flags 11978 */ 11979 /* Client must confirm open */ 11981 Draft Specification NFS version 4 Protocol August 2002 11983 const OPEN4_RESULT_CONFIRM = 0x00000002; 11984 /* Type of file locking behavior at the server */ 11985 const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; 11987 struct OPEN4resok { 11988 stateid4 stateid; /* Stateid for open */ 11989 change_info4 cinfo; /* Directory Change Info */ 11990 uint32_t rflags; /* Result flags */ 11991 bitmap4 attrset; /* attribute set for create*/ 11992 open_delegation4 delegation; /* Info on any open 11993 delegation */ 11994 }; 11996 union OPEN4res switch (nfsstat4 status) { 11997 case NFS4_OK: 11998 /* CURRENT_FH: opened file */ 11999 OPEN4resok resok4; 12000 default: 12001 void; 12002 }; 12004 /* 12005 * OPENATTR: open named attributes directory 12006 */ 12007 struct OPENATTR4args { 12008 /* CURRENT_FH: object */ 12009 bool createdir; 12010 }; 12012 struct OPENATTR4res { 12013 /* CURRENT_FH: named attr directory */ 12014 nfsstat4 status; 12015 }; 12017 /* 12018 * OPEN_CONFIRM: confirm the open 12019 */ 12020 struct OPEN_CONFIRM4args { 12021 /* CURRENT_FH: opened file */ 12022 stateid4 open_stateid; 12023 seqid4 seqid; 12024 }; 12026 struct OPEN_CONFIRM4resok { 12027 stateid4 open_stateid; 12028 }; 12030 union OPEN_CONFIRM4res switch (nfsstat4 status) { 12031 case NFS4_OK: 12032 OPEN_CONFIRM4resok resok4; 12033 default: 12034 void; 12036 Draft Specification NFS version 4 Protocol August 2002 12038 }; 12040 /* 12041 * OPEN_DOWNGRADE: downgrade the access/deny for a file 12042 */ 12043 struct OPEN_DOWNGRADE4args { 12044 /* CURRENT_FH: opened file */ 12045 stateid4 open_stateid; 12046 seqid4 seqid; 12047 uint32_t share_access; 12048 uint32_t share_deny; 12049 }; 12051 struct OPEN_DOWNGRADE4resok { 12052 stateid4 open_stateid; 12053 }; 12055 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 12056 case NFS4_OK: 12057 OPEN_DOWNGRADE4resok resok4; 12058 default: 12059 void; 12060 }; 12062 /* 12063 * PUTFH: Set current filehandle 12064 */ 12065 struct PUTFH4args { 12066 nfs_fh4 object; 12067 }; 12069 struct PUTFH4res { 12070 /* CURRENT_FH: */ 12071 nfsstat4 status; 12072 }; 12074 /* 12075 * PUTPUBFH: Set public filehandle 12076 */ 12077 struct PUTPUBFH4res { 12078 /* CURRENT_FH: public fh */ 12079 nfsstat4 status; 12080 }; 12082 /* 12083 * PUTROOTFH: Set root filehandle 12084 */ 12085 struct PUTROOTFH4res { 12086 /* CURRENT_FH: root fh */ 12087 nfsstat4 status; 12088 }; 12090 Draft Specification NFS version 4 Protocol August 2002 12092 /* 12093 * READ: Read from file 12094 */ 12095 struct READ4args { 12096 /* CURRENT_FH: file */ 12097 stateid4 stateid; 12098 offset4 offset; 12099 count4 count; 12100 }; 12102 struct READ4resok { 12103 bool eof; 12104 opaque data<>; 12105 }; 12107 union READ4res switch (nfsstat4 status) { 12108 case NFS4_OK: 12109 READ4resok resok4; 12110 default: 12111 void; 12112 }; 12114 /* 12115 * READDIR: Read directory 12116 */ 12117 struct READDIR4args { 12118 /* CURRENT_FH: directory */ 12119 nfs_cookie4 cookie; 12120 verifier4 cookieverf; 12121 count4 dircount; 12122 count4 maxcount; 12123 bitmap4 attr_request; 12124 }; 12126 struct entry4 { 12127 nfs_cookie4 cookie; 12128 component4 name; 12129 fattr4 attrs; 12130 entry4 *nextentry; 12131 }; 12133 struct dirlist4 { 12134 entry4 *entries; 12135 bool eof; 12136 }; 12138 struct READDIR4resok { 12139 verifier4 cookieverf; 12140 dirlist4 reply; 12141 }; 12143 Draft Specification NFS version 4 Protocol August 2002 12145 union READDIR4res switch (nfsstat4 status) { 12146 case NFS4_OK: 12147 READDIR4resok resok4; 12148 default: 12149 void; 12150 }; 12152 /* 12153 * READLINK: Read symbolic link 12154 */ 12155 struct READLINK4resok { 12156 linktext4 link; 12157 }; 12159 union READLINK4res switch (nfsstat4 status) { 12160 case NFS4_OK: 12161 READLINK4resok resok4; 12162 default: 12163 void; 12164 }; 12166 /* 12167 * REMOVE: Remove filesystem object 12168 */ 12169 struct REMOVE4args { 12170 /* CURRENT_FH: directory */ 12171 component4 target; 12172 }; 12174 struct REMOVE4resok { 12175 change_info4 cinfo; 12176 }; 12178 union REMOVE4res switch (nfsstat4 status) { 12179 case NFS4_OK: 12180 REMOVE4resok resok4; 12181 default: 12182 void; 12183 }; 12185 /* 12186 * RENAME: Rename directory entry 12187 */ 12188 struct RENAME4args { 12189 /* SAVED_FH: source directory */ 12190 component4 oldname; 12191 /* CURRENT_FH: target directory */ 12192 component4 newname; 12193 }; 12195 struct RENAME4resok { 12197 Draft Specification NFS version 4 Protocol August 2002 12199 change_info4 source_cinfo; 12200 change_info4 target_cinfo; 12201 }; 12203 union RENAME4res switch (nfsstat4 status) { 12204 case NFS4_OK: 12205 RENAME4resok resok4; 12206 default: 12207 void; 12208 }; 12210 /* 12211 * RENEW: Renew a Lease 12212 */ 12213 struct RENEW4args { 12214 clientid4 clientid; 12215 }; 12217 struct RENEW4res { 12218 nfsstat4 status; 12219 }; 12221 /* 12222 * RESTOREFH: Restore saved filehandle 12223 */ 12225 struct RESTOREFH4res { 12226 /* CURRENT_FH: value of saved fh */ 12227 nfsstat4 status; 12228 }; 12230 /* 12231 * SAVEFH: Save current filehandle 12232 */ 12233 struct SAVEFH4res { 12234 /* SAVED_FH: value of current fh */ 12235 nfsstat4 status; 12236 }; 12238 /* 12239 * SECINFO: Obtain Available Security Mechanisms 12240 */ 12241 struct SECINFO4args { 12242 /* CURRENT_FH: directory */ 12243 component4 name; 12244 }; 12246 /* 12247 * From RFC 2203 12248 */ 12249 enum rpc_gss_svc_t { 12250 RPC_GSS_SVC_NONE = 1, 12252 Draft Specification NFS version 4 Protocol August 2002 12254 RPC_GSS_SVC_INTEGRITY = 2, 12255 RPC_GSS_SVC_PRIVACY = 3 12256 }; 12258 struct rpcsec_gss_info { 12259 sec_oid4 oid; 12260 qop4 qop; 12261 rpc_gss_svc_t service; 12262 }; 12264 /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ 12265 union secinfo4 switch (uint32_t flavor) { 12266 case RPCSEC_GSS: 12267 rpcsec_gss_info flavor_info; 12268 default: 12269 void; 12270 }; 12272 typedef secinfo4 SECINFO4resok<>; 12274 union SECINFO4res switch (nfsstat4 status) { 12275 case NFS4_OK: 12276 SECINFO4resok resok4; 12277 default: 12278 void; 12279 }; 12281 /* 12282 * SETATTR: Set attributes 12283 */ 12284 struct SETATTR4args { 12285 /* CURRENT_FH: target object */ 12286 stateid4 stateid; 12287 fattr4 obj_attributes; 12288 }; 12290 struct SETATTR4res { 12291 nfsstat4 status; 12292 bitmap4 attrsset; 12293 }; 12295 /* 12296 * SETCLIENTID 12297 */ 12298 struct SETCLIENTID4args { 12299 nfs_client_id4 client; 12300 cb_client4 callback; 12301 uint32_t callback_ident; 12302 }; 12304 struct SETCLIENTID4resok { 12305 clientid4 clientid; 12307 Draft Specification NFS version 4 Protocol August 2002 12309 verifier4 setclientid_confirm; 12310 }; 12312 union SETCLIENTID4res switch (nfsstat4 status) { 12313 case NFS4_OK: 12314 SETCLIENTID4resok resok4; 12315 case NFS4ERR_CLID_INUSE: 12316 clientaddr4 client_using; 12317 default: 12318 void; 12319 }; 12321 struct SETCLIENTID_CONFIRM4args { 12322 clientid4 clientid; 12323 verifier4 setclientid_confirm; 12324 }; 12326 struct SETCLIENTID_CONFIRM4res { 12327 nfsstat4 status; 12328 }; 12330 /* 12331 * VERIFY: Verify attributes same 12332 */ 12333 struct VERIFY4args { 12334 /* CURRENT_FH: object */ 12335 fattr4 obj_attributes; 12336 }; 12338 struct VERIFY4res { 12339 nfsstat4 status; 12340 }; 12342 /* 12343 * WRITE: Write to file 12344 */ 12345 enum stable_how4 { 12346 UNSTABLE4 = 0, 12347 DATA_SYNC4 = 1, 12348 FILE_SYNC4 = 2 12349 }; 12351 struct WRITE4args { 12352 /* CURRENT_FH: file */ 12353 stateid4 stateid; 12354 offset4 offset; 12355 stable_how4 stable; 12356 opaque data<>; 12357 }; 12359 struct WRITE4resok { 12360 count4 count; 12362 Draft Specification NFS version 4 Protocol August 2002 12364 stable_how4 committed; 12365 verifier4 writeverf; 12366 }; 12368 union WRITE4res switch (nfsstat4 status) { 12369 case NFS4_OK: 12370 WRITE4resok resok4; 12371 default: 12372 void; 12373 }; 12375 /* 12376 * RELEASE_LOCKOWNER: Notify server to release lockowner 12377 */ 12378 struct RELEASE_LOCKOWNER4args { 12379 lock_owner4 lock_owner; 12380 }; 12382 struct RELEASE_LOCKOWNER4res { 12383 nfsstat4 status; 12384 }; 12386 /* 12387 * ILLEGAL: Response for illegal operation numbers 12388 */ 12389 struct ILLEGAL4res { 12390 nfsstat4 status; 12391 }; 12393 /* 12394 * Operation arrays 12395 */ 12397 enum nfs_opnum4 { 12398 OP_ACCESS = 3, 12399 OP_CLOSE = 4, 12400 OP_COMMIT = 5, 12401 OP_CREATE = 6, 12402 OP_DELEGPURGE = 7, 12403 OP_DELEGRETURN = 8, 12404 OP_GETATTR = 9, 12405 OP_GETFH = 10, 12406 OP_LINK = 11, 12407 OP_LOCK = 12, 12408 OP_LOCKT = 13, 12409 OP_LOCKU = 14, 12410 OP_LOOKUP = 15, 12411 OP_LOOKUPP = 16, 12412 OP_NVERIFY = 17, 12413 OP_OPEN = 18, 12414 OP_OPENATTR = 19, 12415 OP_OPEN_CONFIRM = 20, 12417 Draft Specification NFS version 4 Protocol August 2002 12419 OP_OPEN_DOWNGRADE = 21, 12420 OP_PUTFH = 22, 12421 OP_PUTPUBFH = 23, 12422 OP_PUTROOTFH = 24, 12423 OP_READ = 25, 12424 OP_READDIR = 26, 12425 OP_READLINK = 27, 12426 OP_REMOVE = 28, 12427 OP_RENAME = 29, 12428 OP_RENEW = 30, 12429 OP_RESTOREFH = 31, 12430 OP_SAVEFH = 32, 12431 OP_SECINFO = 33, 12432 OP_SETATTR = 34, 12433 OP_SETCLIENTID = 35, 12434 OP_SETCLIENTID_CONFIRM = 36, 12435 OP_VERIFY = 37, 12436 OP_WRITE = 38, 12437 OP_RELEASE_LOCKOWNER = 39, 12438 OP_ILLEGAL = 10044 12439 }; 12441 union nfs_argop4 switch (nfs_opnum4 argop) { 12442 case OP_ACCESS: ACCESS4args opaccess; 12443 case OP_CLOSE: CLOSE4args opclose; 12444 case OP_COMMIT: COMMIT4args opcommit; 12445 case OP_CREATE: CREATE4args opcreate; 12446 case OP_DELEGPURGE: DELEGPURGE4args opdelegpurge; 12447 case OP_DELEGRETURN: DELEGRETURN4args opdelegreturn; 12448 case OP_GETATTR: GETATTR4args opgetattr; 12449 case OP_GETFH: void; 12450 case OP_LINK: LINK4args oplink; 12451 case OP_LOCK: LOCK4args oplock; 12452 case OP_LOCKT: LOCKT4args oplockt; 12453 case OP_LOCKU: LOCKU4args oplocku; 12454 case OP_LOOKUP: LOOKUP4args oplookup; 12455 case OP_LOOKUPP: void; 12456 case OP_NVERIFY: NVERIFY4args opnverify; 12457 case OP_OPEN: OPEN4args opopen; 12458 case OP_OPENATTR: OPENATTR4args opopenattr; 12459 case OP_OPEN_CONFIRM: OPEN_CONFIRM4args opopen_confirm; 12460 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4args opopen_downgrade; 12461 case OP_PUTFH: PUTFH4args opputfh; 12462 case OP_PUTPUBFH: void; 12463 case OP_PUTROOTFH: void; 12464 case OP_READ: READ4args opread; 12465 case OP_READDIR: READDIR4args opreaddir; 12466 case OP_READLINK: void; 12467 case OP_REMOVE: REMOVE4args opremove; 12468 case OP_RENAME: RENAME4args oprename; 12469 case OP_RENEW: RENEW4args oprenew; 12470 case OP_RESTOREFH: void; 12472 Draft Specification NFS version 4 Protocol August 2002 12474 case OP_SAVEFH: void; 12475 case OP_SECINFO: SECINFO4args opsecinfo; 12476 case OP_SETATTR: SETATTR4args opsetattr; 12477 case OP_SETCLIENTID: SETCLIENTID4args opsetclientid; 12478 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4args 12479 opsetclientid_confirm; 12480 case OP_VERIFY: VERIFY4args opverify; 12481 case OP_WRITE: WRITE4args opwrite; 12482 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4args 12483 oprelease_lockowner; 12484 case OP_ILLEGAL: void; 12485 }; 12487 union nfs_resop4 switch (nfs_opnum4 resop){ 12488 case OP_ACCESS: ACCESS4res opaccess; 12489 case OP_CLOSE: CLOSE4res opclose; 12490 case OP_COMMIT: COMMIT4res opcommit; 12491 case OP_CREATE: CREATE4res opcreate; 12492 case OP_DELEGPURGE: DELEGPURGE4res opdelegpurge; 12493 case OP_DELEGRETURN: DELEGRETURN4res opdelegreturn; 12494 case OP_GETATTR: GETATTR4res opgetattr; 12495 case OP_GETFH: GETFH4res opgetfh; 12496 case OP_LINK: LINK4res oplink; 12497 case OP_LOCK: LOCK4res oplock; 12498 case OP_LOCKT: LOCKT4res oplockt; 12499 case OP_LOCKU: LOCKU4res oplocku; 12500 case OP_LOOKUP: LOOKUP4res oplookup; 12501 case OP_LOOKUPP: LOOKUPP4res oplookupp; 12502 case OP_NVERIFY: NVERIFY4res opnverify; 12503 case OP_OPEN: OPEN4res opopen; 12504 case OP_OPENATTR: OPENATTR4res opopenattr; 12505 case OP_OPEN_CONFIRM: OPEN_CONFIRM4res opopen_confirm; 12506 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4res opopen_downgrade; 12507 case OP_PUTFH: PUTFH4res opputfh; 12508 case OP_PUTPUBFH: PUTPUBFH4res opputpubfh; 12509 case OP_PUTROOTFH: PUTROOTFH4res opputrootfh; 12510 case OP_READ: READ4res opread; 12511 case OP_READDIR: READDIR4res opreaddir; 12512 case OP_READLINK: READLINK4res opreadlink; 12513 case OP_REMOVE: REMOVE4res opremove; 12514 case OP_RENAME: RENAME4res oprename; 12515 case OP_RENEW: RENEW4res oprenew; 12516 case OP_RESTOREFH: RESTOREFH4res oprestorefh; 12517 case OP_SAVEFH: SAVEFH4res opsavefh; 12518 case OP_SECINFO: SECINFO4res opsecinfo; 12519 case OP_SETATTR: SETATTR4res opsetattr; 12520 case OP_SETCLIENTID: SETCLIENTID4res opsetclientid; 12521 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4res 12522 opsetclientid_confirm; 12523 case OP_VERIFY: VERIFY4res opverify; 12524 case OP_WRITE: WRITE4res opwrite; 12525 case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4res 12527 Draft Specification NFS version 4 Protocol August 2002 12529 oprelease_lockowner; 12530 case OP_ILLEGAL: ILLEGAL4res opillegal; 12531 }; 12533 struct COMPOUND4args { 12534 utf8string tag; 12535 uint32_t minorversion; 12536 nfs_argop4 argarray<>; 12537 }; 12539 struct COMPOUND4res { 12540 nfsstat4 status; 12541 utf8string tag; 12542 nfs_resop4 resarray<>; 12543 }; 12545 /* 12546 * Remote file service routines 12547 */ 12548 program NFS4_PROGRAM { 12549 version NFS_V4 { 12550 void 12551 NFSPROC4_NULL(void) = 0; 12553 COMPOUND4res 12554 NFSPROC4_COMPOUND(COMPOUND4args) = 1; 12556 } = 4; 12557 } = 100003; 12559 /* 12560 * NFS4 Callback Procedure Definitions and Program 12561 */ 12563 /* 12564 * CB_GETATTR: Get Current Attributes 12565 */ 12566 struct CB_GETATTR4args { 12567 nfs_fh4 fh; 12568 bitmap4 attr_request; 12569 }; 12571 struct CB_GETATTR4resok { 12572 fattr4 obj_attributes; 12573 }; 12575 union CB_GETATTR4res switch (nfsstat4 status) { 12576 case NFS4_OK: 12577 CB_GETATTR4resok resok4; 12578 default: 12580 Draft Specification NFS version 4 Protocol August 2002 12582 void; 12583 }; 12585 /* 12586 * CB_RECALL: Recall an Open Delegation 12587 */ 12588 struct CB_RECALL4args { 12589 stateid4 stateid; 12590 bool truncate; 12591 nfs_fh4 fh; 12592 }; 12594 struct CB_RECALL4res { 12595 nfsstat4 status; 12596 }; 12598 /* 12599 * CB_ILLEGAL: Response for illegal operation numbers 12600 */ 12601 struct CB_ILLEGAL4res { 12602 nfsstat4 status; 12603 }; 12605 /* 12606 * Various definitions for CB_COMPOUND 12607 */ 12608 enum nfs_cb_opnum4 { 12609 OP_CB_GETATTR = 3, 12610 OP_CB_RECALL = 4, 12611 OP_CB_ILLEGAL = 10044 12612 }; 12614 union nfs_cb_argop4 switch (unsigned argop) { 12615 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 12616 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 12617 case OP_CB_ILLEGAL: void; 12618 }; 12620 union nfs_cb_resop4 switch (unsigned resop){ 12621 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 12622 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 12623 case OP_CB_ILLEGAL: CB_ILLEGAL4res opcbillegal; 12624 }; 12626 struct CB_COMPOUND4args { 12627 utf8string tag; 12628 uint32_t minorversion; 12629 uint32_t callback_ident; 12630 nfs_cb_argop4 argarray<>; 12631 }; 12633 struct CB_COMPOUND4res { 12635 Draft Specification NFS version 4 Protocol August 2002 12637 nfsstat4 status; 12638 utf8string tag; 12639 nfs_cb_resop4 resarray<>; 12640 }; 12642 /* 12643 * Program number is in the transient range since the client 12644 * will assign the exact transient program number and provide 12645 * that to the server via the SETCLIENTID operation. 12646 */ 12647 program NFS4_CALLBACK { 12648 version NFS_CB { 12649 void 12650 CB_NULL(void) = 0; 12651 CB_COMPOUND4res 12652 CB_COMPOUND(CB_COMPOUND4args) = 1; 12653 } = 1; 12654 } = 0x40000000; 12656 Draft Specification NFS version 4 Protocol August 2002 12658 19. Bibliography 12660 [Floyd] 12661 S. Floyd, V. Jacobson, "The Synchronization of Periodic Routing 12662 Messages," IEEE/ACM Transactions on Networking, 2(2), pp. 122-136, 12663 April 1994. 12665 [Gray] 12666 C. Gray, D. Cheriton, "Leases: An Efficient Fault-Tolerant Mechanism 12667 for Distributed File Cache Consistency," Proceedings of the Twelfth 12668 Symposium on Operating Systems Principles, p. 202-210, December 1989. 12670 [ISO10646] 12671 "ISO/IEC 10646-1:1993. International Standard -- Information 12672 technology -- Universal Multiple-Octet Coded Character Set (UCS) -- 12673 Part 1: Architecture and Basic Multilingual Plane." 12675 [Juszczak] 12676 Juszczak, Chet, "Improving the Performance and Correctness of an NFS 12677 Server," USENIX Conference Proceedings, USENIX Association, Berkeley, 12678 CA, June 1990, pages 53-63. Describes reply cache implementation 12679 that avoids work in the server by handling duplicate requests. More 12680 important, though listed as a side-effect, the reply cache aids in 12681 the avoidance of destructive non-idempotent operation re-application 12682 -- improving correctness. 12684 [Kazar] 12685 Kazar, Michael Leon, "Synchronization and Caching Issues in the 12686 Andrew File System," USENIX Conference Proceedings, USENIX 12687 Association, Berkeley, CA, Dallas Winter 1988, pages 27-36. A 12688 description of the cache consistency scheme in AFS. Contrasted with 12689 other distributed file systems. 12691 [Macklem] 12692 Macklem, Rick, "Lessons Learned Tuning the 4.3BSD Reno Implementation 12693 of the NFS Protocol," Winter USENIX Conference Proceedings, USENIX 12694 Association, Berkeley, CA, January 1991. Describes performance work 12695 in tuning the 4.3BSD Reno NFS implementation. Describes performance 12696 improvement (reduced CPU loading) through elimination of data copies. 12698 [Mogul] 12699 Mogul, Jeffrey C., "A Recovery Protocol for Spritely NFS," USENIX 12700 File System Workshop Proceedings, Ann Arbor, MI, USENIX Association, 12701 Berkeley, CA, May 1992. Second paper on Spritely NFS proposes a 12702 lease-based scheme for recovering state of consistency protocol. 12704 Draft Specification NFS version 4 Protocol August 2002 12706 [Nowicki] 12707 Nowicki, Bill, "Transport Issues in the Network File System," ACM 12708 SIGCOMM newsletter Computer Communication Review, April 1989. A 12709 brief description of the basis for the dynamic retransmission work. 12711 [Pawlowski] 12712 Pawlowski, Brian, Ron Hixon, Mark Stein, Joseph Tumminaro, "Network 12713 Computing in the UNIX and IBM Mainframe Environment," Uniforum `89 12714 Conf. Proc., (1989) Description of an NFS server implementation for 12715 IBM's MVS operating system. 12717 [RFC1094] 12718 Sun Microsystems, Inc., "NFS: Network File System Protocol 12719 Specification", RFC1094, March 1989. 12721 http://www.ietf.org/rfc/rfc1094.txt 12723 [RFC1345] 12724 Simonsen, K., "Character Mnemonics & Character Sets", RFC1345, 12725 Rationel Almen Planlaegning, June 1992. 12727 http://www.ietf.org/rfc/rfc1345.txt 12729 [RFC1700] 12730 Reynolds, J., Postel, J., "Assigned Numbers", RFC1700, ISI, October 12731 1994 12733 http://www.ietf.org/rfc/rfc1700.txt 12735 [RFC1813] 12736 Callaghan, B., Pawlowski, B., Staubach, P., "NFS Version 3 Protocol 12737 Specification", RFC1813, Sun Microsystems, Inc., June 1995. 12739 http://www.ietf.org/rfc/rfc1813.txt 12741 [RFC1831] 12742 Srinivasan, R., "RPC: Remote Procedure Call Protocol Specification 12743 Version 2", RFC1831, Sun Microsystems, Inc., August 1995. 12745 http://www.ietf.org/rfc/rfc1831.txt 12747 [RFC1832] 12748 Srinivasan, R., "XDR: External Data Representation Standard", 12749 RFC1832, Sun Microsystems, Inc., August 1995. 12751 Draft Specification NFS version 4 Protocol August 2002 12753 http://www.ietf.org/rfc/rfc1832.txt 12755 [RFC1833] 12756 Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC1833, 12757 Sun Microsystems, Inc., August 1995. 12759 http://www.ietf.org/rfc/rfc1833.txt 12761 [RFC1884] 12762 Hinden, R., Deering, S., "IP Version 6 Addressing Architecture", 12763 RFC1884, December 1995. 12765 http://www.ietf.org/rfc/rfc1884.txt 12767 [RFC1964] 12768 Linn, J., "The Kerberos Version 5 GSS-API Mechanism", RFC1964, 12769 OpenVision Technologies, June 1996. 12771 http://www.ietf.org/rfc/rfc1964.txt 12773 [RFC2025] 12774 Adams, C., "The Simple Public-Key GSS-API Mechanism (SPKM)", RFC2025, 12775 Bell-Northern Research, October 1996. 12777 http://www.ietf.org/rfc/rfc2026.txt 12779 [RFC2054] 12780 Callaghan, B., "WebNFS Client Specification", RFC2054, Sun 12781 Microsystems, Inc., October 1996 12783 http://www.ietf.org/rfc/rfc2054.txt 12785 [RFC2055] 12786 Callaghan, B., "WebNFS Server Specification", RFC2055, Sun 12787 Microsystems, Inc., October 1996 12789 http://www.ietf.org/rfc/rfc2055.txt 12791 [RFC2119] 12792 Bradner, S., "Key words for use in RFCs to Indicate Requirement 12793 Levels", RFC2119, Harvard University, March 1997 12795 http://www.ietf.org/rfc/rfc2119.txt 12797 Draft Specification NFS version 4 Protocol August 2002 12799 [RFC2152] 12800 Goldsmith, D., "UTF-7 A Mail-Safe Transformation Format of Unicode", 12801 RFC2152, Apple Computer, Inc., May 1997 12803 http://www.ietf.org/rfc/rfc2152.txt 12805 [RFC2203] 12806 Eisler, M., Chiu, A., Ling, L., "RPCSEC_GSS Protocol Specification", 12807 RFC2203, Sun Microsystems, Inc., August 1995. 12809 http://www.ietf.org/rfc/rfc2203.txt 12811 [RFC2224] 12812 Callaghan, B., "NFS URL Scheme", RFC2224, Sun Microsystems, Inc., 12813 October 1997 12815 http://www.ietf.org/rfc/rfc2224.txt 12817 [RFC2277] 12818 Alvestrand, H., "IETF Policy on Character Sets and Languages", 12819 RFC2277, UNINETT, January 1998. 12821 http://www.ietf.org/rfc/rfc2277.txt 12823 [RFC2279] 12824 Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC2279, 12825 Alis Technologies, January 1998. 12827 http://www.ietf.org/rfc/rfc2279.txt 12829 [RFC2581] 12830 Allman, M., Paxson, V., Stevens, W., "TCP Congestion Control", 12831 RFC2581, April 1999. 12833 http://www.ietf.org/rfc/rfc2581.txt 12835 [RFC2623] 12836 Eisler, M., "NFS Version 2 and Version 3 Security Issues and the NFS 12837 Protocol's Use of RPCSEC_GSS and Kerberos V5", RFC2623, Sun 12838 Microsystems, June 1999 12840 http://www.ietf.org/rfc/rfc2623.txt 12842 [RFC2624] 12843 Shepler, S., "NFS Version 4 Design Considerations", RFC2624, Sun 12845 Draft Specification NFS version 4 Protocol August 2002 12847 Microsystems, June 1999 12849 http://www.ietf.org/rfc/rfc2624.txt 12851 [RFC2743] 12852 Linn, J., "Generic Security Service Application Program Interface, 12853 Version 2, Update 1", RFC2743, RSA Laboratories, January 2000. 12855 http://www.ietf.org/rfc/rfc2743.txt 12857 [RFC2755] 12858 Chiu, A., Eisler, M., Callaghan, B., "Security Negotiation for 12859 WebNFS" , RFC2755, Sun Microsystems, June 2000 12861 http://www.ietf.org/rfc/rfc2847.txt 12863 [RFC2847] 12864 Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism Using 12865 SPKM", RFC2847, Zambeel, June 2000 12867 http://www.ietf.org/rfc/rfc2847.txt 12869 [Sandberg] 12870 Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, "Design 12871 and Implementation of the Sun Network Filesystem," USENIX Conference 12872 Proceedings, USENIX Association, Berkeley, CA, Summer 1985. The 12873 basic paper describing the SunOS implementation of the NFS version 2 12874 protocol, and discusses the goals, protocol specification and trade- 12875 offs. 12877 [Srinivasan] 12878 Srinivasan, V., Jeffrey C. Mogul, "Spritely NFS: Implementation and 12879 Performance of Cache Consistency Protocols", WRL Research Report 12880 89/5, Digital Equipment Corporation Western Research Laboratory, 100 12881 Hamilton Ave., Palo Alto, CA, 94301, May 1989. This paper analyzes 12882 the effect of applying a Sprite-like consistency protocol applied to 12883 standard NFS. The issues of recovery in a stateful environment are 12884 covered in [Mogul]. 12886 [Unicode1] 12887 The Unicode Consortium, "The Unicode Standard, Version 3.0", 12888 Addison-Wesley Developers Press, Reading, MA, 2000. ISBN 0-201- 12889 61633-5. 12891 More information available at: http://www.unicode.org/ 12893 Draft Specification NFS version 4 Protocol August 2002 12895 [Unicode2] 12896 "Unsupported Scripts" Unicode, Inc., The Unicode Consortium, P.O. Box 12897 700519, San Jose, CA 95710-0519 USA, September 1999 12899 http://www.unicode.org/unicode/standard/unsupported.html 12901 [XNFS] 12902 The Open Group, Protocols for Interworking: XNFS, Version 3W, The 12903 Open Group, 1010 El Camino Real Suite 380, Menlo Park, CA 94025, ISBN 12904 1-85912-184-5, February 1998. 12906 HTML version available: http://www.opengroup.org 12908 Draft Specification NFS version 4 Protocol August 2002 12910 20. Authors 12912 20.1. Editor's Address 12914 Spencer Shepler 12915 Sun Microsystems, Inc. 12916 7808 Moonflower Drive 12917 Austin, Texas 78750 12919 Phone: +1 512-349-9376 12920 E-mail: spencer.shepler@sun.com 12922 20.2. Authors' Addresses 12924 Carl Beame 12925 Hummingbird Ltd. 12927 E-mail: beame@bws.com 12929 Brent Callaghan 12930 Sun Microsystems, Inc. 12931 17 Network Circle 12932 Menlo Park, CA 94025 12934 Phone: +1 650-786-5067 12935 E-mail: brent.callaghan@sun.com 12937 Mike Eisler 12938 5765 Chase Point Circle 12939 Colorado Springs, CO 80919 12941 Phone: +1 719-599-9026 12942 E-mail: mike@eisler.com 12944 David Noveck 12945 Network Appliance 12946 375 Totten Pond Road 12947 Waltham, MA 02451 12949 Phone: +1 781-768-5347 12950 E-mail: dnoveck@netapp.com 12952 David Robinson 12953 Sun Microsystems, Inc. 12954 5300 Riata Park Court 12955 Austin, TX 78727 12957 Draft Specification NFS version 4 Protocol August 2002 12959 Phone: +1 650-786-5088 12960 E-mail: david.robinson@sun.com 12962 Robert Thurlow 12963 Sun Microsystems, Inc. 12964 500 Eldorado Blvd. 12965 Broomfield, CO 80021 12967 Phone: +1 650-786-5096 12968 E-mail: robert.thurlow@sun.com 12970 20.3. Acknowledgements 12972 The author thanks and acknowledges: 12974 Neil Brown for his extensive review and comments of various drafts. 12975 Andy Adamson, Jim Rees, and Kendrick Smith from the CITI organization 12976 at the University of Michigan for their implementation efforts and 12977 feedback on the protocol specification. Mike Kupfer for his review 12978 of the file locking and ACL mechanisms. Alan Yoder for his input to 12979 ACL mechanisms. Peter Astrand for his close review of the protocol 12980 specification. Ran Atkinson for his constant reminder that user's do 12981 matter. 12983 Draft Specification NFS version 4 Protocol August 2002 12985 21. Full Copyright Statement 12987 "Copyright (C) The Internet Society (2000-2002). All Rights 12988 Reserved. 12990 This document and translations of it may be copied and furnished to 12991 others, and derivative works that comment on or otherwise explain it 12992 or assist in its implementation may be prepared, copied, published 12993 and distributed, in whole or in part, without restriction of any 12994 kind, provided that the above copyright notice and this paragraph are 12995 included on all such copies and derivative works. However, this 12996 document itself may not be modified in any way, such as by removing 12997 the copyright notice or references to the Internet Society or other 12998 Internet organizations, except as needed for the purpose of 12999 developing Internet standards in which case the procedures for 13000 copyrights defined in the Internet Standards process must be 13001 followed, or as required to translate it into languages other than 13002 English. 13004 The limited permissions granted above are perpetual and will not be 13005 revoked by the Internet Society or its successors or assigns. 13007 This document and the information contained herein is provided on an 13008 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 13009 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 13010 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 13011 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 13012 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."