idnits 2.17.1 draft-ietf-nfsv4-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 146) being 62 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC1094], [RFC1813]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 580 has weird spacing: '...ned int uin...' == Line 584 has weird spacing: '...d hyper uint6...' == Line 647 has weird spacing: '...8string typ...' == Line 727 has weird spacing: '...8string ser...' == Line 790 has weird spacing: '...ned int cb_pr...' == (37 more instances...) == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: The filehandle in the NFS protocol is a per server unique identifier for a file system object. The contents of the filehandle are opaque to the client. Therefore, the server is responsible for translating the filehandle to an internal representation of the file system object. Since the filehandle is the client's reference to an object and the client may cache this reference, the server SHOULD not reuse a filehandle for another file system object. If the server needs to reuse a filehandle value, the time elapsed before reuse SHOULD be large enough such that it is unlikely the client has a cached copy of the reused filehandle value. Note that a client may cache a filehandle for a very long time. For example, a client may cache NFS data to local storage as a method to expand its effective cache size and as a means to survive client restarts. Therefore, the lifetime of a cached filehandle may be extended. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The reader may be wondering why there are three FH4_VOL* bits and why FH4_VOLATILE_ANY is exclusive of FH4_VOL_MIGRATION and FH4_VOL_RENAME. If the a filehandle is normally persistent but cannot persist across a file set migration, then the presence of the FH4_VOL_MIGRATION or FH4_VOL_RENAME tells the client that it can treat the file handle as persistent for purposes of maintaining a file name to file handle cache, except for the specific event described by the bit. However, FH4_VOLATILE_ANY tells the client that it should not maintain such a cache for unopened files. A server MUST not present FH4_VOLATILE_ANY with FH4_VOL_MIGRATION or FH4_VOL_RENAME as this will lead to confusion. FH4_VOLATILE_ANY implies that the file handle will expire upon migration or rename, in addition to other events. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2000) is 8709 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC1094' on line 9871 looks like a reference -- Missing reference section? 'RFC1813' on line 9889 looks like a reference -- Missing reference section? 'RFC1831' on line 9895 looks like a reference -- Missing reference section? 'RFC1832' on line 9901 looks like a reference -- Missing reference section? 'RFC2623' on line 9961 looks like a reference -- Missing reference section? 'RFC1964' on line 876 looks like a reference -- Missing reference section? 'RFC2847' on line 9974 looks like a reference -- Missing reference section? 'RFC2078' on line 9931 looks like a reference -- Missing reference section? 'RFC2203' on line 9943 looks like a reference -- Missing reference section? 'RFC1700' on line 9883 looks like a reference -- Missing reference section? 'RFC1833' on line 9907 looks like a reference -- Missing reference section? 'RFC2581' on line 839 looks like a reference -- Missing reference section? 'Floyd' on line 9816 looks like a reference -- Missing reference section? 'RFC2025' on line 9913 looks like a reference -- Missing reference section? 'RFC2054' on line 9919 looks like a reference -- Missing reference section? 'RFC2055' on line 9925 looks like a reference -- Missing reference section? 'RFC2624' on line 9968 looks like a reference -- Missing reference section? 'RFC1345' on line 9877 looks like a reference -- Missing reference section? 'XNFS' on line 10010 looks like a reference -- Missing reference section? '4' on line 2427 looks like a reference -- Missing reference section? 'Juszczak' on line 9831 looks like a reference -- Missing reference section? 'ISO10646' on line 9826 looks like a reference -- Missing reference section? 'RFC2277' on line 9949 looks like a reference -- Missing reference section? 'RFC2279' on line 9955 looks like a reference -- Missing reference section? 'RFC2152' on line 9937 looks like a reference -- Missing reference section? 'Unicode1' on line 9997 looks like a reference -- Missing reference section? 'Unicode2' on line 10004 looks like a reference -- Missing reference section? 'Gray' on line 9821 looks like a reference -- Missing reference section? 'Kazar' on line 9840 looks like a reference -- Missing reference section? 'Macklem' on line 9847 looks like a reference -- Missing reference section? 'Mogul' on line 9995 looks like a reference -- Missing reference section? 'Nowicki' on line 9860 looks like a reference -- Missing reference section? 'Pawlowski' on line 9865 looks like a reference -- Missing reference section? 'Sandberg' on line 9980 looks like a reference -- Missing reference section? 'Srinivasan' on line 9988 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 11 warnings (==), 38 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 NFS Version 4 Working Group S. Shepler 2 INTERNET-DRAFT Sun Microsystems 3 Document: draft-ietf-nfsv4-07.txt C. Beame 4 Hummingbird Ltd 5 B. Callaghan 6 Sun Microsystems 7 M. Eisler 8 Zambeel 9 D. Noveck 10 Network Appliance 11 D. Robinson 12 Sun Microsystems 13 R. Thurlow 14 Sun Microsystems 15 June 2000 17 NFS version 4 Protocol 19 Status of this Memo 21 This document is an Internet-Draft and is in full conformance with 22 all provisions of Section 10 of RFC2026. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet- Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 Abstract 42 NFS version 4 is a distributed file system protocol which owes 43 heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. 45 Unlike earlier versions, the NFS version 4 protocol supports 46 traditional file access while integrating support for file locking 47 and the mount protocol. In addition, support for strong security 48 (and its negotiation), compound operations, client caching, and 49 internationalization have been added. Of course, attention has been 50 applied to making NFS version 4 operate well in an Internet 51 environment. 53 Copyright 55 Copyright (C) The Internet Society (2000). All Rights Reserved. 57 Key Words 59 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 60 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 61 document are to be interpreted as described in RFC 2119. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 66 1.1. Overview of NFS Version 4 Features . . . . . . . . . . . . 7 67 1.1.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 68 1.1.2. Procedure and Operation Structure . . . . . . . . . . . 8 69 1.1.3. File System Model . . . . . . . . . . . . . . . . . . . 9 70 1.1.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 71 1.1.3.2. Attribute Types . . . . . . . . . . . . . . . . . . . 9 72 1.1.3.3. File System Replication and Migration . . . . . . . 10 73 1.1.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . . 10 74 1.1.5. File locking . . . . . . . . . . . . . . . . . . . . . 10 75 1.1.6. Client Caching and Delegation . . . . . . . . . . . . 11 76 1.2. General Definitions . . . . . . . . . . . . . . . . . . 12 77 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . 14 78 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 14 79 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 80 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 20 81 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 20 82 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 20 83 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 20 84 3.2.1.1. Kerberos V5 as security triple . . . . . . . . . . . 21 85 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 21 86 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 22 87 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 23 88 3.3.1. Security Error . . . . . . . . . . . . . . . . . . . . 23 89 3.3.2. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 23 90 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 23 91 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 26 92 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 26 93 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 26 94 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 27 95 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 27 96 4.2.1. General Properties of a Filehandle . . . . . . . . . . 27 97 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 28 98 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 28 99 4.2.4. One Method of Constructing a Volatile Filehandle . . . 30 100 4.3. Client Recovery from Filehandle Expiration . . . . . . . 30 101 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 32 102 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 33 103 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 33 104 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 33 105 5.4. Mandatory Attributes - Definitions . . . . . . . . . . . 35 106 5.5. Recommended Attributes - Definitions . . . . . . . . . . 37 107 5.6. Interpreting owner and owner_group . . . . . . . . . . . 41 108 5.7. Character Case Attributes . . . . . . . . . . . . . . . 42 109 5.8. Quota Attributes . . . . . . . . . . . . . . . . . . . . 42 110 5.9. Access Control Lists . . . . . . . . . . . . . . . . . . 43 111 5.9.1. ACE type . . . . . . . . . . . . . . . . . . . . . . . 44 112 5.9.2. ACE flag . . . . . . . . . . . . . . . . . . . . . . . 44 113 5.9.3. ACE Access Mask . . . . . . . . . . . . . . . . . . . 46 114 5.9.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 47 115 6. File System Migration and Replication . . . . . . . . . . 48 116 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 48 117 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 48 118 6.3. Interpretation of the fs_locations Attribute . . . . . . 49 119 6.4. Filehandle Recovery for Migration or Replication . . . . 50 120 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 51 121 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 51 122 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 51 123 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 51 124 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 52 125 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 52 126 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 52 127 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 53 128 7.8. Security Policy and Name Space Presentation . . . . . . 53 129 8. File Locking and Share Reservations . . . . . . . . . . . 54 130 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 54 131 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 54 132 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 56 133 8.1.3. nfs_lockowner and stateid Definition . . . . . . . . . 57 134 8.1.4. Use of the stateid . . . . . . . . . . . . . . . . . . 58 135 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 58 136 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 59 137 8.1.7. Releasing nfs_lockowner State . . . . . . . . . . . . 60 138 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 60 139 8.3. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 61 140 8.4. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 61 141 8.5. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 62 142 8.5.1. Client Failure and Recovery . . . . . . . . . . . . . 62 143 8.5.2. Server Failure and Recovery . . . . . . . . . . . . . 63 144 8.5.3. Network Partitions and Recovery . . . . . . . . . . . 64 145 8.6. Recovery from a Lock Request Timeout or Abort . . . . . 65 146 8.7. Server Revocation of Locks . . . . . . . . . . . . . . . 66 147 8.8. Share Reservations . . . . . . . . . . . . . . . . . . . 67 148 8.9. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 68 149 8.10. Open Upgrade and Downgrade . . . . . . . . . . . . . . 68 150 8.11. Short and Long Leases . . . . . . . . . . . . . . . . . 69 151 8.12. Clocks and Calculating Lease Expiration . . . . . . . . 69 152 8.13. Migration, Replication and State . . . . . . . . . . . 70 153 8.13.1. Migration and State . . . . . . . . . . . . . . . . . 70 154 8.13.2. Replication and State . . . . . . . . . . . . . . . . 70 155 8.13.3. Notification of Migrated Lease . . . . . . . . . . . 71 156 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 72 157 9.1. Performance Challenges for Client-Side Caching . . . . . 72 158 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 73 159 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 74 160 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 76 161 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 76 162 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 77 163 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 78 164 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 79 165 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 80 166 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 82 167 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 83 168 9.4.3. Recall of Open Delegation . . . . . . . . . . . . . . 83 169 9.4.4. Delegation Revocation . . . . . . . . . . . . . . . . 85 170 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 85 171 9.5.1. Revocation Recovery for Write Open Delegation . . . . 86 172 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 87 173 9.7. Name Caching . . . . . . . . . . . . . . . . . . . . . . 88 174 9.8. Directory Caching . . . . . . . . . . . . . . . . . . . 89 175 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 91 176 11. Internationalization . . . . . . . . . . . . . . . . . . 94 177 11.1. Universal Versus Local Character Sets . . . . . . . . . 94 178 11.2. Overview of Universal Character Set Standards . . . . . 95 179 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 96 180 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 96 181 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 97 182 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 98 183 13. NFS Version 4 Requests . . . . . . . . . . . . . . . . . 103 184 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 103 185 13.2. Evaluation of a Compound Request . . . . . . . . . . . 103 186 13.3. Synchronous Modifying Operations . . . . . . . . . . . 104 187 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 105 188 14. NFS Version 4 Procedures . . . . . . . . . . . . . . . . 106 189 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 106 190 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 107 191 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 110 192 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 113 193 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 115 194 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 118 195 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 196 Recovery . . . . . . . . . . . . . . . . . . . . . . 120 197 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 121 198 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 122 199 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 124 200 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 126 201 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 128 202 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 130 203 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 132 204 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 134 205 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 137 206 14.2.15. Operation 17: NVERIFY - Verify Difference in 207 Attributes . . . . . . . . . . . . . . . . . . . . . 139 208 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 141 209 14.2.17. Operation 19: OPENATTR - Open Named Attribute 210 Directory . . . . . . . . . . . . . . . . . . . . . 149 211 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 151 212 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access154 213 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 156 214 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 157 215 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 158 216 14.2.23. Operation 25: READ - Read from File . . . . . . . . 159 217 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 162 218 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 166 219 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 168 220 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 170 221 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 173 222 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 174 223 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 176 224 14.2.31. Operation 33: SECINFO - Obtain Available Security . 177 225 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 179 226 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 181 227 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 183 228 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 184 229 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 186 230 15. NFS Version 4 Callback Procedures . . . . . . . . . . . . 190 231 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 190 232 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 191 233 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 193 234 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 194 235 16. Security Considerations . . . . . . . . . . . . . . . . . 196 236 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 197 237 17.1. Named Attribute Definition . . . . . . . . . . . . . . 197 238 18. RPC definition file . . . . . . . . . . . . . . . . . . . 198 239 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 227 240 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 232 241 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 232 242 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 232 243 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 233 244 21. Full Copyright Statement . . . . . . . . . . . . . . . . 234 246 1. Introduction 248 The NFS version 4 protocol is a further revision of the NFS protocol 249 defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains 250 the essential characteristics of previous versions: design for easy 251 recovery, independent of transport protocols, operating systems and 252 filesystems, simplicity, and good performance. The NFS version 4 253 revision has the following goals: 255 o Improved access and good performance on the Internet. 257 The protocol is designed to transit firewalls easily, perform 258 well where latency is high and bandwidth is low, and scale to 259 very large numbers of clients per server. 261 o Strong security with negotiation built into the protocol. 263 The protocol builds on the work of the ONCRPC working group in 264 supporting the RPCSEC_GSS protocol. Additionally, the NFS 265 version 4 protocol provides a mechanism to allow clients and 266 servers the ability to negotiate security and require clients 267 and servers to support a minimal set of security schemes. 269 o Good cross-platform interoperability. 271 The protocol features a file system model that provides a 272 useful, common set of features that does not unduly favor one 273 file system or operating system over another. 275 o Designed for protocol extensions. 277 The protocol is designed to accept standard extensions that do 278 not compromise backward compatibility. 280 1.1. Overview of NFS Version 4 Features 282 To provide a reasonable context for the reader, the major features of 283 NFS version 4 protocol will be reviewed in brief. This will be done 284 to provide an appropriate context for both the reader who is familiar 285 with the previous versions of the NFS protocol and the reader that is 286 new to the NFS protocols. For the reader new to the NFS protocols, 287 there is still a fundamental knowledge that is expected. The reader 288 should be familiar with the XDR and RPC protocols as described in 289 [RFC1831] and [RFC1832]. A basic knowledge of file systems and 290 distributed file systems is expected as well. 292 1.1.1. RPC and Security 294 As with previous versions of NFS, the External Data Representation 295 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS 296 version 4 protocol are those defined in [RFC1831] and [RFC1832]. To 297 meet end to end security requirements, the RPCSEC_GSS framework 298 [RFC2623] will be used to extend the basic RPC security. With the 299 use of RPCSEC_GSS, various mechanisms can be provided to offer 300 authentication, integrity, and privacy to the NFS version 4 protocol. 301 Kerberos V5 will be used as described in [RFC1964] to provide one 302 security framework. The LIPKEY GSS-API mechanism described in 303 [RFC2847] will be used to provide for the use of user password and 304 server public key by the NFS version 4 protocol. With the use of 305 RPCSEC_GSS, other mechanisms may also be specified and used for NFS 306 version 4 security. 308 To enable in-band security negotiation, the NFS version 4 protocol 309 has added a new operation which provides the client a method of 310 querying the server about its policies regarding which security 311 mechanisms must be used for access to the server's file system 312 resources. With this, the client can securely match the security 313 mechanism that meets the policies specified at both the client and 314 server. 316 1.1.2. Procedure and Operation Structure 318 A significant departure from the previous versions of the NFS 319 protocol is the introduction of the COMPOUND procedure. For the NFS 320 version 4 protocol, there are two RPC procedures, NULL and COMPOUND. 321 The COMPOUND procedure is defined in terms of operations and these 322 operations correspond more closely to the traditional NFS procedures. 323 With the use of the COMPOUND procedure, the client is able to build 324 simple or complex requests. These COMPOUND requests allow for a 325 reduction in the number of RPCs needed for logical file system 326 operations. For example, without previous contact with a server a 327 client will be able to read data from a file in one request by 328 combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. 329 With previous versions of the NFS protocol, this type of single 330 request was not possible. 332 The model used for COMPOUND is very simple. There is no logical OR 333 or ANDing of operations. The operations combined within a COMPOUND 334 request are evaluated in order by the server. Once an operation 335 returns a failing result, the evaluation ends and the results of all 336 evaluated operations are returned to the client. 338 The NFS version 4 protocol continues to have the client refer to a 339 file or directory at the server by a "filehandle". The COMPOUND 340 procedure has a method of passing a filehandle from one operation to 341 another within the sequence of operations. There is a concept of a 342 "current filehandle" and "saved filehandle". Most operations use the 343 "current filehandle" as the file system object to operate upon. The 344 "saved filehandle" is used as temporary filehandle storage within a 345 COMPOUND procedure as well as an additional operand for certain 346 operations. 348 1.1.3. File System Model 350 The general file system model used for the NFS version 4 protocol is 351 the same as previous versions. The server file system is 352 hierarchical with the regular files contained within being treated as 353 opaque byte streams. In a slight departure, file and directory names 354 are encoded with UTF-8 to deal with the basics of 355 internationalization. 357 The NFS version 4 protocol does not require a separate protocol to 358 provide for the initial mapping between path name and filehandle. 359 Instead of using the older MOUNT protocol for this mapping, the 360 server provides a ROOT filehandle that represents the logical root or 361 top of the file system tree provided by the server. The server 362 provides multiple file systems by glueing them together with pseudo 363 file systems. These pseudo file systems provide for potential gaps 364 in the path names between real file systems. 366 1.1.3.1. Filehandle Types 368 In previous versions of the NFS protocol, the filehandle provided by 369 the server was guaranteed to be valid or persistent for the lifetime 370 of the file system object to which it referred. For some server 371 implementations, this persistence requirement has been difficult to 372 meet. For the NFS version 4 protocol, this requirement has been 373 relaxed by introducing another type of filehandle, volatile. With 374 persistent and volatile filehandle types, the server implementation 375 can match the abilities of the file system at the server along with 376 the operating environment. The client will have knowledge of the 377 type of filehandle being provided by the server and can be prepared 378 to deal with the semantics of each. 380 1.1.3.2. Attribute Types 382 The NFS version 4 protocol introduces three classes of file system or 383 file attributes. Like the additional filehandle type, the 384 classification of file attributes has been done to ease server 385 implementations along with extending the overall functionality of the 386 NFS protocol. This attribute model is structured to be extensible 387 such that new attributes can be introduced in minor revisions of the 388 protocol without requiring significant rework. 390 The three classifications are: mandatory, recommended and named 391 attributes. This is a significant departure from the previous 392 attribute model used in the NFS protocol. Previously, the attributes 393 for the file system and file objects were a fixed set of mainly Unix 394 attributes. If the server or client did not support a particular 395 attribute, it would have to simulate the attribute the best it could. 397 Mandatory attributes are the minimal set of file or file system 398 attributes that must be provided by the server and must be properly 399 represented by the server. Recommended attributes represent 400 different file system types and operating environments. The 401 recommended attributes will allow for better interoperability and the 402 inclusion of more operating environments. The mandatory and 403 recommended attribute sets are traditional file or file system 404 attributes. The third type of attribute is the named attribute. A 405 named attribute is an opaque byte stream that is associated with a 406 directory or file and referred to by a string name. Named attributes 407 are meant to be used by client applications as a method to associate 408 application specific data with a regular file or directory. 410 One significant addition to the recommended set of file attributes is 411 the Access Control List (ACL) attribute. This attribute provides for 412 directory and file access control beyond the model used in previous 413 versions of the NFS protocol. The ACL definition allows for 414 specification of user and group level access control. 416 1.1.3.3. File System Replication and Migration 418 With the use of a special file attribute, the ability to migrate or 419 replicate server file systems is enabled within the protocol. The 420 file system locations attribute provides a method for the client to 421 probe the server about the location of a file system. In the event 422 of a migration of a file system, the client will receive an error 423 when operating on the file system and it can then query as to the new 424 file system location. Similar steps are used for replication, the 425 client is able to query the server for the multiple available 426 locations of a particular file system. From this information, the 427 client can use its own policies to access the appropriate file system 428 location. 430 1.1.4. OPEN and CLOSE 432 The NFS version 4 protocol introduces OPEN and CLOSE operations. The 433 OPEN operation provides a single point where file lookup, creation, 434 and share semantics can be combined. The CLOSE operation also 435 provides for the release of state accumulated by OPEN. 437 1.1.5. File locking 439 With the NFS version 4 protocol, the support for byte range file 440 locking is part of the NFS protocol. The file locking support is 441 structured so that an RPC callback mechanism is not required. This 442 is a departure from the previous versions of the NFS file locking 443 protocol, Network Lock Manager (NLM). The state associated with file 444 locks is maintained at the server under a lease-based model. The 445 server defines a single lease period for all state held by a NFS 446 client. If the client does not renew its lease within the defined 447 period, all state associated with the client's lease may be released 448 by the server. The client may renew its lease with use of the RENEW 449 operation or implicitly by use of other operations (primarily READ). 451 1.1.6. Client Caching and Delegation 453 The file, attribute, and directory caching for the NFS version 4 454 protocol is similar to previous versions. Attributes and directory 455 information are cached for a duration determined by the client. At 456 the end of a predefined timeout, the client will query the server to 457 see if the related file system object has been updated. 459 For file data, the client checks its cache validity when the file is 460 opened. A query is sent to the server to determine if the file has 461 been changed. Based on this information, the client determines if 462 the data cache for the file should kept or released. Also, when the 463 file is closed, any modified data is written to the server. 465 If an application wants to serialize access to file data, file 466 locking of the file data ranges in question should be used. 468 The major addition to NFS version 4 in the area of caching is the 469 ability of the server to delegate certain responsibilities to the 470 client. When the server grants a delegation for a file to a client, 471 the client is guaranteed certain semantics with respect to the 472 sharing of that file with other clients. At OPEN, the server may 473 provide the client either a read or write delegation for the file. 474 If the client is granted a read delegation, it is assured that no 475 other client has the ability to write to the file for the duration of 476 the delegation. If the client is granted a write delegation, the 477 client is assured that no other client has read or write access to 478 the file. 480 Delegations can be recalled by the server. If another client 481 requests access to the file in such a way that the access conflicts 482 with the granted delegation, the server is able to notify the initial 483 client and recall the delegation. This requires that a callback path 484 exist between the server and client. If this callback path does not 485 exist, then delegations can not be granted. The essence of a 486 delegation is that it allows the client to locally service operations 487 such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate 488 interaction with the server. 490 1.2. General Definitions 492 The following definitions are provided for the purpose of providing 493 an appropriate context for the reader. 495 Client The "client" is the entity that accesses the NFS server's 496 resources. The client may be an application which contains 497 the logic to access the NFS server directly. The client 498 may also be the traditional operating system client remote 499 file system services for a set of applications. 501 In the case of file locking the client is the entity that 502 maintains a set of locks on behalf of one or more 503 applications. This client is responsible for crash or 504 failure recovery for those locks it manages. 506 Note that multiple clients may share the same transport and 507 multiple clients may exist on the same network node. 509 Clientid A 64-bit quantity used as a unique, short-hand reference to 510 a client supplied Verifier and ID. The server is 511 responsible for supplying the Clientid. 513 Lease An interval of time defined by the server for which the 514 client is irrevocably granted a lock. At the end of a 515 lease period the lock may be revoked if the lease has not 516 been extended. The lock must be revoked if a conflicting 517 lock has been granted after the lease interval. 519 All leases granted by a server have the same fixed 520 interval. Note that the fixed interval was chosen to 521 alleviate the expense a server would have in maintaining 522 state about variable length leases across server failures. 524 Lock The term "lock" is used to refer to both record (byte- 525 range) locks as well as file (share) locks unless 526 specifically stated otherwise. 528 Server The "Server" is the entity responsible for coordinating 529 client access to a set of file systems. 531 Stable Storage 532 NFS version 4 servers must be able to recover without data 533 loss from multiple power failures (including cascading 534 power failures, that is, several power failures in quick 535 succession), operating system failures, and hardware 536 failure of components other than the storage medium itself 537 (for example, disk, nonvolatile RAM). 539 Some examples of stable storage that are allowable for an 540 NFS server include: 542 1. Media commit of data, that is, the modified data has 543 been successfully written to the disk media, 544 for example, the disk platter. 546 2. An immediate reply disk drive with battery-backed 547 on-drive intermediate storage or uninterruptible power 548 system (UPS). 550 3. Server commit of data with battery-backed intermediate 551 storage and recovery software. 553 4. Cache commit with uninterruptible power system (UPS) 554 and recovery software. 556 Stateid A 64-bit quantity returned by a server that uniquely 557 defines the locking state granted by the server for a 558 specific lock owner for a specific file. 560 Stateids composed of all bits 0 or all bits 1 have special 561 meaning and are reserved values. 563 Verifier A 64-bit quantity generated by the client that the server 564 can use to determine if the client has restarted and lost 565 all previous lock state. 567 2. Protocol Data Types 569 The syntax and semantics to describe the data types of the NFS 570 version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] 571 documents. The next sections build upon the XDR data types to define 572 types and structures specific to this protocol. 574 2.1. Basic Data Types 576 Data Type Definition 577 _____________________________________________________________________ 578 int32_t typedef int int32_t; 580 uint32_t typedef unsigned int uint32_t; 582 int64_t typedef hyper int64_t; 584 uint64_t typedef unsigned hyper uint64_t; 586 attrlist4 typedef opaque attrlist4<>; 587 Used for file/directory attributes 589 bitmap4 typedef uint32_t bitmap4<>; 590 Used in attribute array encoding. 592 changeid4 typedef uint64_t changeid4; 593 Used in definition of change_info 595 clientid4 typedef uint64_t clientid4; 596 Shorthand reference to client identification 598 component4 typedef utf8string component4; 599 Represents path name components 601 count4 typedef uint32_t count4; 602 Various count parameters (READ, WRITE, COMMIT) 604 length4 typedef uint64_t length4; 605 Describes LOCK lengths 607 linktext4 typedef utf8string linktext4; 608 Symbolic link contents 610 mode4 typedef uint32_t mode4; 611 Mode attribute data type 613 nfs_cookie4 typedef uint64_t nfs_cookie4; 614 Opaque cookie value for READDIR 616 nfs_fh4 typedef opaque nfs_fh4; 617 Filehandle definition; NFS4_FHSIZE is defined as 128 619 nfs_ftype4 enum nfs_ftype4; 620 Various defined file types 622 nfsstat4 enum nfsstat4; 623 Return value for operations 625 offset4 typedef uint64_t offset4; 626 Various offset designations (READ, WRITE, LOCK, COMMIT) 628 pathname4 typedef component4 pathname4<>; 629 Represents path name for LOOKUP, OPEN and others 631 qop4 typedef uint32_t qop4; 632 Quality of protection designation in SECINFO 634 sec_oid4 typedef opaque sec_oid4<>; 635 Security Object Identifier 636 The sec_oid4 data type is not really opaque. 637 Instead contains an ASN.1 OBJECT IDENTIFIER as used 638 by GSS-API in the mech_type argument to 639 GSS_Init_sec_context. See [RFC2078] for details. 641 seqid4 typedef uint32_t seqid4; 642 Sequence identifier used for file locking 644 stateid4 typedef uint64_t stateid4; 645 State identifier used for file locking and delegation 647 utf8string typedef opaque utf8string<>; 648 UTF-8 encoding for strings 650 verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 651 Verifier used for various operations (COMMIT, CREATE, 652 OPEN, READDIR, SETCLIENTID, WRITE) 653 NFS4_VERIFIER_SIZE is defined as 8 655 2.2. Structured Data Types 657 nfstime4 658 struct nfstime4 { 659 int64_t seconds; 660 uint32_t nseconds; 661 } 663 The nfstime4 structure gives the number of seconds and 664 nanoseconds since midnight or 0 hour January 1, 1970 Coordinated 665 Universal Time (UTC). Values greater than zero for the seconds 666 field denote dates after the 0 hour January 1, 1970. Values 667 less than zero for the seconds field denote dates before the 0 668 hour January 1, 1970. In both cases, the nseconds field is to 669 be added to the seconds field for the final time representation. 670 For example, if the time to be represented is one-half second 671 before 0 hour January 1, 1970, the seconds field would have a 672 value of negative one (-1) and the nseconds fields would have a 673 value of one-half second (500000000). Values greater than 674 999,999,999 for nseconds are considered invalid. 676 This data type is used to pass time and date information. A 677 server converts to and from its local representation of time 678 when processing time values, preserving as much accuracy as 679 possible. If the precision of timestamps stored for a file 680 system object is less than defined, loss of precision can occur. 681 An adjunct time maintenance protocol is recommended to reduce 682 client and server time skew. 684 time_how4 686 enum time_how4 { 687 SET_TO_SERVER_TIME4 = 0, 688 SET_TO_CLIENT_TIME4 = 1 689 }; 691 settime4 693 union settime4 switch (time_how4 set_it) { 694 case SET_TO_CLIENT_TIME4: 695 nfstime4 time; 696 default: 697 void; 698 }; 700 The above definitions are used as the attribute definitions to 701 set time values. If set_it is SET_TO_SERVER_TIME4, then the 702 server uses its local representation of time for the time value. 704 specdata4 706 struct specdata4 { 707 uint32_t specdata1; 708 uint32_t specdata2; 709 }; 711 This data type represents additional information for the device 712 file types NF4CHR and NF4BLK. 714 fsid4 716 struct fsid4 { 717 uint64_t major; 718 uint64_t minor; 719 }; 721 This type is the file system identifier that is used as a 722 mandatory attribute. 724 fs_location4 726 struct fs_location4 { 727 utf8string server<>; 728 pathname4 rootpath; 729 }; 731 fs_locations4 733 struct fs_locations4 { 734 pathname4 fs_root; 735 fs_location4 locations<>; 736 }; 738 The fs_location4 and fs_locations4 data types are used for the 739 fs_locations recommended attribute which is used for migration 740 and replication support. 742 fattr4 744 struct fattr4 { 745 bitmap4 attrmask; 746 attrlist4 attr_vals; 747 }; 749 The fattr4 structure is used to represent file and directory 750 attributes. 752 The bitmap is a counted array of 32 bit integers used to contain 753 bit values. The position of the integer in the array that 754 contains bit n can be computed from the expression (n / 32) and 755 its bit within that integer is (n mod 32). 757 0 1 758 +-----------+-----------+-----------+-- 759 | count | 31 .. 0 | 63 .. 32 | 760 +-----------+-----------+-----------+-- 762 change_info4 764 struct change_info4 { 765 bool atomic; 766 changeid4 before; 767 changeid4 after; 768 }; 770 This structure is used with the CREATE, LINK, REMOVE, RENAME 771 operations to let the client the know value of the change 772 attribute for the directory in which the target file system 773 object resides. 775 clientaddr4 777 struct clientaddr4 { 778 /* see struct rpcb in RFC 1833 */ 779 string r_netid<>; /* network id */ 780 string r_addr<>; /* universal address */ 781 }; 783 The clientaddr4 structure is used as part of the SETCLIENT 784 operation to either specify the address of the client that is 785 using a clientid or as part of the call back registration. 787 cb_client4 789 struct cb_client4 { 790 unsigned int cb_program; 791 clientaddr4 cb_location; 792 }; 794 This structure is used by the client to inform the server of its 795 call back address; includes the program number and client 796 address. 798 nfs_client_id4 800 struct nfs_client_id4 { 801 verifier4 verifier; 802 opaque id<>; 803 }; 805 This structure is part of the arguments to the SETCLIENTID 806 operation. 808 nfs_lockowner4 810 struct nfs_lockowner4 { 811 clientid4 clientid; 812 opaque owner<>; 814 }; 816 This structure is used to identify the owner of a OPEN share or 817 file lock. 819 3. RPC and Security Flavor 821 The NFS version 4 protocol is a Remote Procedure Call (RPC) 822 application that uses RPC version 2 and the corresponding eXternal 823 Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The 824 RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as 825 the mechanism to deliver stronger security for the NFS version 4 826 protocol. 828 3.1. Ports and Transports 830 Historically, NFS version 2 and version 3 servers have resided on 831 port 2049. The registered port 2049 [RFC1700] for the NFS protocol 832 should be the default configuration. Using the registered port for 833 NFS services means the NFS client will not need to use the RPC 834 binding protocols as described in [RFC1833]; this will allow NFS to 835 transit firewalls. 837 The transport used by the RPC service for the NFS version 4 protocol 838 MUST provide congestion control comparable to that defined for TCP in 839 [RFC2581]. If the operating environment implements TCP, the NFS 840 version 4 protocol SHOULD be supported over TCP. The NFS client and 841 server may use other transports if they support congestion control as 842 defined above and in those cases a mechanism may be provided to 843 override TCP usage in favor of another transport. 845 If TCP is used as the transport, the client and server SHOULD use 846 persistent connections. This will prevent the weakening of TCP's 847 congestion control via short lived connections and will improve 848 performance for the WAN environment by eliminating the need for SYN 849 handshakes. 851 Note that for various timers the client and server may keep the issue 852 of inadvertent synchronization should be avoided. For further 853 discussion of the general issue refer to [Floyd]. 855 3.2. Security Flavors 857 Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, 858 AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an 859 additional security flavor of RPCSEC_GSS has been introduced which 860 uses the functionality of GSS-API [RFC2078]. This allows for the use 861 of varying security mechanisms by the RPC layer without the 862 additional implementation overhead of adding RPC security flavors. 863 For NFS version 4, the RPCSEC_GSS security flavor MUST be used to 864 enable the mandatory security mechanism. Other flavors, such as, 865 AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. 867 3.2.1. Security mechanisms for NFS version 4 869 The use of RPCSEC_GSS requires selection of: mechanism, quality of 870 protection, and service (authentication, integrity, privacy). The 871 remainder of this document will refer to these three parameters of 872 the RPCSEC_GSS security as the security triple. 874 3.2.1.1. Kerberos V5 as security triple 876 The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be 877 implemented and provide the following security triples. 879 column descriptions: 881 1 == number of pseudo flavor 882 2 == name of pseudo flavor 883 3 == mechanism's OID 884 4 == mechanism's algorithm(s) 885 5 == RPCSEC_GSS service 887 1 2 3 4 5 888 ----------------------------------------------------------------------- 889 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none 890 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity 891 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy 892 for integrity, 893 and 56 bit DES 894 for privacy. 896 Note that the pseudo flavor is presented here as a mapping aid to the 897 implementor. Because this NFS protocol includes a method to 898 negotiate security and it understands the GSS-API mechanism, the 899 pseudo flavor is not needed. The pseudo flavor is needed for NFS 900 version 3 since the security negotiation is done via the MOUNT 901 protocol. 903 For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please 904 see [RFC2623]. 906 3.2.1.2. LIPKEY as a security triple 908 The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be 909 implemented and provide the following security triples. The 910 definition of the columns matches the previous subsection "Kerberos 911 V5 as security triple" 913 1 2 3 4 5 914 ----------------------------------------------------------------------- 915 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none 916 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity 917 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy 919 The mechanism algorithm is listed as "negotiated". This is because 920 LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the 921 confidentiality and integrity algorithms are negotiated. Since 922 SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit 923 cast5CBC for confidentiality for privacy as MANDATORY, and further 924 specifies that HMAC-MD5 and cast5CBC MUST be listed first before 925 weaker algorithms, specifying "negotiated" in column 4 does not 926 impair interoperability. In the event an SPKM-3 peer does not 927 support the mandatory algorithms, the other peer is free to accept or 928 reject the GSS-API context creation. 930 Because SPKM-3 negotiates the algorithms, subsequent calls to 931 LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality 932 of protection value of 0 (zero). See section 5.2 of [RFC2025] for an 933 explanation. 935 LIPKEY uses SPKM-3 to create a secure channel in which to pass a user 936 name and password from the client to the user. Once the user name 937 and password have been accepted by the server, calls to the LIPKEY 938 context are redirected to the SPKM-3 context. See [RFC2847] for more 939 details. 941 3.2.1.3. SPKM-3 as a security triple 943 The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be 944 implemented and provide the following security triples. The 945 definition of the columns matches the previous subsection "Kerberos 946 V5 as security triple". 948 1 2 3 4 5 949 ----------------------------------------------------------------------- 950 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none 951 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity 952 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy 954 For a discussion as to why the mechanism algorithm is listed as 955 "negotiated", see the previous section "LIPKEY as a security triple." 957 Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- 958 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of 959 protection value of 0 (zero). See section 5.2 of [RFC2025] for an 960 explanation. 962 Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a 963 mandatory set of triples to handle the situations where the initiator 964 (the client) is anonymous or where the initiator has its own 965 certificate. If the initiator is anonymous, there will not be a user 966 name and password to send to the target (the server). If the 967 initiator has its own certificate, then using passwords is 968 superfluous. 970 3.3. Security Negotiation 972 With the NFS version 4 server potentially offering multiple security 973 mechanisms, the client needs a method to determine or negotiate which 974 mechanism is to be used for its communication with the server. The 975 NFS server may have multiple points within its file system name space 976 that are available for use by NFS clients. In turn the NFS server 977 may be configured such that each of these entry points may have 978 different or multiple security mechanisms in use. 980 The security negotiation between client and server must be done with 981 a secure channel to eliminate the possibility of a third party 982 intercepting the negotiation sequence and forcing the client and 983 server to choose a lower level of security than required or desired. 985 3.3.1. Security Error 987 Based on the assumption that each NFS version 4 client and server 988 must support a minimum set of security (i.e. LIPKEY, SPKM-3, and 989 Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its 990 communication with the server with one of the minimal security 991 triples. During communication with the server, the client may 992 receive an NFS error of NFS4ERR_WRONGSEC. This error allows the 993 server to notify the client that the security triple currently being 994 used is not appropriate for access to the server's file system 995 resources. The client is then responsible for determining what 996 security triples are available at the server and choose one which is 997 appropriate for the client. 999 3.3.2. SECINFO 1001 The new SECINFO operation will allow the client to determine, on a 1002 per filehandle basis, what security triple is to be used for server 1003 access. In general, the client will not have to use the SECINFO 1004 procedure except during initial communication with the server or when 1005 the client crosses policy boundaries at the server. It is possible 1006 that the server's policies change during the client's interaction 1007 therefore forcing the client to negotiate a new security triple. 1009 3.4. Callback RPC Authentication 1011 The callback RPC (described later) must mutually authenticate the NFS 1012 server to the principal that acquired the clientid (also described 1013 later), using the same security flavor the original SETCLIENTID 1014 operation used. Because LIPKEY is layered over SPKM-3, it is 1015 permissible for the server to use SPKM-3 and not LIPKEY for the 1016 callback even if the client used LIPKEY for SETCLIENTID. 1018 For AUTH_NONE, there are no principals, so this is a non-issue. 1020 For AUTH_SYS, the server simply uses the AUTH_SYS credential that the 1021 user used when it set up the delegation. 1023 For AUTH_DH, one commonly used convention is that the server uses the 1024 credential corresponding to this AUTH_DH principal: 1026 unix.host@domain 1028 where host and domain are variables corresponding to the name of 1029 server host and directory services domain in which it lives such as a 1030 Network Information System domain or a DNS domain. 1032 Regardless of what security mechanism under RPCSEC_GSS is being used, 1033 the NFS server, MUST identify itself in GSS-API via a 1034 GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE 1035 names are of the form: 1037 service@hostname 1039 For NFS, the "service" element is 1041 nfs 1043 Implementations of security mechanisms will convert nfs@hostname to 1044 various different forms. For Kerberos V5 and LIPKEY, the following 1045 form is RECOMMENDED: 1047 nfs/hostname 1049 For Kerberos V5, nfs/hostname would be a server principal in the 1050 Kerberos Key Distribution Center database. For LIPKEY, this would be 1051 the username passed to the target (the NFS version 4 client that 1052 receives the callback). 1054 It should be noted that LIPKEY may not work for callbacks, since the 1055 LIPKEY client uses a user id/password. If the NFS client receiving 1056 the callback can authenticate the NFS server's user name/password 1057 pair, and if the user that the NFS server is authenticating to has a 1058 public key certificate, then it works. 1060 In situations where NFS client uses LIPKEY and uses a per-host 1061 principal for the SETCLIENTID operation, instead of using LIPKEY for 1062 SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication 1063 be used. This effectively means that the client will use a 1064 certificate to authenticate and identify the initiator to the target 1065 on the NFS server. Using SPKM-3 and not LIPKEY has the following 1066 advantages: 1068 o When the server does a callback, it must authenticate to the 1069 principal used in the SETCLIENTID. Even if LIPKEY is used, 1070 because LIPKEY is layered over SPKM-3, the NFS client will need 1071 to have a certificate that corresponds to the principal used in 1072 the SETCLIENTID operation. From an administrative perspective, 1073 having a user name, password, and certificate for both the 1074 client and server is redundant. 1076 o LIPKEY was intended to minimize additional infrastructure 1077 requirements beyond a certificate for the target, and the 1078 expectation is that existing password infrastructure can be 1079 leveraged for the initiator. In some environments, a per-host 1080 password does not exist yet. If certificates are used for any 1081 per-host principals, then additional password infrastructure is 1082 not needed. 1084 o In cases when a host is both an NFS client and server, it can 1085 share the same per-host certificate. 1087 4. Filehandles 1089 The filehandle in the NFS protocol is a per server unique identifier 1090 for a file system object. The contents of the filehandle are opaque 1091 to the client. Therefore, the server is responsible for translating 1092 the filehandle to an internal representation of the file system 1093 object. Since the filehandle is the client's reference to an object 1094 and the client may cache this reference, the server SHOULD not reuse 1095 a filehandle for another file system object. If the server needs to 1096 reuse a filehandle value, the time elapsed before reuse SHOULD be 1097 large enough such that it is unlikely the client has a cached copy of 1098 the reused filehandle value. Note that a client may cache a 1099 filehandle for a very long time. For example, a client may cache NFS 1100 data to local storage as a method to expand its effective cache size 1101 and as a means to survive client restarts. Therefore, the lifetime 1102 of a cached filehandle may be extended. 1104 4.1. Obtaining the First Filehandle 1106 The operations of the NFS protocol are defined in terms of one or 1107 more filehandles. Therefore, the client needs a filehandle to 1108 initiate communication with the server. With the NFS version 2 1109 protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there 1110 exists an ancillary protocol to obtain this first filehandle. The 1111 MOUNT protocol, RPC program number 100005, provides the mechanism of 1112 translating a string based file system path name to a filehandle 1113 which can then be used by the NFS protocols. 1115 The MOUNT protocol has deficiencies in the area of security and use 1116 via firewalls. This is one reason that the use of the public 1117 filehandle was introduced in [RFC2054] and [RFC2055]. With the use 1118 of the public filehandle in combination with the LOOKUP procedure in 1119 the NFS version 2 and 3 protocols, it has been demonstrated that the 1120 MOUNT protocol is unnecessary for viable interaction between NFS 1121 client and server. 1123 Therefore, the NFS version 4 protocol will not use an ancillary 1124 protocol for translation from string based path names to a 1125 filehandle. Two special filehandles will be used as starting points 1126 for the NFS client. 1128 4.1.1. Root Filehandle 1130 The first of the special filehandles is the ROOT filehandle. The 1131 ROOT filehandle is the "conceptual" root of the file system name 1132 space at the NFS server. The client uses or starts with the ROOT 1133 filehandle by employing the PUTROOTFH operation. The PUTROOTFH 1134 operation instructs the server to set the "current" filehandle to the 1135 ROOT of the server's file tree. Once this PUTROOTFH operation is 1136 used, the client can then traverse the entirety of the server's file 1137 tree with the LOOKUP procedure. A complete discussion of the server 1138 name space is in the section "NFS Server Name Space". 1140 4.1.2. Public Filehandle 1142 The second special filehandle is the PUBLIC filehandle. Unlike the 1143 ROOT filehandle, the PUBLIC filehandle may be bound or represent an 1144 arbitrary file system object at the server. The server is 1145 responsible for this binding. It may be that the PUBLIC filehandle 1146 and the ROOT filehandle refer to the same file system object. 1147 However, it is up to the administrative software at the server and 1148 the policies of the server administrator to define the binding of the 1149 PUBLIC filehandle and server file system object. The client may not 1150 make any assumptions about this binding. 1152 4.2. Filehandle Types 1154 In the NFS version 2 and 3 protocols, there was one type of 1155 filehandle with a single set of semantics. The NFS version 4 1156 protocol introduces a new type of filehandle in an attempt to 1157 accommodate certain server environments. The first type of 1158 filehandle is 'persistent'. The semantics of a persistent filehandle 1159 are the same as the filehandles of the NFS version 2 and 3 protocols. 1160 The second or new type of filehandle is the "volatile" filehandle. 1162 The volatile filehandle type is being introduced to address server 1163 functionality or implementation issues which make correct 1164 implementation of a persistent filehandle infeasible. Some server 1165 environments do not provide a file system level invariant that can be 1166 used to construct a persistent filehandle. The underlying server 1167 file system may not provide the invariant or the server's file system 1168 programming interfaces may not provide access to the needed 1169 invariant. Volatile filehandles may ease the implementation of 1170 server functionality such as hierarchical storage management or file 1171 system reorganization or migration. However, the volatile filehandle 1172 increases the implementation burden for the client. However this 1173 increased burden is deemed acceptable based on the overall gains 1174 achieved by the protocol. 1176 Since the client will need to handle persistent and volatile 1177 filehandle differently, a file attribute is defined which may be used 1178 by the client to determine the filehandle types being returned by the 1179 server. 1181 4.2.1. General Properties of a Filehandle 1183 The filehandle contains all the information the server needs to 1184 distinguish an individual file. To the client, the filehandle is 1185 opaque. The client stores filehandles for use in a later request and 1186 can compare two filehandles from the same server for equality by 1187 doing a byte-by-byte comparison. However, the client MUST NOT 1188 otherwise interpret the contents of filehandles. If two filehandles 1189 from the same server are equal, they MUST refer to the same file. If 1190 they are not equal, the client may use information provided by the 1191 server, in the form of file attributes, to determine whether they 1192 denote the same files or different files. The client would do this 1193 as necessary for client side caching. Servers SHOULD try to maintain 1194 a one-to-one correspondence between filehandles and files but this is 1195 not required. Clients MUST use filehandle comparisons only to 1196 improve performance, not for correct behavior. All clients need to 1197 be prepared for situations in which it cannot be determined whether 1198 two filehandles denote the same object and in such cases, avoid 1199 making invalid assumpions which might cause incorrect behavior. 1200 Further discussion of filehandle and attribute comparison in the 1201 context of data caching is presented in the section "Data Caching and 1202 File Identity". 1204 As an example, in the case that two different path names when 1205 traversed at the server terminate at the same file system object, the 1206 server SHOULD return the same filehandle for each path. This can 1207 occur if a hard link is used to create two file names which refer to 1208 the same underlying file object and associated data. For example, if 1209 paths /a/b/c and /a/d/c refer to the same file, the server SHOULD 1210 return the same filehandle for both path names traversals. 1212 4.2.2. Persistent Filehandle 1214 A persistent filehandle is defined as having a fixed value for the 1215 lifetime of the file system object to which it refers. Once the 1216 server creates the filehandle for a file system object, the server 1217 MUST accept the same filehandle for the object for the lifetime of 1218 the object. If the server restarts or reboots the NFS server must 1219 honor the same filehandle value as it did in the server's previous 1220 instantiation. Similarly, if the file system is migrated, the new 1221 NFS server must honor the same file handle as the old NFS server. 1223 The persistent filehandle will be become stale or invalid when the 1224 file system object is removed. When the server is presented with a 1225 persistent filehandle that refers to a deleted object, it MUST return 1226 an error of NFS4ERR_STALE. A filehandle may become stale when the 1227 file system containing the object is no longer available. The file 1228 system may become unavailable if it exists on removable media and the 1229 media is no longer available at the server or the file system in 1230 whole has been destroyed or the file system has simply been removed 1231 from the server's name space (i.e. unmounted in a Unix environment). 1233 4.2.3. Volatile Filehandle 1235 A volatile filehandle does not share the same longevity 1236 characteristics of a persistent filehandle. The server may determine 1237 that a volatile filehandle is no longer valid at many different 1238 points in time. If the server can definitively determine that a 1239 volatile filehandle refers to an object that has been removed, the 1240 server should return NFS4ERR_STALE to the client (as is the case for 1241 persistent filehandles). In all other cases where the server 1242 determines that a volatile filehandle can no longer be used, it 1243 should return an error of NFS4ERR_FHEXPIRED. 1245 The mandatory attribute "fh_expire_type" is used by the client to 1246 determine what type of filehandle the server is providing for a 1247 particular file system. This attribute is a bitmask with the 1248 following values: 1250 FH4_PERSISTENT 1251 The value of FH4_PERSISTENT is used to indicate a persistent 1252 filehandle, which is valid until the object is removed from the 1253 file system. The server will not return NFS4ERR_FHEXPIRED for 1254 this filehandle. FH4_PERSISTENT is defined as a value in which 1255 none of the bits specified below are set. 1257 FH4_NOEXPIRE_WITH_OPEN 1258 The filehandle will not expire while client has the file open. 1259 If this bit is set, then the values FH4_VOLATILE_ANY or 1260 FH4_VOL_RENAME do not impact expiration while the file is open. 1261 Once the file is closed or if the FH4_NOEXPIRE_WITH_OPEN bit is 1262 false, the rest of the volatile related bits apply. 1264 FH4_VOLATILE_ANY 1265 The filehandle may expire at any time and will expire during 1266 system migration and rename. 1268 FH4_VOL_MIGRATION 1269 The filehandle will expire during file system migration. May 1270 only be set if FH4_VOLATILE_ANY is not set. 1272 FH4_VOL_RENAME 1273 The filehandle may expire due to a rename. This includes a 1274 rename by the requesting client or a rename by another client. 1275 May only be set if FH4_VOLATILE_ANY is not set. 1277 Servers which provide volatile filehandles should deny a RENAME or 1278 REMOVE that would affect an OPEN file or any of the components 1279 leading to the OPEN file. In addition, the server should deny all 1280 RENAME or REMOVE requests during the grace or lease period upon 1281 server restart. 1283 The reader may be wondering why there are three FH4_VOL* bits and why 1284 FH4_VOLATILE_ANY is exclusive of FH4_VOL_MIGRATION and 1285 FH4_VOL_RENAME. If the a filehandle is normally persistent but 1286 cannot persist across a file set migration, then the presence of the 1287 FH4_VOL_MIGRATION or FH4_VOL_RENAME tells the client that it can 1288 treat the file handle as persistent for purposes of maintaining a 1289 file name to file handle cache, except for the specific event 1290 described by the bit. However, FH4_VOLATILE_ANY tells the client 1291 that it should not maintain such a cache for unopened files. A 1292 server MUST not present FH4_VOLATILE_ANY with FH4_VOL_MIGRATION or 1293 FH4_VOL_RENAME as this will lead to confusion. FH4_VOLATILE_ANY 1294 implies that the file handle will expire upon migration or rename, in 1295 addition to other events. 1297 4.2.4. One Method of Constructing a Volatile Filehandle 1299 As mentioned, in some instances a filehandle is stale (no longer 1300 valid; perhaps because the file was removed from the server) or it is 1301 expired (the underlying file is valid but since the filehandle is 1302 volatile, it may have expired). Thus the server needs to be able to 1303 return NFS4ERR_STALE in the former case and NFS4ERR_FHEXPIRED in the 1304 latter case. This can be done by careful construction of the volatile 1305 filehandle. One possible implementation follows. 1307 A volatile filehandle, while opaque to the client could contain: 1309 [volatile bit = 1 | server boot time | slot | generation number] 1311 o slot is an index in the server volatile filehandle table 1313 o generation number is the generation number for the table 1314 entry/slot 1316 If the server boot time is less than the current server boot time, 1317 return NFS4ERR_FHEXPIRED. If slot is out of range, return 1318 NFS4ERR_BADHANDLE. If the generation number does not match, return 1319 NFS4ERR_FHEXPIRED. 1321 When the server reboots, the table is gone (it is volatile). 1323 If volatile bit is 0, then it is a persistent filehandle with a 1324 different structure following it. 1326 4.3. Client Recovery from Filehandle Expiration 1328 If possible, the client SHOULD recover from the receipt of an 1329 NFS4ERR_FHEXPIRED error. The client must take on additional 1330 responsibility so that it may prepare itself to recover from the 1331 expiration of a volatile filehandle. If the server returns 1332 persistent filehandles, the client does not need these additional 1333 steps. 1335 For volatile filehandles, most commonly the client will need to store 1336 the component names leading up to and including the file system 1337 object in question. With these names, the client should be able to 1338 recover by finding a filehandle in the name space that is still 1339 available or by starting at the root of the server's file system name 1340 space. 1342 If the expired filehandle refers to an object that has been removed 1343 from the file system, obviously the client will not be able to 1344 recover from the expired filehandle. 1346 It is also possible that the expired filehandle refers to a file that 1347 has been renamed. If the file was renamed by another client, again 1348 it is possible that the original client will not be able to recover. 1349 However, in the case that the client itself is renaming the file and 1350 the file is open, it is possible that the client may be able to 1351 recover. The client can determine the new path name based on the 1352 processing of the rename request. The client can then regenerate the 1353 new filehandle based on the new path name. The client could also use 1354 the compound operation mechanism to construct a set of operations 1355 like: 1356 RENAME A B 1357 LOOKUP B 1358 GETFH 1360 5. File Attributes 1362 To meet the requirements of extensibility and increased 1363 interoperability with non-Unix platforms, attributes must be handled 1364 in a flexible manner. The NFS Version 3 fattr3 structure contains a 1365 fixed list of attributes that not all clients and servers are able to 1366 support or care about. The fattr3 structure can not be extended as 1367 new needs arise and it provides no way to indicate non-support. With 1368 the NFS Version 4 protocol, the client will be able to ask what 1369 attributes the server supports and will be able to request only those 1370 attributes in which it is interested. 1372 To this end, attributes will be divided into three groups: mandatory, 1373 recommended, and named. Both mandatory and recommended attributes 1374 are supported in the NFS version 4 protocol by a specific and well- 1375 defined encoding and are identified by number. They are requested by 1376 setting a bit in the bit vector sent in the GETATTR request; the 1377 server response includes a bit vector to list what attributes were 1378 returned in the response. New mandatory or recommended attributes 1379 may be added to the NFS protocol between major revisions by 1380 publishing a standards-track RFC which allocates a new attribute 1381 number value and defines the encoding for the attribute. See the 1382 section "Minor Versioning" for further discussion. 1384 Named attributes are accessed by the new OPENATTR operation, which 1385 accesses a hidden directory of attributes associated with a file 1386 system object. OPENATTR takes a filehandle for the object and 1387 returns the filehandle for the attribute hierarchy. The filehandle 1388 for the named attributes is a directory object accessible by LOOKUP 1389 or READDIR and contains files whose names represent the named 1390 attributes and whose data bytes are the value of the attribute. For 1391 example: 1393 LOOKUP "foo" ; look up file 1394 GETATTR attrbits 1395 OPENATTR ; access foo's named attributes 1396 LOOKUP "x11icon" ; look up specific attribute 1397 READ 0,4096 ; read stream of bytes 1399 Named attributes are intended for data needed by applications rather 1400 than by an NFS client implementation. NFS implementors are strongly 1401 encouraged to define their new attributes as recommended attributes 1402 by bringing them to the IETF standards-track process. 1404 The set of attributes which are classified as mandatory is 1405 deliberately small since servers must do whatever it takes to support 1406 them. The recommended attributes may be unsupported; though a server 1407 should support as many as it can. Attributes are deemed mandatory if 1408 the data is both needed by a large number of clients and is not 1409 otherwise reasonably computable by the client when support is not 1410 provided on the server. 1412 5.1. Mandatory Attributes 1414 These MUST be supported by every NFS Version 4 client and server in 1415 order to ensure a minimum level of interoperability. The server must 1416 store and return these attributes and the client must be able to 1417 function with an attribute set limited to these attributes. With 1418 just the mandatory attributes some client functionality may be 1419 impaired or limited in some ways. A client may ask for any of these 1420 attributes to be returned by setting a bit in the GETATTR request and 1421 the server must return their value. 1423 5.2. Recommended Attributes 1425 These attributes are understood well enough to warrant support in the 1426 NFS Version 4 protocol. However, they may not be supported on all 1427 clients and servers. A client may ask for any of these attributes to 1428 be returned by setting a bit in the GETATTR request but must handle 1429 the case where the server does not return them. A client may ask for 1430 the set of attributes the server supports and should not request 1431 attributes the server does not support. A server should be tolerant 1432 of requests for unsupported attributes and simply not return them 1433 rather than considering the request an error. It is expected that 1434 servers will support all attributes they comfortably can and only 1435 fail to support attributes which are difficult to support in their 1436 operating environments. A server should provide attributes whenever 1437 they don't have to "tell lies" to the client. For example, a file 1438 modification time should be either an accurate time or should not be 1439 supported by the server. This will not always be comfortable to 1440 clients but it seems that the client has a better ability to 1441 fabricate or construct an attribute or do without the attribute. 1443 5.3. Named Attributes 1445 These attributes are not supported by direct encoding in the NFS 1446 Version 4 protocol but are accessed by string names rather than 1447 numbers and correspond to an uninterpreted stream of bytes which are 1448 stored with the file system object. The name space for these 1449 attributes may be accessed by using the OPENATTR operation. The 1450 OPENATTR operation returns a filehandle for a virtual "attribute 1451 directory" and further perusal of the name space may be done using 1452 READDIR and LOOKUP operations on this filehandle. Named attributes 1453 may then be examined or changed by normal READ and WRITE and CREATE 1454 operations on the filehandles returned from READDIR and LOOKUP. 1455 Named attributes may have attributes. 1457 It is recommended that servers support arbitrary named attributes. A 1458 client should not depend on the ability to store any named attributes 1459 in the server's file system. If a server does support named 1460 attributes, a client which is also able to handle them should be able 1461 to copy a file's data and meta-data with complete transparency from 1462 one location to another; this would imply that names allowed for 1463 regular directory entries are valid for named attribute names as 1464 well. 1466 Names of attributes will not be controlled by this document or other 1467 IETF standards track documents. See the section "IANA 1468 Considerations" for further discussion. 1470 5.4. Mandatory Attributes - Definitions 1472 Name # DataType Access Description 1473 ___________________________________________________________________ 1474 supp_attr 0 bitmap READ The bit vector which 1475 would retrieve all 1476 mandatory and 1477 recommended attributes 1478 that are supported for 1479 this object. 1481 type 1 nfs4_ftype READ The type of the object 1482 (file, directory, 1483 symlink) 1485 fh_expire_type 2 uint32 READ Server uses this to 1486 specify filehandle 1487 expiration behavior to 1488 the client. See the 1489 section "Filehandles" 1490 for additional 1491 description. 1493 change 3 uint64 READ A value created by the 1494 server that the client 1495 can use to determine 1496 if file data, 1497 directory contents or 1498 attributes of the 1499 object have been 1500 modified. The server 1501 may return the 1502 object's time_modify 1503 attribute for this 1504 attribute's value but 1505 only if the file 1506 system object can not 1507 be updated more 1508 frequently than the 1509 resolution of 1510 time_modify. 1512 size 4 uint64 R/W The size of the object 1513 in bytes. 1515 link_support 5 boolean READ Does the object's file 1516 system supports hard 1517 links? 1519 symlink_support 6 boolean READ Does the object's file 1520 system supports 1521 symbolic links? 1523 named_attr 7 boolean READ Does this object have 1524 named attributes? 1526 fsid 8 fsid4 READ Unique file system 1527 identifier for the 1528 file system holding 1529 this object. fsid 1530 contains major and 1531 minor components each 1532 of which are uint64. 1534 unique_handles 9 boolean READ Are two distinct 1535 filehandles guaranteed 1536 to refer to two 1537 different file system 1538 objects? 1540 lease_time 10 nfs_lease4 READ Duration of leases at 1541 server in seconds. 1543 rdattr_error 11 enum READ Error returned from 1544 getattr during 1545 readdir. 1547 5.5. Recommended Attributes - Definitions 1549 Name # Data Type Access Description 1550 _____________________________________________________________________ 1551 ACL 12 nfsace4<> R/W The access control 1552 list for the object. 1554 aclsupport 13 uint32 READ Indicates what types 1555 of ACLs are supported 1556 on the current file 1557 system. 1559 archive 14 boolean R/W Whether or not this 1560 file has been 1561 archived since the 1562 time of last 1563 modification 1564 (deprecated in favor 1565 of time_backup). 1567 cansettime 15 boolean READ Is the server able to 1568 change the times for 1569 a file system object 1570 as specified in a 1571 SETATTR operation? 1573 case_insensitive 16 boolean READ Are filename 1574 comparisons on this 1575 file system case 1576 insensitive? 1578 case_preserving 17 boolean READ Is filename case on 1579 this file system 1580 preserved? 1582 chown_restricted 18 boolean READ If TRUE, the server 1583 will reject any 1584 request to change 1585 either the owner or 1586 the group associated 1587 with a file if the 1588 caller is not a 1589 privileged user (for 1590 example, "root" in 1591 Unix operating 1592 environments or in NT 1593 the "Take Ownership" 1594 privilege) 1596 filehandle 19 nfs4_fh READ The filehandle of 1597 this object 1598 (primarily for 1599 readdir requests). 1601 fileid 20 uint64 READ A number uniquely 1602 identifying the file 1603 within the file 1604 system. 1606 files_avail 21 uint64 READ File slots available 1607 to this user on the 1608 file system 1609 containing this 1610 object - this should 1611 be the smallest 1612 relevant limit. 1614 files_free 22 uint64 READ Free file slots on 1615 the file system 1616 containing this 1617 object - this should 1618 be the smallest 1619 relevant limit. 1621 files_total 23 uint64 READ Total file slots on 1622 the file system 1623 containing this 1624 object. 1626 fs_locations 24 fs_locations READ Locations where this 1627 file system may be 1628 found. If the server 1629 returns NFS4ERR_MOVED 1630 as an error, this 1631 attribute must be 1632 supported. 1634 hidden 25 boolean R/W Is file considered 1635 hidden with respect 1636 to the WIN32 API? 1638 homogeneous 26 boolean READ Whether or not this 1639 object's file system 1640 is homogeneous, i.e. 1641 are per file system 1642 attributes the same 1643 for all file system's 1644 objects. 1646 maxfilesize 27 uint64 READ Maximum supported 1647 file size for the 1648 file system of this 1649 object. 1651 maxlink 28 uint32 READ Maximum number of 1652 links for this 1653 object. 1655 maxname 29 uint32 READ Maximum filename size 1656 supported for this 1657 object. 1659 maxread 30 uint64 READ Maximum read size 1660 supported for this 1661 object. 1663 maxwrite 31 uint64 READ Maximum write size 1664 supported for this 1665 object. This 1666 attribute SHOULD be 1667 supported if the file 1668 is writable. Lack of 1669 this attribute can 1670 lead to the client 1671 either wasting 1672 bandwidth or not 1673 receiving the best 1674 performance. 1676 mimetype 32 utf8<> R/W MIME body 1677 type/subtype of this 1678 object. 1680 mode 33 mode4 R/W Unix-style permission 1681 bits for this object 1682 (deprecated in favor 1683 of ACLs) 1685 no_trunc 34 boolean READ If a name longer than 1686 name_max is used, 1687 will an error be 1688 returned or will the 1689 name be truncated? 1691 numlinks 35 uint32 READ Number of hard links 1692 to this object. 1694 owner 36 utf8<> R/W The string name of 1695 the owner of this 1696 object. 1698 owner_group 37 utf8<> R/W The string name of 1699 the group ownership 1700 of this object. 1702 quota_hard 38 uint64 READ For definition see 1703 "Quota Attributes" 1704 section below. 1706 quota_soft 39 uint64 READ For definition see 1707 "Quota Attributes" 1708 section below. 1710 quota_used 40 uint64 READ For definition see 1711 "Quota Attributes" 1712 section below. 1714 rawdev 41 specdata4 READ Raw device 1715 identifier. Unix 1716 device major/minor 1717 node information. 1719 space_avail 42 uint64 READ Disk space in bytes 1720 available to this 1721 user on the file 1722 system containing 1723 this object - this 1724 should be the 1725 smallest relevant 1726 limit. 1728 space_free 43 uint64 READ Free disk space in 1729 bytes on the file 1730 system containing 1731 this object - this 1732 should be the 1733 smallest relevant 1734 limit. 1736 space_total 44 uint64 READ Total disk space in 1737 bytes on the file 1738 system containing 1739 this object. 1741 space_used 45 uint64 READ Number of file system 1742 bytes allocated to 1743 this object. 1745 system 46 boolean R/W Is this file a system 1746 file with respect to 1747 the WIN32 API? 1749 time_access 47 nfstime4 READ The time of last 1750 access to the object. 1752 time_access_set 48 settime4 WRITE Set the time of last 1753 access to the object. 1754 SETATTR use only. 1756 time_backup 49 nfstime4 R/W The time of last 1757 backup of the object. 1759 time_create 50 nfstime4 R/W The time of creation 1760 of the object. This 1761 attribute does not 1762 have any relation to 1763 the traditional Unix 1764 file attribute 1765 "ctime" or "change 1766 time". 1768 time_delta 51 nfstime4 READ Smallest useful 1769 server time 1770 granularity. 1772 time_metadata 52 nfstime4 R/W The time of last 1773 meta-data 1774 modification of the 1775 object. 1777 time_modify 53 nfstime4 READ The time of last 1778 modification to the 1779 object. 1781 time_modify_set 54 settime4 WRITE Set the time of last 1782 modification to the 1783 object. SETATTR use 1784 only. 1786 5.6. Interpreting owner and owner_group 1788 The recommended attributes "owner" and "owner_group" are represented 1789 in terms of a UTF-8 string. To avoid a representation that is tied 1790 to a particular underlying implementation at the client or server, 1791 the use of the UTF-8 string has been chosen. Note that section 6.1 1792 of [RFC2624] provides additional rationale. It is expected that the 1793 client and server will have their own local representation of owner 1794 and owner_group that is used for local storage or presentation to the 1795 end user. Therefore, it is expected that when these attributes are 1796 transferred between the client and server that the local 1797 representation is translated to a syntax of the form 1798 "user@dns_domain". This will allow for a client and server that do 1799 not use the same local representation the ability to translate to a 1800 common syntax that can be interpreted by both. 1802 The translation is not specified as part of the protocol. This 1803 allows various solutions to be employed. For example, a local 1804 translation table may be consulted that maps between a numeric id to 1805 the user@dns_domain syntax. A name service may also be used to 1806 accomplish the translation. The "dns_domain" portion of the owner 1807 string is meant to be a DNS domain name. For example, user@ietf.org. 1809 In the case where there is no translation available to the client or 1810 server, the attribute value must be constructed without the "@". 1811 Therefore, the absence of the @ from the owner or owner_group 1812 attribute signifies that no translation was available and the 1813 receiver of the attribute should not place any special meaning with 1814 the attribute value. Even though the attribute value can not be 1815 translated, it may still be useful. In the case of a client, the 1816 attribute string may be used for local display of ownership. 1818 5.7. Character Case Attributes 1820 With respect to the case_insensitive and case_preserving attributes, 1821 each UCS-4 character (which UTF-8 encodes) has a "long descriptive 1822 name" [RFC1345] which may or may not included the word "CAPITAL" or 1823 "SMALL". The presence of SMALL or CAPITAL allows an NFS server to 1824 implement unambiguous and efficient table driven mappings for case 1825 insensitive comparisons, and non-case-preserving storage. For 1826 general character handling and internationalization issues, see the 1827 section "Internationalization". 1829 5.8. Quota Attributes 1831 For the attributes related to file system quotas, the following 1832 definitions apply: 1834 quota_soft 1835 The value in bytes which represents the amount of additional 1836 disk space that can be allocated to this file or directory 1837 before the user may reasonably be warned. It is understood that 1838 this space may be consumed by allocations to other files or 1839 directories though there is a rule as to which other files or 1840 directories. 1842 quota_hard 1843 The value in bytes which represent the amount of additional disk 1844 space beyond the current allocation that can be allocated to 1845 this file or directory before further allocations will be 1846 refused. It is understood that this space may be consumed by 1847 allocations to other files or directories. 1849 quota_used 1850 The value in bytes which represent the amount of disc space used 1851 by this file or directory and possibly a number of other similar 1852 files or directories, where the set of "similar" meets at least 1853 the criterion that allocating space to any file or directory in 1854 the set will reduce the "quota_avail_hard" of every other file 1855 or directory in the set. 1857 Note that there may be a number of distinct but overlapping sets 1858 of files or directories for which a quota_used value is 1859 maintained. E.g. "all files with a given owner", "all files with 1860 a given group owner". etc. 1862 The server is at liberty to choose any of those sets but should 1863 do so in a repeatable way. The rule may be configured per- 1864 filesystem or may be "choose the set with the smallest quota". 1866 5.9. Access Control Lists 1868 The NFS ACL attribute is an array of access control entries (ACE). 1869 There are various access control entry types. The server is able to 1870 communicate which ACE types are supported by returning the 1871 appropriate value within the aclsupport attribute. The types of ACEs 1872 are defined as follows: 1874 Type Description 1875 _____________________________________________________ 1876 ALLOW Explicitly grants the access defined in 1877 acemask4 to the file or directory. 1879 DENY Explicitly denies the access defined in 1880 acemask4 to the file or directory. 1882 AUDIT LOG (system dependent) any access 1883 attempt to a file or directory which 1884 uses any of the access methods specified 1885 in acemask4. 1887 ALARM Generate a system ALARM (system 1888 dependent) when any access attempt is 1889 made to a file or directory for the 1890 access methods specified in acemask4. 1892 The NFS ACE attribute is defined as follows: 1894 typedef uint32_t acetype4; 1895 typedef uint32_t aceflag4; 1896 typedef uint32_t acemask4; 1897 struct nfsace4 { 1898 acetype4 type; 1899 aceflag4 flag; 1900 acemask4 access_mask; 1901 utf8string who; 1902 }; 1904 To determine if an ACCESS or OPEN request succeeds each nfsace4 entry 1905 is processed in order by the server. Only ACEs which have a "who" 1906 that matches the requester are considered. Each ACE is processed 1907 until all of the bits of the requester's access have been ALLOWED. 1908 Once a bit (see below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it 1909 is no longer considered in the processing of later ACEs. If an 1910 ACCESS_DENIED_ACE is encountered where the requester's mode still has 1911 unALLOWED bits in common with the "access_mask" of the ACE, the 1912 request is denied. 1914 The bitmask constants used to represent the above definitions within 1915 the aclsupport attribute are as follows: 1917 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 1918 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 1919 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 1920 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 1922 5.9.1. ACE type 1924 The semantics of the "type" field follow the descriptions provided 1925 above. 1927 The bitmask constants used for the type field are as follows: 1929 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 1930 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 1931 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 1932 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 1934 5.9.2. ACE flag 1936 The "flag" field contains values based on the following descriptions. 1938 ACE4_FILE_INHERIT_ACE 1940 Can be placed on a directory and indicates that this ACE should be 1941 added to each new non-directory file created. 1943 ACE4_DIRECTORY_INHERIT_ACE 1944 Can be placed on a directory and indicates that this ACE should be 1945 added to each new directory created. 1947 ACE4_INHERIT_ONLY_ACE 1949 Can be placed on a directory but does not apply to the directory, 1950 only to newly created files/directories as specified by the above two 1951 flags. 1953 ACE4_NO_PROPAGATE_INHERIT_ACE 1955 Can be placed on a directory. Normally when a new directory is 1956 created and an ACE exists on the parent directory which is marked 1957 ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new directory. 1958 One for the directory itself and one which is an inheritable ACE for 1959 newly created directories. This flag tells the server to not place 1960 an ACE on the newly created directory which is inheritable by 1961 subdirectories of the created directory. 1963 ACE4_SUCCESSFUL_ACCESS_ACE_FLAG 1965 ACL4_FAILED_ACCESS_ACE_FLAG 1967 Both indicate for AUDIT and ALARM which state to log the event. On 1968 every ACCESS or OPEN call which occurs on a file or directory which 1969 has an ACL that is of type ACE4_SYSTEM_AUDIT_ACE_TYPE or 1970 ACE4_SYSTEM_ALARM_ACE_TYPE, the attempted access is compared to the 1971 ace4mask of these ACLs. If the access is a subset of ace4mask and the 1972 identifier match, an AUDIT trail or an ALARM is generated. By 1973 default this happens regardless of the success or failure of the 1974 ACCESS or OPEN call. 1976 The flag ACE4_SUCCESSFUL_ACCESS_ACE_FLAG only produces the AUDIT or 1977 ALARM if the ACCESS or OPEN call is successful. The 1978 ACE4_FAILED_ACCESS_ACE_FLAG causes the ALARM or AUDIT if the ACCESS 1979 or OPEN call fails. 1981 ACE4_IDENTIFIER_GROUP 1983 Indicates that the "who" refers to a GROUP as defined under Unix. 1985 The bitmask constants used for the flag field are as follows: 1987 const ACE4_FILE_INHERIT_ACE = 0x00000001; 1988 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 1989 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 1990 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 1991 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 1992 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 1993 const ACE4_IDENTIFIER_GROUP = 0x00000040; 1995 5.9.3. ACE Access Mask 1997 The access_mask field contains values based on the following: 1999 Access Description 2000 _______________________________________________________________ 2001 READ_DATA Permission to read the data of the file 2002 LIST_DIRECTORY Permission to list the contents of a 2003 directory 2004 WRITE_DATA Permission to modify the file's data 2005 ADD_FILE Permission to add a new file to a 2006 directory 2007 APPEND_DATA Permission to append data to a file 2008 ADD_SUBDIRECTORY Permission to create a subdirectory to a 2009 directory 2010 READ_NAMED_ATTRS Permission to read the named attributes 2011 of a file 2012 WRITE_NAMED_ATTRS Permission to write the named attributes 2013 of a file 2014 EXECUTE Permission to execute a file 2015 DELETE_CHILD Permission to delete a file or directory 2016 within a directory 2017 READ_ATTRIBUTES The ability to read basic attributes 2018 (non-acls) of a file 2019 WRITE_ATTRIBUTES Permission to change basic attributes 2020 (non-acls) of a file 2022 DELETE Permission to Delete the file 2023 READ_ACL Permission to Read the ACL 2024 WRITE_ACL Permission to Write the ACL 2025 WRITE_OWNER Permission to change the owner 2026 SYNCHRONIZE Permission to access file locally at the 2027 server with synchronous reads and writes 2029 The bitmask constants used for the access mask field are as follows: 2031 const ACE4_READ_DATA = 0x00000001; 2032 const ACE4_LIST_DIRECTORY = 0x00000001; 2033 const ACE4_WRITE_DATA = 0x00000002; 2034 const ACE4_ADD_FILE = 0x00000002; 2035 const ACE4_APPEND_DATA = 0x00000004; 2036 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 2037 const ACE4_READ_NAMED_ATTRS = 0x00000008; 2038 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 2039 const ACE4_EXECUTE = 0x00000020; 2040 const ACE4_DELETE_CHILD = 0x00000040; 2041 const ACE4_READ_ATTRIBUTES = 0x00000080; 2042 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 2044 const ACE4_DELETE = 0x00010000; 2045 const ACE4_READ_ACL = 0x00020000; 2046 const ACE4_WRITE_ACL = 0x00040000; 2047 const ACE4_WRITE_OWNER = 0x00080000; 2048 const ACE4_SYNCHRONIZE = 0x00100000; 2050 5.9.4. ACE who 2052 There are several special identifiers ("who") which need to be 2053 understood universally. Some of these identifiers cannot be 2054 understood when an NFS client accesses the server, but have meaning 2055 when a local process accesses the file. The ability to display and 2056 modify these permissions is permitted over NFS. 2058 Who Description 2059 _______________________________________________________________ 2060 "OWNER" The owner of the file. 2061 "GROUP" The group associated with the file. 2062 "EVERYONE" The world. 2063 "INTERACTIVE" Accessed from an interactive terminal. 2064 "NETWORK" Accessed via the network. 2065 "DIALUP" Accessed as a dialup user to the server. 2066 "BATCH" Accessed from a batch job. 2067 "ANONYMOUS" Accessed without any authentication. 2068 "AUTHENTICATED" Any authenticated user (opposite of 2069 ANONYMOUS) 2070 "SERVICE" Access from a system service. 2072 To avoid conflict, these special identifiers are distinguish by an 2073 appended "@" and should appear in the form "xxxx@" (note: no domain 2074 name after the "@"). For example: ANONYMOUS@. 2076 6. File System Migration and Replication 2078 With the use of the recommended attribute "fs_locations", the NFS 2079 version 4 server has a method of providing file system migration or 2080 replication services. For the purposes of migration and replication, 2081 a file system will be defined as all files that share a given fsid 2082 (both major and minor values are the same). 2084 The fs_locations attribute provides a list of file system locations. 2085 These locations are specified by providing the server name (either 2086 DNS domain or IP address) and the path name representing the root of 2087 the file system. Depending on the type of service being provided, 2088 the list will provide a new location or a set of alternate locations 2089 for the file system. The client will use this information to 2090 redirect its requests to the new server. 2092 6.1. Replication 2094 It is expected that file system replication will be used in the case 2095 of read-only data. Typically, the file system will be replicated on 2096 two or more servers. The fs_locations attribute will provide the 2097 list of these locations to the client. On first access of the file 2098 system, the client should obtain the value of the fs_locations 2099 attribute. If, in the future, the client finds the server 2100 unresponsive, the client may attempt to use another server specified 2101 by fs_locations. 2103 If applicable, the client must take the appropriate steps to recover 2104 valid filehandles from the new server. This is described in more 2105 detail in the following sections. 2107 6.2. Migration 2109 File system migration is used to move a file system from one server 2110 to another. Migration is typically used for a file system that is 2111 writable and has a single copy. The expected use of migration is for 2112 load balancing or general resource reallocation. The protocol does 2113 not specify how the file system will be moved between servers. This 2114 server-to-server transfer mechanism is left to the server 2115 implementor. However, the method used to communicate the migration 2116 event between client and server is specified here. 2118 Once the servers participating in the migration have completed the 2119 move of the file system, the error NFS4ERR_MOVED will be returned for 2120 subsequent requests received by the original server. The 2121 NFS4ERR_MOVED error is returned for all operations except GETATTR. 2122 Upon receiving the NFS4ERR_MOVED error, the client will obtain the 2123 value of the fs_locations attribute. The client will then use the 2124 contents of the attribute to redirect its requests to the specified 2125 server. To facilitate the use of GETATTR, operations such as PUTFH 2126 must also be accepted by the server for the migrated file system's 2127 filehandles. Note that if the server returns NFS4ERR_MOVED, the 2128 server MUST support the fs_locations attribute. 2130 If the client requests more attributes than just fs_locations, the 2131 server may return fs_locations only. This is to be expected since 2132 the server has migrated the file system and may not have a method of 2133 obtaining additional attribute data. 2135 The server implementor needs to be careful in developing a migration 2136 solution. The server must consider all of the state information 2137 clients may have outstanding at the server. This includes but is not 2138 limited to locking/share state, delegation state, and asynchronous 2139 file writes which are represented by WRITE and COMMIT verifiers. The 2140 server should strive to minimize the impact on its clients during and 2141 after the migration process. 2143 6.3. Interpretation of the fs_locations Attribute 2145 The fs_location attribute is structured in the following way: 2147 struct fs_location { 2148 utf8string server<>; 2149 pathname4 rootpath; 2150 }; 2152 struct fs_locations { 2153 pathname4 fs_root; 2154 fs_location locations<>; 2155 }; 2157 The fs_location struct is used to represent the location of a file 2158 system by providing a server name and the path to the root of the 2159 file system. For a multi-homed server or a set of servers that use 2160 the same rootpath, an array of server names may be provided. An 2161 entry in the server array is an UTF8 string and represents one of a 2162 traditional DNS host name, IPv4 address, or IPv6 address. It is not 2163 a requirement that all servers that share the same rootpath be listed 2164 in one fs_location struct. The array of server names is provided for 2165 convenience. Servers that share the same rootpath may also be listed 2166 in separate fs_location entries in the fs_locations attribute. 2168 The fs_locations struct and attribute then contains an array of 2169 locations. Since the name space of each server may be constructed 2170 differently, the "fs_root" field is provided. The path represented 2171 by fs_root represents the location of the file system in the server's 2172 name space. Therefore, the fs_root path is only associated with the 2173 server from which the fs_locations attribute was obtained. The 2174 fs_root path is meant to aid the client in locating the file system 2175 at the various servers listed. 2177 As an example, there is a replicated file system located at two 2178 servers (servA and servB). At servA the file system is located at 2179 path "/a/b/c". At servB the file system is located at path "/x/y/z". 2180 In this example the client accesses the file system first at servA 2181 with a multi-component lookup path of "/a/b/c/d". Since the client 2182 used a multi-component lookup to obtain the filehandle at "/a/b/c/d", 2183 it is unaware that the file system's root is located in servA's name 2184 space at "/a/b/c". When the client switches to servB, it will need 2185 to determine that the directory it first referenced at servA is now 2186 represented by the path "/x/y/z/d" on servB. To facilitate this, the 2187 fs_locations attribute provided by servA would have a fs_root value 2188 of "/a/b/c" and two entries in fs_location. One entry in fs_location 2189 will be for itself (servA) and the other will be for servB with a 2190 path of "/x/y/z". With this information, the client is able to 2191 substitute "/x/y/z" for the "/a/b/c" at the beginning of its access 2192 path and construct "/x/y/z/d" to use for the new server. 2194 6.4. Filehandle Recovery for Migration or Replication 2196 Filehandles for file systems that are replicated or migrated 2197 generally have the same semantics as for file systems that are not 2198 replicated or migrated. For example, if a file system has persistent 2199 filehandles and it is migrated to another server, the filehandle 2200 values for the file system will be valid at the new server. 2202 For volatile filehandles, the servers involved likely do not have a 2203 mechanism to transfer filehandle format and content between 2204 themselves. Therefore, a server may have difficulty in determining 2205 if a volatile filehandle from an old server should return an error of 2206 NFS4ERR_FHEXPIRED. Therefore, the client is informed, with the use 2207 of the fh_expire_type attribute, whether volatile filehandles will 2208 expire at the migration or replication event. If the bit 2209 FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client 2210 must treat the volatile filehandle as if the server had returned the 2211 NFS4ERR_FHEXPIRED error. At the migration or replication event in 2212 the presence of the FH4_VOL_MIGRATION bit, the client will not 2213 present the original or old volatile file handle to the new server. 2214 The client will start its communication with the new server by 2215 recovering its filehandles using the saved file names. 2217 7. NFS Server Name Space 2219 7.1. Server Exports 2221 On a UNIX server the name space describes all the files reachable by 2222 pathnames under the root directory or "/". On a Windows NT server 2223 the name space constitutes all the files on disks named by mapped 2224 disk letters. NFS server administrators rarely make the entire 2225 server's file system name space available to NFS clients. More often 2226 portions of the name space are made available via an "export" 2227 feature. In previous versions of the NFS protocol, the root 2228 filehandle for each export is obtained through the MOUNT protocol; 2229 the client sends a string that identifies the export of name space 2230 and the server returns the root filehandle for it. The MOUNT 2231 protocol supports an EXPORTS procedure that will enumerate the 2232 server's exports. 2234 7.2. Browsing Exports 2236 The NFS version 4 protocol provides a root filehandle that clients 2237 can use to obtain filehandles for these exports via a multi-component 2238 LOOKUP. A common user experience is to use a graphical user 2239 interface (perhaps a file "Open" dialog window) to find a file via 2240 progressive browsing through a directory tree. The client must be 2241 able to move from one export to another export via single-component, 2242 progressive LOOKUP operations. 2244 This style of browsing is not well supported by the NFS version 2 and 2245 3 protocols. The client expects all LOOKUP operations to remain 2246 within a single server file system. For example, the device 2247 attribute will not change. This prevents a client from taking name 2248 space paths that span exports. 2250 An automounter on the client can obtain a snapshot of the server's 2251 name space using the EXPORTS procedure of the MOUNT protocol. If it 2252 understands the server's pathname syntax, it can create an image of 2253 the server's name space on the client. The parts of the name space 2254 that are not exported by the server are filled in with a "pseudo file 2255 system" that allows the user to browse from one mounted file system 2256 to another. There is a drawback to this representation of the 2257 server's name space on the client: it is static. If the server 2258 administrator adds a new export the client will be unaware of it. 2260 7.3. Server Pseudo File System 2262 NFS version 4 servers avoid this name space inconsistency by 2263 presenting all the exports within the framework of a single server 2264 name space. An NFS version 4 client uses LOOKUP and READDIR 2265 operations to browse seamlessly from one export to another. Portions 2266 of the server name space that are not exported are bridged via a 2267 "pseudo file system" that provides a view of exported directories 2268 only. A pseudo file system has a unique fsid and behaves like a 2269 normal, read only file system. 2271 Based on the construction of the server's name space, it is possible 2272 that multiple pseudo file systems may exist. For example, 2274 /a pseudo file system 2275 /a/b real file system 2276 /a/b/c pseudo file system 2277 /a/b/c/d real file system 2279 Each of the pseudo file systems are consider separate entities and 2280 therefore will have a unique fsid. 2282 7.4. Multiple Roots 2284 The DOS and Windows operating environments are sometimes described as 2285 having "multiple roots". File systems are commonly represented as 2286 disk letters. MacOS represents file systems as top level names. NFS 2287 version 4 servers for these platforms can construct a pseudo file 2288 system above these root names so that disk letters or volume names 2289 are simply directory names in the pseudo root. 2291 7.5. Filehandle Volatility 2293 The nature of the server's pseudo file system is that it is a logical 2294 representation of file system(s) available from the server. 2295 Therefore, the pseudo file system is most likely constructed 2296 dynamically when the server is first instantiated. It is expected 2297 that the pseudo file system may not have an on disk counterpart from 2298 which persistent filehandles could be constructed. Even though it is 2299 preferable that the server provide persistent filehandles for the 2300 pseudo file system, the NFS client should expect that pseudo file 2301 system filehandles are volatile. This can be confirmed by checking 2302 the associated "fh_expire_type" attribute for those filehandles in 2303 question. If the filehandles are volatile, the NFS client must be 2304 prepared to recover a filehandle value (e.g. with a multi-component 2305 LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. 2307 7.6. Exported Root 2309 If the server's root file system is exported, one might conclude that 2310 a pseudo-file system is not needed. This would be wrong. Assume the 2311 following file systems on a server: 2313 / disk1 (exported) 2314 /a disk2 (not exported) 2315 /a/b disk3 (exported) 2317 Because disk2 is not exported, disk3 cannot be reached with simple 2318 LOOKUPs. The server must bridge the gap with a pseudo-file system. 2320 7.7. Mount Point Crossing 2322 The server file system environment may be constructed in such a way 2323 that one file system contains a directory which is 'covered' or 2324 mounted upon by a second file system. For example: 2326 /a/b (file system 1) 2327 /a/b/c/d (file system 2) 2329 The pseudo file system for this server may be constructed to look 2330 like: 2332 / (place holder/not exported) 2333 /a/b (file system 1) 2334 /a/b/c/d (file system 2) 2336 It is the server's responsibility to present the pseudo file system 2337 that is complete to the client. If the client sends a lookup request 2338 for the path "/a/b/c/d", the server's response is the filehandle of 2339 the file system "/a/b/c/d". In previous versions of the NFS 2340 protocol, the server would respond with the directory "/a/b/c/d" 2341 within the file system "/a/b". 2343 The NFS client will be able to determine if it crosses a server mount 2344 point by a change in the value of the "fsid" attribute. 2346 7.8. Security Policy and Name Space Presentation 2348 The application of the server's security policy needs to be carefully 2349 considered by the implementor. One may choose to limit the 2350 viewability of portions of the pseudo file system based on the 2351 server's perception of the client's ability to authenticate itself 2352 properly. However, with the support of multiple security mechanisms 2353 and the ability to negotiate the appropriate use of these mechanisms, 2354 the server is unable to properly determine if a client will be able 2355 to authenticate itself. If, based on its policies, the server 2356 chooses to limit the contents of the pseudo file system, the server 2357 may effectively hide file systems from a client that may otherwise 2358 have legitimate access. 2360 8. File Locking and Share Reservations 2362 Integrating locking into the NFS protocol necessarily causes it to be 2363 state-full. With the inclusion of "share" file locks the protocol 2364 becomes substantially more dependent on state than the traditional 2365 combination of NFS and NLM [XNFS]. There are three components to 2366 making this state manageable: 2368 o Clear division between client and server 2370 o Ability to reliably detect inconsistency in state between client 2371 and server 2373 o Simple and robust recovery mechanisms 2375 In this model, the server owns the state information. The client 2376 communicates its view of this state to the server as needed. The 2377 client is also able to detect inconsistent state before modifying a 2378 file. 2380 To support Win32 "share" locks it is necessary to atomically OPEN or 2381 CREATE files. Having a separate share/unshare operation would not 2382 allow correct implementation of the Win32 OpenFile API. In order to 2383 correctly implement share semantics, the previous NFS protocol 2384 mechanisms used when a file is opened or created (LOOKUP, CREATE, 2385 ACCESS) need to be replaced. The NFS version 4 protocol has an OPEN 2386 operation that subsumes the functionality of LOOKUP, CREATE, and 2387 ACCESS. However, because many operations require a filehandle, the 2388 traditional LOOKUP is preserved to map a file name to filehandle 2389 without establishing state on the server. The policy of granting 2390 access or modifying files is managed by the server based on the 2391 client's state. These mechanisms can implement policy ranging from 2392 advisory only locking to full mandatory locking. 2394 8.1. Locking 2396 It is assumed that manipulating a lock is rare when compared to READ 2397 and WRITE operations. It is also assumed that crashes and network 2398 partitions are relatively rare. Therefore it is important that the 2399 READ and WRITE operations have a lightweight mechanism to indicate if 2400 they possess a held lock. A lock request contains the heavyweight 2401 information required to establish a lock and uniquely define the lock 2402 owner. 2404 The following sections describe the transition from the heavy weight 2405 information to the eventual stateid used for most client and server 2406 locking and lease interactions. 2408 8.1.1. Client ID 2410 For each LOCK request, the client must identify itself to the server. 2412 This is done in such a way as to allow for correct lock 2413 identification and crash recovery. Client identification is 2414 accomplished with two values. 2416 o A verifier that is used to detect client reboots. 2418 o A variable length opaque array to uniquely define a client. 2420 For an operating system this may be a fully qualified host 2421 name or IP address. For a user level NFS client it may 2422 additionally contain a process id or other unique sequence. 2424 The data structure for the Client ID would then appear as: 2426 struct nfs_client_id { 2427 opaque verifier[4]; 2428 opaque id<>; 2429 } 2431 It is possible through the mis-configuration of a client or the 2432 existence of a rogue client that two clients end up using the same 2433 nfs_client_id. This situation is avoided by "negotiating" the 2434 nfs_client_id between client and server with the use of the 2435 SETCLIENTID and SETCLIENTID_CONFIRM operations. The following 2436 describes the two scenarios of negotiation. 2438 1 Client has never connected to the server 2440 In this case the client generates an nfs_client_id and 2441 unless another client has the same nfs_client_id.id field, 2442 the server accepts the request. The server also records the 2443 principal (or principal to uid mapping) from the credential 2444 in the RPC request that contains the nfs_client_id 2445 negotiation request (SETCLIENTID operation). 2447 Two clients might still use the same nfs_client_id.id due 2448 to perhaps configuration error. For example, a High 2449 Availability configuration where the nfs_client_id.id is 2450 derived from the ethernet controller address and both 2451 systems have the same address. In this case, the result is 2452 a switched union that returns, in addition to 2453 NFS4ERR_CLID_INUSE, the network address (the rpcbind netid 2454 and universal address) of the client that is using the id. 2456 2 Client is re-connecting to the server after a client reboot 2458 In this case, the client still generates an nfs_client_id 2459 but the nfs_client_id.id field will be the same as the 2460 nfs_client_id.id generated prior to reboot. If the server 2461 finds that the principal/uid is equal to the previously 2462 "registered" nfs_client_id.id, then locks associated with 2463 the old nfs_client_id are immediately released. If the 2464 principal/uid is not equal, then this is a rogue client and 2465 the request is returned in error. For more discussion of 2466 crash recovery semantics, see the section on "Crash 2467 Recovery". 2469 It is possible for a retransmission of request to be 2470 received by the server after the server has acted upon and 2471 responded to the original client request. Therefore to 2472 mitigate effects of the retransmission of the SETCLIENTID 2473 operation, the client and server use a confirmation step. 2474 The server returns a confirmation verifier that the client 2475 then sends to the server in the SETCLIENTID_CONFIRM 2476 operation. Once the server receives the confirmation from 2477 the client, the locking state for the client is released. 2479 In both cases, upon success, NFS4_OK is returned. To help reduce the 2480 amount of data transferred on OPEN and LOCK, the server will also 2481 return a unique 64-bit clientid value that is a shorthand reference 2482 to the nfs_client_id values presented by the client. From this point 2483 forward, the client will use the clientid to refer to itself. 2485 The clientid assigned by the server should be chosen so that it will 2486 not conflict with a clientid previously assigned by the server. This 2487 applies across server restarts or reboots. When a clientid is 2488 presented to a server and that clientid is not recognized, as would 2489 happen after a server reboot, the server will reject the request with 2490 the error NFS4ERR_STALE_CLIENTID. When this happens, the client must 2491 obtain a new clientid by use of the SETCLIENTID operation and then 2492 proceed to any other necessary recovery for the server reboot case 2493 (See the section "Server Failure and Recovery"). 2495 The client must also employ the SETCLIENTID operation when it 2496 receives a NFS4ERR_STALE_STATEID error using a stateid derived from 2497 its current clientid, since this also indicates a server reboot which 2498 has invalidated the existing clientid (see the next section 2499 "nfs_lockowner and stateid Definition" for details). 2501 8.1.2. Server Release of Clientid 2503 If the server determines that the client holds no associated state 2504 for its clientid, the server may choose to release the clientid. The 2505 server may make this choice for an inactive client so that resources 2506 are not consumed by those intermittently active clients. If the 2507 client contacts the server after this release, the server must ensure 2508 the client receives the appropriate error so that it will use the 2509 SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. 2510 It should be clear that the server must be very hesitant to release a 2511 clientid since the resulting work on the client to recover from such 2512 an event will be the same burden as if the server had failed and 2513 restarted. Typically a server would not release a clientid unless 2514 there had been no activity from that client for many minutes. 2516 8.1.3. nfs_lockowner and stateid Definition 2518 When requesting a lock, the client must present to the server the 2519 clientid and an identifier for the owner of the requested lock. 2520 These two fields are referred to as the nfs_lockowner and the 2521 definition of those fields are: 2523 o A clientid returned by the server as part of the client's use of 2524 the SETCLIENTID operation. 2526 o A variable length opaque array used to uniquely define the owner 2527 of a lock managed by the client. 2529 This may be a thread id, process id, or other unique value. 2531 When the server grants the lock, it responds with a unique 64-bit 2532 stateid. The stateid is used as a shorthand reference to the 2533 nfs_lockowner, since the server will be maintaining the 2534 correspondence between them. 2536 The server is free to form the stateid in any manner that it chooses 2537 as long as it is able to recognize invalid and out-of-date stateids. 2538 This requirement includes those stateids generated by earlier 2539 instances of the server. From this, the client can be properly 2540 notified of a server restart. This notification will occur when the 2541 client presents a stateid to the server from a previous 2542 instantiation. 2544 The server must be able to distinguish the following situations and 2545 return the error as specified: 2547 o The stateid was generated by an earlier server instance (i.e. 2548 before a server reboot). The error NFS4ERR_STALE_STATEID should 2549 be returned. 2551 o The stateid was generated by the current server instance but the 2552 stateid no longer designates the current locking state for the 2553 lockowner-file pair in question (i.e. one or more locking 2554 operations has occurred). The error NFS4ERR_OLD_STATEID should 2555 be returned. 2557 This error condition will only occur when the client issues a 2558 locking request which changes a stateid while an I/O request 2559 that uses that stateid is outstanding. 2561 o The stateid was generated by the current server instance but the 2562 stateid does not designate a locking state for any active 2563 lockowner-file pair. The error NFS4ERR_BAD_STATEID should be 2564 returned. 2566 This error condition will occur when there has been a logic 2567 error on the part of the client or server. This should not 2568 happen. 2570 One mechanism that may be used to satisfy these requirements is for 2571 the server to divide stateids into three fields: 2573 o A server verifier which uniquely designates a particular server 2574 instantiation. 2576 o An index into a table of locking-state structures. 2578 o A sequence value which is incremented for each stateid that is 2579 associated with the same index into the locking-state table. 2581 By matching the incoming stateid and its field values with the state 2582 held at the server, the server is able to easily determine if a 2583 stateid is valid for its current instantiation and state. If the 2584 stateid is not valid, the appropriate error can be supplied to the 2585 client. 2587 8.1.4. Use of the stateid 2589 All READ and WRITE operations contain a stateid. If the 2590 nfs_lockowner performs a READ or WRITE on a range of bytes within a 2591 locked range, the stateid (previously returned by the server) must be 2592 used to indicate that the appropriate lock (record or share) is held. 2593 If no state is established by the client, either record lock or share 2594 lock, a stateid of all bits 0 is used. If no conflicting locks are 2595 held on the file, the server may service the READ or WRITE operation. 2596 If a conflict with an explicit lock occurs, an error is returned for 2597 the operation (NFS4ERR_LOCKED). This allows "mandatory locking" to be 2598 implemented. 2600 A stateid of all bits 1 (one) allows READ operations to bypass record 2601 locking checks at the server. However, WRITE operations with stateid 2602 with bits all 1 (one) do not bypass record locking checks. File 2603 locking checks are handled by the OPEN operation (see the section 2604 "OPEN/CLOSE Operations"). 2606 An explicit lock may not be granted while a READ or WRITE operation 2607 with conflicting implicit locking is being performed. 2609 8.1.5. Sequencing of Lock Requests 2611 Locking is different than most NFS operations as it requires "at- 2612 most-one" semantics that are not provided by ONCRPC. ONCRPC over a 2613 reliable transport is not sufficient because a sequence of locking 2614 requests may span multiple TCP connections. In the face of 2615 retransmission or reordering, lock or unlock requests must have a 2616 well defined and consistent behavior. To accomplish this, each lock 2617 request contains a sequence number that is a consecutively increasing 2618 integer. Different nfs_lockowners have different sequences. The 2619 server maintains the last sequence number (L) received and the 2620 response that was returned. 2622 Note that for requests that contain a sequence number, the client, 2623 for each nfs_lockowner, there should be no more than one outstanding 2624 request. 2626 If a request with a previous sequence number (r < L) is received, it 2627 is rejected with the return of error NFS4ERR_BAD_SEQID. Given a 2628 properly-functioning client, the response to (r) must have been 2629 received before the last request (L) was sent. If a duplicate of 2630 last request (r == L) is received, the stored response is returned. 2631 If a request beyond the next sequence (r == L + 2) is received, it is 2632 rejected with the return of error NFS4ERR_BAD_SEQID. Sequence 2633 history is reinitialized whenever the client verifier changes. 2635 Since the sequence number is represented with an unsigned 32-bit 2636 integer, the arithmetic involved with the sequence number is mod 2637 2^32. 2639 It is critical the server maintain the last response sent to the 2640 client to provide a more reliable cache of duplicate non-idempotent 2641 requests than that of the traditional cache described in [Juszczak]. 2642 The traditional duplicate request cache uses a least recently used 2643 algorithm for removing unneeded requests. However, the last lock 2644 request and response on a given nfs_lockowner must be cached as long 2645 as the lock state exists on the server. 2647 8.1.6. Recovery from Replayed Requests 2649 As described above, the sequence number is per nfs_lockowner. As 2650 long as the server maintains the last sequence number received and 2651 follows the methods described above, there are no risks of a 2652 Byzantine router re-sending old requests. The server need only 2653 maintain the nfs_lockowner, sequence number state as long as there 2654 are open files or closed files with locks outstanding. 2656 LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence 2657 number and therefore the risk of the replay of these operations 2658 resulting in undesired effects is non-existent while the server 2659 maintains the nfs_lockowner state. 2661 8.1.7. Releasing nfs_lockowner State 2663 When a particular nfs_lockowner no longer holds open or file locking 2664 state at the server, the server may choose to release the sequence 2665 number state associated with the nfs_lockowner. The server may make 2666 this choice based on lease expiration, for the reclamation of server 2667 memory, or other implementation specific details. In any event, the 2668 server is able to do this safely only when the nfs_lockowner no 2669 longer is being utilized by the client. The server may choose to 2670 hold the nfs_lockowner state in the event that retransmitted requests 2671 are received. However, the period to hold this state is 2672 implementation specific. 2674 In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is 2675 retransmitted after the server has previously released the 2676 nfs_lockowner state, the server will find that the nfs_lockowner has 2677 no files open and an error will be returned to the client. If the 2678 nfs_lockowner does have a file open, the stateid will not match and 2679 again an error is returned to the client. 2681 In the case that an OPEN is retransmitted and the nfs_lockowner is 2682 being used for the first time or the nfs_lockowner state has been 2683 previously released by the server, the use of the OPEN_CONFIRM 2684 operation will prevent incorrect behavior. When the server observes 2685 the use of the nfs_lockowner for the first time, it will direct the 2686 client to perform the OPEN_CONFIRM for the corresponding OPEN. This 2687 sequence establishes the use of an nfs_lockowner and associated 2688 sequence number. See the section "OPEN_CONFIRM - Confirm Open" for 2689 further details. 2691 8.2. Lock Ranges 2693 The protocol allows a lock owner to request a lock with one byte 2694 range and then either upgrade or unlock a sub-range of the initial 2695 lock. It is expected that this will be an uncommon type of request. 2696 In any case, servers or server file systems may not be able to 2697 support sub-range lock semantics. In the event that a server 2698 receives a locking request that represents a sub-range of current 2699 locking state for the lock owner, the server is allowed to return the 2700 error NFS4ERR_LOCK_RANGE to signify that it does not support sub- 2701 range lock operations. Therefore, the client should be prepared to 2702 receive this error and, if appropriate, report the error to the 2703 requesting application. 2705 The client is discouraged from combining multiple independent locking 2706 ranges that happen to be adjacent into a single request since the 2707 server may not support sub-range requests and for reasons related to 2708 the recovery of file locking state in the event of server failure. 2709 As discussed in the section "Server Failure and Recovery" below, the 2710 server may employ certain optimizations during recovery that work 2711 effectively only when the client's behavior during lock recovery is 2712 similar to the client's locking behavior prior to server failure. 2714 8.3. Blocking Locks 2716 Some clients require the support of blocking locks. The NFS version 2717 4 protocol must not rely on a callback mechanism and therefore is 2718 unable to notify a client when a previously denied lock has been 2719 granted. Clients have no choice but to continually poll for the 2720 lock. This presents a fairness problem. Two new lock types are 2721 added, READW and WRITEW, and are used to indicate to the server that 2722 the client is requesting a blocking lock. The server should maintain 2723 an ordered list of pending blocking locks. When the conflicting lock 2724 is released, the server may wait the lease period for the first 2725 waiting client to re-request the lock. After the lease period 2726 expires the next waiting client request is allowed the lock. Clients 2727 are required to poll at an interval sufficiently small that it is 2728 likely to acquire the lock in a timely manner. The server is not 2729 required to maintain a list of pending blocked locks as it is used to 2730 increase fairness and not correct operation. Because of the 2731 unordered nature of crash recovery, storing of lock state to stable 2732 storage would be required to guarantee ordered granting of blocking 2733 locks. 2735 Servers may also note the lock types and delay returning denial of 2736 the request to allow extra time for a conflicting lock to be 2737 released, allowing a successful return. In this way, clients can 2738 avoid the burden of needlessly frequent polling for blocking locks. 2739 The server should take care in the length of delay in the event the 2740 client retransmits the request. 2742 8.4. Lease Renewal 2744 The purpose of a lease is to allow a server to remove stale locks 2745 that are held by a client that has crashed or is otherwise 2746 unreachable. It is not a mechanism for cache consistency and lease 2747 renewals may not be denied if the lease interval has not expired. 2749 The following events cause implicit renewal of all of the leases for 2750 a given client (i.e. all those sharing a given clientid). Each of 2751 these is a positive indication that the client is still active and 2752 that the associated state held at the server, for the client, is 2753 still valid. 2755 o An OPEN with a valid clientid. 2757 o Any operation made with a valid stateid (CLOSE, DELEGRETURN, 2758 LOCK, LOCKU, OPEN, OPEN_CONFIRM, READ, RENEW, SETATTR, WRITE). 2759 This does not include the special stateids of all bits 0 or all 2760 bits 1. 2762 Note that if the client had restarted or rebooted, the 2763 client would not be making these requests without issuing 2764 the SETCLIENTID operation. The use of the SETCLIENTID 2765 operation (possibly with the addition of the optional 2766 SETCLIENTID_CONFIRM operation) notifies the server to drop 2767 the locking state associated with the client. 2769 If the server has rebooted, the stateids 2770 (NFS4ERR_STALE_STATEID error) or the clientid 2771 (NFS4ERR_STALE_CLIENTID error) will not be valid hence 2772 preventing spurious renewals. 2774 This approach allows for low overhead lease renewal which scales 2775 well. In the typical case no extra RPC calls are required for lease 2776 renewal and in the worst case one RPC is required every lease period 2777 (i.e. a RENEW operation). The number of locks held by the client is 2778 not a factor since all state for the client is involved with the 2779 lease renewal action. 2781 Since all operations that create a new lease also renew existing 2782 leases, the server must maintain a common lease expiration time for 2783 all valid leases for a given client. This lease time can then be 2784 easily updated upon implicit lease renewal actions. 2786 8.5. Crash Recovery 2788 The important requirement in crash recovery is that both the client 2789 and the server know when the other has failed. Additionally, it is 2790 required that a client sees a consistent view of data across server 2791 restarts or reboots. All READ and WRITE operations that may have 2792 been queued within the client or network buffers must wait until the 2793 client has successfully recovered the locks protecting the READ and 2794 WRITE operations. 2796 8.5.1. Client Failure and Recovery 2798 In the event that a client fails, the server may recover the client's 2799 locks when the associated leases have expired. Conflicting locks 2800 from another client may only be granted after this lease expiration. 2801 If the client is able to restart or reinitialize within the lease 2802 period the client may be forced to wait the remainder of the lease 2803 period before obtaining new locks. 2805 To minimize client delay upon restart, lock requests are associated 2806 with an instance of the client by a client supplied verifier. This 2807 verifier is part of the initial SETCLIENTID call made by the client. 2808 The server returns a clientid as a result of the SETCLIENTID 2809 operation. The client then confirms the use of the verifier with 2810 SETCLIENTID_CONFIRM. The clientid in combination with an opaque 2811 owner field is then used by the client to identify the lock owner for 2812 OPEN. This chain of associations is then used to identify all locks 2813 for a particular client. 2815 Since the verifier will be changed by the client upon each 2816 initialization, the server can compare a new verifier to the verifier 2817 associated with currently held locks and determine that they do not 2818 match. This signifies the client's new instantiation and subsequent 2819 loss of locking state. As a result, the server is free to release 2820 all locks held which are associated with the old clientid which was 2821 derived from the old verifier. 2823 For secure environments, a change in the verifier must only cause the 2824 release of locks associated with the authenticated requester. This 2825 is required to prevent a rogue entity from freeing otherwise valid 2826 locks. 2828 Note that the verifier must have the same uniqueness properties of 2829 the verifier for the COMMIT operation. 2831 8.5.2. Server Failure and Recovery 2833 If the server loses locking state (usually as a result of a restart 2834 or reboot), it must allow clients time to discover this fact and re- 2835 establish the lost locking state. The client must be able to re- 2836 establish the locking state without having the server deny valid 2837 requests because the server has granted conflicting access to another 2838 client. Likewise, if there is the possibility that clients have not 2839 yet re-established their locking state for a file, the server must 2840 disallow READ and WRITE operations for that file. The duration of 2841 this recovery period is equal to the duration of the lease period. 2843 A client can determine that server failure (and thus loss of locking 2844 state) has occurred, when it receives one of two errors. The 2845 NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a 2846 reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a 2847 clientid invalidated by reboot or restart. When either of these are 2848 received, the client must establish a new clientid (See the section 2849 "Client ID") and re-establish the locking state as discussed below. 2851 The period of special handling of locking and READs and WRITEs, equal 2852 in duration to the lease period, is referred to as the "grace 2853 period". During the grace period, clients recover locks and the 2854 associated state by reclaim-type locking requests (i.e. LOCK requests 2855 with reclaim set to true and OPEN operations with a claim type of 2856 CLAIM_PREVIOUS). During the grace period, the server must reject 2857 READ and WRITE operations and non-reclaim locking requests (i.e. 2858 other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. 2860 If the server can reliably determine that granting a non-reclaim 2861 request will not conflict with reclamation of locks by other clients, 2862 the NFS4ERR_GRACE error does not have to be returned and the non- 2863 reclaim client request can be serviced. For the server to be able to 2864 service READ and WRITE operations during the grace period, it must 2865 again be able to guarantee that no possible conflict could arise 2866 between an impending reclaim locking request and the READ or WRITE 2867 operation. If the server is unable to offer that guarantee, the 2868 NFS4ERR_GRACE error must be returned to the client. 2870 For a server to provide simple, valid handling during the grace 2871 period, the easiest method is to simply reject all non-reclaim 2872 locking requests and READ and WRITE operations by returning the 2873 NFS4ERR_GRACE error. However, a server may keep information about 2874 granted locks in stable storage. With this information, the server 2875 could determine if a regular lock or READ or WRITE operation can be 2876 safely processed. 2878 For example, if a count of locks on a given file is available in 2879 stable storage, the server can track reclaimed locks for the file and 2880 when all reclaims have been processed, non-reclaim locking requests 2881 may be processed. This way the server can ensure that non-reclaim 2882 locking requests will not conflict with potential reclaim requests. 2883 With respect to I/O requests, if the server is able to determine that 2884 there are no outstanding reclaim requests for a file by information 2885 from stable storage or another similar mechanism, the processing of 2886 I/O requests could proceed normally for the file. 2888 To reiterate, for a server that allows non-reclaim lock and I/O 2889 requests to be processed during the grace period, it MUST determine 2890 that no lock subsequently reclaimed will be rejected and that no lock 2891 subsequently reclaimed would have prevented any I/O operation 2892 processed during the grace period. 2894 Clients should be prepared for the return of NFS4ERR_GRACE errors for 2895 non-reclaim lock and I/O requests. In this case the client should 2896 employ a retry mechanism for the request. A delay (on the order of 2897 several seconds) between retries should be used to avoid overwhelming 2898 the server. Further discussion of the general is included in 2899 [Floyd]. The client must account for the server that is able to 2900 perform I/O and non-reclaim locking requests within the grace period 2901 as well as those that can not do so. 2903 A reclaim-type locking request outside the server's grace period can 2904 only succeed if the server can guarantee that no conflicting lock or 2905 I/O request has been granted since reboot or restart. 2907 8.5.3. Network Partitions and Recovery 2909 If the duration of a network partition is greater than the lease 2910 period provided by the server, the server will have not received a 2911 lease renewal from the client. If this occurs, the server may free 2912 all locks held for the client. As a result, all stateids held by the 2913 client will become invalid or stale. Once the client is able to 2914 reach the server after such a network partition, all I/O submitted by 2915 the client with the now invalid stateids will fail with the server 2916 returning the error NFS4ERR_EXPIRED. Once this error is received, 2917 the client will suitably notify the application that held the lock. 2919 As a courtesy to the client or as an optimization, the server may 2920 continue to hold locks on behalf of a client for which recent 2921 communication has extended beyond the lease period. If the server 2922 receives a lock or I/O request that conflicts with one of these 2923 courtesy locks, the server must free the courtesy lock and grant the 2924 new request. 2926 If the server continues to hold locks beyond the expiration of a 2927 client's lease, the server MUST employ a method of recording this 2928 fact in its stable storage. Conflicting locks requests from another 2929 client may be serviced after the lease expiration. There are various 2930 scenarios involving server failure after such an event that require 2931 the storage of these lease expirations or network partitions. One 2932 scenario is as follows: 2934 A client holds a lock at the server and encounters a 2935 network partition and is unable to renew the associated 2936 lease. A second client obtains a conflicting lock and then 2937 frees the lock. After the unlock request by the second 2938 client, the server reboots or reinitializes. Once the 2939 server recovers, the network partition heals and the 2940 original client attempts to reclaim the original lock. 2942 In this scenario and without any state information, the server will 2943 allow the reclaim and the client will be in an inconsistent state 2944 because the server or the client has no knowledge of the conflicting 2945 lock. 2947 The server may choose to store this lease expiration or network 2948 partitioning state in a way that will only identify the client as a 2949 whole. Note that this may potentially lead to lock reclaims being 2950 denied unnecessarily because of a mix of conflicting and non- 2951 conflicting locks. The server may also choose to store information 2952 about each lock that has an expired lease with an associated 2953 conflicting lock. The choice of the amount and type of state 2954 information that is stored is left to the implementor. In any case, 2955 the server must have enough state information to enable correct 2956 recovery from multiple partitions and multiple server failures. 2958 8.6. Recovery from a Lock Request Timeout or Abort 2960 In the event a lock request times out, a client may decide to not 2961 retry the request. The client may also abort the request when the 2962 process for which it was issued is terminated (e.g. in UNIX due to a 2963 signal. It is possible though that the server received the request 2964 and acted upon it. This would change the state on the server without 2965 the client being aware of the change. It is paramount that the 2966 client re-synchronize state with server before it attempts any other 2967 operation that takes a seqid and/or a stateid with the same 2968 nfs_lockowner. This is straightforward to do without a special re- 2969 synchronize operation. 2971 Since the server maintains the last lock request and response 2972 received on the nfs_lockowner, for each nfs_lockowner, the client 2973 should cache the last lock request it sent such that the lock request 2974 did not receive a response. From this, the next time the client does 2975 a lock operation for the nfs_lockowner, it can send the cached 2976 request, if there is one, and if the request was one that established 2977 state (e.g. a LOCK or OPEN operation) the client can follow up with a 2978 request to remove the state (e.g. a LOCKU or CLOSE operation). With 2979 this approach, the sequencing and stateid information on the client 2980 and server for the given nfs_lockowner will re-synchronize and in 2981 turn the lock state will re-synchronize. 2983 8.7. Server Revocation of Locks 2985 At any point, the server can revoke locks held by a client and the 2986 client must be prepared for this event. When the client detects that 2987 its locks have been or may have been revoked, the client is 2988 responsible for validating the state information between itself and 2989 the server. Validating locking state for the client means that it 2990 must verify or reclaim state for each lock currently held. 2992 The first instance of lock revocation is upon server reboot or re- 2993 initialization. In this instance the client will receive an error 2994 (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will 2995 proceed with normal crash recovery as described in the previous 2996 section. 2998 The second lock revocation event is the inability to renew the lease 2999 period. While this is considered a rare or unusual event, the client 3000 must be prepared to recover. Both the server and client will be able 3001 to detect the failure to renew the lease and are capable of 3002 recovering without data corruption. For the server, it tracks the 3003 last renewal event serviced for the client and knows when the lease 3004 will expire. Similarly, the client must track operations which will 3005 renew the lease period. Using the time that each such request was 3006 sent and the time that the corresponding reply was received, the 3007 client should bound the time that the corresponding renewal could 3008 have occurred on the server and thus determine if it is possible that 3009 a lease period expiration could have occurred. 3011 The third lock revocation event can occur as a result of 3012 administrative intervention within the lease period. While this is 3013 considered a rare event, it is possible that the server's 3014 administrator has decided to release or revoke a particular lock held 3015 by the client. As a result of revocation, the client will receive an 3016 error of NFS4ERR_EXPIRED and the error is received within the lease 3017 period for the lock. In this instance the client may assume that 3018 only the nfs_lockowner's locks have been lost. The client notifies 3019 the lock holder appropriately. The client may not assume the lease 3020 period has been renewed as a result of failed operation. 3022 When the client determines the lease period may have expired, the 3023 client must mark all locks held for the associated lease as 3024 "unvalidated". This means the client has been unable to re-establish 3025 or confirm the appropriate lock state with the server. As described 3026 in the previous section on crash recovery, there are scenarios in 3027 which the server may grant conflicting locks after the lease period 3028 has expired for a client. When it is possible that the lease period 3029 has expired, the client must validate each lock currently held to 3030 ensure that a conflicting lock has not been granted. The client may 3031 accomplish this task by issuing an I/O request, either a pending I/O 3032 or a zero-length read, specifying the stateid associated with the 3033 lock in question. If the response to the request is success, the 3034 client has validated all of the locks governed by that stateid and 3035 re-established the appropriate state between itself and the server. 3036 If the I/O request is not successful, then one or more of the locks 3037 associated with the stateid was revoked by the server and the client 3038 must notify the owner. 3040 8.8. Share Reservations 3042 A share reservation is a mechanism to control access to a file. It 3043 is a separate and independent mechanism from record locking. When a 3044 client opens a file, it issues an OPEN operation to the server 3045 specifying the type of access required (READ, WRITE, or BOTH) and the 3046 type of access to deny others (deny NONE, READ, WRITE, or BOTH). If 3047 the OPEN fails the client will fail the application's open request. 3049 Pseudo-code definition of the semantics: 3051 if ((request.access & file_state.deny)) || 3052 (request.deny & file_state.access)) 3053 return (NFS4ERR_DENIED) 3055 The constants used for the OPEN and OPEN_DOWNGRADE operations for the 3056 access and deny fields are as follows: 3058 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 3059 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 3060 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 3062 const OPEN4_SHARE_DENY_NONE = 0x00000000; 3063 const OPEN4_SHARE_DENY_READ = 0x00000001; 3064 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 3065 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 3067 8.9. OPEN/CLOSE Operations 3069 To provide correct share semantics, a client MUST use the OPEN 3070 operation to obtain the initial filehandle and indicate the desired 3071 access and what if any access to deny. Even if the client intends to 3072 use a stateid of all 0's or all 1's, it must still obtain the 3073 filehandle for the regular file with the OPEN operation so the 3074 appropriate share semantics can be applied. For clients that do not 3075 have a deny mode built into their open programming interfaces, deny 3076 equal to NONE should be used. 3078 The OPEN operation with the CREATE flag, also subsumes the CREATE 3079 operation for regular files as used in previous versions of the NFS 3080 protocol. This allows a create with a share to be done atomically. 3082 The CLOSE operation removes all share locks held by the nfs_lockowner 3083 on that file. If record locks are held, the client SHOULD release 3084 all locks before issuing a CLOSE. The server MAY free all 3085 outstanding locks on CLOSE but some servers may not support the CLOSE 3086 of a file that still has record locks held. The server MUST return 3087 failure if any locks would exist after the CLOSE. 3089 The LOOKUP operation will return a filehandle without establishing 3090 any lock state on the server. Without a valid stateid, the server 3091 will assume the client has the least access. For example, a file 3092 opened with deny READ/WRITE cannot be accessed using a filehandle 3093 obtained through LOOKUP because it would not have a valid stateid 3094 (i.e. using a stateid of all bits 0 or all bits 1). 3096 8.10. Open Upgrade and Downgrade 3098 When an OPEN is done for a file and the lockowner for which the open 3099 is being done already has the file open, the result is to upgrade the 3100 open file status maintained on the server to include the access and 3101 deny bits specified by the new OPEN as well as those for the existing 3102 OPEN. The result is that there is one open file, as far as the 3103 protocol is concerned, and it includes the union of the access and 3104 deny bits for all of the OPEN requests completed. Only a single 3105 CLOSE will be done to reset the effects of both OPEN's. Note that 3106 the client, when issuing the OPEN, may not know that the same file is 3107 in fact being opened. The above only applies if both OPEN's result 3108 in the OPEN'ed object being designated by the same filehandle. 3110 When the server chooses to export multiple filehandles corresponding 3111 to the same file object and returns different filehandles on two 3112 different OPEN's of the same file object, the server MUST NOT "OR" 3113 together the access and deny bits and coalesce the two open files. 3114 Instead the server must maintain separate OPEN's with separate 3115 stateid's and will require separate CLOSE's to free them. 3117 When multiple open files on the client are merged into a single open 3118 file object on the server, the close of one of the open files (on the 3119 client) may necessitate change of the access and deny status of the 3120 open file on the server. This is because the union of the access and 3121 deny bits for the remaining open's may be smaller (i.e. a proper 3122 subset) than previously. The OPEN_DOWNGRADE operation is used to 3123 make the necessary change and the client should use it to update the 3124 server so that share reservation requests by other clients are 3125 handled properly. 3127 8.11. Short and Long Leases 3129 When determining the time period for the server lease, the usual 3130 lease tradeoffs apply. Short leases are good for fast server 3131 recovery at a cost of increased RENEW or READ (with zero length) 3132 requests. Longer leases are certainly kinder and gentler to large 3133 internet servers trying to handle a very large numbers of clients. 3134 The number of RENEW requests drop in proportion to the lease time. 3135 The disadvantages of long leases are slower recovery after server 3136 failure (server must wait for leases to expire and grace period 3137 before granting new lock requests) and increased file contention (if 3138 client fails to transmit an unlock request then server must wait for 3139 lease expiration before granting new locks). 3141 Long leases are usable if the server is able to store lease state in 3142 non-volatile memory. Upon recovery, the server can reconstruct the 3143 lease state from its non-volatile memory and continue operation with 3144 its clients and therefore long leases are not an issue. 3146 8.12. Clocks and Calculating Lease Expiration 3148 To avoid the need for synchronized clocks, lease times are granted by 3149 the server as a time delta. However, there is a requirement that the 3150 client and server clocks do not drift excessively over the duration 3151 of the lock. There is also the issue of propagation delay across the 3152 network which could easily be several hundred milliseconds as well as 3153 the possibility that requests will be lost and need to be 3154 retransmitted. 3156 To take propagation delay into account, the client should subtract it 3157 from lease times (e.g. if the client estimates the one-way 3158 propagation delay as 200 msec, then it can assume that the lease is 3159 already 200 msec old when it gets it). In addition, it will take 3160 another 200 msec to get a response back to the server. So the client 3161 must send a lock renewal or write data back to the server 400 msec 3162 before the lease would expire. 3164 8.13. Migration, Replication and State 3166 When responsibility for handling a given file system is transferred 3167 to a new server (migration) or the client chooses to use an alternate 3168 server (e.g. in response to server unresponsiveness) in the context 3169 of file system replication, the appropriate handling of state shared 3170 between the client and server (i.e. locks, leases, stateid's, and 3171 clientid's) is as described below. The handling differs between 3172 migration and replication. For related discussion of file server 3173 state and recover of such see the sections under "File Locking and 3174 Share Reservations" 3176 8.13.1. Migration and State 3178 In the case of migration, the servers involved in the migration of a 3179 file system should transfer all server state from the original to the 3180 new server. This must be done in a way that is transparent to the 3181 client. This state transfer will ease the client's transition when a 3182 file system migration occurs. If the servers are successful in 3183 transferring all state, the client will continue to use stateid's 3184 assigned by the original server. Therefore the new server must 3185 recognized these stateid's as valid. This holds true for the 3186 clientid as well. Since responsibility for an entire file system is 3187 transferred with a migration event, there is no possibility that 3188 conflicts will arise on the new server as a result of the transfer of 3189 locks. 3191 As part of the transfer of information between servers, leases would 3192 be transferred as well. The leases being transferred to the new 3193 server will typically have a different expiration time from those for 3194 the same client, previously on the new server. To maintain the 3195 property that all leases on a given server for a given client expire 3196 at the same time, the server should advance the expiration time to 3197 the later of the leases being transferred or the leases already 3198 present. This allows the client to maintain lease renewal of both 3199 classes without special effort. 3201 The servers may choose not to transfer the state information upon 3202 migration. However, this choice is discouraged. In this case, when 3203 the client presents state information from the original server, the 3204 client must be prepared to receive either NFS4ERR_STALE_CLIENTID or 3205 NFS4ERR_STALE_STATEID from the new server. The client should then 3206 recover its state information as it normally would in response to a 3207 server failure. The new server must take care to allow for the 3208 recovery of state information as it would in the event of server 3209 restart. 3211 8.13.2. Replication and State 3213 Since client switch-over in the case of replication is not under 3214 server control, the handling of state is different. In this case, 3215 leases, stateid's and clientid's do not have validity across a 3216 transition from one server to another. The client must re-establish 3217 its locks on the new server. This can be compared to the re- 3218 establishment of locks by means of reclaim-type requests after a 3219 server reboot. The difference is that the server has no provision to 3220 distinguish requests reclaiming locks from those obtaining new locks 3221 or to defer the latter. Thus, a client re-establishing a lock on the 3222 new server (by means of a LOCK or OPEN request), may have the 3223 requests denied due to a conflicting lock. Since replication is 3224 intended for read-only use of filesystems, such denial of locks 3225 should not pose large difficulties in practice. When an attempt to 3226 re-establish a lock on a new server is denied, the client should 3227 treat the situation as if his original lock had been revoked. 3229 8.13.3. Notification of Migrated Lease 3231 In the case of lease renewal, the client may not be submitting 3232 requests for a file system that has been migrated to another server. 3233 This can occur because of the implicit lease renewal mechanism. The 3234 client renews leases for all file systems when submitting a request 3235 to any one file system at the server. 3237 In order for the client to schedule renewal of leases that may have 3238 been relocated to the new server, the client must find out about 3239 lease relocation before those leases expire. To accomplish this, all 3240 operations which implicitly renew leases for a client (i.e. OPEN, 3241 CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error 3242 NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be 3243 renewed has been transferred to a new server. This condition will 3244 continue until the client receives an NFS4ERR_MOVED error and the 3245 server receives the subsequent GETATTR(fs_locations) for an access to 3246 each file system for which a lease has been moved to a new server. 3248 When a client receives an NFS4ERR_LEASE_MOVED error, it should 3249 perform some operation, such as a RENEW, on each file system 3250 associated with the server in question. When the client receives an 3251 NFS4ERR_MOVED error, the client can follow the normal process to 3252 obtain the new server information (through the fs_locations 3253 attribute) and perform renewal of those leases on the new server. If 3254 the server has not had state transferred to it transparently, it will 3255 receive either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from 3256 the new server, as described above, and can then recover state 3257 information as it does in the event of server failure. 3259 9. Client-Side Caching 3261 Client-side caching of data, of file attributes, and of file names is 3262 essential to providing good performance with the NFS protocol. 3263 Providing distributed cache coherence is a difficult problem and 3264 previous versions of the NFS protocol have not attempted it. 3265 Instead, several NFS client implementation techniques have been used 3266 to reduce the problems that a lack of coherence poses for users. 3267 These techniques have not been clearly defined by earlier protocol 3268 specifications and it is often unclear what is valid or invalid 3269 client behavior. 3271 The NFS version 4 protocol uses many techniques similar to those that 3272 have been used in previous protocol versions. The NFS version 4 3273 protocol does not provide distributed cache coherence. However, it 3274 defines a more limited set of caching guarantees to allow locks and 3275 share reservations to be used without destructive interference from 3276 client side caching. 3278 In addition, the NFS version 4 protocol introduces a delegation 3279 mechanism which allows many decisions normally made by the server to 3280 be made locally by clients. This mechanism provides efficient 3281 support of the common cases where sharing is infrequent or where 3282 sharing is read-only. 3284 9.1. Performance Challenges for Client-Side Caching 3286 Caching techniques used in previous versions of the NFS protocol have 3287 been successful in providing good performance. However, several 3288 scalability challenges can arise when those techniques are used with 3289 very large numbers of clients. This is particularly true when 3290 clients are geographically distributed which classically increases 3291 the latency for cache revalidation requests. 3293 The previous versions of the NFS protocol repeat their file data 3294 cache validation requests at the time the file is opened. This 3295 behavior can have serious performance drawbacks. A common case is 3296 one in which a file is only accessed by a single client. Therefore, 3297 sharing is infrequent. 3299 In this case, repeated reference to the server to find that no 3300 conflicts exist is expensive. A better option with regards to 3301 performance is to allow a client that repeatedly opens a file to do 3302 so without reference to the server. This is done until potentially 3303 conflicting operations from another client actually occur. 3305 A similar situation arises in connection with file locking. Sending 3306 file lock and unlock requests to the server as well as the read and 3307 write requests necessary to make data caching consistent with the 3308 locking semantics (see the section "Data Caching and File Locking") 3309 can severely limit performance. When locking is used to provide 3310 protection against infrequent conflicts, a large penalty is incurred. 3311 This penalty may discourage the use of file locking by applications. 3313 The NFS version 4 protocol provides more aggressive caching 3314 strategies with the following design goals: 3316 o Compatibility with a large range of server semantics. 3318 o Provide the same caching benefits as previous versions of the 3319 NFS protocol when unable to provide the more aggressive model. 3321 o Requirements for aggressive caching are organized so that a 3322 large portion of the benefit can be obtained even when not all 3323 of the requirements can be met. 3325 The appropriate requirements for the server are discussed in later 3326 sections in which specific forms of caching are covered. (see the 3327 section "Open Delegation"). 3329 9.2. Delegation and Callbacks 3331 Recallable delegation of server responsibilities for a file to a 3332 client improves performance by avoiding repeated requests to the 3333 server in the absence of inter-client conflict. With the use of a 3334 "callback" RPC from server to client, a server recalls delegated 3335 responsibilities when another client engages in sharing of a 3336 delegated file. 3338 A delegation is passed from the server to the client, specifying the 3339 object of the delegation and the type of delegation. There are 3340 different types of delegations but each type contains a stateid to be 3341 used to represent the delegation when performing operations that 3342 depend on the delegation. This stateid is similar to those 3343 associated with locks and share reservations but differs in that the 3344 stateid for a delegation is associated with a clientid and may be 3345 used on behalf of all the nfs_lockowners for the given client. A 3346 delegation is made to the client as a whole and not to any specific 3347 process or thread of control within it. 3349 Because callback RPCs may not work in all environments (due to 3350 firewalls, for example), correct protocol operation does not depend 3351 on them. Preliminary testing of callback functionality by means of a 3352 CB_NULL procedure determines whether callbacks can be supported. The 3353 CB_NULL procedure checks the continuity of the callback path. A 3354 server makes a preliminary assessment of callback availability to a 3355 given client and avoids delegating responsibilities until it has 3356 determined that callbacks are supported. Because the granting of a 3357 delegation is always conditional upon the absence of conflicting 3358 access, clients must not assume that a delegation will be granted and 3359 they must always be prepared for OPENs to be processed without any 3360 delegations being granted. 3362 Once granted, a delegation behaves in most ways like a lock. There 3363 is an associated lease that is subject to renewal together with all 3364 of the other leases held by that client. 3366 Unlike locks, an operation by a second client to a delegated file 3367 will cause the server to recall a delegation through a callback. 3369 On recall, the client holding the delegation must flush modified 3370 state (such as modified data) to the server and return the 3371 delegation. The conflicting request will not receive a response 3372 until the recall is complete. The recall is considered complete when 3373 the client returns the delegation or the server times out on the 3374 recall and revokes the delegation as a result of the timeout. 3375 Following the resolution of the recall, the server has the 3376 information necessary to grant or deny the second client's request. 3378 At the time the client receives a delegation recall, it may have 3379 substantial state that needs to be flushed to the server. Therefore, 3380 the server should allow sufficient time for the delegation to be 3381 returned since it may involve numerous RPCs to the server. If the 3382 server is able to determine that the client is diligently flushing 3383 state to the server as a result of the recall, the server may extend 3384 the usual time allowed for a recall. However, the time allowed for 3385 recall completion should not be unbounded. 3387 An example of this is when responsibility to mediate opens on a given 3388 file is delegated to a client (see the section "Open Delegation"). 3389 The server will not know what opens are in effect on the client. 3390 Without this knowledge the server will be unable to determine if the 3391 access and deny state for the file allows any particular open until 3392 the delegation for the file has been returned. 3394 A client failure or a network partition can result in failure to 3395 respond to a recall callback. In this case, the server will revoke 3396 the delegation which in turn will render useless any modified state 3397 still on the client. 3399 9.2.1. Delegation Recovery 3401 There are three situations that delegation recovery must deal with: 3403 o Client reboot or restart 3405 o Server reboot or restart 3407 o Network partition (full or callback-only) 3409 In the event the client reboots or restarts, the failure to renew 3410 leases will result in the revocation of record locks and share 3411 reservations. Delegations, however, may be treated a bit 3412 differently. 3414 There will be situations in which delegations will need to be 3415 reestablished after a client reboots or restarts. The reason for 3416 this is the client may have file data stored locally and this data 3417 was associated with the previously held delegations. The client will 3418 need to reestablish the appropriate file state on the server. 3420 To allow for this type of client recovery, the server may extend the 3421 period for delegation recovery beyond the typical lease expiration 3422 period. This implies that requests from other clients that conflict 3423 with these delegations will need to wait. Because the normal recall 3424 process may require significant time for the client to flush changed 3425 state to the server, other clients need be prepared for delays that 3426 occur because of a conflicting delegation. This longer interval 3427 would increase the window for clients to reboot and consult stable 3428 storage so that the delegations can be reclaimed. For open 3429 delegations, such delegations are reclaimed using OPEN with a claim 3430 type of CLAIM_DELEGATE_PREV. (see the sections on "Data Caching and 3431 Revocation" and "Operation 18: OPEN" for discussion of open 3432 delegation and the details of OPEN respectively). 3434 When the server reboots or restarts, delegations are reclaimed (using 3435 the OPEN operation with CLAIM_DELEGATE_PREV) in a similar fashion to 3436 record locks and share reservations. However, there is a slight 3437 semantic difference. In the normal case if the server decides that a 3438 delegation should not be granted, it performs the requested action 3439 (e.g. OPEN) without granting any delegation. For reclaim, the server 3440 grants the delegation but a special designation is applied so that 3441 the client treats the delegation as having been granted but recalled 3442 by the server. Because of this, the client has the duty to write all 3443 modified state to the server and then return the delegation. This 3444 process of handling delegation reclaim reconciles three principles of 3445 the NFS Version 4 protocol: 3447 o Upon reclaim, a client reporting resources assigned to it by an 3448 earlier server instance must be granted those resources. 3450 o The server has unquestionable authority to determine whether 3451 delegations are to be granted and, once granted, whether they 3452 are to be continued. 3454 o The use of callbacks is not to be depended upon until the client 3455 has proven its ability to receive them. 3457 When a network partition occurs, delegations are subject to freeing 3458 by the server when the lease renewal period expires. This is similar 3459 to the behavior for locks and share reservations. For delegations, 3460 however, the server may extend the period in which conflicting 3461 requests are held off. Eventually the occurrence of a conflicting 3462 request from another client will cause revocation of the delegation. 3463 A loss of the callback path (e.g. by later network configuration 3464 change) will have the same effect. A recall request will fail and 3465 revocation of the delegation will result. 3467 A client normally finds out about revocation of a delegation when it 3468 uses a stateid associated with a delegation and receives the error 3469 NFS4ERR_EXPIRED. It also may find out about delegation revocation 3470 after a client reboot when it attempts to reclaim a delegation and 3471 receives that same error. Note that in the case of a revoked write 3472 open delegation, there are issues because data may have been modified 3473 by the client whose delegation is revoked and separately by other 3474 clients. See the section "Revocation Recovery for Write Open 3475 Delegation" for a discussion of such issues. Note also that when 3476 delegations are revoked, information about the revoked delegation 3477 will be written by the server to stable storage (as described in the 3478 section "Crash Recovery"). This is done to deal with the case in 3479 which a server reboots after revoking a delegation but before the 3480 client holding the revoked delegation is notified about the 3481 revocation. 3483 9.3. Data Caching 3485 When applications share access to a set of files, they need to be 3486 implemented so as to take account of the possibility of conflicting 3487 access by another application. This is true whether the applications 3488 in question execute on different clients or reside on the same 3489 client. 3491 Share reservations and record locks are the facilities the NFS 3492 version 4 protocol provides to allow applications to coordinate 3493 access by providing mutual exclusion facilities. The NFS version 4 3494 protocol's data caching must be implemented such that it does not 3495 invalidate the assumptions that those using these facilities depend 3496 upon. 3498 9.3.1. Data Caching and OPENs 3500 In order to avoid invalidating the sharing assumptions that 3501 applications rely on, NFS version 4 clients should not provide cached 3502 data to applications or modify it on behalf of an application when it 3503 would not be valid to obtain or modify that same data via a READ or 3504 WRITE operation. 3506 Furthermore, in the absence of open delegation (see the section "Open 3507 Delegation") two additional rules apply. Note that these rules are 3508 obeyed in practice by many NFS version 2 and version 3 clients. 3510 o First, cached data present on a client must be revalidated after 3511 doing an OPEN. This is to ensure that the data for the OPENed 3512 file is still correctly reflected in the client's cache. This 3513 validation must be done at least when the client's OPEN 3514 operation includes DENY=WRITE or BOTH thus terminating a period 3515 in which other clients may have had the opportunity to open the 3516 file with WRITE access. Clients may choose to do the 3517 revalidation more often (i.e. at OPENs specifying DENY=NONE) to 3518 parallel the NFS version 3 protocol's practice for the benefit 3519 of users assuming this degree of cache revalidation. 3521 o Second, modified data must be flushed to the server before 3522 closing a file OPENed for write. This is complementary to the 3523 first rule. If the data is not flushed at CLOSE, the 3524 revalidation done after client OPENs as file is unable to 3525 achieve its purpose. The other aspect to flushing the data 3526 before close is that the data must be committed to stable 3527 storage, at the server, before the CLOSE operation is requested 3528 by the client. In the case of a server reboot or restart and a 3529 CLOSEd file, it may not be possible to retransmit the data to be 3530 written to the file. Hence, this requirement. 3532 9.3.2. Data Caching and File Locking 3534 For those applications that choose to use file locking instead of 3535 share reservations to exclude inconsistent file access, there is an 3536 analogous set of constraints that apply to client side data caching. 3537 These rules are effective only if the file locking is used in a way 3538 that matches in an equivalent way the actual READ and WRITE 3539 operations executed. This is as opposed to file locking that is 3540 based on pure convention. For example, it is possible to manipulate 3541 a two-megabyte file by dividing the file into two one-megabyte 3542 regions and protecting access to the two regions by file locks on 3543 bytes zero and one. A lock for write on byte zero of the file would 3544 represent the right to do READ and WRITE operations on the first 3545 region. A lock for write on byte one of the file would represent the 3546 right to do READ and WRITE operations on the second region. As long 3547 as all applications manipulating the file obey this convention, they 3548 will work on a local file system. However, they may not work with 3549 the NFS version 4 protocol unless clients refrain from data caching. 3551 The rules for data caching in the file locking environment are: 3553 o First, when a client obtains a file lock for a particular 3554 region, the data cache corresponding to that region (if any 3555 cache data exists) must be revalidated. If the change attribute 3556 indicates that the file may have been updated since the cached 3557 data was obtained, the client must flush or invalidate the 3558 cached data for the newly locked region. A client might choose 3559 to invalidate all of non-modified cached data that it has for 3560 the file but the only requirement for correct operation is to 3561 invalidate all of the data in the newly locked region. 3563 o Second, before releasing a write lock for a region, all modified 3564 data for that region must be flushed to the server. The 3565 modified data must also be written to stable storage. 3567 Note that flushing data to the server and the invalidation of cached 3568 data must reflect the actual byte ranges locked or unlocked. 3569 Rounding these up or down to reflect client cache block boundaries 3570 will cause problems if not carefully done. For example, writing a 3571 modified block when only half of that block is within an area being 3572 unlocked may cause invalid modification to the region outside the 3573 unlocked area. This, in turn, may be part of a region locked by 3574 another client. Clients can avoid this situation by synchronously 3575 performing portions of write operations that overlap that portion 3576 (initial or final) that is not a full block. Similarly, invalidating 3577 a locked area which is not an integral number of full buffer blocks 3578 would require the client to read one or two partial blocks from the 3579 server if the revalidation procedure shows that the data which the 3580 client possesses may not be valid. 3582 The data that is written to the server as a pre-requisite to the 3583 unlocking of a region must be written, at the server, to stable 3584 storage. The client may accomplish this either with synchronous 3585 writes or by following asynchronous writes with a COMMIT operation. 3586 This is required because retransmission of the modified data after a 3587 server reboot might conflict with a lock held by another client. 3589 A client implementation may choose to accommodate applications which 3590 use record locking in non-standard ways (e.g. using a record lock as 3591 a global semaphore) by flushing to the server more data upon an LOCKU 3592 than is covered by the locked range. This may include modified data 3593 within files other than the one for which the unlocks are being done. 3594 In such cases, the client must not interfere with applications whose 3595 READs and WRITEs are being done only within the bounds of record 3596 locks which the application holds. For example, an application locks 3597 a single byte of a file and proceeds to write that single byte. A 3598 client that chose to handle a LOCKU by flushing all modified data to 3599 the server could validly write that single byte in response to an 3600 unrelated unlock. However, it would not be valid to write the entire 3601 block in which that single written byte was located since it includes 3602 an area that is not locked and might be locked by another client. 3603 Client implementations can avoid this problem by dividing files with 3604 modified data into those for which all modifications are done to 3605 areas covered by an appropriate record lock and those for which there 3606 are modifications not covered by a record lock. Any writes done for 3607 the former class of files must not include areas not locked and thus 3608 not modified on the client. 3610 9.3.3. Data Caching and Mandatory File Locking 3612 Client side data caching needs to respect mandatory file locking when 3613 it is in effect. The presence of mandatory file locking for a given 3614 file is indicated in the result flags for an OPEN. When mandatory 3615 locking is in effect for a file, the client must check for an 3616 appropriate file lock for data being read or written. If a lock 3617 exists for the range being read or written, the client may satisfy 3618 the request using the client's validated cache. If an appropriate 3619 file lock is not held for the range of the read or write, the read or 3620 write request must not be satisfied by the client's cache and the 3621 request must be sent to the server for processing. When a read or 3622 write request partially overlaps a locked region, the request should 3623 be subdivided into multiple pieces with each region (locked or not) 3624 treated appropriately. 3626 9.3.4. Data Caching and File Identity 3628 When clients cache data, the file data needs to organized according 3629 to the file system object to which the data belongs. For NFS version 3630 3 clients, the typical practice has been to assume for the purpose of 3631 caching that distinct filehandles represent distinct file system 3632 objects. The client then has the choice to organize and maintain the 3633 data cache on this basis. 3635 In the NFS version 4 protocol, there is now the possibility to have 3636 significant deviations from a "one filehandle per object" model 3637 because a filehandle may be constructed on the basis of the object's 3638 pathname. Therefore, clients need a reliable method to determine if 3639 two filehandles designate the same file system object. If clients 3640 were simply to assume that all distinct filehandles denote distinct 3641 objects and proceed to do data caching on this basis, caching 3642 inconsistencies would arise between the distinct client side objects 3643 which mapped to the same server side object. 3645 By providing a method to differentiate filehandles, the NFS version 4 3646 protocol alleviates a potential functional regression in comparison 3647 with the NFS version 3 protocol. Without this method, caching 3648 inconsistencies within the same client could occur and this has not 3649 been present in previous versions of the NFS protocol. Note that it 3650 is possible to have such inconsistencies with applications executing 3651 on multiple clients but that is not the issue being addressed here. 3653 For the purposes of data caching, the following steps allow an NFS 3654 version 4 client to determine whether two distinct filehandles denote 3655 the same server side object: 3657 o If GETATTR directed to two filehandles have different values of 3658 the fsid attribute, then the filehandles represent distinct 3659 objects. 3661 o If GETATTR for any file with an fsid that matches the fsid of 3662 the two filehandles in question returns a unique_handles 3663 attribute with a value of TRUE, then the two objects are 3664 distinct. 3666 o If GETATTR directed to the two filehandles does not return the 3667 fileid attribute for one or both of the handles, then the it 3668 cannot be determined whether the two objects are the same. 3669 Therefore, operations which depend on that knowledge (e.g. 3670 client side data caching) cannot be done reliably. 3672 o If GETATTR directed to the two filehandles returns different 3673 values for the fileid attribute, then they are distinct objects. 3675 o Otherwise they are the same object. 3677 9.4. Open Delegation 3679 When a file is being OPENed, the server may delegate further handling 3680 of opens and closes for that file to the opening client. Any such 3681 delegation is recallable, since the circumstances that allowed for 3682 the delegation are subject to change. In particular, the server may 3683 receive a conflicting OPEN from another client, the server must 3684 recall the delegation before deciding whether the OPEN from the other 3685 client may be granted. Making a delegation is up to the server and 3686 clients should not assume that any particular OPEN either will or 3687 will not result in an open delegation. The following is a typical 3688 set of conditions that servers might use in deciding whether OPEN 3689 should be delegated: 3691 o The client must be able to respond to the server's callback 3692 requests. The server will use the CB_NULL procedure for a test 3693 of callback ability. 3695 o The client must have responded properly to previous recalls. 3697 o There must be no current open conflicting with the requested 3698 delegation. 3700 o There should be no current delegation that conflicts with the 3701 delegation being requested. 3703 o The probability of future conflicting open requests should be 3704 low based on the recent history of the file. 3706 o The existence of any server-specific semantics of OPEN/CLOSE 3707 that would make the required handling incompatible with the 3708 prescribed handling that the delegated client would apply (see 3709 below). 3711 There are two types of open delegations, read and write. A read open 3712 delegation allows a client to handle, on its own, requests to open a 3713 file for reading that do not deny read access to others. Multiple 3714 read open delegations may be outstanding simultaneously and do not 3715 conflict. A write open delegation allows the client to handle, on 3716 its own, all opens. Only one write open delegation may exist for a 3717 given file at a given time and it is inconsistent with any read open 3718 delegations. 3720 When a client has a read open delegation, it may not make any changes 3721 to the contents or attributes of the file but it is assured that no 3722 other client may do so. When a client has a write open delegation, 3723 it may modify the file data since no other client will be accessing 3724 the file's data. The client holding a write delegation may only 3725 affect file attributes which are intimately connected with the file 3726 data: object_size, time_modify, change. 3728 When a client has an open delegation, it does not send OPENs or 3729 CLOSEs to the server but updates the appropriate status internally. 3730 For a read open delegation, opens that cannot be handled locally 3731 (opens for write or that deny read access) must be sent to the 3732 server. 3734 When an open delegation is made, the response to the OPEN contains an 3735 open delegation structure which specifies the following: 3737 o the type of delegation (read or write) 3739 o space limitation information to control flushing of data on 3740 close (write open delegation only, see the section "Open 3741 Delegation and Data Caching") 3743 o an nfsace4 specifying read and write permissions 3745 o a stateid to represent the delegation for READ and WRITE 3747 The stateid is separate and distinct from the stateid for the OPEN 3748 proper. The standard stateid, unlike the delegation stateid, is 3749 associated with a particular nfs_lockowner and will continue to be 3750 valid after the delegation is recalled and the file remains open. 3752 When a request internal to the client is made to open a file and open 3753 delegation is in effect, it will be accepted or rejected solely on 3754 the basis of the following conditions. Any requirement for other 3755 checks to be made by the delegate should result in open delegation 3756 being denied so that the checks can be made by the server itself. 3758 o The access and deny bits for the request and the file as 3759 described in the section "Share Reservations". 3761 o The read and write permissions as determined below. 3763 The nfsace4 passed with delegation can be used to avoid frequent 3764 ACCESS calls. The permission check should be as follows: 3766 o If the nfsace4 indicates that the open may be done, then it 3767 should be granted without reference to the server. 3769 o If the nfsace4 indicates that the open may not be done, then an 3770 ACCESS request must be sent to the server to obtain the 3771 definitive answer. 3773 The server may return an nfsace4 that is more restrictive than the 3774 actual ACL of the file. This includes an nfsace4 that specifies 3775 denial of all access. Note that some common practices such as 3776 mapping the traditional user "root" to the user "nobody" may make it 3777 incorrect to return the actual ACL of the file in the delegation 3778 response. 3780 The use of delegation together with various other forms of caching 3781 creates the possibility that no server authentication will ever be 3782 performed for a given user since all of the user's requests might be 3783 satisfied locally. Where the client is depending on the server for 3784 authentication, the client should be sure authentication occurs for 3785 each user by use of the ACCESS operation. This should be the case 3786 even if an ACCESS operation would not be required otherwise. As 3787 mentioned before, the server may enforce frequent authentication by 3788 returning an nfsace4 denying all access with every open delegation. 3790 9.4.1. Open Delegation and Data Caching 3792 OPEN delegation allows much of the message overhead associated with 3793 the opening and closing files to be eliminated. An open when an open 3794 delegation is in effect does not require that a validation message be 3795 sent to the server. The continued endurance of the "read open 3796 delegation" provides a guarantee that no OPEN for write and thus no 3797 write has occurred. Similarly, when closing a file opened for write 3798 and if write open delegation is in effect, the data written does not 3799 have to be flushed to the server until the open delegation is 3800 recalled. The continued endurance of the open delegation provides a 3801 guarantee that no open and thus no read or write has been done by 3802 another client. 3804 For the purposes of open delegation, READs and WRITEs done without an 3805 OPEN are treated as the functional equivalents of a corresponding 3806 type of OPEN. This refers to the READs and WRITEs that use the 3807 special stateids consisting of all zero bits or all one bits. 3808 Therefore, READs or WRITEs with a special stateid done by another 3809 client will force the server to recall a write open delegation. A 3810 WRITE with a special stateid done by another client will force a 3811 recall of read open delegations. 3813 With delegations, a client is able to avoid writing data to the 3814 server when the CLOSE of a file is serviced. The CLOSE operation is 3815 the usual point at which the client is notified of a lack of stable 3816 storage for the modified file data generated by the application. At 3817 the CLOSE, file data is written to the server and through normal 3818 accounting the server is able to determine if the available file 3819 system space for the data has been exceeded (i.e. server returns 3820 NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting includes quotas. 3821 The introduction of delegations requires that a alternative method be 3822 in place for the same type of communication to occur between client 3823 and server. 3825 In the delegation response, the server provides either the limit of 3826 the size of the file or the number of modified blocks and associated 3827 block size. The server must ensure that the client will be able to 3828 flush data to the server of a size equal to that provided in the 3829 original delegation. The server must make this assurance for all 3830 outstanding delegations. Therefore, the server must be careful in 3831 its management of available space for new or modified data taking 3832 into account available file system space and any applicable quotas. 3833 The server can recall delegations as a result of managing the 3834 available file system space. The client should abide by the server's 3835 state space limits for delegations. If the client exceeds the stated 3836 limits for the delegation, the server's behavior is undefined. 3838 Based on server conditions, quotas or available file system space, 3839 the server may grant write open delegations with very restrictive 3840 space limitations. The limitations may be defined in a way that will 3841 always force modified data to be flushed to the server on close. 3843 With respect to authentication, flushing modified data to the server 3844 after a CLOSE has occurred may be problematic. For example, the user 3845 of the application may have logged off of the client and unexpired 3846 authentication credentials may not be present. In this case, the 3847 client may need to take special care to ensure that local unexpired 3848 credentials will in fact be available. This may be accomplished by 3849 tracking the expiration time of credentials and flushing data well in 3850 advance of their expiration or by making private copies of 3851 credentials to assure their availability when needed. 3853 9.4.2. Open Delegation and File Locks 3855 When a client holds a write open delegation, lock operations are 3856 performed locally. This includes those required for mandatory file 3857 locking. This can be done since the delegation implies that there 3858 can be no conflicting locks. Similarly, all of the revalidations 3859 that would normally be associated with obtaining locks and the 3860 flushing of data associated with the releasing of locks need not be 3861 done. 3863 9.4.3. Recall of Open Delegation 3865 The following events necessitate recall of an open delegation: 3867 o Potentially conflicting OPEN request (or READ/WRITE done with 3868 "special" stateid) 3870 o SETATTR issued by another client 3872 o REMOVE request for the file 3874 o RENAME request for the file as either source or target of the 3875 RENAME 3877 Whether a RENAME of a directory in the path leading to the file 3878 results in recall of an open delegation depends on the semantics of 3879 the server file system. If that file system denies such RENAMEs when 3880 a file is open, the recall must be performed to determine whether the 3881 file in question is, in fact, open. 3883 In addition to the situations above, the server may choose to recall 3884 open delegations at any time if resource constraints make it 3885 advisable to do so. Clients should always be prepared for the 3886 possibility of recall. 3888 The server needs to employ special handling for a GETATTR where the 3889 target is a file that has a write open delegation in effect. In this 3890 case, the client holding the delegation needs to be interrogated. 3891 The server will use a CB_GETATTR callback, if the GETATTR attribute 3892 bits include any of the attributes that a write open delegate may 3893 modify (object_size, time_modify, change). 3895 When a client receives a recall for an open delegation, it needs to 3896 update state on the server before returning the delegation. These 3897 same updates must be done whenever a client chooses to return a 3898 delegation voluntarily. The following items of state need to be 3899 dealt with: 3901 o If the file associated with the delegation is no longer open and 3902 no previous CLOSE operation has been sent to the server, a CLOSE 3903 operation must be sent to the server. 3905 o If a file has other open references at the client, then OPEN 3906 operations must be sent to the server. The appropriate stateids 3907 will be provided by the server for subsequent use by the client 3908 since the delegation stateid will not longer be valid. These 3909 OPEN requests are done with the claim type of 3910 CLAIM_DELEGATE_CUR. This will allow the presentation of the 3911 delegation stateid so that the client can establish the 3912 appropriate rights to perform the OPEN. (see the section 3913 "Operation 18: OPEN" for details.) 3915 o If there are granted file locks, the corresponding LOCK 3916 operations need to be performed. This applies to the write open 3917 delegation case only. 3919 o For a write open delegation, if at the time of recall the file 3920 is not open for write, all modified data for the file must be 3921 flushed to the server. If the delegation had not existed, the 3922 client would have done this data flush before the CLOSE 3923 operation. 3925 o For a write open delegation when a file is still open at the 3926 time of recall, any modified data for the file needs to be 3927 flushed to the server. 3929 o With the write open delegation in place, it is possible that the 3930 file was truncated during the duration of the delegation. For 3931 example, the truncation could have occurred as a result of an 3932 OPEN UNCHECKED with a object_size attribute value of zero. 3933 Therefore, if a truncation of the file has occurred and this 3934 operation has not been propagated to the server, the truncation 3935 must occur before any modified data is written to the server. 3937 In the case of write open delegation, file locking imposes some 3938 additional requirements. The flushing of any modified data in any 3939 region for which a write lock was released while the write open 3940 delegation was in effect is what is required to precisely maintain 3941 the associated invariant. However, because the write open delegation 3942 implies no other locking by other clients, a simpler implementation 3943 is to flush all modified data for the file (as described just above) 3944 if any write lock has been released while the write open delegation 3945 was in effect. 3947 9.4.4. Delegation Revocation 3949 At the point a delegation is revoked, if there are associated opens 3950 on the client, the applications holding these opens need to be 3951 notified. This notification usually occurs by returning errors for 3952 READ/WRITE operations or when a close is attempted for the open file. 3954 If no opens exist for the file at the point the delegation is 3955 revoked, then notification of the revocation is unnecessary. 3956 However, if there is modified data present at the client for the 3957 file, the user of the application should be notified. Unfortunately, 3958 it may not be possible to notify the user since active applications 3959 may not be present at the client. See the section "Revocation 3960 Recovery for Write Open Delegation" for additional details. 3962 9.5. Data Caching and Revocation 3964 When locks and delegations are revoked, the assumptions upon which 3965 successful caching depend are no longer guaranteed. The owner of the 3966 locks or share reservations which have been revoked needs to be 3967 notified. This notification includes applications with a file open 3968 that has a corresponding delegation which has been revoked. Cached 3969 data associated with the revocation must be removed from the client. 3970 In the case of modified data existing in the client's cache, that 3971 data must be removed from the client without it being written to the 3972 server. As mentioned, the assumptions made by the client are no 3973 longer valid at the point when a lock or delegation has been revoked. 3974 For example, another client may have been granted a conflicting lock 3975 after the revocation of the lock at the first client. Therefore, the 3976 data within the lock range may have been modified by the other 3977 client. Obviously, the first client is unable to guarantee to the 3978 application what has occurred to the file in the case of revocation. 3980 Notification to a lock owner will in many cases consist of simply 3981 returning an error on the next and all subsequent READs/WRITEs to the 3982 open file or on the close. Where the methods available to a client 3983 make such notification impossible because errors for certain 3984 operations may not be returned, more drastic action such as signals 3985 or process termination may be appropriate. The justification for 3986 this is that an invariant for which an application depends on may be 3987 violated. Depending on how errors are typically treated for the 3988 client operating environment, further levels of notification 3989 including logging, console messages, and GUI pop-ups may be 3990 appropriate. 3992 9.5.1. Revocation Recovery for Write Open Delegation 3994 Revocation recovery for a write open delegation poses the special 3995 issue of modified data in the client cache while the file is not 3996 open. In this situation, any client which does not flush modified 3997 data to the server on each close must ensure that the user receives 3998 appropriate notification of the failure as a result of the 3999 revocation. Since such situations may require human action to 4000 correct problems, notification schemes in which the appropriate user 4001 or administrator is notified may be necessary. Logging and console 4002 messages are typical examples. 4004 If there is modified data on the client, it must not be flushed 4005 normally to the server. A client may attempt to provide a copy of 4006 the file data as modified during the delegation under a different 4007 name in the file system name space to ease recovery. Unless the 4008 client can determine that the file has not modified by any other 4009 client, this technique must be limited to situations in which a 4010 client has a complete cached copy of the file in question. Use of 4011 such a technique may be limited to files under a certain size or may 4012 only be used when sufficient disk space is guaranteed to be available 4013 within the target file system and when the client has sufficient 4014 buffering resources to keep the cached copy available until it is 4015 properly stored to the target file system. 4017 9.6. Attribute Caching 4019 The attributes discussed in this section do not include named 4020 attributes. Individual named attributes are analogous to files and 4021 caching of the data for these needs to be handled just as data 4022 caching is for ordinary files. Similarly, LOOKUP results from an 4023 OPENATTR directory are to be cached on the same basis as any other 4024 pathnames and similarly for directory contents. 4026 Clients may cache file attributes obtained from the server and use 4027 them to avoid subsequent GETATTR requests. Such caching is write 4028 through in that modification to file attributes is always done by 4029 means of requests to the server and should not be done locally and 4030 cached. The exception to this are modifications to attributes that 4031 are intimately connected with data caching. Therefore, extending a 4032 file by writing data to the local data cache is reflected immediately 4033 in the object_size as seen on the client without this change being 4034 immediately reflected on the server. Normally such changes are not 4035 propagated directly to the server but when the modified data is 4036 flushed to the server, analogous attribute changes are made on the 4037 server. When open delegation is in effect, the modified attributes 4038 may be returned to the server in the response to a CB_RECALL call. 4040 The result of local caching of attributes is that the attribute 4041 caches maintained on individual clients will not be coherent. Changes 4042 made in one order on the server may be seen in a different order on 4043 one client and in a third order on a different client. 4045 The typical file system application programming interfaces do not 4046 provide means to atomically modify or interrogate attributes for 4047 multiple files at the same time. The following rules provide an 4048 environment where the potential incoherences mentioned above can be 4049 reasonably managed. These rules are derived from the practice of 4050 previous NFS protocols. 4052 o All attributes for a given file (per-fsid attributes excepted) 4053 are cached as a unit at the client so that no non- 4054 serializability can arise within the context of a single file. 4056 o An upper time boundary is maintained on how long a client cache 4057 entry can be kept without being refreshed from the server. 4059 o When operations are performed that change attributes at the 4060 server, the updated attribute set is requested as part of the 4061 containing RPC. This includes directory operations that update 4062 attributes indirectly. This is accomplished by following the 4063 modifying operation with a GETATTR operation and then using the 4064 results of the GETATTR to update the client's cached attributes. 4066 Note that if the full set of attributes to be cached is requested by 4067 READDIR, the results can be cached by the client on the same basis as 4068 attributes obtained via GETATTR. 4070 A client may validate its cached version of attributes for a file by 4071 fetching only the change attribute and assuming that if the change 4072 attribute has the same value as it did when the attributes were 4073 cached, then no attributes have changed. The possible exception is 4074 the attribute time_access. 4076 9.7. Name Caching 4078 The results of LOOKUP and READDIR operations may be cached to avoid 4079 the cost of subsequent LOOKUP operations. Just as in the case of 4080 attribute caching, inconsistencies may arise among the various client 4081 caches. To mitigate the effects of these inconsistencies and given 4082 the context of typical file system APIs, the following rules should 4083 be followed: 4085 o The results of unsuccessful LOOKUPs should not be cached, unless 4086 they are specifically reverified at the point of use. 4088 o An upper time boundary is maintained on how long a client name 4089 cache entry can be kept without verifying that the entry has not 4090 been made invalid by a directory change operation performed by 4091 another client. 4093 When a client is not making changes to a directory for which there 4094 exist name cache entries, the client needs to periodically fetch 4095 attributes for that directory to ensure that it is not being 4096 modified. After determining that no modification has occurred, the 4097 expiration time for the associated name cache entries may be updated 4098 to be the current time plus the name cache staleness bound. 4100 When a client is making changes to a given directory, it needs to 4101 determine whether there have been changes made to the directory by 4102 other clients. It does this by using the change attribute as 4103 reported before and after the directory operation in the associated 4104 change_info4 value returned for the operation. The server is able to 4105 communicate to the client whether the change_info4 data is provided 4106 atomically with respect to the directory operation. If the change 4107 values are provided atomically, the client is then able to compare 4108 the pre-operation change value with the change value in the client's 4109 name cache. If the comparison indicates that the directory was 4110 updated by another client, the name cache associated with the 4111 modified directory is purged from the client. If the comparison 4112 indicates no modification, the name cache can be updated on the 4113 client to reflect the directory operation and the associated timeout 4114 extended. The post-operation change value needs to be saved as the 4115 basis for future change_info4 comparisons. 4117 As demonstrated by the scenario above, name caching requires that the 4118 client revalidate name cache data by inspecting the change attribute 4119 of a directory at the point when the name cache item was cached. 4120 This requires that the server update the change attribute for 4121 directories when the contents of the corresponding directory is 4122 modified. For a client to use the change_info4 information 4123 appropriately and correctly, the server must report the pre and post 4124 operation change attribute values atomically. When the server is 4125 unable to report the before and after values atomically with respect 4126 to the directory operation, the server must indicate that fact in the 4127 change_info4 return value. When the information is not atomically 4128 reported, the client should not assume that other clients have not 4129 changed the directory. 4131 9.8. Directory Caching 4133 The results of READDIR operations may be used to avoid subsequent 4134 READDIR operations. Just as in the cases of attribute and name 4135 caching, inconsistencies may arise among the various client caches. 4136 To mitigate the effects of these inconsistencies, and given the 4137 context of typical file system APIs, the following rules should be 4138 followed: 4140 o Cached READDIR information for a directory which is not obtained 4141 in a single READDIR operation must always be a consistent 4142 snapshot of directory contents. This is determined by using a 4143 GETATTR before the first READDIR and after the last of READDIR 4144 that contributes to the cache. 4146 o An upper time boundary is maintained to indicate the length of 4147 time a directory cache entry is considered valid before the 4148 client must revalidate the cached information. 4150 The revalidation technique parallels that discussed in the case of 4151 name caching. When the client is not changing the directory in 4152 question, checking the change attribute of the directory with GETATTR 4153 is adequate. The lifetime of the cache entry can be extended at 4154 these checkpoints. When a client is modifying the directory, the 4155 client needs to use the change_info4 data to determine whether there 4156 are other clients modifying the directory. If it is determined that 4157 no other client modifications are occurring, the client may update 4158 its directory cache to reflect its own changes. 4160 As demonstrated previously, directory caching requires that the 4161 client revalidate directory cache data by inspecting the change 4162 attribute of a directory at the point when the directory was cached. 4163 This requires that the server update the change attribute for 4164 directories when the contents of the corresponding directory is 4165 modified. For a client to use the change_info4 information 4166 appropriately and correctly, the server must report the pre and post 4167 operation change attribute values atomically. When the server is 4168 unable to report the before and after values atomically with respect 4169 to the directory operation, the server must indicate that fact in the 4170 change_info4 return value. When the information is not atomically 4171 reported, the client should not assume that other clients have not 4172 changed the directory. 4174 10. Minor Versioning 4176 To address the requirement of an NFS protocol that can evolve as the 4177 need arises, the NFS version 4 protocol contains the rules and 4178 framework to allow for future minor changes or versioning. 4180 The base assumption with respect to minor versioning is that any 4181 future accepted minor version must follow the IETF process and be 4182 documented in a standards track RFC. Therefore, each minor version 4183 number will correspond to an RFC. Minor version zero of the NFS 4184 version 4 protocol is represented by this RFC. The COMPOUND 4185 procedure will support the encoding of the minor version being 4186 requested by the client. 4188 The following items represent the basic rules for the development of 4189 minor versions. Note that a future minor version may decide to 4190 modify or add to the following rules as part of the minor version 4191 definition. 4193 1 Procedures are not added or deleted 4195 To maintain the general RPC model, NFS version 4 minor versions 4196 will not add or delete procedures from the NFS program. 4198 2 Minor versions may add operations to the COMPOUND and 4199 CB_COMPOUND procedures. 4201 The addition of operations to the COMPOUND and CB_COMPOUND 4202 procedures does not affect the RPC model. 4204 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, 4205 and GETATTR4res. 4207 This allows for the expansion of the attribute model to allow 4208 for future growth or adaptation. 4210 2.2 Minor version X must append any new attributes after the last 4211 documented attribute. 4213 Since attribute results are specified as an opaque array of 4214 per-attribute XDR encoded results, the complexity of adding new 4215 attributes in the midst of the current definitions will be too 4216 burdensome. 4218 3 Minor versions must not modify the structure of an existing 4219 operation's arguments or results. 4221 Again the complexity of handling multiple structure definitions 4222 for a single operation is too burdensome. New operations should 4223 be added instead of modifying existing structures for a minor 4224 version. 4226 This rule does not preclude the following adaptations in a minor 4227 version. 4229 o adding bits to flag fields such as new attributes to 4230 GETATTR's bitmap4 data type 4232 o adding bits to existing attributes like ACLs that have flag 4233 words 4235 o extending enumerated types (including NFS4ERR_*) with new 4236 values 4238 4 Minor versions may not modify the structure of existing 4239 attributes. 4241 5 Minor versions may not delete operations. 4243 This prevents the potential reuse of a particular operation 4244 "slot" in a future minor version. 4246 6 Minor versions may not delete attributes. 4248 7 Minor versions may not delete flag bits or enumeration values. 4250 8 Minor versions may declare an operation as mandatory to NOT 4251 implement. 4253 Specifying an operation as "mandatory to not implement" is 4254 equivalent to obsoleting an operation. For the client, it means 4255 that the operation should not be sent to the server. For the 4256 server, an NFS error can be returned as opposed to "dropping" 4257 the request as an XDR decode error. This approach allows for 4258 the obsolescence of an operation while maintaining its structure 4259 so that a future minor version can reintroduce the operation. 4261 8.1 Minor versions may declare attributes mandatory to NOT 4262 implement. 4264 8.2 Minor versions may declare flag bits or enumeration values as 4265 mandatory to NOT implement. 4267 9 Minor versions may downgrade features from mandatory to 4268 recommended, or recommended to optional. 4270 10 Minor versions may upgrade features from optional to recommended 4271 or recommended to mandatory. 4273 11 A client and server that support minor version X must support 4274 minor versions 0 (zero) through X-1 as well. 4276 12 No new features may be introduced as mandatory in a minor 4277 version. 4279 This rule allows for the introduction of new functionality and 4280 forces the use of implementation experience before designating a 4281 feature as mandatory. 4283 13 A client MUST NOT attempt to use a stateid, file handle, or 4284 similar returned object from the COMPOUND procedure with minor 4285 version X for another COMPOUND procedure with minor version Y, 4286 where X != Y. 4288 11. Internationalization 4290 The primary issue in which NFS needs to deal with 4291 internationalization, or i18n, is with respect to file names and 4292 other strings as used within the protocol. NFS' choice of string 4293 representation must allow reasonable name/string access to clients 4294 which use various languages. The UTF-8 encoding of the UCS as 4295 defined by [ISO10646] allows for this type of access and follows the 4296 policy described in "IETF Policy on Character Sets and Languages", 4297 [RFC2277]. This choice is explained further in the following. 4299 11.1. Universal Versus Local Character Sets 4301 [RFC1345] describes a table of 16 bit characters for many different 4302 languages (the bit encodings match Unicode, though of course RFC1345 4303 is somewhat out of date with respect to current Unicode assignments). 4304 Each character from each language has a unique 16 bit value in the 16 4305 bit character set. Thus this table can be thought of as a universal 4306 character set. [RFC1345] then talks about groupings of subsets of 4307 the entire 16 bit character set into "Charset Tables". For example 4308 one might take all the Greek characters from the 16 bit table (which 4309 are consecutively allocated), and normalize their offsets to a table 4310 that fits in 7 bits. Thus it is determined that "lower case alpha" 4311 is in the same position as "upper case a" in the US-ASCII table, and 4312 "upper case alpha" is in the same position as "lower case a" in the 4313 US-ASCII table. 4315 These normalized subset character sets can be thought of as "local 4316 character sets", suitable for an operating system locale. 4318 Local character sets are not suitable for the NFS protocol. Consider 4319 someone who creates a file with a name in a Swedish character set. 4320 If someone else later goes to access the file with their locale set 4321 to the Swedish language, then there are no problems. But if someone 4322 in say the US-ASCII locale goes to access the file, the file name 4323 will look very different, because the Swedish characters in the 7 bit 4324 table will now be represented in US-ASCII characters on the display. 4325 It would be preferable to give the US-ASCII user a way to display the 4326 file name using Swedish glyphs. In order to do that, the NFS protocol 4327 would have to include the locale with the file name on each operation 4328 to create a file. 4330 But then what of the situation when there is a path name on the 4331 server like: 4333 /component-1/component-2/component-3 4335 Each component could have been created with a different locale. If 4336 one issues CREATE with multi-component path name, and if some of the 4337 leading components already exist, what is to be done with the 4338 existing components? Is the current locale attribute replaced with 4339 the user's current one? These types of situations quickly become too 4340 complex when there is an alternate solution. 4342 If the NFS version 4 protocol used a universal 16 bit or 32 bit 4343 character set (or an encoding of a 16 bit or 32 bit character set 4344 into octets), then the server and client need not care if the locale 4345 of the user accessing the file is different than the locale of the 4346 user who created the file. The unique 16 bit or 32 bit encoding of 4347 the character allows for determination of what language the character 4348 is from and also how to display that character on the client. The 4349 server need not know what locales are used. 4351 11.2. Overview of Universal Character Set Standards 4353 The previous section makes a case for using a universal character 4354 set. This section makes the case for using UTF-8 as the specific 4355 universal character set for the NFS version 4 protocol. 4357 [RFC2279] discusses UTF-* (UTF-8 and other UTF-XXX encodings), 4358 Unicode, and UCS-*. There are two standards bodies managing 4359 universal code sets: 4361 o ISO/IEC which has the standard 10646-1 4363 o Unicode which has the Unicode standard 4365 Both standards bodies have pledged to track each other's assignments 4366 of character codes. 4368 The following is a brief analysis of the various standards. 4370 UCS Universal Character Set. This is ISO/IEC 10646-1: "a 4371 multi-octet character set called the Universal Character 4372 Set (UCS), which encompasses most of the world's writing 4373 systems." 4375 UCS-2 a two octet per character encoding that addresses the first 4376 2^16 characters of UCS. Currently there are no UCS 4377 characters beyond that range. 4379 UCS-4 a four octet per character encoding that permits the 4380 encoding of up to 2^31 characters. 4382 UTF UTF is an abbreviation of the term "UCS transformation 4383 format" and is used in the naming of various standards for 4384 encoding of UCS characters as described below. 4386 UTF-1 Only historical interest; it has been removed from 10646-1 4387 UTF-7 Encodes the entire "repertoire" of UCS "characters using 4388 only octets with the higher order bit clear". [RFC2152] 4389 describes UTF-7. UTF-7 accomplishes this by reserving one 4390 of the 7bit US-ASCII characters as a "shift" character to 4391 indicate non-US-ASCII characters. 4393 UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII 4394 characters are encoded as before unchanged. Any octet with 4395 the high bit cleared can only mean a US-ASCII character. 4396 The high bit set means that a UCS character is being 4397 encoded. 4399 UTF-16 Encodes UCS-4 characters into UCS-2 characters using a 4400 reserved range in UCS-2. 4402 Unicode Unicode and UCS-2 are the same; [RFC2279] states: 4404 Up to the present time, changes in Unicode and amendments 4405 to ISO/IEC 10646 have tracked each other, so that the 4406 character repertoires and code point assignments have 4407 remained in sync. The relevant standardization committees 4408 have committed to maintain this very useful synchronism. 4410 11.3. Difficulties with UCS-4, UCS-2, Unicode 4412 Adapting existing applications, and file systems to multi-octet 4413 schemes like UCS and Unicode can be difficult. A significant amount 4414 of code has been written to process streams of bytes. Also there are 4415 many existing stored objects described with 7 bit or 8 bit 4416 characters. Doubling or quadrupling the bandwidth and storage 4417 requirements seems like an expensive way to accomplish I18N. 4419 UCS-2 and Unicode are "only" 16 bits long. That might seem to be 4420 enough but, according to [Unicode1], 49,194 Unicode characters are 4421 already assigned. According to [Unicode2] there are still more 4422 languages that need to be added. 4424 11.4. UTF-8 and its solutions 4426 UTF-8 solves problems for NFS that exist with the use of UCS and 4427 Unicode. UTF-8 will encode 16 bit and 32 bit characters in a way 4428 that will be compact for most users. The encoding table from UCS-4 to 4429 UTF-8, as copied from [RFC2279]: 4431 UCS-4 range (hex.) UTF-8 octet sequence (binary) 4432 0000 0000-0000 007F 0xxxxxxx 4433 0000 0080-0000 07FF 110xxxxx 10xxxxxx 4434 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 4435 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 4436 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 4437 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 4438 10xxxxxx 4440 See [RFC2279] for precise encoding and decoding rules. Note because 4441 of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account 4442 for the reserved range between D800 and DFFF. 4444 Note that the 16 bit UCS or Unicode characters require no more than 3 4445 octets to encode into UTF-8 4447 Interestingly, UTF-8 has room to handle characters larger than 31 4448 bits, because the leading octet of form: 4450 1111111x 4452 is not defined. If needed, ISO could either use that octet to 4453 indicate a sequence of an encoded 8 octet character, or perhaps use 4454 11111110 to permit the next octet to indicate an even more expandable 4455 character set. 4457 So using UTF-8 to represent character encodings means never having to 4458 run out of room. 4460 11.5. Normalization 4462 The client and server operating environments may differ in their 4463 policies and operational methods with respect to character 4464 normalization (See [Unicode1] for a discussion of normalization 4465 forms). This difference may also exist between applications on the 4466 same client. This adds to the difficulty of providing a single 4467 normalization policy for the protocol that allows for maximal 4468 interoperability. This issue is similar to the character case issues 4469 where the server may or may not support case insensitive file name 4470 matching and may or may not preserve the character case when storing 4471 file names. The protocol does not mandate a particular behavior but 4472 allows for the various permutations. 4474 The NFS version 4 protocol does not mandate the use of a particular 4475 normalization form at this time. A later revision of this 4476 specification may specify a particular normalization form. 4477 Therefore, the server and client can expect that they may receive 4478 unnormalized characters within protocol requests and responses. If 4479 the operating environment requires normalization, then the 4480 implementation must normalize the various UTF-8 encoded strings 4481 within the protocol before presenting the information to an 4482 application (at the client) or local file system (at the server). 4484 12. Error Definitions 4486 NFS error numbers are assigned to failed operations within a compound 4487 request. A compound request contains a number of NFS operations that 4488 have their results encoded in sequence in a compound reply. The 4489 results of successful operations will consist of an NFS4_OK status 4490 followed by the encoded results of the operation. If an NFS 4491 operation fails, an error status will be entered in the reply and the 4492 compound request will be terminated. 4494 A description of each defined error follows: 4496 NFS4_OK Indicates the operation completed successfully. 4498 NFS4ERR_ACCES Permission denied. The caller does not have the 4499 correct permission to perform the requested 4500 operation. Contrast this with NFS4ERR_PERM, 4501 which restricts itself to owner or privileged 4502 user permission failures. 4504 NFS4ERR_BADHANDLE Illegal NFS file handle. The file handle failed 4505 internal consistency checks. 4507 NFS4ERR_BADTYPE An attempt was made to create an object of a 4508 type not supported by the server. 4510 NFS4ERR_BAD_COOKIE READDIR cookie is stale. 4512 NFS4ERR_BAD_SEQID The sequence number in a locking request is 4513 neither the next expected number or the last 4514 number processed. 4516 NFS4ERR_BAD_STATEID A stateid generated by the current server 4517 instance, but which does not designate any 4518 locking state (either current or superseded) 4519 for a current lockowner-file pair, was used. 4521 NFS4ERR_CLID_INUSE The SETCLIENTID procedure has found that a 4522 client id is already in use by another client. 4524 NFS4ERR_DELAY The server initiated the request, but was not 4525 able to complete it in a timely fashion. The 4526 client should wait and then try the request 4527 with a new RPC transaction ID. For example, 4528 this error should be returned from a server 4529 that supports hierarchical storage and receives 4530 a request to process a file that has been 4531 migrated. In this case, the server should start 4532 the immigration process and respond to client 4533 with this error. This error may also occur 4534 when a necessary delegation recall makes 4535 processing a request in a timely fashion 4536 impossible. 4538 NFS4ERR_DENIED An attempt to lock a file is denied. Since 4539 this may be a temporary condition, the client 4540 is encouraged to retry the lock request until 4541 the lock is accepted. 4543 NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The 4544 user's resource limit on the server has been 4545 exceeded. 4547 NFS4ERR_EXIST File exists. The file specified already exists. 4549 NFS4ERR_EXPIRED A lease has expired that is being used in the 4550 current procedure. 4552 NFS4ERR_FBIG File too large. The operation would have caused 4553 a file to grow beyond the server's limit. 4555 NFS4ERR_FHEXPIRED The file handle provided is volatile and has 4556 expired at the server. 4558 NFS4ERR_GRACE The server is in its recovery or grace period 4559 which should match the lease period of the 4560 server. 4562 NFS4ERR_INVAL Invalid argument or unsupported argument for an 4563 operation. Two examples are attempting a 4564 READLINK on an object other than a symbolic 4565 link or attempting to SETATTR a time field on a 4566 server that does not support this operation. 4568 NFS4ERR_IO I/O error. A hard error (for example, a disk 4569 error) occurred while processing the requested 4570 operation. 4572 NFS4ERR_ISDIR Is a directory. The caller specified a 4573 directory in a non-directory operation. 4575 NFS4ERR_LEASE_MOVED A lease being renewed is associated with a file 4576 system that has been migrated to a new server. 4578 NFS4ERR_LOCKED A read or write operation was attempted on a 4579 locked file. 4581 NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a 4582 current lock for the lock owner and the server 4583 does not support this type of request. 4585 NFS4ERR_MINOR_VERS_MISMATCH 4586 The server has received a request that 4587 specifies an unsupported minor version. The 4588 server must return a COMPOUND4res with a zero 4589 length operations result array. 4591 NFS4ERR_MLINK Too many hard links. 4593 NFS4ERR_MOVED The filesystem which contains the current 4594 filehandle object has been relocated or 4595 migrated to another server. The client may 4596 obtain the new filesystem location by obtaining 4597 the "fs_locations" attribute for the current 4598 filehandle. For further discussion, refer to 4599 the section "Filesystem Migration or 4600 Relocation". 4602 NFS4ERR_NAMETOOLONG The filename in an operation was too long. 4604 NFS4ERR_NODEV No such device. 4606 NFS4ERR_NOENT No such file or directory. The file or 4607 directory name specified does not exist. 4609 NFS4ERR_NOFILEHANDLE The logical current file handle value has not 4610 been set properly. This may be a result of a 4611 malformed COMPOUND operation (i.e. no PUTFH or 4612 PUTROOTFH before an operation that requires the 4613 current file handle be set). 4615 NFS4ERR_NOSPC No space left on device. The operation would 4616 have caused the server's file system to exceed 4617 its limit. 4619 NFS4ERR_NOTDIR Not a directory. The caller specified a non- 4620 directory in a directory operation. 4622 NFS4ERR_NOTEMPTY An attempt was made to remove a directory that 4623 was not empty. 4625 NFS4ERR_NOTSUPP Operation is not supported. 4627 NFS4ERR_NOT_SAME This error is returned by the VERIFY operation 4628 to signify that the attributes compared were 4629 not the same as provided in the client's 4630 request. 4632 NFS4ERR_NXIO I/O error. No such device or address. 4634 NFS4ERR_OLD_STATEID A stateid which designates the locking state 4635 for a lockowner-file at an earlier time was 4636 used. 4638 NFS4ERR_PERM Not owner. The operation was not allowed 4639 because the caller is either not a privileged 4640 user (root) or not the owner of the target of 4641 the operation. 4643 NFS4ERR_READDIR_NOSPC The encoded response to a READDIR request 4644 exceeds the size limit set by the initial 4645 request. 4647 NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, 4648 the server may exhaust available resources and 4649 can not continue processing procedures within 4650 the COMPOUND operation. This error will be 4651 returned from the server in those instances of 4652 resource exhaustion related to the processing 4653 of the COMPOUND procedure. 4655 NFS4ERR_ROFS Read-only file system. A modifying operation 4656 was attempted on a read-only file system. 4658 NFS4ERR_SAME This error is returned by the NVERIFY operation 4659 to signify that the attributes compared were 4660 the same as provided in the client's request. 4662 NFS4ERR_SERVERFAULT An error occurred on the server which does not 4663 map to any of the legal NFS version 4 protocol 4664 error values. The client should translate this 4665 into an appropriate error. UNIX clients may 4666 choose to translate this to EIO. 4668 NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share 4669 reservation has failed because of a share 4670 conflict. 4672 NFS4ERR_STALE Invalid file handle. The file handle given in 4673 the arguments was invalid. The file referred to 4674 by that file handle no longer exists or access 4675 to it has been revoked. 4677 NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was 4678 used in a locking or SETCLIENTID_CONFIRM 4679 request. 4681 NFS4ERR_STALE_STATEID A stateid generated by an earlier server 4682 instance was used. 4684 NFS4ERR_SYMLINK The current file handle provided for a LOOKUP 4685 is not a directory but a symbolic link. Also 4686 used if the final component of the OPEN path is 4687 a symbolic link. 4689 NFS4ERR_TOOSMALL Buffer or request is too small. 4691 NFS4ERR_WRONGSEC The security mechanism being used by the client 4692 for the procedure does not match the server's 4693 security policy. The client should change the 4694 security mechanism being used and retry the 4695 operation. 4697 NFS4ERR_XDEV Attempt to do a cross-device hard link. 4699 13. NFS Version 4 Requests 4701 For the NFS version 4 RPC program, there are two traditional RPC 4702 procedures: NULL and COMPOUND. All other functionality is defined as 4703 a set of operations and these operations are defined in normal 4704 XDR/RPC syntax and semantics. However, these operations are 4705 encapsulated within the COMPOUND procedure. This requires that the 4706 client combine one or more of the NFS version 4 operations into a 4707 single request. 4709 The NFS4_CALLBACK program is used to provide server to client 4710 signaling and is constructed in a similar fashion as the NFS version 4711 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the 4712 same way as NULL and COMPOUND are within the NFS program. The 4713 CB_COMPOUND request also encapsulates the remaining operations of the 4714 NFS4_CALLBACK program. There is no predefined RPC program number for 4715 the NFS4_CALLBACK program. It is up to the client to specify a 4716 program number in the "transient" program range. The program and 4717 port number of the NFS4_CALLBACK program are provided by the client 4718 as part of the SETCLIENTID operation and therefore is fixed for the 4719 life of the client instantiation. 4721 13.1. Compound Procedure 4723 The COMPOUND procedure provides the opportunity for better 4724 performance within high latency networks. The client can avoid 4725 cumulative latency of multiple RPCs by combining multiple dependent 4726 operations into a single COMPOUND procedure. A compound operation 4727 may provide for protocol simplification by allowing the client to 4728 combine basic procedures into a single request that is customized for 4729 the client's environment. 4731 The CB_COMPOUND procedure precisely parallels the features of 4732 COMPOUND as described above. 4734 The basics of the COMPOUND procedures construction is: 4736 +-----------+-----------+-----------+-- 4737 | op + args | op + args | op + args | 4738 +-----------+-----------+-----------+-- 4740 and the reply looks like this: 4742 +------------+-----------------------+-----------------------+-- 4743 |last status | status + op + results | status + op + results | 4744 +------------+-----------------------+-----------------------+-- 4746 13.2. Evaluation of a Compound Request 4748 The server will process the COMPOUND procedure by evaluating each of 4749 the operations within the COMPOUND procedure in order. Each 4750 component operation consists of a 32 bit operation code, followed by 4751 the argument of length determined by the type of operation. The 4752 results of each operation are encoded in sequence into a reply 4753 buffer. The results of each operation are preceded by the opcode and 4754 a status code (normally zero). If an operation results in a non-zero 4755 status code, the status will be encoded and evaluation of the 4756 compound sequence will halt and the reply will be returned. Note 4757 that evaluation stops even in the event of "non error" conditions 4758 such as NFS4ERR_SAME. 4760 There are no atomicity requirements for the operations contained 4761 within the COMPOUND procedure. The operations being evaluated as 4762 part of a COMPOUND request may be evaluated simultaneously with other 4763 COMPOUND requests that the server receives. 4765 It is the client's responsibility for recovering from any partially 4766 completed COMPOUND procedure. Partially completed COMPOUND 4767 procedures may occur at any point due to errors such as 4768 NFS4ERR_RESOURCE and NFS4ERR_LONG_DELAY. This may occur even given 4769 an otherwise valid operation string. Further, a server reboot which 4770 occurs in the middle of processing a COMPOUND procedure may leave the 4771 client with the difficult task of determining how far COMPOUND 4772 processing has proceeded. Therefore, the client should avoid overly 4773 complex COMPOUND procedures in the event of the failure of an 4774 operation within the procedure. 4776 Each operation assumes a "current" and "saved" filehandle that is 4777 available as part of the execution context of the compound request. 4778 Operations may set, change, or return the current filehandle. The 4779 "saved" filehandle is used for temporary storage of a filehandle 4780 value and as operands for the RENAME and LINK operations. 4782 13.3. Synchronous Modifying Operations 4784 NFS version 4 operations that modify the file system are synchronous. 4785 When an operation is successfully completed at the server, the client 4786 can depend that any data associated with the request is now on stable 4787 storage (the one exception is in the case of the file data in a WRITE 4788 operation with the UNSTABLE option specified). 4790 This implies that any previous operations within the same compound 4791 request are also reflected in stable storage. This behavior enables 4792 the client's ability to recover from a partially executed compound 4793 request which may resulted from the failure of the server. For 4794 example, if a compound request contains operations A and B and the 4795 server is unable to send a response to the client, depending on the 4796 progress the server made in servicing the request the result of both 4797 operations may be reflected in stable storage or just operation A may 4798 be reflected. The server must not have just the results of operation 4799 B in stable storage. 4801 13.4. Operation Values 4803 The operations encoded in the COMPOUND procedure are identified by 4804 operation values. To avoid overlap with the RPC procedure numbers, 4805 operations 0 (zero) and 1 are not defined. Operation 2 is not 4806 defined but reserved for future use with minor versioning. 4808 14. NFS Version 4 Procedures 4810 14.1. Procedure 0: NULL - No Operation 4812 SYNOPSIS 4814 4816 ARGUMENT 4818 void; 4820 RESULT 4822 void; 4824 DESCRIPTION 4826 Standard NULL procedure. Void argument, void response. This 4827 procedure has no functionality associated with it. Because of this 4828 it is sometimes used to measure the overhead of processing a 4829 service request. Therefore, the server should ensure that no 4830 unnecessary work is done in servicing this procedure. 4832 ERRORS 4834 None. 4836 14.2. Procedure 1: COMPOUND - Compound Operations 4838 SYNOPSIS 4840 compoundargs -> compoundres 4842 ARGUMENT 4844 union nfs_argop4 switch (nfs_opnum4 argop) { 4845 case : ; 4846 ... 4847 }; 4849 struct COMPOUND4args { 4850 utf8string tag; 4851 uint32_t minorversion; 4852 nfs_argop4 argarray<>; 4853 }; 4855 RESULT 4857 union nfs_resop4 switch (nfs_opnum4 resop){ 4858 case : ; 4859 ... 4860 }; 4862 struct COMPOUND4res { 4863 nfsstat4 status; 4864 utf8string tag; 4865 nfs_resop4 resarray<>; 4866 }; 4868 DESCRIPTION 4870 The COMPOUND procedure is used to combine one or more of the NFS 4871 operations into a single RPC request. The main NFS RPC program has 4872 two main procedures: NULL and COMPOUND. All other operations use 4873 the COMPOUND procedure as a wrapper. 4875 The COMPOUND procedure is used to combine individual operations 4876 into a single RPC request. The server interprets each of the 4877 operations in turn. If an operation is executed by the server and 4878 the status of that operation is NFS4_OK, then the next operation in 4879 the COMPOUND procedure is executed. The server continues this 4880 process until there are no more operations to be executed or one of 4881 the operations has a status value other than NFS4_OK. 4883 In the processing of the COMPOUND procedure, the server may find 4884 that it does not have the available resources to execute any or all 4885 of the operations within the COMPOUND sequence. In this case, the 4886 error NFS4ERR_RESOURCE will be returned for the particular 4887 operation within the COMPOUND procedure where the resource 4888 exhaustion occurred. This assumes that all previous operations 4889 within the COMPOUND sequence have been evaluated successfully. The 4890 results for all of the evaluated operations must be returned to the 4891 client. 4893 The COMPOUND arguments contain a "minorversion" field. The initial 4894 and default value for this field is 0 (zero). This field will be 4895 used by future minor versions such that the client can communicate 4896 to the server what minor version is being requested. If the server 4897 receives a COMPOUND procedure with a minorversion field value that 4898 it does not support, the server MUST return an error of 4899 NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. 4901 Contained within the COMPOUND results is a "status" field. If the 4902 results array length is non-zero, this status must be equivalent to 4903 the status of the last operation that was executed within the 4904 COMPOUND procedure. Therefore, if an operation incurred an error 4905 then the "status" value will be the same error value as is being 4906 returned for the operation that failed. 4908 Note that operations, 0 (zero) and 1 (one) are not defined for the 4909 COMPOUND procedure. If the server receives an operation array with 4910 either of these included, an error of NFS4ERR_NOTSUPP must be 4911 returned. Operation 2 is not defined but reserved for future 4912 definition and use with minor versioning. If the server receives a 4913 operation array that contains operation 2 and the minorversion 4914 field has a value of 0 (zero), an error of NFS4ERR_NOTSUPP is 4915 returned. If an operation array contains an operation 2 and the 4916 minorversion field is non-zero and the server does not support the 4917 minor version, the server returns an error of 4918 NFS4ERR_MINOR_VERS_MISMATCH. Therefore, the 4919 NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other 4920 errors. 4922 IMPLEMENTATION 4924 Note that the definition of the "tag" in both the request and 4925 response are left to the implementor. It may be used to summarize 4926 the content of the compound request for the benefit of packet 4927 sniffers and engineers debugging implementations. 4929 Since an error of any type may occur after only a portion of the 4930 operations have been evaluated, the client must be prepared to 4931 recover from any failure. If the source of an NFS4ERR_RESOURCE 4932 error was a complex or lengthy set of operations, it is likely that 4933 if the number of operations were reduced the server would be able 4934 to evaluate them successfully. Therefore, the client is 4935 responsible for dealing with this type of complexity in recovery. 4937 ERRORS 4939 All errors defined in the protocol 4941 14.2.1. Operation 3: ACCESS - Check Access Rights 4943 SYNOPSIS 4945 (cfh), accessreq -> supported, accessrights 4947 ARGUMENT 4949 const ACCESS4_READ = 0x00000001; 4950 const ACCESS4_LOOKUP = 0x00000002; 4951 const ACCESS4_MODIFY = 0x00000004; 4952 const ACCESS4_EXTEND = 0x00000008; 4953 const ACCESS4_DELETE = 0x00000010; 4954 const ACCESS4_EXECUTE = 0x00000020; 4956 struct ACCESS4args { 4957 /* CURRENT_FH: object */ 4958 uint32_t access; 4959 }; 4961 RESULT 4963 struct ACCESS4resok { 4964 uint32_t supported; 4965 uint32_t access; 4966 }; 4968 union ACCESS4res switch (nfsstat4 status) { 4969 case NFS4_OK: 4970 ACCESS4resok resok4; 4971 default: 4972 void; 4973 }; 4975 DESCRIPTION 4977 ACCESS determines the access rights that a user, as identified by 4978 the credentials in the RPC request, has with respect to the file 4979 system object specified by the current filehandle. The client 4980 encodes the set of access rights that are to be checked in the bit 4981 mask "access". The server checks the permissions encoded in the 4982 bit mask. If a status of NFS4_OK is returned, two bit masks are 4983 included in the response. The first, "supported", represents the 4984 access rights for which the server can verify reliably. The 4985 second, "access", represents the access rights available to the 4986 user for the filehandle provided. On success, the current 4987 filehandle retains its value. 4989 Note that the supported field will contain only as many values as 4990 was originally sent in the arguments. For example, if the client 4991 sends an ACCESS operation with only the ACCESS4_READ value set and 4992 the server supports this value, the server will return only 4993 ACCESS4_READ even if it could have reliably checked other values. 4995 The results of this operation are necessarily advisory in nature. 4996 A return status of NFS4_OK and the appropriate bit set in the bit 4997 mask does not imply that such access will be allowed to the file 4998 system object in the future. This is because access rights can be 4999 revoked by the server at any time. 5001 The following access permissions may be requested: 5003 ACCESS4_READ Read data from file or read a directory. 5005 ACCESS4_LOOKUP Look up a name in a directory (no meaning for non- 5006 directory objects). 5008 ACCESS4_MODIFY Rewrite existing file data or modify existing 5009 directory entries. 5011 ACCESS4_EXTEND Write new data or add directory entries. 5013 ACCESS4_DELETE Delete an existing directory entry (no meaning for 5014 non-directory objects). 5016 ACCESS4_EXECUTE Execute file (no meaning for a directory). 5018 On success, the current filehandle retains its value. 5020 IMPLEMENTATION 5022 For the NFS version 4 protocol, the use of the ACCESS procedure 5023 when opening a regular file is deprecated in favor of using OPEN. 5025 In general, it is not sufficient for the client to attempt to 5026 deduce access permissions by inspecting the uid, gid, and mode 5027 fields in the file attributes or by attempting to interpret the 5028 contents of the ACL attribute. This is because the server may 5029 perform uid or gid mapping or enforce additional access control 5030 restrictions. It is also possible that the server may not be in 5031 the same ID space as the client. In these cases (and perhaps 5032 others), the client can not reliably perform an access check with 5033 only current file attributes. 5035 In the NFS version 2 protocol, the only reliable way to determine 5036 whether an operation was allowed was to try it and see if it 5037 succeeded or failed. Using the ACCESS procedure in the NFS version 5038 4 protocol, the client can ask the server to indicate whether or 5039 not one or more classes of operations are permitted. The ACCESS 5040 operation is provided to allow clients to check before doing a 5041 series of operations which will result in an access failure. The 5042 OPEN operation provides a point where the server can verify access 5043 to the file object and method to return that information to the 5044 client. The ACCESS operation is still useful for directory 5045 operations or for use in the case the UNIX API "access" is used on 5046 the client. 5048 The information returned by the server in response to an ACCESS 5049 call is not permanent. It was correct at the exact time that the 5050 server performed the checks, but not necessarily afterwards. The 5051 server can revoke access permission at any time. 5053 The client should use the effective credentials of the user to 5054 build the authentication information in the ACCESS request used to 5055 determine access rights. It is the effective user and group 5056 credentials that are used in subsequent read and write operations. 5058 Many implementations do not directly support the ACCESS4_DELETE 5059 permission. Operating systems like UNIX will ignore the 5060 ACCESS4_DELETE bit if set on an access request on a non-directory 5061 object. In these systems, delete permission on a file is 5062 determined by the access permissions on the directory in which the 5063 file resides, instead of being determined by the permissions of the 5064 file itself. Therefore, the mask returned enumerating which access 5065 rights can be determined will have the ACCESS4_DELETE value set to 5066 0. This indicates to the client that the server was unable to 5067 check that particular access right. The ACCESS4_DELETE bit in the 5068 access mask returned will then be ignored by the client. 5070 ERRORS 5072 NFS4ERR_ACCES 5073 NFS4ERR_BADHANDLE 5074 NFS4ERR_DELAY 5075 NFS4ERR_FHEXPIRED 5076 NFS4ERR_IO 5077 NFS4ERR_MOVED 5078 NFS4ERR_NOFILEHANDLE 5079 NFS4ERR_RESOURCE 5080 NFS4ERR_SERVERFAULT 5081 NFS4ERR_STALE 5082 NFS4ERR_WRONGSEC 5084 14.2.2. Operation 4: CLOSE - Close File 5086 SYNOPSIS 5088 (cfh), seqid, stateid -> stateid 5090 ARGUMENT 5092 struct CLOSE4args { 5093 /* CURRENT_FH: object */ 5094 seqid4 seqid 5095 stateid4 stateid; 5096 }; 5098 RESULT 5100 union CLOSE4res switch (nfsstat4 status) { 5101 case NFS4_OK: 5102 stateid4 stateid; 5103 default: 5104 void; 5105 }; 5107 DESCRIPTION 5109 The CLOSE operation releases share reservations for the file as 5110 specified by the current filehandle. The share reservations and 5111 other state information released at the server as a result of this 5112 CLOSE is only associated with the supplied stateid. The sequence 5113 id provides for the correct ordering. State associated with other 5114 OPENs is not affected. 5116 If record locks are held, the client SHOULD release all locks 5117 before issuing a CLOSE. The server MAY free all outstanding locks 5118 on CLOSE but some servers may not support the CLOSE of a file that 5119 still has record locks held. The server MUST return failure if any 5120 locks would exist after the CLOSE. 5122 On success, the current filehandle retains its value. 5124 IMPLEMENTATION 5126 ERRORS 5127 NFS4ERR_BADHANDLE 5128 NFS4ERR_BAD_SEQID 5129 NFS4ERR_BAD_STATEID 5130 NFS4ERR_DELAY 5131 NFS4ERR_EXPIRED 5132 NFS4ERR_FHEXPIRED 5133 NFS4ERR_GRACE 5134 NFS4ERR_INVAL 5135 NFS4ERR_LEASE_MOVED 5136 NFS4ERR_MOVED 5137 NFS4ERR_NOFILEHANDLE 5138 NFS4ERR_OLD_STATEID 5139 NFS4ERR_RESOURCE 5140 NFS4ERR_SERVERFAULT 5141 NFS4ERR_STALE 5142 NFS4ERR_STALE_STATEID 5144 14.2.3. Operation 5: COMMIT - Commit Cached Data 5146 SYNOPSIS 5148 (cfh), offset, count -> verifier 5150 ARGUMENT 5152 struct COMMIT4args { 5153 /* CURRENT_FH: file */ 5154 offset4 offset; 5155 count4 count; 5156 }; 5158 RESULT 5160 struct COMMIT4resok { 5161 verifier4 writeverf; 5162 }; 5164 union COMMIT4res switch (nfsstat4 status) { 5165 case NFS4_OK: 5166 COMMIT4resok resok4; 5167 default: 5168 void; 5169 }; 5171 DESCRIPTION 5173 The COMMIT operation forces or flushes data to stable storage for 5174 the file specified by the current file handle. The flushed data is 5175 that which was previously written with a WRITE operation which had 5176 the stable field set to UNSTABLE4. 5178 The offset specifies the position within the file where the flush 5179 is to begin. An offset value of 0 (zero) means to flush data 5180 starting at the beginning of the file. The count specifies the 5181 number of bytes of data to flush. If count is 0 (zero), a flush 5182 from offset to the end of the file is done. 5184 The server returns a write verifier upon successful completion of 5185 the COMMIT. The write verifier is used by the client to determine 5186 if the server has restarted or rebooted between the initial 5187 WRITE(s) and the COMMIT. The client does this by comparing the 5188 write verifier returned from the initial writes and the verifier 5189 returned by the COMMIT procedure. The server must vary the value 5190 of the write verifier at each server event or instantiation that 5191 may lead to a loss of uncommitted data. Most commonly this occurs 5192 when the server is rebooted; however, other events at the server 5193 may result in uncommitted data loss as well. 5195 On success, the current filehandle retains its value. 5197 IMPLEMENTATION 5199 The COMMIT procedure is similar in operation and semantics to the 5200 POSIX fsync(2) system call that synchronizes a file's state with 5201 the disk (file data and metadata is flushed to disk or stable 5202 storage). COMMIT performs the same operation for a client, flushing 5203 any unsynchronized data and metadata on the server to the server's 5204 disk or stable storage for the specified file. Like fsync(2), it 5205 may be that there is some modified data or no modified data to 5206 synchronize. The data may have been synchronized by the server's 5207 normal periodic buffer synchronization activity. COMMIT should 5208 return NFS4_OK, unless there has been an unexpected error. 5210 COMMIT differs from fsync(2) in that it is possible for the client 5211 to flush a range of the file (most likely triggered by a buffer- 5212 reclamation scheme on the client before file has been completely 5213 written). 5215 The server implementation of COMMIT is reasonably simple. If the 5216 server receives a full file COMMIT request, that is starting at 5217 offset 0 and count 0, it should do the equivalent of fsync()'ing 5218 the file. Otherwise, it should arrange to have the cached data in 5219 the range specified by offset and count to be flushed to stable 5220 storage. In both cases, any metadata associated with the file must 5221 be flushed to stable storage before returning. It is not an error 5222 for there to be nothing to flush on the server. This means that 5223 the data and metadata that needed to be flushed have already been 5224 flushed or lost during the last server failure. 5226 The client implementation of COMMIT is a little more complex. 5227 There are two reasons for wanting to commit a client buffer to 5228 stable storage. The first is that the client wants to reuse a 5229 buffer. In this case, the offset and count of the buffer are sent 5230 to the server in the COMMIT request. The server then flushes any 5231 cached data based on the offset and count, and flushes any metadata 5232 associated with the file. It then returns the status of the flush 5233 and the write verifier. The other reason for the client to 5234 generate a COMMIT is for a full file flush, such as may be done at 5235 close. In this case, the client would gather all of the buffers 5236 for this file that contain uncommitted data, do the COMMIT 5237 operation with an offset of 0 and count of 0, and then free all of 5238 those buffers. Any other dirty buffers would be sent to the server 5239 in the normal fashion. 5241 After a buffer is written by the client with the stable parameter 5242 set to UNSTABLE4, the buffer must be considered as modified by the 5243 client until the buffer has either been flushed via a COMMIT 5244 operation or written via a WRITE operation with stable parameter 5245 set to FILE_SYNC4 or DATA_SYNC4. This is done to prevent the buffer 5246 from being freed and reused before the data can be flushed to 5247 stable storage on the server. 5249 When a response is returned from either a WRITE or a COMMIT 5250 operation and it contains a write verifier that is different than 5251 previously returned by the server, the client will need to 5252 retransmit all of the buffers containing uncommitted cached data to 5253 the server. How this is to be done is up to the implementor. If 5254 there is only one buffer of interest, then it should probably be 5255 sent back over in a WRITE request with the appropriate stable 5256 parameter. If there is more than one buffer, it might be 5257 worthwhile retransmitting all of the buffers in WRITE requests with 5258 the stable parameter set to UNSTABLE4 and then retransmitting the 5259 COMMIT operation to flush all of the data on the server to stable 5260 storage. The timing of these retransmissions is left to the 5261 implementor. 5263 The above description applies to page-cache-based systems as well 5264 as buffer-cache-based systems. In those systems, the virtual 5265 memory system will need to be modified instead of the buffer cache. 5267 ERRORS 5269 NFS4ERR_ACCES 5270 NFS4ERR_BADHANDLE 5271 NFS4ERR_FHEXPIRED 5272 NFS4ERR_IO 5273 NFS4ERR_ISDIR 5274 NFS4ERR_LOCKED 5275 NFS4ERR_MOVED 5276 NFS4ERR_NOFILEHANDLE 5277 NFS4ERR_RESOURCE 5278 NFS4ERR_ROFS 5279 NFS4ERR_SERVERFAULT 5280 NFS4ERR_STALE 5281 NFS4ERR_WRONGSEC 5283 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 5285 SYNOPSIS 5287 (cfh), name, type -> (cfh), change_info 5289 ARGUMENT 5291 union createtype4 switch (nfs_ftype4 type) { 5292 case NF4LNK: 5293 linktext4 linkdata; 5294 case NF4BLK: 5295 case NF4CHR: 5296 specdata4 devdata; 5297 case NF4SOCK: 5298 case NF4FIFO: 5299 case NF4DIR: 5300 void; 5301 }; 5303 struct CREATE4args { 5304 /* CURRENT_FH: directory for creation */ 5305 component4 objname; 5306 createtype4 objtype; 5307 }; 5309 RESULT 5311 struct CREATE4resok { 5312 change_info4 cinfo; 5313 }; 5315 union CREATE4res switch (nfsstat4 status) { 5316 case NFS4_OK: 5317 CREATE4resok resok4; 5318 default: 5319 void; 5320 }; 5322 DESCRIPTION 5324 The CREATE operation creates a non-regular file object in a 5325 directory with a given name. The OPEN procedure MUST be used to 5326 create a regular file. 5328 The objname specifies the name for the new object. If the objname 5329 has a length of 0 (zero), the error NFS4ERR_INVAL will be returned. 5330 The objtype determines the type of object to be created: directory, 5331 symlink, etc. 5333 If an object of the same name already exists in the directory, the 5334 server will return the error NFS4ERR_EXIST. 5336 For the directory where the new file object was created, the server 5337 returns change_info4 information in cinfo. With the atomic field 5338 of the change_info4 struct, the server will indicate if the before 5339 and after change attributes were obtained atomically with respect 5340 to the file object creation. 5342 If the objname has a length of 0 (zero), or if objname does not 5343 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 5344 returned. 5346 The current filehandle is replaced by that of the new object. 5348 IMPLEMENTATION 5350 If the client desires to set attribute values after the create, a 5351 SETATTR operation can be added to the COMPOUND request so that the 5352 appropriate attributes will be set. 5354 ERRORS 5356 NFS4ERR_ACCES 5357 NFS4ERR_BADHANDLE 5358 NFS4ERR_BADTYPE 5359 NFS4ERR_DQUOT 5360 NFS4ERR_EXIST 5361 NFS4ERR_FHEXPIRED 5362 NFS4ERR_INVAL 5363 NFS4ERR_IO 5364 NFS4ERR_MOVED 5365 NFS4ERR_NAMETOOLONG 5366 NFS4ERR_NOFILEHANDLE 5367 NFS4ERR_NOSPC 5368 NFS4ERR_NOTDIR 5369 NFS4ERR_NOTSUPP 5370 NFS4ERR_RESOURCE 5371 NFS4ERR_ROFS 5372 NFS4ERR_SERVERFAULT 5373 NFS4ERR_STALE 5374 NFS4ERR_WRONGSEC 5376 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery 5378 SYNOPSIS 5380 clientid -> 5382 ARGUMENT 5384 struct DELEGPURGE4args { 5385 clientid4 clientid; 5386 }; 5388 RESULT 5390 struct DELEGPURGE4res { 5391 nfsstat4 status; 5392 }; 5394 DESCRIPTION 5396 Purges all of the delegations awaiting recovery for a given client. 5397 This is useful for clients which do not commit delegation 5398 information to stable storage to indicate that conflicting requests 5399 need not be delayed by the server awaiting recovery of delegation 5400 information. 5402 This operation should be used by clients that record delegation 5403 information on stable storage on the client. In this case, 5404 DELEGPURGE should be issued immediately after doing delegation 5405 recovery on all delegations know to the client. Doing so will 5406 notify the server that no additional delegations for the client 5407 will be recovered allowing it to free resources, and avoid delaying 5408 other clients who make requests that conflict with the unrecovered 5409 delegations. The set of delegations known to the server and the 5410 client may be different. The reason for this is that a client may 5411 fail after making a request which resulted in delegation but before 5412 it received the results and committed them to the client's stable 5413 storage. 5415 ERRORS 5417 NFS4ERR_RESOURCE 5418 NFS4ERR_SERVERFAULT 5419 NFS4ERR_STALE_CLIENTID 5421 14.2.6. Operation 8: DELEGRETURN - Return Delegation 5423 SYNOPSIS 5425 stateid -> 5427 ARGUMENT 5429 struct DELEGRETURN4args { 5430 stateid4 stateid; 5431 }; 5433 RESULT 5435 struct DELEGRETURN4res { 5436 nfsstat4 status; 5437 }; 5439 DESCRIPTION 5441 Returns the delegation represented by the given stateid. 5443 ERRORS 5445 NFS4ERR_BAD_STATEID 5446 NFS4ERR_OLD_STATEID 5447 NFS4ERR_RESOURCE 5448 NFS4ERR_SERVERFAULT 5449 NFS4ERR_STALE_STATEID 5451 14.2.7. Operation 9: GETATTR - Get Attributes 5453 SYNOPSIS 5455 (cfh), attrbits -> attrbits, attrvals 5457 ARGUMENT 5459 struct GETATTR4args { 5460 /* CURRENT_FH: directory or file */ 5461 bitmap4 attr_request; 5462 }; 5464 RESULT 5466 struct GETATTR4resok { 5467 fattr4 obj_attributes; 5468 }; 5470 union GETATTR4res switch (nfsstat4 status) { 5471 case NFS4_OK: 5472 GETATTR4resok resok4; 5473 default: 5474 void; 5475 }; 5477 DESCRIPTION 5479 The GETATTR operation will obtain attributes for the file system 5480 object specified by the current filehandle. The client sets a bit 5481 in the bitmap argument for each attribute value that it would like 5482 the server to return. The server returns an attribute bitmap that 5483 indicates the attribute values for which it was able to return, 5484 followed by the attribute values ordered lowest attribute number 5485 first. 5487 The server must return a value for each attribute that the client 5488 requests if the attribute is supported by the server. If the 5489 server does not support an attribute or cannot approximate a useful 5490 value then it must not return the attribute value and must not set 5491 the attribute bit in the result bitmap. The server must return an 5492 error if it supports an attribute but cannot obtain its value. In 5493 that case no attribute values will be returned. 5495 All servers must support the mandatory attributes as specified in 5496 the section "File Attributes". 5498 On success, the current filehandle retains its value. 5500 IMPLEMENTATION 5502 ERRORS 5504 NFS4ERR_ACCES 5505 NFS4ERR_BADHANDLE 5506 NFS4ERR_DELAY 5507 NFS4ERR_FHEXPIRED 5508 NFS4ERR_INVAL 5509 NFS4ERR_IO 5510 NFS4ERR_MOVED 5511 NFS4ERR_NOFILEHANDLE 5512 NFS4ERR_RESOURCE 5513 NFS4ERR_SERVERFAULT 5514 NFS4ERR_STALE 5515 NFS4ERR_WRONGSEC 5517 14.2.8. Operation 10: GETFH - Get Current Filehandle 5519 SYNOPSIS 5521 (cfh) -> filehandle 5523 ARGUMENT 5525 /* CURRENT_FH: */ 5526 void; 5528 RESULT 5530 struct GETFH4resok { 5531 nfs_fh4 object; 5532 }; 5534 union GETFH4res switch (nfsstat4 status) { 5535 case NFS4_OK: 5536 GETFH4resok resok4; 5537 default: 5538 void; 5539 }; 5541 DESCRIPTION 5543 This operation returns the current filehandle value. 5545 On success, the current filehandle retains its value. 5547 IMPLEMENTATION 5549 Operations that change the current filehandle like LOOKUP or CREATE 5550 do not automatically return the new filehandle as a result. For 5551 instance, if a client needs to lookup a directory entry and obtain 5552 its filehandle then the following request is needed. 5554 PUTFH (directory filehandle) 5555 LOOKUP (entry name) 5556 GETFH 5558 ERRORS 5560 NFS4ERR_BADHANDLE 5561 NFS4ERR_FHEXPIRED 5562 NFS4ERR_MOVED 5563 NFS4ERR_NOFILEHANDLE 5564 NFS4ERR_RESOURCE 5565 NFS4ERR_SERVERFAULT 5566 NFS4ERR_STALE 5567 NFS4ERR_WRONGSEC 5569 14.2.9. Operation 11: LINK - Create Link to a File 5571 SYNOPSIS 5573 (sfh), (cfh), newname -> (cfh), change_info 5575 ARGUMENT 5577 struct LINK4args { 5578 /* SAVED_FH: source object */ 5579 /* CURRENT_FH: target directory */ 5580 component4 newname; 5581 }; 5583 RESULT 5585 struct LINK4resok { 5586 change_info4 cinfo; 5587 }; 5589 union LINK4res switch (nfsstat4 status) { 5590 case NFS4_OK: 5591 LINK4resok resok4; 5592 default: 5593 void; 5594 }; 5596 DESCRIPTION 5598 The LINK operation creates an additional newname for the file 5599 represented by the saved filehandle, as set by the SAVEFH 5600 operation, in the directory represented by the current filehandle. 5601 The existing file and the target directory must reside within the 5602 same file system on the server. On success, the current filehandle 5603 will continue to be the target directory. 5605 For the target directory, the server returns change_info4 5606 information in cinfo. With the atomic field of the change_info4 5607 struct, the server will indicate if the before and after change 5608 attributes were obtained atomically with respect to the link 5609 creation. 5611 If the newname has a length of 0 (zero), or if newname does not 5612 obey the UTF-8 definition, the error NFS4ERR_INVAL will be 5613 returned. 5615 IMPLEMENTATION 5616 Changes to any property of the "hard" linked files are reflected in 5617 all of the linked files. When a link is made to a file, the 5618 attributes for the file should have a value for numlinks that is 5619 one greater than the value before the LINK operation. 5621 The comments under RENAME regarding object and target residing on 5622 the same file system apply here as well. The comments regarding the 5623 target name applies as well. 5625 Note that symbolic links are created with the CREATE operation. 5627 ERRORS 5629 NFS4ERR_ACCES 5630 NFS4ERR_BADHANDLE 5631 NFS4ERR_DELAY 5632 NFS4ERR_DQUOT 5633 NFS4ERR_EXIST 5634 NFS4ERR_FHEXPIRED 5635 NFS4ERR_INVAL 5636 NFS4ERR_IO 5637 NFS4ERR_ISDIR 5638 NFS4ERR_MLINK 5639 NFS4ERR_MOVED 5640 NFS4ERR_NAMETOOLONG 5641 NFS4ERR_NOFILEHANDLE 5642 NFS4ERR_NOSPC 5643 NFS4ERR_NOTDIR 5644 NFS4ERR_NOTSUPP 5645 NFS4ERR_RESOURCE 5646 NFS4ERR_ROFS 5647 NFS4ERR_SERVERFAULT 5648 NFS4ERR_STALE 5649 NFS4ERR_WRONGSEC 5650 NFS4ERR_XDEV 5652 14.2.10. Operation 12: LOCK - Create Lock 5654 SYNOPSIS 5656 (cfh) type, seqid, reclaim, stateid, offset, length -> stateid, 5657 access 5659 ARGUMENT 5661 enum nfs4_lock_type { 5662 READ_LT = 1, 5663 WRITE_LT = 2, 5664 READW_LT = 3, /* blocking read */ 5665 WRITEW_LT = 4 /* blocking write */ 5666 }; 5668 struct LOCK4args { 5669 /* CURRENT_FH: file */ 5670 nfs_lock_type4 locktype; 5671 seqid4 seqid; 5672 bool reclaim; 5673 stateid4 stateid; 5674 offset4 offset; 5675 length4 length; 5676 }; 5678 RESULT 5680 struct LOCK4denied { 5681 nfs_lockowner4 owner; 5682 offset4 offset; 5683 length4 length; 5684 }; 5686 union LOCK4res switch (nfsstat4 status) { 5687 case NFS4_OK: 5688 stateid4 stateid; 5689 case NFS4ERR_DENIED: 5690 LOCK4denied denied; 5691 default: 5692 void; 5693 }; 5695 DESCRIPTION 5697 The LOCK operation requests a record lock for the byte range 5698 specified by the offset and length parameters. The lock type is 5699 also specified to be one of the nfs4_lock_types. If this is a 5700 reclaim request, the reclaim parameter will be TRUE; 5702 Bytes in a file may be locked even if those bytes are not currently 5703 allocated to the file. To lock the file from a specific offset 5704 through the end-of-file (no matter how long the file actually is) 5705 use a length field with all bits set to 1 (one). To lock the 5706 entire file, use an offset of 0 (zero) and a length with all bits 5707 set to 1. A length of 0 is reserved and should not be used. 5709 In the case that the lock is denied, the owner, offset, and length 5710 of a conflicting lock are returned. 5712 On success, the current filehandle retains its value. 5714 IMPLEMENTATION 5716 If the server is unable to determine the exact offset and length of 5717 the conflicting lock, the same offset and length that were provided 5718 in the arguments should be returned in the denied results. The 5719 File Locking section contains a full description of this and the 5720 other file locking operations. 5722 ERRORS 5724 NFS4ERR_ACCES 5725 NFS4ERR_BADHANDLE 5726 NFS4ERR_BAD_SEQID 5727 NFS4ERR_BAD_STATEID 5728 NFS4ERR_DELAY 5729 NFS4ERR_DENIED 5730 NFS4ERR_EXPIRED 5731 NFS4ERR_FHEXPIRED 5732 NFS4ERR_GRACE 5733 NFS4ERR_INVAL 5734 NFS4ERR_ISDIR 5735 NFS4ERR_LEASE_MOVED 5736 NFS4ERR_LOCK_RANGE 5737 NFS4ERR_MOVED 5738 NFS4ERR_NOFILEHANDLE 5739 NFS4ERR_OLD_STATEID 5740 NFS4ERR_RESOURCE 5741 NFS4ERR_SERVERFAULT 5742 NFS4ERR_STALE 5743 NFS4ERR_STALE_CLIENTID 5744 NFS4ERR_STALE_STATEID 5745 NFS4ERR_WRONGSEC 5747 14.2.11. Operation 13: LOCKT - Test For Lock 5749 SYNOPSIS 5751 (cfh) type, owner, offset, length -> {void, NFS4ERR_DENIED -> 5752 owner} 5754 ARGUMENT 5756 struct LOCKT4args { 5757 /* CURRENT_FH: file */ 5758 nfs_lock_type4 locktype; 5759 nfs_lockowner4 owner; 5760 offset4 offset; 5761 length4 length; 5762 }; 5764 RESULT 5766 union LOCKT4res switch (nfsstat4 status) { 5767 case NFS4ERR_DENIED: 5768 LOCK4denied denied; 5769 case NFS4_OK: 5770 void; 5771 default: 5772 void; 5773 }; 5775 DESCRIPTION 5777 The LOCKT operation tests the lock as specified in the arguments. 5778 If a conflicting lock exists, the owner, offset, and length of the 5779 conflicting lock are returned; if no lock is held, nothing other 5780 than NFS4_OK is returned. 5782 On success, the current filehandle retains its value. 5784 IMPLEMENTATION 5786 If the server is unable to determine the exact offset and length of 5787 the conflicting lock, the same offset and length that were provided 5788 in the arguments should be returned in the denied results. The 5789 File Locking section contains further discussion of the file 5790 locking mechanisms. 5792 LOCKT uses nfs_lockowner4 instead of a stateid4, as LOCK does, to 5793 identify the owner so that the client does not have to open the 5794 file to test for the existence of a lock. 5796 ERRORS 5798 NFS4ERR_ACCES 5799 NFS4ERR_BADHANDLE 5800 NFS4ERR_DELAY 5801 NFS4ERR_DENIED 5802 NFS4ERR_FHEXPIRED 5803 NFS4ERR_GRACE 5804 NFS4ERR_INVAL 5805 NFS4ERR_ISDIR 5806 NFS4ERR_LEASE_MOVED 5807 NFS4ERR_LOCK_RANGE 5808 NFS4ERR_MOVED 5809 NFS4ERR_NOFILEHANDLE 5810 NFS4ERR_RESOURCE 5811 NFS4ERR_SERVERFAULT 5812 NFS4ERR_STALE 5813 NFS4ERR_STALE_CLIENTID 5814 NFS4ERR_WRONGSEC 5816 14.2.12. Operation 14: LOCKU - Unlock File 5818 SYNOPSIS 5820 (cfh) type, seqid, stateid, offset, length -> stateid 5822 ARGUMENT 5824 struct LOCKU4args { 5825 /* CURRENT_FH: file */ 5826 nfs_lock_type4 type; 5827 seqid4 seqid; 5828 stateid4 stateid; 5829 offset4 offset; 5830 length4 length; 5831 }; 5833 RESULT 5835 union LOCKU4res switch (nfsstat4 status) { 5836 case NFS4_OK: 5837 stateid4 stateid; 5838 default: 5839 void; 5840 }; 5842 DESCRIPTION 5844 The LOCKU operation unlocks the record lock specified by the 5845 parameters. 5847 On success, the current filehandle retains its value. 5849 IMPLEMENTATION 5851 The File Locking section contains a full description of this and 5852 the other file locking procedures. 5854 ERRORS 5856 NFS4ERR_ACCES 5857 NFS4ERR_BADHANDLE 5858 NFS4ERR_BAD_SEQID 5859 NFS4ERR_BAD_STATEID 5860 NFS4ERR_EXPIRED 5861 NFS4ERR_FHEXPIRED 5862 NFS4ERR_GRACE 5863 NFS4ERR_INVAL 5864 NFS4ERR_LOCK_RANGE 5865 NFS4ERR_LEASE_MOVED 5866 NFS4ERR_MOVED 5867 NFS4ERR_NOFILEHANDLE 5868 NFS4ERR_OLD_STATEID 5869 NFS4ERR_RESOURCE 5870 NFS4ERR_SERVERFAULT 5871 NFS4ERR_STALE 5872 NFS4ERR_STALE_CLIENTID 5873 NFS4ERR_STALE_STATEID 5875 14.2.13. Operation 15: LOOKUP - Lookup Filename 5877 SYNOPSIS 5879 (cfh), filenames -> (cfh) 5881 ARGUMENT 5883 struct LOOKUP4args { 5884 /* CURRENT_FH: directory */ 5885 pathname4 path; 5886 }; 5888 RESULT 5890 struct LOOKUP4res { 5891 /* CURRENT_FH: object */ 5892 nfsstat4 status; 5893 }; 5895 DESCRIPTION 5897 This operation LOOKUPs or finds a file system object starting from 5898 the directory specified by the current filehandle. LOOKUP 5899 evaluates the pathname contained in the array of names and obtains 5900 a new current filehandle from the final name. All but the final 5901 name in the list must be the names of directories. 5903 If the pathname cannot be evaluated either because a component does 5904 not exist or because the client does not have permission to 5905 evaluate a component of the path, then an error will be returned 5906 and the current filehandle will be unchanged. 5908 If the path is a zero length array, if any component does not obey 5909 the UTF-8 definition, or if any component in the path is of zero 5910 length, the error NFS4ERR_INVAL will be returned. 5912 IMPLEMENTATION 5914 If the client prefers a partial evaluation of the path then a 5915 sequence of LOOKUP operations can be substituted e.g. 5917 PUTFH (directory filehandle) 5918 LOOKUP "pub" "foo" "bar" 5919 GETFH 5921 or, if the client wishes to obtain the intermediate filehandles 5923 PUTFH (directory filehandle) 5924 LOOKUP "pub" 5925 GETFH 5926 LOOKUP "foo" 5927 GETFH 5928 LOOKUP "bar" 5929 GETFH 5931 NFS version 4 servers depart from the semantics of previous NFS 5932 versions in allowing LOOKUP requests to cross mountpoints on the 5933 server. The client can detect a mountpoint crossing by comparing 5934 the fsid attribute of the directory with the fsid attribute of the 5935 directory looked up. If the fsids are different then the new 5936 directory is a server mountpoint. Unix clients that detect a 5937 mountpoint crossing will need to mount the server's filesystem. 5938 This needs to be done to maintain the file object identity checking 5939 mechanisms common to Unix clients. 5941 Servers that limit NFS access to "shares" or "exported" filesystems 5942 should provide a pseudo-filesystem into which the exported 5943 filesystems can be integrated, so that clients can browse the 5944 server's name space. The clients view of a pseudo filesystem will 5945 be limited to paths that lead to exported filesystems. 5947 Note: previous versions of the protocol assigned special semantics 5948 to the names "." and "..". NFS version 4 assigns no special 5949 semantics to these names. The LOOKUPP operator must be used to 5950 lookup a parent directory. 5952 Note that this procedure does not follow symbolic links. The 5953 client is responsible for all parsing of filenames including 5954 filenames that are modified by symbolic links encountered during 5955 the lookup process. 5957 If the current file handle supplied is not a directory but a 5958 symbolic link, the error NFS4ERR_SYMLINK is returned as the error. 5959 For all other non-directory file types, the error NFS4ERR_NOTDIR is 5960 returned. 5962 ERRORS 5964 NFS4ERR_ACCES 5965 NFS4ERR_BADHANDLE 5966 NFS4ERR_FHEXPIRED 5967 NFS4ERR_INVAL 5968 NFS4ERR_IO 5969 NFS4ERR_MOVED 5970 NFS4ERR_NAMETOOLONG 5971 NFS4ERR_NOENT 5972 NFS4ERR_NOFILEHANDLE 5973 NFS4ERR_NOTDIR 5974 NFS4ERR_RESOURCE 5975 NFS4ERR_SERVERFAULT 5976 NFS4ERR_STALE 5977 NFS4ERR_SYMLINK 5978 NFS4ERR_WRONGSEC 5980 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory 5982 SYNOPSIS 5984 (cfh) -> (cfh) 5986 ARGUMENT 5988 /* CURRENT_FH: object */ 5989 void; 5991 RESULT 5993 struct LOOKUPP4res { 5994 /* CURRENT_FH: directory */ 5995 nfsstat4 status; 5996 }; 5998 DESCRIPTION 6000 The current filehandle is assumed to refer to a regular directory 6001 or a named attribute directory. LOOKUPP assigns the filehandle for 6002 its parent directory to be the current filehandle. If there is no 6003 parent directory an NFS4ERR_ENOENT error must be returned. 6004 Therefore, NFS4ERR_ENOENT will be returned by the server when the 6005 current filehandle is at the root or top of the server's file tree. 6007 IMPLEMENTATION 6009 As for LOOKUP, LOOKUPP will also cross mountpoints. 6011 If the current filehandle is not a directory or named attribute 6012 directory, the error NFS4ERR_NOTDIR is returned. 6014 ERRORS 6016 NFS4ERR_ACCES 6017 NFS4ERR_BADHANDLE 6018 NFS4ERR_FHEXPIRED 6019 NFS4ERR_INVAL 6020 NFS4ERR_IO 6021 NFS4ERR_MOVED 6022 NFS4ERR_NOENT 6023 NFS4ERR_NOFILEHANDLE 6024 NFS4ERR_NOTDIR 6025 NFS4ERR_RESOURCE 6026 NFS4ERR_SERVERFAULT 6027 NFS4ERR_STALE 6028 NFS4ERR_WRONGSEC 6030 14.2.15. Operation 17: NVERIFY - Verify Difference in Attributes 6032 SYNOPSIS 6034 (cfh), fattr -> - 6036 ARGUMENT 6038 struct NVERIFY4args { 6039 /* CURRENT_FH: object */ 6040 fattr4 obj_attributes; 6041 }; 6043 RESULT 6045 struct NVERIFY4res { 6046 nfsstat4 status; 6047 }; 6049 DESCRIPTION 6051 This operation is used to prefix a sequence of operations to be 6052 performed if one or more attributes have changed on some filesystem 6053 object. If all the attributes match then the error NFS4ERR_SAME 6054 must be returned. 6056 On success, the current filehandle retains its value. 6058 IMPLEMENTATION 6060 This operation is useful as a cache validation operator. If the 6061 object to which the attributes belong has changed then the 6062 following operations may obtain new data associated with that 6063 object. For instance, to check if a file has been changed and 6064 obtain new data if it has: 6066 PUTFH (public) 6067 LOOKUP "pub" "foo" "bar" 6068 NVERIFY attrbits attrs 6069 READ 0 32767 6071 In the case that a recommended attribute is specified in the 6072 NVERIFY operation and the server does not support that attribute 6073 for the file system object, the error NFS4ERR_NOTSUPP is returned 6074 to the client. 6076 ERRORS 6078 NFS4ERR_ACCES 6079 NFS4ERR_BADHANDLE 6080 NFS4ERR_DELAY 6081 NFS4ERR_FHEXPIRED 6082 NFS4ERR_INVAL 6083 NFS4ERR_IO 6084 NFS4ERR_MOVED 6085 NFS4ERR_NOFILEHANDLE 6086 NFS4ERR_NOTSUPP 6087 NFS4ERR_RESOURCE 6088 NFS4ERR_SAME 6089 NFS4ERR_SERVERFAULT 6090 NFS4ERR_STALE 6091 NFS4ERR_WRONGSEC 6093 14.2.16. Operation 18: OPEN - Open a Regular File 6095 SYNOPSIS 6097 (cfh), claim, openhow, owner, seqid, access, deny -> (cfh), 6098 stateid, cinfo, rflags, open_confirm, delegation 6100 ARGUMENT 6102 struct OPEN4args { 6103 open_claim4 claim; 6104 openflag4 openhow; 6105 nfs_lockowner4 owner; 6106 seqid4 seqid; 6107 uint32_t share_access; 6108 uint32_t share_deny; 6109 }; 6111 enum createmode4 { 6112 UNCHECKED4 = 0, 6113 GUARDED4 = 1, 6114 EXCLUSIVE4 = 2 6115 }; 6117 union createhow4 switch (createmode4 mode) { 6118 case UNCHECKED4: 6119 case GUARDED4: 6120 fattr4 createattrs; 6121 case EXCLUSIVE4: 6122 verifier4 createverf; 6123 }; 6125 enum opentype4 { 6126 OPEN4_NOCREATE = 0, 6127 OPEN4_CREATE = 1 6128 }; 6130 union openflag4 switch (opentype4 opentype) { 6131 case OPEN4_CREATE: 6132 createhow4 how; 6133 default: 6134 void; 6135 }; 6137 /* Next definitions used for OPEN delegation */ 6138 enum limit_by4 { 6139 NFS_LIMIT_SIZE = 1, 6140 NFS_LIMIT_BLOCKS = 2 6141 /* others as needed */ 6142 }; 6143 struct nfs_modified_limit4 { 6144 uint32_t num_blocks; 6145 uint32_t bytes_per_block; 6146 }; 6148 union nfs_space_limit4 switch (limit_by4 limitby) { 6149 /* limit specified as file size */ 6150 case NFS_LIMIT_SIZE: 6151 uint64_t filesize; 6152 /* limit specified by number of blocks */ 6153 case NFS_LIMIT_BLOCKS: 6154 nfs_modified_limit4 mod_blocks; 6155 } ; 6157 enum open_delegation_type4 { 6158 OPEN_DELEGATE_NONE = 0, 6159 OPEN_DELEGATE_READ = 1, 6160 OPEN_DELEGATE_WRITE = 2 6161 }; 6163 enum open_claim_type4 { 6164 CLAIM_NULL = 0, 6165 CLAIM_PREVIOUS = 1, 6166 CLAIM_DELEGATE_CUR = 2, 6167 CLAIM_DELEGATE_PREV = 3 6168 }; 6170 struct open_claim_delegate_cur4 { 6171 pathname4 file; 6172 stateid4 delegate_stateid; 6173 }; 6175 union open_claim4 switch (open_claim_type4 claim) { 6176 /* 6177 * No special rights to file. Ordinary OPEN of the specified file. 6178 */ 6179 case CLAIM_NULL: 6180 /* CURRENT_FH: directory */ 6181 pathname4 file; 6183 /* 6184 * Right to the file established by an open previous to server 6185 * reboot. File identified by filehandle obtained at that time 6186 * rather than by name. 6187 */ 6188 case CLAIM_PREVIOUS: 6189 /* CURRENT_FH: file being reclaimed */ 6190 uint32_t delegate_type; 6192 /* 6193 * Right to file based on a delegation granted by the server. 6194 * File is specified by name. 6196 */ 6197 case CLAIM_DELEGATE_CUR: 6198 /* CURRENT_FH: directory */ 6199 open_claim_delegate_cur4 delegate_cur_info; 6201 /* Right to file based on a delegation granted to a previous boot 6202 * instance of the client. File is specified by name. 6203 */ 6204 case CLAIM_DELEGATE_PREV: 6205 /* CURRENT_FH: directory */ 6206 pathname4 file_delegate_prev; 6207 }; 6209 RESULT 6211 struct open_read_delegation4 { 6212 stateid4 stateid; /* Stateid for delegation*/ 6213 bool recall; /* Pre-recalled flag for 6214 delegations obtained 6215 by reclaim 6216 (CLAIM_PREVIOUS) */ 6217 nfsace4 permissions; /* Defines users who don't 6218 need an ACCESS call to 6219 open for read */ 6220 }; 6222 struct open_write_delegation4 { 6223 stateid4 stateid; /* Stateid for delegation*/ 6224 bool recall; /* Pre-recalled flag for 6225 delegations obtained 6226 by reclaim 6227 (CLAIM_PREVIOUS) */ 6228 nfs_space_limit4 space_limit; /* Defines condition that 6229 the client must check to 6230 determine whether the 6231 file needs to be flushed 6232 to the server on close. 6233 */ 6234 nfsace4 permissions; /* Defines users who don't 6235 need an ACCESS call as 6236 part of a delegated 6237 open. */ 6238 }; 6240 union open_delegation4 6241 switch (open_delegation_type4 delegation_type) { 6242 case OPEN_DELEGATE_NONE: 6243 void; 6244 case OPEN_DELEGATE_READ: 6245 open_read_delegation4 read; 6247 case OPEN_DELEGATE_WRITE: 6248 open_write_delegation4 write; 6249 }; 6251 const OPEN4_RESULT_MLOCK = 0x00000001; 6252 const OPEN4_RESULT_CONFIRM= 0x00000002; 6254 struct OPEN4resok { 6255 stateid4 stateid; /* Stateid for open */ 6256 change_info4 cinfo; /* Directory Change Info */ 6257 uint32_t rflags; /* Result flags */ 6258 verifier4 open_confirm; /* OPEN_CONFIRM verifier */ 6259 open_delegation4 delegation; /* Info on any open 6260 delegation */ 6261 }; 6263 union OPEN4res switch (nfsstat4 status) { 6264 case NFS4_OK: 6265 /* CURRENT_FH: opened file */ 6266 OPEN4resok result; 6267 default: 6268 void; 6269 }; 6271 WARNING TO CLIENT IMPLEMENTORS 6273 OPEN resembles LOOKUP in that it generates a filehandle for the 6274 client to use. Unlike LOOKUP though, OPEN creates server state on 6275 the filehandle. In normal circumstances, the client can only 6276 release this state with a CLOSE operation. CLOSE uses the current 6277 filehandle to determine which file to close. Therefore the client 6278 MUST follow every OPEN operation with a GETFH operation in the same 6279 COMPOUND procedure. This will supply the client with the 6280 filehandle such that CLOSE can be used appropriately. 6282 Simply waiting for the lease on the file to expire is insufficient 6283 because the server may maintain the state indefinitely as long as 6284 another client does not attempt to make a conflicting access to the 6285 same file. 6287 DESCRIPTION 6289 The OPEN operation creates and/or opens a regular file in a 6290 directory with the provided name. If the file does not exist at 6291 the server and creation is desired, specification of the method of 6292 creation is provided by the openhow parameter. The client has the 6293 choice of three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. 6295 UNCHECKED means that the file should be created if a file of that 6296 name does not exist and encountering an existing regular file of 6297 that name is not an error. For this type of create, createattrs 6298 specifies the initial set of attributes for the file. The set of 6299 attributes may includes any writable attribute valid for regular 6300 files. When an UNCHECKED create encounters an existing file, the 6301 attributes specified by createattrs is not used, except that when 6302 an object_size of zero is specified, the existing file is 6303 truncated. If GUARDED is specified, the server checks for the 6304 presence of a duplicate object by name before performing the 6305 create. If a duplicate exists, an error of NFS4ERR_EXIST is 6306 returned as the status. If the object does not exist, the request 6307 is performed as described for UNCHECKED. 6309 EXCLUSIVE specifies that the server is to follow exclusive creation 6310 semantics, using the verifier to ensure exclusive creation of the 6311 target. The server should check for the presence of a duplicate 6312 object by name. If the object does not exist, the server creates 6313 the object and stores the verifier with the object. If the object 6314 does exist and the stored verifier matches the client provided 6315 verifier, the server uses the existing object as the newly created 6316 object. If the stored verifier does not match, then an error of 6317 NFS4ERR_EXIST is returned. No attributes may be provided in this 6318 case, since the server may use an attribute of the target object to 6319 store the verifier. 6321 For the target directory, the server returns change_info4 6322 information in cinfo. With the atomic field of the change_info4 6323 struct, the server will indicate if the before and after change 6324 attributes were obtained atomically with respect to the link 6325 creation. 6327 Upon successful creation, the current filehandle is replaced by 6328 that of the new object. 6330 The OPEN procedure provides for DOS SHARE capability with the use 6331 of the access and deny fields of the OPEN arguments. The client 6332 specifies at OPEN the required access and deny modes. For clients 6333 that do not directly support SHAREs (i.e. Unix), the expected deny 6334 value is DENY_NONE. In the case that there is a existing SHARE 6335 reservation that conflicts with the OPEN request, the server 6336 returns the error NFS4ERR_DENIED. For a complete SHARE request, 6337 the client must provide values for the owner and seqid fields for 6338 the OPEN argument. For additional discussion of SHARE semantics 6339 see the section on 'Share Reservations'. 6341 In the case that the client is recovering state from a server 6342 failure, the reclaim field of the OPEN argument is used to signify 6343 that the request is meant to reclaim state previously held. 6345 The "claim" field of the OPEN argument is used to specify the file 6346 to be opened and the state information which the client claims to 6347 possess. There are four basic claim types which cover the various 6348 situations for an OPEN. They are as follows: 6350 CLAIM_NULL 6351 For the client, this is a new OPEN 6352 request and there is no previous state 6353 associate with the file for the client. 6355 CLAIM_PREVIOUS 6356 The client is claiming basic OPEN state 6357 for a file that was held previous to a 6358 server reboot. Generally used when a 6359 server is returning persistent file 6360 handles; the client may not have the 6361 file name to reclaim the OPEN. 6363 CLAIM_DELEGATE_CUR 6364 The client is claiming a delegation for 6365 OPEN as granted by the server. 6366 Generally this is done as part of 6367 recalling a delegation. 6369 CLAIM_DELEGATE_PREV 6370 The client is claiming a delegation 6371 granted to a previous client instance; 6372 used after the client reboots. 6374 For OPEN requests whose claim type is other than CLAIM_PREVIOUS 6375 (i.e. requests other than those devoted to reclaiming opens after a 6376 server reboot) that reach the server during its grace or lease 6377 expiration period, the server returns an error of NFS4ERR_GRACE. 6379 For any OPEN request, the server may return an open delegation, 6380 which allows further opens and closes to be handled locally on the 6381 client as described in the section Open Delegation. Note that 6382 delegation is up to the server to decide. The client should never 6383 assume that delegation will or will not be granted in a particular 6384 instance. It should always be prepared for either case. A partial 6385 exception is the reclaim (CLAIM_PREVIOUS) case, in which a 6386 delegation type is claimed. In this case, delegation will always 6387 be granted, although the server may specify an immediate recall in 6388 the delegation structure. 6390 The rflags returned by a successful OPEN allow the server to return 6391 information governing how the open file is to be handled. 6392 OPEN4_RESULT_MLOCK indicates to the caller that mandatory locking 6393 is in effect for this file and the client should act appropriately 6394 with regard to data cached on the client. OPEN4_RESULT_CONFIRM 6395 indicates that the client MUST execute an OPEN_CONFIRM operation 6396 before using the open file. 6398 If the file is a zero length array, if any component does not obey 6399 the UTF-8 definition, or if any component in the path is of zero 6400 length, the error NFS4ERR_INVAL will be returned. 6402 When an OPEN is done and the specified lockowner already has the 6403 resulting filehandle open, the result is to "OR" together the new 6404 share and deny status together with the existing status. In this 6405 case, only a single CLOSE need be done, even though multiple OPEN's 6406 were completed. 6408 IMPLEMENTATION 6410 The OPEN procedure contains support for EXCLUSIVE create. The 6411 mechanism is similar to the support in NFS version 3 [RFC1813]. As 6412 in NFS version 3, this mechanism provides reliable exclusive 6413 creation. Exclusive create is invoked when the how parameter is 6414 EXCLUSIVE. In this case, the client provides a verifier that can 6415 reasonably be expected to be unique. A combination of a client 6416 identifier, perhaps the client network address, and a unique number 6417 generated by the client, perhaps the RPC transaction identifier, 6418 may be appropriate. 6420 If the object does not exist, the server creates the object and 6421 stores the verifier in stable storage. For file systems that do not 6422 provide a mechanism for the storage of arbitrary file attributes, 6423 the server may use one or more elements of the object meta-data to 6424 store the verifier. The verifier must be stored in stable storage 6425 to prevent erroneous failure on retransmission of the request. It 6426 is assumed that an exclusive create is being performed because 6427 exclusive semantics are critical to the application. Because of the 6428 expected usage, exclusive CREATE does not rely solely on the 6429 normally volatile duplicate request cache for storage of the 6430 verifier. The duplicate request cache in volatile storage does not 6431 survive a crash and may actually flush on a long network partition, 6432 opening failure windows. In the UNIX local file system 6433 environment, the expected storage location for the verifier on 6434 creation is the meta-data (time stamps) of the object. For this 6435 reason, an exclusive object create may not include initial 6436 attributes because the server would have nowhere to store the 6437 verifier. 6439 If the server can not support these exclusive create semantics, 6440 possibly because of the requirement to commit the verifier to 6441 stable storage, it should fail the OPEN request with the error, 6442 NFS4ERR_NOTSUPP. 6444 During an exclusive CREATE request, if the object already exists, 6445 the server reconstructs the object's verifier and compares it with 6446 the verifier in the request. If they match, the server treats the 6447 request as a success. The request is presumed to be a duplicate of 6448 an earlier, successful request for which the reply was lost and 6449 that the server duplicate request cache mechanism did not detect. 6450 If the verifiers do not match, the request is rejected with the 6451 status, NFS4ERR_EXIST. 6453 Once the client has performed a successful exclusive create, it 6454 must issue a SETATTR to set the correct object attributes. Until 6455 it does so, it should not rely upon any of the object attributes, 6456 since the server implementation may need to overload object meta- 6457 data to store the verifier. The subsequent SETATTR must not occur 6458 in the same COMPOUND request as the OPEN. This separation will 6459 guarantee that the exclusive create mechanism will continue to 6460 function properly in the face of retransmission of the request. 6462 Use of the GUARDED attribute does not provide exactly-once 6463 semantics. In particular, if a reply is lost and the server does 6464 not detect the retransmission of the request, the procedure can 6465 fail with NFS4ERR_EXIST, even though the create was performed 6466 successfully. 6468 For SHARE reservations, the client must specify a value for access 6469 that is one of READ, WRITE, or BOTH. For deny, the client must 6470 specify one of NONE, READ, WRITE, or BOTH. If the client fails to 6471 do this, the server must return NFS4ERR_INVAL. 6473 If the final component provided to OPEN is a symbolic link, the 6474 error NFS4ERR_SYMLINK will be returned to the client. If an 6475 intermediate component of the pathname provided to OPEN is a 6476 symbolic link, the error NFS4ERR_NOTDIR will be returned to the 6477 client. 6479 ERRORS 6481 NFS4ERR_ACCES 6482 NFS4ERR_BAD_SEQID 6483 NFS4ERR_DELAY 6484 NFS4ERR_DQUOT 6485 NFS4ERR_EXIST 6486 NFS4ERR_FHEXPIRED 6487 NFS4ERR_GRACE 6488 NFS4ERR_IO 6489 NFS4ERR_LEASE_MOVED 6490 NFS4ERR_MOVED 6491 NFS4ERR_NAMETOOLONG 6492 NFS4ERR_NOFILEHANDLE 6493 NFS4ERR_NOSPC 6494 NFS4ERR_NOTDIR 6495 NFS4ERR_NOTSUPP 6496 NFS4ERR_RESOURCE 6497 NFS4ERR_ROFS 6498 NFS4ERR_SERVERFAULT 6499 NFS4ERR_SHARE_DENIED 6500 NFS4ERR_STALE_CLIENTID 6501 NFS4ERR_SYMLINK 6503 14.2.17. Operation 19: OPENATTR - Open Named Attribute Directory 6505 SYNOPSIS 6507 (cfh) -> (cfh) 6509 ARGUMENT 6511 /* CURRENT_FH: file or directory */ 6512 void; 6514 RESULT 6516 struct OPENATTR4res { 6517 /* CURRENT_FH: name attr directory*/ 6518 nfsstat4 status; 6519 }; 6521 DESCRIPTION 6523 The OPENATTR operation is used to obtain the filehandle of the 6524 named attribute directory associated with the current filehandle. 6525 The result of the OPENATTR will be a filehandle to an object of 6526 type NF4ATTRDIR. From this filehandle, READDIR and LOOKUP 6527 procedures can be used to obtain filehandles for the various named 6528 attributes associated with the original file system object. 6529 Filehandles returned within the named attribute directory will have 6530 a type of NF4NAMEDATTR. 6532 IMPLEMENTATION 6534 If the server does not support named attributes for the current 6535 filehandle, an error of NFS4ERR_NOTSUPP will be returned to the 6536 client. 6538 ERRORS 6540 NFS4ERR_ACCES 6541 NFS4ERR_BADHANDLE 6542 NFS4ERR_DELAY 6543 NFS4ERR_FHEXPIRED 6544 NFS4ERR_INVAL 6545 NFS4ERR_IO 6546 NFS4ERR_MOVED 6547 NFS4ERR_NOENT 6548 NFS4ERR_NOFILEHANDLE 6549 NFS4ERR_NOTSUPP 6550 NFS4ERR_RESOURCE 6551 NFS4ERR_SERVERFAULT 6552 NFS4ERR_STALE 6553 NFS4ERR_WRONGSEC 6555 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open 6557 SYNOPSIS 6559 (cfh), seqid, open_confirm-> stateid 6561 ARGUMENT 6563 struct OPEN_CONFIRM4args { 6564 /* CURRENT_FH: opened file */ 6565 seqid4 seqid; 6566 verifier4 open_confirm; /* OPEN_CONFIRM verifier */ 6567 }; 6569 RESULT 6571 struct OPEN_CONFIRM4resok { 6572 stateid4 stateid; 6573 }; 6575 union OPEN_CONFIRM4res switch (nfsstat4 status) { 6576 case NFS4_OK: 6577 OPEN_CONFIRM4resok resok4; 6578 default: 6579 void; 6580 }; 6582 DESCRIPTION 6584 This operation is used to confirm the sequence id usage for the 6585 first time that a nfs_lockowner is used by a client. The OPEN 6586 operation returns a opaque confirmation verifier that is then 6587 passed to this operation along with the next sequence id for the 6588 nfs_lockowner. The sequence id passed to the OPEN_CONFIRM must be 6589 1 (one) greater than the seqid passed to the OPEN operation from 6590 which the open_confirm value was obtained. If the server receives 6591 an unexpected sequence id with respect to the original open, then 6592 the server assumes that the client will not confirm the original 6593 OPEN and all state associated with the original OPEN is released by 6594 the server. 6596 On success, the current filehandle retains its value. 6598 IMPLEMENTATION 6600 A given client might generate many nfs_lockowner data structures 6601 for a given clientid. The client will periodically either dispose 6602 of its nfs_lockowners or stop using them for indefinite periods of 6603 time. The latter situation is why the NFS version 4 protocol does 6604 not have a an explicit operation to exit an nfs_lockowner: such an 6605 operation is of no use in that situation. Instead, to avoid 6606 unbounded memory use, the server needs to implement a strategy for 6607 disposing of nfs_lockowners that have no current lock, open, or 6608 delegation state for any files and have not been used recently. 6609 The time period used to determine when to dispose of nfs_lockowners 6610 is an implementation choice. The time period should certainly be 6611 no less than the lease time plus any grace period the server wishes 6612 to implement beyond a lease time. The OPEN_CONFIRM operation 6613 allows the server to safely dispose of unused nfs_lockowner data 6614 structures. 6616 In the case that a client issues an OPEN operation and the server 6617 no longer has a record of the nfs_lockowner, the server needs 6618 ensure that this is a new OPEN and not a replay or retransmission. 6620 A lazy server implementation might require confirmation for every 6621 nfs_lockowner for which it has no record. However, this is not 6622 necessary until the server records the fact that it has disposed of 6623 one nfs_lockowner for the given clientid. 6625 The server must hold unconfirmed OPEN state until one of three 6626 events occur. First, the client sends an OPEN_CONFIRM request with 6627 the appropriate sequence id and confirmation verifier within the 6628 lease period. In this case, the OPEN state on the server goes to 6629 confirmed, and the nfs_lockowner on the server is fully 6630 established. 6632 Second, the client sends another OPEN request with a sequence id 6633 that is incorrect for the nfs_lockowner (out of sequence). In this 6634 case, the server assumes the second OPEN request is valid and the 6635 first one is a replay. The server cancels the OPEN state of the 6636 first OPEN request, establishes an unconfirmed OPEN state for the 6637 second OPEN request, and responds to the second OPEN request with 6638 an indication that an OPEN_CONFIRM is needed. The process then 6639 repeats itself. While there is a potential for a denial of service 6640 attack on the client, it is mitigated if the client and server 6641 require the use of a security flavor based on Kerberos V5, LIPKEY, 6642 or some other flavor that uses cryptography. 6644 What if the server is in the unconfirmed OPEN state for a given 6645 nfs_lockowner, and it receives an operation on the nfs_lockowner 6646 that has a stateid but the operation is not OPEN, or it is 6647 OPEN_CONFIRM but with the wrong confirmation verifier? Then, even 6648 if the seqid is correct, the server returns NFS4ERR_BAD_STATEID, 6649 because the server assumes the operation is a replay: if the server 6650 has no established OPEN state, then there is no way, for example, a 6651 LOCK operation could be valid. 6653 Third, neither of the two aforementioned events occur for the 6654 nfs_lockowner within the lease period. In this case, the OPEN 6655 state is cancelled and disposal of the nfs_lockowner can occur. 6657 ERRORS 6659 NFS4ERR_BADHANDLE 6660 NFS4ERR_BAD_SEQID 6661 NFS4ERR_EXPIRED 6662 NFS4ERR_FHEXPIRED 6663 NFS4ERR_GRACE 6664 NFS4ERR_INVAL 6665 NFS4ERR_MOVED 6666 NFS4ERR_NOENT 6667 NFS4ERR_NOFILEHANDLE 6668 NFS4ERR_NOTSUPP 6669 NFS4ERR_RESOURCE 6670 NFS4ERR_SERVERFAULT 6671 NFS4ERR_STALE 6672 NFS4ERR_WRONGSEC 6674 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access 6676 SYNOPSIS 6678 (cfh), stateid, seqid, access, deny -> stateid 6680 ARGUMENT 6682 struct OPEN_DOWNGRADE4args { 6683 /* CURRENT_FH: opened file */ 6684 stateid4 stateid; 6685 seqid4 seqid; 6686 uint32_t share_access; 6687 uint32_t share_deny; 6688 }; 6690 RESULT 6692 struct OPEN_DOWNGRADE4resok { 6693 stateid4 stateid; 6694 }; 6696 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 6697 case NFS4_OK: 6698 OPEN_DOWNGRADE4resok resok4; 6699 default: 6700 void; 6701 }; 6703 This operation is used to adjust the access and deny bits for a given 6704 open. This is necessary when a given lockowner opens the same file 6705 multiple times with different access and deny flags. In this 6706 situation, a close of one of the open's may change the appropriate 6707 access and deny flags to remove bits associated with open's no longer 6708 in effect. 6710 The access and deny bits specified in this operation replace the 6711 current ones for the specified open file. If either the access or 6712 the deny mode specified includes bits not in effect for the open, the 6713 error NFS4ERR_INVAL should be returned. Since access and deny bits 6714 are subsets of those already granted, it is not possible for this 6715 request to be denied because of conflicting share reservations. 6717 On success, the current filehandle retains its value. 6719 ERRORS 6721 NFS4ERR_BADHANDLE 6722 NFS4ERR_BAD_SEQID 6723 NFS4ERR_BAD_STATEID 6724 NFS4ERR_EXPIRED 6725 NFS4ERR_FHEXPIRED 6726 NFS4ERR_INVAL 6727 NFS4ERR_MOVED 6728 NFS4ERR_NOFILEHANDLE 6729 NFS4ERR_OLD_STATEID 6730 NFS4ERR_RESOURCE 6731 NFS4ERR_SERVERFAULT 6732 NFS4ERR_STALE 6733 NFS4ERR_STALE_STATEID 6735 14.2.20. Operation 22: PUTFH - Set Current Filehandle 6737 SYNOPSIS 6739 filehandle -> (cfh) 6741 ARGUMENT 6743 struct PUTFH4args { 6744 nfs4_fh object; 6745 }; 6747 RESULT 6749 struct PUTFH4res { 6750 /* CURRENT_FH: */ 6751 nfsstat4 status; 6752 }; 6754 DESCRIPTION 6756 Replaces the current filehandle with the filehandle provided as an 6757 argument. 6759 IMPLEMENTATION 6761 Commonly used as the first operator in an NFS request to set the 6762 context for following operations. 6764 ERRORS 6766 NFS4ERR_BADHANDLE 6767 NFS4ERR_FHEXPIRED 6768 NFS4ERR_MOVED 6769 NFS4ERR_RESOURCE 6770 NFS4ERR_SERVERFAULT 6771 NFS4ERR_STALE 6772 NFS4ERR_WRONGSEC 6774 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle 6776 SYNOPSIS 6778 - -> (cfh) 6780 ARGUMENT 6782 void; 6784 RESULT 6786 struct PUTPUBFH4res { 6787 /* CURRENT_FH: public fh */ 6788 nfsstat4 status; 6789 }; 6791 DESCRIPTION 6793 Replaces the current filehandle with the filehandle that represents 6794 the public filehandle of the server's name space. This filehandle 6795 may be different from the "root" filehandle which may be associated 6796 with some other directory on the server. 6798 IMPLEMENTATION 6800 Used as the first operator in an NFS request to set the context for 6801 following operations. 6803 ERRORS 6805 NFS4ERR_RESOURCE 6806 NFS4ERR_SERVERFAULT 6807 NFS4ERR_WRONGSEC 6809 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle 6811 SYNOPSIS 6813 - -> (cfh) 6815 ARGUMENT 6817 void; 6819 RESULT 6821 struct PUTROOTFH4res { 6822 /* CURRENT_FH: root fh */ 6823 nfsstat4 status; 6824 }; 6826 DESCRIPTION 6828 Replaces the current filehandle with the filehandle that represents 6829 the root of the server's name space. From this filehandle a LOOKUP 6830 operation can locate any other filehandle on the server. This 6831 filehandle may be different from the "public" filehandle which may 6832 be associated with some other directory on the server. 6834 IMPLEMENTATION 6836 Commonly used as the first operator in an NFS request to set the 6837 context for following operations. 6839 ERRORS 6841 NFS4ERR_RESOURCE 6842 NFS4ERR_SERVERFAULT 6843 NFS4ERR_WRONGSEC 6845 14.2.23. Operation 25: READ - Read from File 6847 SYNOPSIS 6849 (cfh), offset, count, stateid -> eof, data 6851 ARGUMENT 6853 struct READ4args { 6854 /* CURRENT_FH: file */ 6855 stateid4 stateid; 6856 offset4 offset; 6857 count4 count; 6858 }; 6860 RESULT 6862 struct READ4resok { 6863 bool eof; 6864 opaque data<>; 6865 }; 6867 union READ4res switch (nfsstat4 status) { 6868 case NFS4_OK: 6869 READ4resok resok4; 6870 default: 6871 void; 6872 }; 6874 DESCRIPTION 6876 The READ operation reads data from the regular file identified by 6877 the current filehandle. 6879 The client provides an offset of where the READ is to start and a 6880 count of how many bytes are to be read. An offset of 0 (zero) 6881 means to read data starting at the beginning of the file. If 6882 offset is greater than or equal to the size of the file, the 6883 status, NFS4_OK, is returned with a data length set to 0 (zero) and 6884 eof is set to TRUE. The READ is subject to access permissions 6885 checking. 6887 If the client specifies a count value of 0 (zero), the READ 6888 succeeds and returns 0 (zero) bytes of data again subject to access 6889 permissions checking. The server may choose to return fewer bytes 6890 than specified by the client. The client needs to check for this 6891 condition and handle the condition appropriately. 6893 The stateid value for a READ request represents a value returned 6894 from a previous record lock or share reservation request. Used by 6895 the server to verify that the associated lock is still valid and to 6896 update lease timeouts for the client. 6898 If the read ended at the end-of-file (formally, in a correctly 6899 formed READ request, if offset + count is equal to the size of the 6900 file), or the read request extends beyond the size of the file (if 6901 offset + count is greater than the size of the file), eof is 6902 returned as TRUE; otherwise it is FALSE. A successful READ of an 6903 empty file will always return eof as TRUE. 6905 On success, the current filehandle retains its value. 6907 IMPLEMENTATION 6909 It is possible for the server to return fewer than count bytes of 6910 data. If the server returns less than the count requested and eof 6911 set to FALSE, the client should issue another READ to get the 6912 remaining data. A server may return less data than requested under 6913 several circumstances. The file may have been truncated by another 6914 client or perhaps on the server itself, changing the file size from 6915 what the requesting client believes to be the case. This would 6916 reduce the actual amount of data available to the client. It is 6917 possible that the server may back off the transfer size and reduce 6918 the read request return. Server resource exhaustion may also occur 6919 necessitating a smaller read return. 6921 If the file is locked the server will return an NFS4ERR_LOCKED 6922 error. Since the lock may be of short duration, the client may 6923 choose to retransmit the READ request (with exponential backoff) 6924 until the operation succeeds. 6926 ERRORS 6928 NFS4ERR_ACCES 6929 NFS4ERR_BADHANDLE 6930 NFS4ERR_BAD_STATEID 6931 NFS4ERR_DELAY 6932 NFS4ERR_DENIED 6933 NFS4ERR_EXPIRED 6934 NFS4ERR_FHEXPIRED 6935 NFS4ERR_GRACE 6936 NFS4ERR_INVAL 6937 NFS4ERR_IO 6938 NFS4ERR_LOCKED 6939 NFS4ERR_LEASE_MOVED 6940 NFS4ERR_MOVED 6941 NFS4ERR_NOFILEHANDLE 6942 NFS4ERR_NXIO 6943 NFS4ERR_OLD_STATEID 6944 NFS4ERR_RESOURCE 6945 NFS4ERR_SERVERFAULT 6946 NFS4ERR_STALE 6947 NFS4ERR_STALE_STATEID 6948 NFS4ERR_WRONGSEC 6950 14.2.24. Operation 26: READDIR - Read Directory 6952 SYNOPSIS 6953 (cfh), cookie, cookieverf, dircount, maxcount, attrbits -> 6954 cookieverf { cookie, filename, attrbits, attributes } 6956 ARGUMENT 6958 struct READDIR4args { 6959 /* CURRENT_FH: directory */ 6960 nfs_cookie4 cookie; 6961 verifier4 cookieverf; 6962 count4 dircount; 6963 count4 maxcount; 6964 bitmap4 attr_request; 6965 }; 6967 RESULT 6969 struct entry4 { 6970 nfs_cookie4 cookie; 6971 component4 name; 6972 fattr4 attrs; 6973 entry4 *nextentry; 6974 }; 6976 struct dirlist4 { 6977 entry4 *entries; 6978 bool eof; 6979 }; 6981 struct READDIR4resok { 6982 verifier4 cookieverf; 6983 dirlist4 reply; 6984 }; 6986 union READDIR4res switch (nfsstat4 status) { 6987 case NFS4_OK: 6988 READDIR4resok resok4; 6989 default: 6990 void; 6991 }; 6993 DESCRIPTION 6995 The READDIR operation retrieves a variable number of entries from a 6996 file system directory and returns client requested attributes for 6997 each entry along with information to allow the client to request 6998 additional directory entries in a subsequent READDIR. 7000 The arguments contain a cookie value that represents where the 7001 READDIR should start within the directory. A value of 0 (zero) for 7002 the cookie is used to start reading at the beginning of the 7003 directory. For subsequent READDIR requests, the client specifies a 7004 cookie value that is provided by the server on a previous READDIR 7005 request. 7007 The cookieverf value should be set to 0 (zero) when the cookie 7008 value is 0 (zero) (first directory read). On subsequent requests, 7009 it should be a cookieverf as returned by the server. The 7010 cookieverf must match that returned by the READDIR in which the 7011 cookie was acquired. 7013 The dircount portion of the argument is a hint of the maximum 7014 number of bytes of directory information that should be returned. 7015 This value represents the length of the names of the directory 7016 entries and the cookie value for these entries. This length 7017 represents the XDR encoding of the data (names and cookies) and not 7018 the length in the native format of the server. The server may 7019 return less data. 7021 The maxcount value of the argument is the maximum number of bytes 7022 for the result. This maximum size represents all of the data being 7023 returned and includes the XDR overhead. The server may return less 7024 data. If the server is unable to return a single directory entry 7025 within the maxcount limit, the error NFS4ERR_READDIR_NOSPC will be 7026 returned to the client. 7028 Finally, attrbits represents the list of attributes to be returned 7029 for each directory entry supplied by the server. 7031 On successful return, the server's response will provide a list of 7032 directory entries. Each of these entries contains the name of the 7033 directory entry, a cookie value for that entry, and the associated 7034 attributes as requested. 7036 The cookie value is only meaningful to the server and is used as a 7037 "bookmark" for the directory entry. As mentioned, this cookie is 7038 used by the client for subsequent READDIR operations so that it may 7039 continue reading a directory. The cookie is similar in concept to 7040 a READ offset but should not be interpreted as such by the client. 7041 Ideally, the cookie value should not change if the directory is 7042 modified since the client may be caching these values. 7044 In some cases, the server may encounter an error while obtaining 7045 the attributes for a directory entry. Instead of returning an 7046 error for the entire READDIR operation, the server can instead 7047 return the attribute 'fattr4_rdattr_error'. With this, the server 7048 is able to communicate the failure to the client and not fail the 7049 entire operation in the instance of what might be a transient 7050 failure. Obviously, the client must request the 7051 fattr4_rdattr_error attribute for this method to work properly. If 7052 the client does not request the attribute, the server has no choice 7053 but to return failure for the entire READDIR operation. 7055 For some file system environments, the directory entries "." and 7056 ".." have special meaning and in other environments, they may not. 7057 If the server supports these special entries within a directory, 7058 they should not be returned to the client as part of the READDIR 7059 response. To enable some client environments, the cookie values of 7060 0, 1, and 2 are to be considered reserved. Note that the Unix 7061 client will use these values when combining the server's response 7062 and local representations to enable a fully formed Unix directory 7063 presentation to the application. 7065 For READDIR arguments, cookie values of 1 and 2 should not be used 7066 and for READDIR results cookie values of 0, 1, and 2 should not 7067 returned. 7069 On success, the current filehandle retains its value. 7071 IMPLEMENTATION 7073 The server's file system directory representations can differ 7074 greatly. A client's programming interfaces may also be bound to 7075 the local operating environment in a way that does not translate 7076 well into the NFS protocol. Therefore the use of the dircount and 7077 maxcount fields are provided to allow the client the ability to 7078 provide guidelines to the server. If the client is aggressive 7079 about attribute collection during a READDIR, the server has an idea 7080 of how to limit the encoded response. The dircount field provides 7081 a hint on the number of entries based solely on the names of the 7082 directory entries. Since it is a hint, it may be possible that a 7083 dircount value is zero. In this case, the server is free to ignore 7084 the dircount value and return directory information based on the 7085 specified maxcount value. 7087 The cookieverf may be used by the server to help manage cookie 7088 values that may become stale. It should be a rare occurrence that 7089 a server is unable to continue properly reading a directory with 7090 the provided cookie/cookieverf pair. The server should make every 7091 effort to avoid this condition since the application at the client 7092 may not be able to properly handle this type of failure. 7094 The use of the cookieverf will also protect the client from using 7095 READDIR cookie values that may be stale. For example, if the file 7096 system has been migrated, the server may or may not be able to use 7097 the same cookie values to service READDIR as the previous server 7098 used. With the client providing the cookieverf, the server is able 7099 to provide the appropriate response to the client. This prevents 7100 the case where the server may accept a cookie value but the 7101 underlying directory has changed and the response is invalid from 7102 the client's context of its previous READDIR. 7104 Since some servers will not be returning "." and ".." entries as 7105 has been done with previous versions of the NFS protocol, the 7106 client that requires these entries be present in READDIR responses 7107 must fabricate them. 7109 ERRORS 7111 NFS4ERR_ACCES 7112 NFS4ERR_BADHANDLE 7113 NFS4ERR_BAD_COOKIE 7114 NFS4ERR_DELAY 7115 NFS4ERR_FHEXPIRED 7116 NFS4ERR_INVAL 7117 NFS4ERR_IO 7118 NFS4ERR_MOVED 7119 NFS4ERR_NOFILEHANDLE 7120 NFS4ERR_NOTDIR 7121 NFS4ERR_NOTSUPP 7122 NFS4ERR_READDIR_NOSPC 7123 NFS4ERR_RESOURCE 7124 NFS4ERR_SERVERFAULT 7125 NFS4ERR_STALE 7126 NFS4ERR_TOOSMALL 7127 NFS4ERR_WRONGSEC 7129 14.2.25. Operation 27: READLINK - Read Symbolic Link 7131 SYNOPSIS 7133 (cfh) -> linktext 7135 ARGUMENT 7137 /* CURRENT_FH: symlink */ 7138 void; 7140 RESULT 7142 struct READLINK4resok { 7143 linktext4 link; 7144 }; 7146 union READLINK4res switch (nfsstat4 status) { 7147 case NFS4_OK: 7148 READLINK4resok resok4; 7149 default: 7150 void; 7151 }; 7153 DESCRIPTION 7155 READLINK reads the data associated with a symbolic link. The data 7156 is a UTF-8 string that is opaque to the server. That is, whether 7157 created by an NFS client or created locally on the server, the data 7158 in a symbolic link is not interpreted when created, but is simply 7159 stored. 7161 On success, the current filehandle retains its value. 7163 IMPLEMENTATION 7165 A symbolic link is nominally a pointer to another file. The data 7166 is not necessarily interpreted by the server, just stored in the 7167 file. It is possible for a client implementation to store a path 7168 name that is not meaningful to the server operating system in a 7169 symbolic link. A READLINK operation returns the data to the client 7170 for interpretation. If different implementations want to share 7171 access to symbolic links, then they must agree on the 7172 interpretation of the data in the symbolic link. 7174 The READLINK operation is only allowed on objects of type NF4LNK. 7175 The server should return the error, NFS4ERR_INVAL, if the object is 7176 not of type, NF4LNK. 7178 ERRORS 7180 NFS4ERR_ACCES 7181 NFS4ERR_BADHANDLE 7182 NFS4ERR_DELAY 7183 NFS4ERR_FHEXPIRED 7184 NFS4ERR_INVAL 7185 NFS4ERR_IO 7186 NFS4ERR_MOVED 7187 NFS4ERR_NOFILEHANDLE 7188 NFS4ERR_NOTSUPP 7189 NFS4ERR_RESOURCE 7190 NFS4ERR_SERVERFAULT 7191 NFS4ERR_STALE 7192 NFS4ERR_WRONGSEC 7194 14.2.26. Operation 28: REMOVE - Remove Filesystem Object 7196 SYNOPSIS 7198 (cfh), filename -> change_info 7200 ARGUMENT 7202 struct REMOVE4args { 7203 /* CURRENT_FH: directory */ 7204 component4 target; 7205 }; 7207 RESULT 7209 struct REMOVE4resok { 7210 change_info4 cinfo; 7211 } 7213 union REMOVE4res switch (nfsstat4 status) { 7214 case NFS4_OK: 7215 REMOVE4resok resok4; 7216 default: 7217 void; 7218 } 7220 DESCRIPTION 7222 The REMOVE operation removes (deletes) a directory entry named by 7223 filename from the directory corresponding to the current 7224 filehandle. If the entry in the directory was the last reference 7225 to the corresponding file system object, the object may be 7226 destroyed. 7228 For the directory where the filename was removed, the server 7229 returns change_info4 information in cinfo. With the atomic field 7230 of the change_info4 struct, the server will indicate if the before 7231 and after change attributes were obtained atomically with respect 7232 to the removal. 7234 If the target has a length of 0 (zero), or if target does not obey 7235 the UTF-8 definition, the error NFS4ERR_INVAL will be returned. 7237 On success, the current filehandle retains its value. 7239 IMPLEMENTATION 7241 NFS versions 2 and 3 required a different operator RMDIR for 7242 directory removal. NFS version 4 REMOVE can be used to delete any 7243 directory entry independent of its file type. 7245 The concept of last reference is server specific. However, if the 7246 numlinks field in the previous attributes of the object had the 7247 value 1, the client should not rely on referring to the object via 7248 a file handle. Likewise, the client should not rely on the 7249 resources (disk space, directory entry, and so on) formerly 7250 associated with the object becoming immediately available. Thus, if 7251 a client needs to be able to continue to access a file after using 7252 REMOVE to remove it, the client should take steps to make sure that 7253 the file will still be accessible. The usual mechanism used is to 7254 RENAME the file from its old name to a new hidden name. 7256 ERRORS 7258 NFS4ERR_ACCES 7259 NFS4ERR_BADHANDLE 7260 NFS4ERR_DELAY 7261 NFS4ERR_FHEXPIRED 7262 NFS4ERR_IO 7263 NFS4ERR_MOVED 7264 NFS4ERR_NAMETOOLONG 7265 NFS4ERR_NOENT 7266 NFS4ERR_NOFILEHANDLE 7267 NFS4ERR_NOTDIR 7268 NFS4ERR_NOTEMPTY 7269 NFS4ERR_NOTSUPP 7270 NFS4ERR_RESOURCE 7271 NFS4ERR_ROFS 7272 NFS4ERR_SERVERFAULT 7273 NFS4ERR_STALE 7274 NFS4ERR_WRONGSEC 7276 14.2.27. Operation 29: RENAME - Rename Directory Entry 7278 SYNOPSIS 7280 (sfh), oldname (cfh), newname -> source_change_info, 7281 target_change_info 7283 ARGUMENT 7285 struct RENAME4args { 7286 /* SAVED_FH: source directory */ 7287 component4 oldname; 7288 /* CURRENT_FH: target directory */ 7289 component4 newname; 7290 }; 7292 RESULT 7294 struct RENAME4resok { 7295 change_info4 source_cinfo; 7296 change_info4 target_cinfo; 7297 }; 7299 union RENAME4res switch (nfsstat4 status) { 7300 case NFS4_OK: 7301 RENAME4resok resok4; 7302 default: 7303 void; 7304 }; 7306 DESCRIPTION 7308 The RENAME operation renames the object identified by oldname in 7309 the source directory corresponding to the saved filehandle, as set 7310 by the SAVEFH operation, to newname in the target directory 7311 corresponding to the current filehandle. The operation is required 7312 to be atomic to the client. Source and target directories must 7313 reside on the same file system on the server. On success, the 7314 current filehandle will continue to be the target directory. 7316 If the target directory already contains an entry with the name, 7317 newname, the source object must be compatible with the target: 7318 either both are non-directories or both are directories and the 7319 target must be empty. If compatible, the existing target is 7320 removed before the rename occurs. If they are not compatible or if 7321 the target is a directory but not empty, the server will return the 7322 error, NFS4ERR_EXIST. 7324 If oldname and newname both refer to the same file (they might be 7325 hard links of each other), then RENAME should perform no action and 7326 return success. 7328 For both directories involved in the RENAME, the server returns 7329 change_info4 information. With the atomic field of the 7330 change_info4 struct, the server will indicate if the before and 7331 after change attributes were obtained atomically with respect to 7332 the rename. 7334 If the oldname or newname has a length of 0 (zero), or if oldname 7335 or newname does not obey the UTF-8 definition, the error 7336 NFS4ERR_INVAL will be returned. 7338 IMPLEMENTATION 7340 The RENAME operation must be atomic to the client. The statement 7341 "source and target directories must reside on the same file system 7342 on the server" means that the fsid fields in the attributes for the 7343 directories are the same. If they reside on different file systems, 7344 the error, NFS4ERR_XDEV, is returned. 7346 A filehandle may or may not become stale or expire on a rename. 7347 However, server implementors are strongly encouraged to attempt to 7348 keep file handles from becoming stale or expiring in this fashion. 7350 On some servers, the filenames, "." and "..", are illegal as either 7351 oldname or newname. In addition, neither oldname nor newname can 7352 be an alias for the source directory. These servers will return 7353 the error, NFS4ERR_INVAL, in these cases. 7355 ERRORS 7357 NFS4ERR_ACCES 7358 NFS4ERR_BADHANDLE 7359 NFS4ERR_DELAY 7360 NFS4ERR_DQUOT 7361 NFS4ERR_EXIST 7362 NFS4ERR_FHEXPIRED 7363 NFS4ERR_INVAL 7364 NFS4ERR_IO 7365 NFS4ERR_ISDIR 7366 NFS4ERR_MOVED 7367 NFS4ERR_NAMETOOLONG 7368 NFS4ERR_NOENT 7369 NFS4ERR_NOFILEHANDLE 7370 NFS4ERR_NOSPC 7371 NFS4ERR_NOTDIR 7372 NFS4ERR_NOTEMPTY 7373 NFS4ERR_NOTSUPP 7374 NFS4ERR_RESOURCE 7375 NFS4ERR_ROFS 7376 NFS4ERR_SERVERFAULT 7377 NFS4ERR_STALE 7378 NFS4ERR_WRONGSEC 7379 NFS4ERR_XDEV 7381 14.2.28. Operation 30: RENEW - Renew a Lease 7383 SYNOPSIS 7385 stateid -> () 7387 ARGUMENT 7389 struct RENEW4args { 7390 stateid4 stateid; 7391 }; 7393 RESULT 7395 struct RENEW4res { 7396 nfsstat4 status; 7397 }; 7399 DESCRIPTION 7401 The RENEW operation is used by the client to renew leases which it 7402 currently holds at a server. In processing the RENEW request, the 7403 server renews all leases associated with the client. The 7404 associated leases are determined by the client id provided via the 7405 SETCLIENTID procedure. 7407 The stateid for RENEW may not be one of the special stateids 7408 consisting of all bits 0 (zero) or all bits 1. 7410 IMPLEMENTATION 7412 ERRORS 7414 NFS4ERR_BAD_STATEID 7415 NFS4ERR_EXPIRED 7416 NFS4ERR_GRACE 7417 NFS4ERR_INVAL 7418 NFS4ERR_LEASE_MOVED 7419 NFS4ERR_MOVED 7420 NFS4ERR_OLD_STATEID 7421 NFS4ERR_RESOURCE 7422 NFS4ERR_SERVERFAULT 7423 NFS4ERR_STALE_STATEID 7424 NFS4ERR_WRONGSEC 7426 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle 7428 SYNOPSIS 7430 (sfh) -> (cfh) 7432 ARGUMENT 7434 /* SAVED_FH: */ 7435 void; 7437 RESULT 7439 struct RESTOREFH4res { 7440 /* CURRENT_FH: value of saved fh */ 7441 nfsstat4 status; 7442 }; 7444 DESCRIPTION 7446 Set the current filehandle to the value in the saved filehandle. 7447 If there is no saved filehandle then return an error 7448 NFS4ERR_NOFILEHANDLE. 7450 IMPLEMENTATION 7452 Operations like OPEN and LOOKUP use the current filehandle to 7453 represent a directory and replace it with a new filehandle. 7454 Assuming the previous filehandle was saved with a SAVEFH operator, 7455 the previous filehandle can be restored as the current filehandle. 7456 This is commonly used to obtain post-operation attributes for the 7457 directory, e.g. 7459 PUTFH (directory filehandle) 7460 SAVEFH 7461 GETATTR attrbits (pre-op dir attrs) 7462 CREATE optbits "foo" attrs 7463 GETATTR attrbits (file attributes) 7464 RESTOREFH 7465 GETATTR attrbits (post-op dir attrs) 7467 ERRORS 7469 NFS4ERR_BADHANDLE 7470 NFS4ERR_FHEXPIRED 7471 NFS4ERR_MOVED 7472 NFS4ERR_NOFILEHANDLE 7473 NFS4ERR_RESOURCE 7474 NFS4ERR_SERVERFAULT 7475 NFS4ERR_STALE 7476 NFS4ERR_WRONGSEC 7478 14.2.30. Operation 32: SAVEFH - Save Current Filehandle 7480 SYNOPSIS 7482 (cfh) -> (sfh) 7484 ARGUMENT 7486 /* CURRENT_FH: */ 7487 void; 7489 RESULT 7491 struct SAVEFH4res { 7492 /* SAVED_FH: value of current fh */ 7493 nfsstat4 status; 7494 }; 7496 DESCRIPTION 7498 Save the current filehandle. If a previous filehandle was saved 7499 then it is no longer accessible. The saved filehandle can be 7500 restored as the current filehandle with the RESTOREFH operator. 7502 On success, the current filehandle retains its value. 7504 IMPLEMENTATION 7506 ERRORS 7508 NFS4ERR_BADHANDLE 7509 NFS4ERR_FHEXPIRED 7510 NFS4ERR_MOVED 7511 NFS4ERR_NOFILEHANDLE 7512 NFS4ERR_RESOURCE 7513 NFS4ERR_SERVERFAULT 7514 NFS4ERR_STALE 7515 NFS4ERR_WRONGSEC 7517 14.2.31. Operation 33: SECINFO - Obtain Available Security 7519 SYNOPSIS 7521 (cfh), name -> { secinfo } 7523 ARGUMENT 7525 struct SECINFO4args { 7526 /* CURRENT_FH: */ 7527 component4 name; 7528 }; 7530 RESULT 7532 enum rpc_gss_svc_t { 7533 RPC_GSS_SVC_NONE = 1, 7534 RPC_GSS_SVC_INTEGRITY = 2, 7535 RPC_GSS_SVC_PRIVACY = 3 7536 }; 7538 struct rpcsec_gss_info { 7539 sec_oid4 oid; 7540 qop4 qop; 7541 rpc_gss_svc_t service; 7542 }; 7544 struct secinfo4 { 7545 uint32_t flavor; 7546 opaque flavor_info<>; /* null for AUTH_SYS, AUTH_NONE; 7547 contains rpcsec_gss_info for 7548 RPCSEC_GSS. */ 7549 }; 7551 typedef secinfo4 SECINFO4resok<>; 7553 union SECINFO4res switch (nfsstat4 status) { 7554 case NFS4_OK: 7555 SECINFO4resok resok4; 7556 default: 7557 void; 7558 }; 7560 DESCRIPTION 7562 The SECINFO operation is used by the client to obtain a list of 7563 valid RPC authentication flavors for a specific file handle, file 7564 name pair. The result will contain an array which represents the 7565 security mechanisms available. The array entries are represented 7566 by the secinfo4 structure. The field 'flavor' will contain a value 7567 of AUTH_NONE, AUTH_SYS (as defined in [RFC1831]), or RPCSEC_GSS (as 7568 defined in [RFC2203]). 7570 For the flavors, AUTH_NONE, and AUTH_SYS no additional security 7571 information is returned. For a return value of RPCSEC_GSS, a 7572 security triple is returned that contains the mechanism object id 7573 (as defined in [RFC2078]), the quality of protection (as defined in 7574 [RFC2078]) and the service type (as defined in [RFC2203]). It is 7575 possible for SECINFO to return multiple entries with flavor equal 7576 to RPCSEC_GSS with different security triple values. 7578 On success, the current filehandle retains its value. 7580 IMPLEMENTATION 7582 The SECINFO operation is expected to be used by the NFS client when 7583 the error value of NFS4ERR_WRONGSEC is returned from another NFS 7584 operation. This signifies to the client that the server's security 7585 policy is different from what the client is currently using. At 7586 this point, the client is expected to obtain a list of possible 7587 security flavors and choose what best suits its policies. 7589 It is recommended that the client issue the SECINFO call protected 7590 by a security triple that uses either rpc_gss_svc_integrity or 7591 rpc_gss_svc_privacy service. The use of rpc_gss_svc_none would 7592 allow an attacker in the middle to modify the SECINFO results such 7593 that the client might select a weaker algorithm in the set allowed 7594 by server, making the client and/or server vulnerable to further 7595 attacks. 7597 ERRORS 7599 NFS4ERR_BADHANDLE 7600 NFS4ERR_FHEXPIRED 7601 NFS4ERR_MOVED 7602 NFS4ERR_NAMETOOLONG 7603 NFS4ERR_NOENT 7604 NFS4ERR_NOFILEHANDLE 7605 NFS4ERR_NOTDIR 7606 NFS4ERR_RESOURCE 7607 NFS4ERR_SERVERFAULT 7608 NFS4ERR_STALE 7609 NFS4ERR_WRONGSEC 7611 14.2.32. Operation 34: SETATTR - Set Attributes 7613 SYNOPSIS 7615 (cfh), attrbits, attrvals -> - 7617 ARGUMENT 7619 struct SETATTR4args { 7620 /* CURRENT_FH: target object */ 7621 stateid4 stateid; 7622 fattr4 obj_attributes; 7623 }; 7625 RESULT 7627 struct SETATTR4res { 7628 nfsstat4 status; 7629 bitmap4 attrsset; 7630 }; 7632 DESCRIPTION 7634 The SETATTR operation changes one or more of the attributes of a 7635 file system object. The new attributes are specified with a bitmap 7636 and the attributes that follow the bitmap in bit order. 7638 The stateid is necessary for SETATTRs that change the size of a 7639 file (modify the attribute object_size). This stateid represents a 7640 record lock, share reservation, or delegation which must be valid 7641 for the SETATTR to modify the file data. A valid stateid would 7642 always be specified. When the file size is not changed, the 7643 special stateid consisting of all bits 0 (zero) should be used. 7645 On either success or failure of the operation, the server will 7646 return the attrsset bitmask to represent what (if any) attributes 7647 were successfully set. 7649 On success, the current filehandle retains its value. 7651 IMPLEMENTATION 7653 The file size attribute is used to request changes to the size of a 7654 file. A value of 0 (zero) causes the file to be truncated, a value 7655 less than the current size of the file causes data from new size to 7656 the end of the file to be discarded, and a size greater than the 7657 current size of the file causes logically zeroed data bytes to be 7658 added to the end of the file. Servers are free to implement this 7659 using holes or actual zero data bytes. Clients should not make any 7660 assumptions regarding a server's implementation of this feature, 7661 beyond that the bytes returned will be zeroed. Servers must 7662 support extending the file size via SETATTR. 7664 SETATTR is not guaranteed atomic. A failed SETATTR may partially 7665 change a file's attributes. 7667 Changing the size of a file with SETATTR indirectly changes the 7668 time_modify. A client must account for this as size changes can 7669 result in data deletion. 7671 If server and client times differ, programs that compare client 7672 time to file times can break. A time maintenance protocol should be 7673 used to limit client/server time skew. 7675 If the server cannot successfully set all the attributes it must 7676 return an NFS4ERR_INVAL error. If the server can only support 32 7677 bit offsets and sizes, a SETATTR request to set the size of a file 7678 to larger than can be represented in 32 bits will be rejected with 7679 this same error. 7681 ERRORS 7683 NFS4ERR_ACCES 7684 NFS4ERR_BADHANDLE 7685 NFS4ERR_BAD_STATEID 7686 NFS4ERR_DELAY 7687 NFS4ERR_DENIED 7688 NFS4ERR_DQUOT 7689 NFS4ERR_EXPIRED 7690 NFS4ERR_FBIG 7691 NFS4ERR_FHEXPIRED 7692 NFS4ERR_GRACE 7693 NFS4ERR_INVAL 7694 NFS4ERR_IO 7695 NFS4ERR_MOVED 7696 NFS4ERR_NOFILEHANDLE 7697 NFS4ERR_NOSPC 7698 NFS4ERR_NOTSUPP 7699 NFS4ERR_OLD_STATEID 7700 NFS4ERR_PERM 7701 NFS4ERR_RESOURCE 7702 NFS4ERR_ROFS 7703 NFS4ERR_SERVERFAULT 7704 NFS4ERR_STALE 7705 NFS4ERR_STALE_STATEID 7706 NFS4ERR_WRONGSEC 7708 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid 7710 SYNOPSIS 7712 client, callback -> clientid, setclientid_confirm 7714 ARGUMENT 7716 struct SETCLIENTID4args { 7717 nfs_client_id4 client; 7718 cb_client4 callback; 7719 }; 7721 RESULT 7723 struct SETCLIENTID4resok { 7724 clientid4 clientid; 7725 verifier4 setclientid_confirm; 7726 }; 7728 union SETCLIENTID4res switch (nfsstat4 status) { 7729 case NFS4_OK: 7730 SETCLIENTID4resok resok4; 7731 case NFS4ERR_CLID_INUSE: 7732 clientaddr4 client_using; 7733 default: 7734 void; 7735 }; 7737 DESCRIPTION 7739 The SETCLIENTID operation introduces the ability of the client to 7740 notify the server of its intention to use a particular client 7741 identifier and verifier pair. Upon successful completion the 7742 server will return a clientid which is used in subsequent file 7743 locking requests and a confirmation verifier. The client will use 7744 the SETCLIENTID_CONFIRM operation to return the verifier to the 7745 server. At that point, the client may use the clientid in 7746 subsequent operations that require an nfs_lockowner. 7748 The callback information provided in this operation will be used if 7749 the client is provided an open delegation at a future point. 7750 Therefore, the client must correctly reflect the program and port 7751 numbers for the callback program at the time SETCLIENTID is used. 7753 IMPLEMENTATION 7755 The server takes the verifier and client identification supplied in 7756 the nfs_client_id4 and searches for a match of the client 7757 identification. If no match is found the server saves the 7758 principal/uid information along with the verifier and client 7759 identification and returns a unique clientid that is used as a 7760 shorthand reference to the supplied information. 7762 If the server finds matching client identification and a 7763 corresponding match in principal/uid, the server releases all 7764 locking state for the client and returns a new clientid. 7766 The principal, or principal to user-identifier mapping is taken 7767 from the credential presented in the RPC. As mentioned, the server 7768 will use the credential and associated principal for the matching 7769 with existing clientids. If the client is a traditional host-based 7770 client like a Unix NFS client, then the credential presented may be 7771 the host credential. If the client is a user level client or 7772 lightweight client, the credential used may be the end user's 7773 credential. The client should take care in choosing an appropriate 7774 credential since denial of service attacks could be attempted by a 7775 rogue client that has access to the credential. 7777 ERRORS 7779 NFS4ERR_CLID_INUSE 7780 NFS4ERR_INVAL 7781 NFS4ERR_RESOURCE 7782 NFS4ERR_SERVERFAULT 7784 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 7786 SYNOPSIS 7788 setclientid_confirm -> - 7790 ARGUMENT 7792 struct SETCLIENTID_CONFIRM4args { 7793 verifier4 setclientid_confirm; 7794 }; 7796 RESULT 7798 struct SETCLIENTID_CONFIRM4res { 7799 nfsstat4 status; 7800 }; 7802 DESCRIPTION 7804 This operation is used by the client to confirm the results from a 7805 previous call to SETCLIENTID. The client provides the server 7806 supplied (from a SETCLIENTID response) opaque confirmation 7807 verifier. The server responds with a simple status of success or 7808 failure. 7810 IMPLEMENTATION 7812 The client must use the SETCLIENTID_CONFIRM operation to confirm 7813 its use of client identifier. If the server is holding state for a 7814 client which has presented a new verifier via SETCLIENTID, then the 7815 state will not be released, as described in the section "Client 7816 Failure and Recovery", until a valid SETCLIENTID_CONFIRM is 7817 received. Upon successful confirmation the server will release the 7818 previous state held on behalf of the client. The server should 7819 choose a confirmation cookie value that is reasonably unique for 7820 the client. 7822 ERRORS 7824 NFS4ERR_CLID_INUSE 7825 NFS4ERR_INVAL 7826 NFS4ERR_RESOURCE 7827 NFS4ERR_SERVERFAULT 7828 NFS4ERR_STALE_CLIENTID 7830 14.2.35. Operation 37: VERIFY - Verify Same Attributes 7832 SYNOPSIS 7834 (cfh), fattr -> - 7836 ARGUMENT 7838 struct VERIFY4args { 7839 /* CURRENT_FH: object */ 7840 fattr4 obj_attributes; 7841 }; 7843 RESULT 7845 struct VERIFY4res { 7846 nfsstat4 status; 7847 }; 7849 DESCRIPTION 7851 The VERIFY operation is used to verify that attributes have a value 7852 assumed by the client before proceeding with following operations 7853 in the compound request. If any of the attributes do not match 7854 then the error NFS4ERR_NOT_SAME must be returned. The current 7855 filehandle retains its value after successful completion of the 7856 operation. 7858 IMPLEMENTATION 7860 One possible use of the VERIFY operation is the following compound 7861 sequence. With this the client is attempting to verify that the 7862 file being removed will match what the client expects to be 7863 removed. This sequence can help prevent the unintended deletion of 7864 a file. 7866 PUTFH (directory filehandle) 7867 LOOKUP (file name) 7868 VERIFY (filehandle == fh) 7869 PUTFH (directory filehandle) 7870 REMOVE (file name) 7872 This sequence does not prevent a second client from removing and 7873 creating a new file in the middle of this sequence but it does help 7874 avoid the unintended result. 7876 In the case that a recommended attribute is specified in the VERIFY 7877 operation and the server does not support that attribute for the 7878 file system object, the error NFS4ERR_NOTSUPP is returned to the 7879 client. 7881 ERRORS 7883 NFS4ERR_ACCES 7884 NFS4ERR_BADHANDLE 7885 NFS4ERR_DELAY 7886 NFS4ERR_FHEXPIRED 7887 NFS4ERR_INVAL 7888 NFS4ERR_MOVED 7889 NFS4ERR_NOFILEHANDLE 7890 NFS4ERR_NOTSUPP 7891 NFS4ERR_NOT_SAME 7892 NFS4ERR_RESOURCE 7893 NFS4ERR_SERVERFAULT 7894 NFS4ERR_STALE 7895 NFS4ERR_WRONGSEC 7897 14.2.36. Operation 38: WRITE - Write to File 7899 SYNOPSIS 7901 (cfh), offset, count, stability, stateid, data -> count, committed, 7902 verifier 7904 ARGUMENT 7906 enum stable_how4 { 7907 UNSTABLE4 = 0, 7908 DATA_SYNC4 = 1, 7909 FILE_SYNC4 = 2 7910 }; 7912 struct WRITE4args { 7913 /* CURRENT_FH: file */ 7914 stateid4 stateid; 7915 offset4 offset; 7916 stable_how4 stable; 7917 opaque data<>; 7918 }; 7920 RESULT 7922 struct WRITE4resok { 7923 count4 count; 7924 stable_how4 committed; 7925 verifier4 writeverf; 7926 }; 7928 union WRITE4res switch (nfsstat4 status) { 7929 case NFS4_OK: 7930 WRITE4resok resok4; 7931 default: 7932 void; 7933 }; 7935 DESCRIPTION 7937 The WRITE operation is used to write data to a regular file. The 7938 target file is specified by the current filehandle. The offset 7939 specifies the offset where the data should be written. An offset 7940 of 0 (zero) specifies that the write should start at the beginning 7941 of the file. The count represents the number of bytes of data that 7942 are to be written. If the count is 0 (zero), the WRITE will 7943 succeed and return a count of 0 (zero) subject to permissions 7944 checking. The server may choose to write fewer bytes than 7945 requested by the client. 7947 Part of the write request is a specification of how the write is to 7948 be performed. The client specifies with the stable parameter the 7949 method of how the data is to be processed by the server. If stable 7950 is FILE_SYNC4, the server must commit the data written plus all 7951 file system metadata to stable storage before returning results. 7952 This corresponds to the NFS version 2 protocol semantics. Any 7953 other behavior constitutes a protocol violation. If stable is 7954 DATA_SYNC4, then the server must commit all of the data to stable 7955 storage and enough of the metadata to retrieve the data before 7956 returning. The server implementor is free to implement DATA_SYNC4 7957 in the same fashion as FILE_SYNC4, but with a possible performance 7958 drop. If stable is UNSTABLE4, the server is free to commit any 7959 part of the data and the metadata to stable storage, including all 7960 or none, before returning a reply to the client. There is no 7961 guarantee whether or when any uncommitted data will subsequently be 7962 committed to stable storage. The only guarantees made by the server 7963 are that it will not destroy any data without changing the value of 7964 verf and that it will not commit the data and metadata at a level 7965 less than that requested by the client. 7967 The stateid returned from a previous record lock or share 7968 reservation request is provided as part of the argument. The 7969 stateid is used by the server to verify that the associated lock is 7970 still valid and to update lease timeouts for the client. 7972 Upon successful completion, the following results are returned. 7973 The count result is the number of bytes of data written to the 7974 file. The server may write fewer bytes than requested. If so, the 7975 actual number of bytes written starting at location, offset, is 7976 returned. 7978 The server also returns an indication of the level of commitment of 7979 the data and metadata via committed. If the server committed all 7980 data and metadata to stable storage, committed should be set to 7981 FILE_SYNC4. If the level of commitment was at least as strong as 7982 DATA_SYNC4, then committed should be set to DATA_SYNC4. Otherwise, 7983 committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, 7984 then committed must also be FILE_SYNC4: anything else constitutes a 7985 protocol violation. If stable was DATA_SYNC4, then committed may be 7986 FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol 7987 violation. If stable was UNSTABLE4, then committed may be either 7988 FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. 7990 The final portion of the result is the write verifier, verf. The 7991 write verifier is a cookie that the client can use to determine 7992 whether the server has changed state between a call to WRITE and a 7993 subsequent call to either WRITE or COMMIT. This cookie must be 7994 consistent during a single instance of the NFS version 4 protocol 7995 service and must be unique between instances of the NFS version 4 7996 protocol server, where uncommitted data may be lost. 7998 If a client writes data to the server with the stable argument set 7999 to UNSTABLE4 and the reply yields a committed response of 8000 DATA_SYNC4 or UNSTABLE4, the client will follow up some time in the 8001 future with a COMMIT operation to synchronize outstanding 8002 asynchronous data and metadata with the server's stable storage, 8003 barring client error. It is possible that due to client crash or 8004 other error that a subsequent COMMIT will not be received by the 8005 server. 8007 On success, the current filehandle retains its value. 8009 IMPLEMENTATION 8011 It is possible for the server to write fewer than count bytes of 8012 data. In this case, the server should not return an error unless 8013 no data was written at all. If the server writes less than count 8014 bytes, the client should issue another WRITE to write the remaining 8015 data. 8017 It is assumed that the act of writing data to a file will cause the 8018 time_modified of the file to be updated. However, the 8019 time_modified of the file should not be changed unless the contents 8020 of the file are changed. Thus, a WRITE request with count set to 0 8021 should not cause the time_modified of the file to be updated. 8023 The definition of stable storage has been historically a point of 8024 contention. The following expected properties of stable storage 8025 may help in resolving design issues in the implementation. Stable 8026 storage is persistent storage that survives: 8028 1. Repeated power failures. 8029 2. Hardware failures (of any board, power supply, etc.). 8030 3. Repeated software crashes, including reboot cycle. 8032 This definition does not address failure of the stable storage 8033 module itself. 8035 The verifier is defined to allow a client to detect different 8036 instances of an NFS version 4 protocol server over which cached, 8037 uncommitted data may be lost. In the most likely case, the verifier 8038 allows the client to detect server reboots. This information is 8039 required so that the client can safely determine whether the server 8040 could have lost cached data. If the server fails unexpectedly and 8041 the client has uncommitted data from previous WRITE requests (done 8042 with the stable argument set to UNSTABLE4 and in which the result 8043 committed was returned as UNSTABLE4 as well) it may not have 8044 flushed cached data to stable storage. The burden of recovery is on 8045 the client and the client will need to retransmit the data to the 8046 server. 8048 A suggested verifier would be to use the time that the server was 8049 booted or the time the server was last started (if restarting the 8050 server without a reboot results in lost buffers). 8052 The committed field in the results allows the client to do more 8053 effective caching. If the server is committing all WRITE requests 8054 to stable storage, then it should return with committed set to 8055 FILE_SYNC4, regardless of the value of the stable field in the 8056 arguments. A server that uses an NVRAM accelerator may choose to 8057 implement this policy. The client can use this to increase the 8058 effectiveness of the cache by discarding cached data that has 8059 already been committed on the server. 8061 Some implementations may return NFS4ERR_NOSPC instead of 8062 NFS4ERR_DQUOT when a user's quota is exceeded. 8064 ERRORS 8066 NFS4ERR_ACCES 8067 NFS4ERR_BADHANDLE 8068 NFS4ERR_BAD_STATEID 8069 NFS4ERR_DELAY 8070 NFS4ERR_DENIED 8071 NFS4ERR_DQUOT 8072 NFS4ERR_EXPIRED 8073 NFS4ERR_FBIG 8074 NFS4ERR_FHEXPIRED 8075 NFS4ERR_GRACE 8076 NFS4ERR_INVAL 8077 NFS4ERR_IO 8078 NFS4ERR_LEASE_MOVED 8079 NFS4ERR_LOCKED 8080 NFS4ERR_MOVED 8081 NFS4ERR_NOFILEHANDLE 8082 NFS4ERR_NOSPC 8083 NFS4ERR_OLD_STATEID 8084 NFS4ERR_RESOURCE 8085 NFS4ERR_ROFS 8086 NFS4ERR_SERVERFAULT 8087 NFS4ERR_STALE 8088 NFS4ERR_STALE_STATEID 8089 NFS4ERR_WRONGSEC 8091 15. NFS Version 4 Callback Procedures 8093 The procedures used for callbacks are defined in the following 8094 sections. In the interest of clarity, the terms "client" and 8095 "server" refer to NFS clients and servers, despite the fact that for 8096 an individual callback RPC, the sense of these terms would be 8097 precisely the opposite. 8099 15.1. Procedure 0: CB_NULL - No Operation 8101 SYNOPSIS 8103 8105 ARGUMENT 8107 void; 8109 RESULT 8111 void; 8113 DESCRIPTION 8115 Standard NULL procedure. Void argument, void response. Even 8116 though there is no direct functionality associated with this 8117 procedure, the server will use CB_NULL to confirm the existence of 8118 a path for RPCs from server to client. 8120 ERRORS 8122 None. 8124 15.2. Procedure 1: CB_COMPOUND - Compound Operations 8126 SYNOPSIS 8128 compoundargs -> compoundres 8130 ARGUMENT 8132 enum nfs_cb_opnum4 { 8133 OP_CB_GETATTR = 3, 8134 OP_CB_RECALL = 4 8135 }; 8137 union nfs_cb_argop4 switch (unsigned argop) { 8138 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 8139 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 8140 }; 8142 struct CB_COMPOUND4args { 8143 utf8string tag; 8144 uint32_t minorversion; 8145 nfs_cb_argop4 argarray<>; 8146 }; 8148 RESULT 8150 union nfs_cb_resop4 switch (unsigned resop){ 8151 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 8152 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 8153 }; 8155 struct CB_COMPOUND4res { 8156 nfsstat4 status; 8157 utf8string tag; 8158 nfs_cb_resop4 resarray<>; 8159 }; 8161 DESCRIPTION 8163 The CB_COMPOUND procedure is used to combine one or more of the 8164 callback procedures into a single RPC request. The main callback 8165 RPC program has two main procedures: CB_NULL and CB_COMPOUND. All 8166 other operations use the CB_COMPOUND procedure as a wrapper. 8168 In the processing of the CB_COMPOUND procedure, the client may find 8169 that it does not have the available resources to execute any or all 8170 of the operations within the CB_COMPOUND sequence. In this case, 8171 the error NFS4ERR_RESOURCE will be returned for the particular 8172 operation within the CB_COMPOUND procedure where the resource 8173 exhaustion occurred. This assumes that all previous operations 8174 within the CB_COMPOUND sequence have been evaluated successfully. 8176 Contained within the CB_COMPOUND results is a 'status' field. This 8177 status must be equivalent to the status of the last operation that 8178 was executed within the CB_COMPOUND procedure. Therefore, if an 8179 operation incurred an error then the 'status' value will be the 8180 same error value as is being returned for the operation that 8181 failed. 8183 IMPLEMENTATION 8185 The CB_COMPOUND procedure is used to combine individual operations 8186 into a single RPC request. The client interprets each of the 8187 operations in turn. If an operation is executed by the client and 8188 the status of that operation is NFS4_OK, then the next operation in 8189 the CB_COMPOUND procedure is executed. The client continues this 8190 process until there are no more operations to be executed or one of 8191 the operations has a status value other than NFS4_OK. 8193 ERRORS 8195 NFS4ERR_BADHANDLE 8196 NFS4ERR_BAD_STATEID 8197 NFS4ERR_RESOURCE 8199 15.2.1. Operation 3: CB_GETATTR - Get Attributes 8201 SYNOPSIS 8203 fh, attrbits -> attrbits, attrvals 8205 ARGUMENT 8207 struct CB_GETATTR4args { 8208 nfs_fh4 fh; 8209 bitmap4 attr_request; 8210 }; 8212 RESULT 8214 struct CB_GETATTR4resok { 8215 fattr4 obj_attributes; 8216 }; 8218 union CB_GETATTR4res switch (nfsstat4 status) { 8219 case NFS4_OK: 8220 CB_GETATTR4resok resok4; 8221 default: 8222 void; 8223 }; 8225 DESCRIPTION 8227 The CB_GETATTR operation is used to obtain the attributes modified 8228 by an open delegate to allow the server to respond to GETATTR 8229 requests for a file which is the subject of an open delegation. 8231 If the handle specified is not one for which the client holds a 8232 write open delegation, an NFS4ERR_BADHANDLE error is returned. 8234 IMPLEMENTATION 8236 The client returns attrbits and the associated attribute values 8237 only for attributes that it may change (change, time_modify, 8238 object_size). 8240 ERRORS 8242 NFS4ERR_BADHANDLE 8243 NFS4ERR_RESOURCE 8245 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation 8247 SYNOPSIS 8249 stateid, truncate, fh -> status 8251 ARGUMENT 8253 struct CB_RECALL4args { 8254 stateid4 stateid; 8255 bool truncate; 8256 nfs_fh4 fh; 8257 }; 8259 RESULT 8261 struct CB_RECALL4res { 8262 nfsstat4 status; 8263 }; 8265 DESCRIPTION 8267 The CB_RECALL operation is used to begin the process of recalling 8268 an open delegation and returning it to the server. 8270 The truncate flag is used to optimize recall for a file which is 8271 about to be truncated to zero. When it is set, the client is freed 8272 of obligation to propagate modified data for the file to the 8273 server, since this data is irrelevant. 8275 If the handle specified is not one for which the client holds an 8276 open delegation, an NFS4ERR_BADHANDLE error is returned. 8278 If the stateid specified is not one corresponding to an open 8279 delegation for the file specified by the filehandle, an 8280 NFS4ERR_BAD_STATEID is returned. 8282 IMPLEMENTATION 8284 The client should reply to the callback immediately. Replying does 8285 not complete the recall. The recall is not complete until the 8286 delegation is returned using a DELEGRETURN. 8288 ERRORS 8290 NFS4ERR_BADHANDLE 8291 NFS4ERR_BAD_STATEID 8292 NFS4ERR_RESOURCE 8294 16. Security Considerations 8296 The major security feature to consider is the authentication of the 8297 user making the request of NFS service. Consideration should also be 8298 given to the integrity and privacy of this NFS request. These 8299 specific issues are discussed as part of the section on "RPC and 8300 Security Flavor". 8302 17. IANA Considerations 8304 17.1. Named Attribute Definition 8306 The NFS version 4 protocol provides for the association of named 8307 attributes to files. The name space identifiers for these attributes 8308 are defined as string names. The protocol does not define the 8309 specific assignment of the name space for these file attributes; the 8310 application developer or system vendor is allowed to define the 8311 attribute, its semantics, and the associated name. Even though this 8312 name space will not be specifically controlled to prevent collisions, 8313 the application developer or system vendor is strongly encouraged to 8314 provide the name assignment and associated semantics for attributes 8315 via an Informational RFC. This will provide for interoperability 8316 where common interests exist. 8318 18. RPC definition file 8320 /* 8321 * Copyright (C) The Internet Society (1998,1999,2000). 8322 * All Rights Reserved. 8323 */ 8325 /* 8326 * nfs4_prot.x 8327 * 8328 */ 8330 %#pragma ident "@(#)nfs4_prot.x 1.97 00/06/12" 8332 /* 8333 * Basic typedefs for RFC 1832 data type definitions 8334 */ 8335 typedef int int32_t; 8336 typedef unsigned int uint32_t; 8337 typedef hyper int64_t; 8338 typedef unsigned hyper uint64_t; 8340 /* 8341 * Sizes 8342 */ 8343 const NFS4_FHSIZE = 128; 8344 const NFS4_VERIFIER_SIZE = 8; 8346 /* 8347 * File types 8348 */ 8349 enum nfs_ftype4 { 8350 NF4REG = 1, /* Regular File */ 8351 NF4DIR = 2, /* Directory */ 8352 NF4BLK = 3, /* Special File - block device */ 8353 NF4CHR = 4, /* Special File - character device */ 8354 NF4LNK = 5, /* Symbolic Link */ 8355 NF4SOCK = 6, /* Special File - socket */ 8356 NF4FIFO = 7, /* Special File - fifo */ 8357 NF4ATTRDIR = 8, /* Attribute Directory */ 8358 NF4NAMEDATTR = 9 /* Named Attribute */ 8359 }; 8361 /* 8362 * Error status 8363 */ 8364 enum nfsstat4 { 8365 NFS4_OK = 0, 8366 NFS4ERR_PERM = 1, 8367 NFS4ERR_NOENT = 2, 8368 NFS4ERR_IO = 5, 8369 NFS4ERR_NXIO = 6, 8370 NFS4ERR_ACCES = 13, 8371 NFS4ERR_EXIST = 17, 8372 NFS4ERR_XDEV = 18, 8373 NFS4ERR_NODEV = 19, 8374 NFS4ERR_NOTDIR = 20, 8375 NFS4ERR_ISDIR = 21, 8376 NFS4ERR_INVAL = 22, 8377 NFS4ERR_FBIG = 27, 8378 NFS4ERR_NOSPC = 28, 8379 NFS4ERR_ROFS = 30, 8380 NFS4ERR_MLINK = 31, 8381 NFS4ERR_NAMETOOLONG = 63, 8382 NFS4ERR_NOTEMPTY = 66, 8383 NFS4ERR_DQUOT = 69, 8384 NFS4ERR_STALE = 70, 8385 NFS4ERR_BADHANDLE = 10001, 8386 NFS4ERR_BAD_COOKIE = 10003, 8387 NFS4ERR_NOTSUPP = 10004, 8388 NFS4ERR_TOOSMALL = 10005, 8389 NFS4ERR_SERVERFAULT = 10006, 8390 NFS4ERR_BADTYPE = 10007, 8391 NFS4ERR_DELAY = 10008, 8392 NFS4ERR_SAME = 10009,/* nverify says attrs same */ 8393 NFS4ERR_DENIED = 10010,/* lock unavailable */ 8394 NFS4ERR_EXPIRED = 10011,/* lock lease expired */ 8395 NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */ 8396 NFS4ERR_GRACE = 10013,/* in grace period */ 8397 NFS4ERR_FHEXPIRED = 10014,/* file handle expired */ 8398 NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */ 8399 NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */ 8400 NFS4ERR_CLID_INUSE = 10017,/* clientid in use */ 8401 NFS4ERR_RESOURCE = 10018,/* resource exhaustion */ 8402 NFS4ERR_MOVED = 10019,/* filesystem relocated */ 8403 NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */ 8404 NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ 8405 NFS4ERR_STALE_CLIENTID = 10022, 8406 NFS4ERR_STALE_STATEID = 10023, 8407 NFS4ERR_OLD_STATEID = 10024, 8408 NFS4ERR_BAD_STATEID = 10025, 8409 NFS4ERR_BAD_SEQID = 10026, 8410 NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */ 8411 NFS4ERR_LOCK_RANGE = 10028, 8412 NFS4ERR_SYMLINK = 10029, 8413 NFS4ERR_READDIR_NOSPC = 10030, 8414 NFS4ERR_LEASE_MOVED = 10031 8415 }; 8417 /* 8418 * Basic data types 8419 */ 8420 typedef uint32_t bitmap4<>; 8421 typedef uint64_t offset4; 8422 typedef uint32_t count4; 8423 typedef uint64_t length4; 8424 typedef uint64_t clientid4; 8425 typedef uint64_t stateid4; 8426 typedef uint32_t seqid4; 8427 typedef opaque utf8string<>; 8428 typedef utf8string component4; 8429 typedef component4 pathname4<>; 8430 typedef uint64_t nfs_lockid4; 8431 typedef uint64_t nfs_cookie4; 8432 typedef utf8string linktext4; 8433 typedef opaque sec_oid4<>; 8434 typedef uint32_t qop4; 8435 typedef uint32_t mode4; 8436 typedef uint64_t changeid4; 8437 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; 8439 /* 8440 * Timeval 8441 */ 8442 struct nfstime4 { 8443 int64_t seconds; 8444 uint32_t nseconds; 8445 }; 8447 enum time_how4 { 8448 SET_TO_SERVER_TIME4 = 0, 8449 SET_TO_CLIENT_TIME4 = 1 8450 }; 8452 union settime4 switch (time_how4 set_it) { 8453 case SET_TO_CLIENT_TIME4: 8454 nfstime4 time; 8455 default: 8456 void; 8457 }; 8459 /* 8460 * File access handle 8461 */ 8462 typedef opaque nfs_fh4; 8464 /* 8465 * File attribute definitions 8466 */ 8468 /* 8469 * FSID structure for major/minor 8470 */ 8471 struct fsid4 { 8472 uint64_t major; 8473 uint64_t minor; 8474 }; 8476 /* 8477 * Filesystem locations attribute for relocation/migration 8478 */ 8479 struct fs_location4 { 8480 utf8string server<>; 8481 pathname4 rootpath; 8482 }; 8484 struct fs_locations4 { 8485 pathname4 fs_root; 8486 fs_location4 locations<>; 8487 }; 8489 /* 8490 * Various Access Control Entry definitions 8491 */ 8493 /* 8494 * Mask that indicates which Access Control Entries are supported. 8495 * Values for the fattr4_aclsupport attribute. 8496 */ 8497 const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; 8498 const ACL4_SUPPORT_DENY_ACL = 0x00000002; 8499 const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; 8500 const ACL4_SUPPORT_ALARM_ACL = 0x00000008; 8502 typedef uint32_t acetype4; 8504 /* 8505 * acetype4 values, others can be added as needed. 8506 */ 8507 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; 8508 const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; 8509 const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; 8510 const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; 8512 /* 8513 * ACE flag 8514 */ 8515 typedef uint32_t aceflag4; 8517 /* 8518 * ACE flag values 8519 */ 8520 const ACE4_FILE_INHERIT_ACE = 0x00000001; 8521 const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; 8522 const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; 8523 const ACE4_INHERIT_ONLY_ACE = 0x00000008; 8524 const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; 8525 const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; 8526 const ACE4_IDENTIFIER_GROUP = 0x00000040; 8528 /* 8529 * ACE mask 8530 */ 8531 typedef uint32_t acemask4; 8533 /* 8534 * ACE mask values 8535 */ 8536 const ACE4_READ_DATA = 0x00000001; 8537 const ACE4_LIST_DIRECTORY = 0x00000001; 8538 const ACE4_WRITE_DATA = 0x00000002; 8539 const ACE4_ADD_FILE = 0x00000002; 8540 const ACE4_APPEND_DATA = 0x00000004; 8541 const ACE4_ADD_SUBDIRECTORY = 0x00000004; 8542 const ACE4_READ_NAMED_ATTRS = 0x00000008; 8543 const ACE4_WRITE_NAMED_ATTRS = 0x00000010; 8544 const ACE4_EXECUTE = 0x00000020; 8545 const ACE4_DELETE_CHILD = 0x00000040; 8546 const ACE4_READ_ATTRIBUTES = 0x00000080; 8547 const ACE4_WRITE_ATTRIBUTES = 0x00000100; 8549 const ACE4_DELETE = 0x00010000; 8550 const ACE4_READ_ACL = 0x00020000; 8551 const ACE4_WRITE_ACL = 0x00040000; 8552 const ACE4_WRITE_OWNER = 0x00080000; 8553 const ACE4_SYNCHRONIZE = 0x00100000; 8555 /* 8556 * ACE4_GENERIC_READ -- defined as combination of 8557 * ACE4_READ_ACL | 8558 * ACE4_READ_DATA | 8559 * ACE4_READ_ATTRIBUTES | 8560 * ACE4_SYNCHRONIZE 8561 */ 8563 const ACE4_GENERIC_READ = 0x00120081; 8565 /* 8566 * ACE4_GENERIC_WRITE -- defined as combination of 8567 * ACE4_READ_ACL | 8568 * ACE4_WRITE_DATA | 8569 * ACE4_WRITE_ATTRIBUTES | 8570 * ACE4_WRITE_ACL | 8571 * ACE4_APPEND_DATA | 8572 * ACE4_SYNCHRONIZE 8573 */ 8575 const ACE4_GENERIC_WRITE = 0x00160106; 8577 /* 8578 * ACE4_GENERIC_EXECUTE -- defined as combination of 8579 * ACE4_READ_ACL 8580 * ACE4_READ_ATTRIBUTES 8581 * ACE4_EXECUTE 8582 * ACE4_SYNCHRONIZE 8583 */ 8584 const ACE4_GENERIC_EXECUTE = 0x001200A0; 8586 /* 8587 * Access Control Entry definition 8588 */ 8589 struct nfsace4 { 8590 acetype4 type; 8591 aceflag4 flag; 8592 acemask4 access_mask; 8593 utf8string who; 8594 }; 8596 /* 8597 * Special data/attribute associated with 8598 * file types NF4BLK and NF4CHR. 8599 */ 8600 struct specdata4 { 8601 uint32_t specdata1; 8602 uint32_t specdata2; 8603 }; 8605 /* 8606 * Values for fattr4_fh_expire_type 8607 */ 8608 const FH4_PERSISTENT = 0x00000000; 8609 const FH4_NOEXPIRE_WITH_OPEN = 0x00000001; 8610 const FH4_VOLATILE_ANY = 0x00000002; 8611 const FH4_VOL_MIGRATION = 0x00000004; 8612 const FH4_VOL_RENAME = 0x00000008; 8614 typedef bitmap4 fattr4_supported_attrs; 8615 typedef nfs_ftype4 fattr4_type; 8616 typedef uint32_t fattr4_fh_expire_type; 8617 typedef changeid4 fattr4_change; 8618 typedef uint64_t fattr4_size; 8619 typedef bool fattr4_link_support; 8620 typedef bool fattr4_symlink_support; 8621 typedef bool fattr4_named_attr; 8622 typedef fsid4 fattr4_fsid; 8623 typedef bool fattr4_unique_handles; 8624 typedef uint32_t fattr4_lease_time; 8625 typedef nfsstat4 fattr4_rdattr_error; 8627 typedef nfsace4 fattr4_acl<>; 8628 typedef uint32_t fattr4_aclsupport; 8629 typedef bool fattr4_archive; 8630 typedef bool fattr4_cansettime; 8631 typedef bool fattr4_case_insensitive; 8632 typedef bool fattr4_case_preserving; 8633 typedef bool fattr4_chown_restricted; 8634 typedef uint64_t fattr4_fileid; 8635 typedef uint64_t fattr4_files_avail; 8636 typedef nfs_fh4 fattr4_filehandle; 8637 typedef uint64_t fattr4_files_free; 8638 typedef uint64_t fattr4_files_total; 8639 typedef fs_locations4 fattr4_fs_locations; 8640 typedef bool fattr4_hidden; 8641 typedef bool fattr4_homogeneous; 8642 typedef uint64_t fattr4_maxfilesize; 8643 typedef uint32_t fattr4_maxlink; 8644 typedef uint32_t fattr4_maxname; 8645 typedef uint64_t fattr4_maxread; 8646 typedef uint64_t fattr4_maxwrite; 8647 typedef utf8string fattr4_mimetype; 8648 typedef mode4 fattr4_mode; 8649 typedef bool fattr4_no_trunc; 8650 typedef uint32_t fattr4_numlinks; 8651 typedef utf8string fattr4_owner; 8652 typedef utf8string fattr4_owner_group; 8653 typedef uint64_t fattr4_quota_hard; 8654 typedef uint64_t fattr4_quota_soft; 8655 typedef uint64_t fattr4_quota_used; 8656 typedef specdata4 fattr4_rawdev; 8657 typedef uint64_t fattr4_space_avail; 8658 typedef uint64_t fattr4_space_free; 8659 typedef uint64_t fattr4_space_total; 8660 typedef uint64_t fattr4_space_used; 8661 typedef bool fattr4_system; 8662 typedef nfstime4 fattr4_time_access; 8663 typedef settime4 fattr4_time_access_set; 8664 typedef nfstime4 fattr4_time_backup; 8665 typedef nfstime4 fattr4_time_create; 8666 typedef nfstime4 fattr4_time_delta; 8667 typedef nfstime4 fattr4_time_metadata; 8668 typedef nfstime4 fattr4_time_modify; 8669 typedef settime4 fattr4_time_modify_set; 8671 /* 8672 * Mandatory Attributes 8673 */ 8674 const FATTR4_SUPPORTED_ATTRS = 0; 8675 const FATTR4_TYPE = 1; 8676 const FATTR4_FH_EXPIRE_TYPE = 2; 8677 const FATTR4_CHANGE = 3; 8678 const FATTR4_SIZE = 4; 8679 const FATTR4_LINK_SUPPORT = 5; 8680 const FATTR4_SYMLINK_SUPPORT = 6; 8681 const FATTR4_NAMED_ATTR = 7; 8682 const FATTR4_FSID = 8; 8683 const FATTR4_UNIQUE_HANDLES = 9; 8684 const FATTR4_LEASE_TIME = 10; 8685 const FATTR4_RDATTR_ERROR = 11; 8687 /* 8688 * Recommended Attributes 8689 */ 8690 const FATTR4_ACL = 12; 8691 const FATTR4_ACLSUPPORT = 13; 8692 const FATTR4_ARCHIVE = 14; 8693 const FATTR4_CANSETTIME = 15; 8694 const FATTR4_CASE_INSENSITIVE = 16; 8695 const FATTR4_CASE_PRESERVING = 17; 8696 const FATTR4_CHOWN_RESTRICTED = 18; 8697 const FATTR4_FILEHANDLE = 19; 8698 const FATTR4_FILEID = 20; 8699 const FATTR4_FILES_AVAIL = 21; 8700 const FATTR4_FILES_FREE = 22; 8701 const FATTR4_FILES_TOTAL = 23; 8702 const FATTR4_FS_LOCATIONS = 24; 8703 const FATTR4_HIDDEN = 25; 8704 const FATTR4_HOMOGENEOUS = 26; 8705 const FATTR4_MAXFILESIZE = 27; 8706 const FATTR4_MAXLINK = 28; 8707 const FATTR4_MAXNAME = 29; 8708 const FATTR4_MAXREAD = 30; 8709 const FATTR4_MAXWRITE = 31; 8710 const FATTR4_MIMETYPE = 32; 8711 const FATTR4_MODE = 33; 8712 const FATTR4_NO_TRUNC = 34; 8713 const FATTR4_NUMLINKS = 35; 8714 const FATTR4_OWNER = 36; 8715 const FATTR4_OWNER_GROUP = 37; 8716 const FATTR4_QUOTA_HARD = 38; 8717 const FATTR4_QUOTA_SOFT = 39; 8718 const FATTR4_QUOTA_USED = 40; 8719 const FATTR4_RAWDEV = 41; 8720 const FATTR4_SPACE_AVAIL = 42; 8721 const FATTR4_SPACE_FREE = 43; 8722 const FATTR4_SPACE_TOTAL = 44; 8723 const FATTR4_SPACE_USED = 45; 8724 const FATTR4_SYSTEM = 46; 8725 const FATTR4_TIME_ACCESS = 47; 8726 const FATTR4_TIME_ACCESS_SET = 48; 8727 const FATTR4_TIME_BACKUP = 49; 8728 const FATTR4_TIME_CREATE = 50; 8729 const FATTR4_TIME_DELTA = 51; 8730 const FATTR4_TIME_METADATA = 52; 8731 const FATTR4_TIME_MODIFY = 53; 8732 const FATTR4_TIME_MODIFY_SET = 54; 8734 typedef opaque attrlist4<>; 8736 /* 8737 * File attribute container 8738 */ 8739 struct fattr4 { 8740 bitmap4 attrmask; 8741 attrlist4 attr_vals; 8742 }; 8744 /* 8745 * Change info for the client 8746 */ 8747 struct change_info4 { 8748 bool atomic; 8749 changeid4 before; 8750 changeid4 after; 8751 }; 8753 struct clientaddr4 { 8754 /* see struct rpcb in RFC 1833 */ 8755 string r_netid<>; /* network id */ 8756 string r_addr<>; /* universal address */ 8757 }; 8759 /* 8760 * Callback program info as provided by the client 8761 */ 8762 struct cb_client4 { 8763 unsigned int cb_program; 8764 clientaddr4 cb_location; 8765 }; 8767 /* 8768 * Client ID 8769 */ 8770 struct nfs_client_id4 { 8771 verifier4 verifier; 8772 opaque id<>; 8773 }; 8775 struct nfs_lockowner4 { 8776 clientid4 clientid; 8777 opaque owner<>; 8779 }; 8781 enum nfs_lock_type4 { 8782 READ_LT = 1, 8783 WRITE_LT = 2, 8784 READW_LT = 3, /* blocking read */ 8785 WRITEW_LT = 4 /* blocking write */ 8786 }; 8788 /* 8789 * ACCESS: Check access permission 8790 */ 8791 const ACCESS4_READ = 0x00000001; 8792 const ACCESS4_LOOKUP = 0x00000002; 8793 const ACCESS4_MODIFY = 0x00000004; 8794 const ACCESS4_EXTEND = 0x00000008; 8795 const ACCESS4_DELETE = 0x00000010; 8796 const ACCESS4_EXECUTE = 0x00000020; 8798 struct ACCESS4args { 8799 /* CURRENT_FH: object */ 8800 uint32_t access; 8801 }; 8803 struct ACCESS4resok { 8804 uint32_t supported; 8805 uint32_t access; 8806 }; 8808 union ACCESS4res switch (nfsstat4 status) { 8809 case NFS4_OK: 8810 ACCESS4resok resok4; 8811 default: 8812 void; 8813 }; 8815 /* 8816 * CLOSE: Close a file and release share locks 8817 */ 8818 struct CLOSE4args { 8819 /* CURRENT_FH: object */ 8820 seqid4 seqid; 8821 stateid4 stateid; 8822 }; 8824 union CLOSE4res switch (nfsstat4 status) { 8825 case NFS4_OK: 8826 stateid4 stateid; 8827 default: 8828 void; 8829 }; 8830 /* 8831 * COMMIT: Commit cached data on server to stable storage 8832 */ 8833 struct COMMIT4args { 8834 /* CURRENT_FH: file */ 8835 offset4 offset; 8836 count4 count; 8837 }; 8839 struct COMMIT4resok { 8840 verifier4 writeverf; 8841 }; 8843 union COMMIT4res switch (nfsstat4 status) { 8844 case NFS4_OK: 8845 COMMIT4resok resok4; 8846 default: 8847 void; 8848 }; 8850 /* 8851 * CREATE: Create a file 8852 */ 8853 union createtype4 switch (nfs_ftype4 type) { 8854 case NF4LNK: 8855 linktext4 linkdata; 8856 case NF4BLK: 8857 case NF4CHR: 8858 specdata4 devdata; 8859 case NF4SOCK: 8860 case NF4FIFO: 8861 case NF4DIR: 8862 void; 8863 }; 8865 struct CREATE4args { 8866 /* CURRENT_FH: directory for creation */ 8867 component4 objname; 8868 createtype4 objtype; 8869 }; 8871 struct CREATE4resok { 8872 change_info4 cinfo; 8873 }; 8875 union CREATE4res switch (nfsstat4 status) { 8876 case NFS4_OK: 8877 CREATE4resok resok4; 8878 default: 8879 void; 8880 }; 8881 /* 8882 * DELEGPURGE: Purge Delegations Awaiting Recovery 8883 */ 8884 struct DELEGPURGE4args { 8885 clientid4 clientid; 8886 }; 8888 struct DELEGPURGE4res { 8889 nfsstat4 status; 8890 }; 8892 /* 8893 * DELEGRETURN: Return a delegation 8894 */ 8895 struct DELEGRETURN4args { 8896 stateid4 stateid; 8897 }; 8899 struct DELEGRETURN4res { 8900 nfsstat4 status; 8901 }; 8903 /* 8904 * GETATTR: Get file attributes 8905 */ 8906 struct GETATTR4args { 8907 /* CURRENT_FH: directory or file */ 8908 bitmap4 attr_request; 8909 }; 8911 struct GETATTR4resok { 8912 fattr4 obj_attributes; 8913 }; 8915 union GETATTR4res switch (nfsstat4 status) { 8916 case NFS4_OK: 8917 GETATTR4resok resok4; 8918 default: 8919 void; 8920 }; 8922 /* 8923 * GETFH: Get current filehandle 8924 */ 8925 struct GETFH4resok { 8926 nfs_fh4 object; 8927 }; 8929 union GETFH4res switch (nfsstat4 status) { 8930 case NFS4_OK: 8931 GETFH4resok resok4; 8932 default: 8934 void; 8935 }; 8937 /* 8938 * LINK: Create link to an object 8939 */ 8940 struct LINK4args { 8941 /* SAVED_FH: source object */ 8942 /* CURRENT_FH: target directory */ 8943 component4 newname; 8944 }; 8946 struct LINK4resok { 8947 change_info4 cinfo; 8948 }; 8950 union LINK4res switch (nfsstat4 status) { 8951 case NFS4_OK: 8952 LINK4resok resok4; 8953 default: 8954 void; 8955 }; 8957 /* 8958 * LOCK/LOCKT/LOCKU: Record lock management 8959 */ 8960 struct LOCK4args { 8961 /* CURRENT_FH: file */ 8962 nfs_lock_type4 locktype; 8963 seqid4 seqid; 8964 bool reclaim; 8965 stateid4 stateid; 8966 offset4 offset; 8967 length4 length; 8968 }; 8970 struct LOCK4denied { 8971 nfs_lockowner4 owner; 8972 offset4 offset; 8973 length4 length; 8974 }; 8976 union LOCK4res switch (nfsstat4 status) { 8977 case NFS4_OK: 8978 stateid4 stateid; 8979 case NFS4ERR_DENIED: 8980 LOCK4denied denied; 8981 default: 8982 void; 8983 }; 8985 struct LOCKT4args { 8986 /* CURRENT_FH: file */ 8987 nfs_lock_type4 locktype; 8988 nfs_lockowner4 owner; 8989 offset4 offset; 8990 length4 length; 8991 }; 8993 union LOCKT4res switch (nfsstat4 status) { 8994 case NFS4ERR_DENIED: 8995 LOCK4denied denied; 8996 case NFS4_OK: 8997 void; 8998 default: 8999 void; 9000 }; 9002 struct LOCKU4args { 9003 /* CURRENT_FH: file */ 9004 nfs_lock_type4 type; 9005 seqid4 seqid; 9006 stateid4 stateid; 9007 offset4 offset; 9008 length4 length; 9009 }; 9011 union LOCKU4res switch (nfsstat4 status) { 9012 case NFS4_OK: 9013 stateid4 stateid; 9014 default: 9015 void; 9016 }; 9018 /* 9019 * LOOKUP: Lookup filename 9020 */ 9021 struct LOOKUP4args { 9022 /* CURRENT_FH: directory */ 9023 pathname4 path; 9024 }; 9026 struct LOOKUP4res { 9027 /* CURRENT_FH: object */ 9028 nfsstat4 status; 9029 }; 9031 /* 9032 * LOOKUPP: Lookup parent directory 9033 */ 9034 struct LOOKUPP4res { 9035 /* CURRENT_FH: directory */ 9036 nfsstat4 status; 9037 }; 9038 /* 9039 * NVERIFY: Verify attributes different 9040 */ 9041 struct NVERIFY4args { 9042 /* CURRENT_FH: object */ 9043 fattr4 obj_attributes; 9044 }; 9046 struct NVERIFY4res { 9047 nfsstat4 status; 9048 }; 9050 /* 9051 * Various definitions for OPEN 9052 */ 9053 enum createmode4 { 9054 UNCHECKED4 = 0, 9055 GUARDED4 = 1, 9056 EXCLUSIVE4 = 2 9057 }; 9059 union createhow4 switch (createmode4 mode) { 9060 case UNCHECKED4: 9061 case GUARDED4: 9062 fattr4 createattrs; 9063 case EXCLUSIVE4: 9064 verifier4 createverf; 9065 }; 9067 enum opentype4 { 9068 OPEN4_NOCREATE = 0, 9069 OPEN4_CREATE = 1 9070 }; 9072 union openflag4 switch (opentype4 opentype) { 9073 case OPEN4_CREATE: 9074 createhow4 how; 9075 default: 9076 void; 9077 }; 9079 /* Next definitions used for OPEN delegation */ 9080 enum limit_by4 { 9081 NFS_LIMIT_SIZE = 1, 9082 NFS_LIMIT_BLOCKS = 2 9083 /* others as needed */ 9084 }; 9086 struct nfs_modified_limit4 { 9087 uint32_t num_blocks; 9088 uint32_t bytes_per_block; 9089 }; 9090 union nfs_space_limit4 switch (limit_by4 limitby) { 9091 /* limit specified as file size */ 9092 case NFS_LIMIT_SIZE: 9093 uint64_t filesize; 9094 /* limit specified by number of blocks */ 9095 case NFS_LIMIT_BLOCKS: 9096 nfs_modified_limit4 mod_blocks; 9097 } ; 9099 /* 9100 * Share Access and Deny constants for open argument 9101 */ 9102 const OPEN4_SHARE_ACCESS_READ = 0x00000001; 9103 const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; 9104 const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; 9106 const OPEN4_SHARE_DENY_NONE = 0x00000000; 9107 const OPEN4_SHARE_DENY_READ = 0x00000001; 9108 const OPEN4_SHARE_DENY_WRITE = 0x00000002; 9109 const OPEN4_SHARE_DENY_BOTH = 0x00000003; 9111 enum open_delegation_type4 { 9112 OPEN_DELEGATE_NONE = 0, 9113 OPEN_DELEGATE_READ = 1, 9114 OPEN_DELEGATE_WRITE = 2 9115 }; 9117 enum open_claim_type4 { 9118 CLAIM_NULL = 0, 9119 CLAIM_PREVIOUS = 1, 9120 CLAIM_DELEGATE_CUR = 2, 9121 CLAIM_DELEGATE_PREV = 3 9122 }; 9124 struct open_claim_delegate_cur4 { 9125 pathname4 file; 9126 stateid4 delegate_stateid; 9127 }; 9129 union open_claim4 switch (open_claim_type4 claim) { 9130 /* 9131 * No special rights to file. Ordinary OPEN of the specified file. 9132 */ 9133 case CLAIM_NULL: 9134 /* CURRENT_FH: directory */ 9135 pathname4 file; 9137 /* 9138 * Right to the file established by an open previous to server 9139 * reboot. File identified by filehandle obtained at that time 9140 * rather than by name. 9141 */ 9143 case CLAIM_PREVIOUS: 9144 /* CURRENT_FH: file being reclaimed */ 9145 uint32_t delegate_type; 9147 /* 9148 * Right to file based on a delegation granted by the server. 9149 * File is specified by name. 9150 */ 9151 case CLAIM_DELEGATE_CUR: 9152 /* CURRENT_FH: directory */ 9153 open_claim_delegate_cur4 delegate_cur_info; 9155 /* Right to file based on a delegation granted to a previous boot 9156 * instance of the client. File is specified by name. 9157 */ 9158 case CLAIM_DELEGATE_PREV: 9159 /* CURRENT_FH: directory */ 9160 pathname4 file_delegate_prev; 9161 }; 9163 /* 9164 * OPEN: Open a file, potentially receiving an open delegation 9165 */ 9166 struct OPEN4args { 9167 open_claim4 claim; 9168 openflag4 openhow; 9169 nfs_lockowner4 owner; 9170 seqid4 seqid; 9171 uint32_t share_access; 9172 uint32_t share_deny; 9173 }; 9175 struct open_read_delegation4 { 9176 stateid4 stateid; /* Stateid for delegation*/ 9177 bool recall; /* Pre-recalled flag for 9178 delegations obtained 9179 by reclaim 9180 (CLAIM_PREVIOUS) */ 9181 nfsace4 permissions; /* Defines users who don't 9182 need an ACCESS call to 9183 open for read */ 9184 }; 9186 struct open_write_delegation4 { 9187 stateid4 stateid; /* Stateid for delegation */ 9188 bool recall; /* Pre-recalled flag for 9189 delegations obtained 9190 by reclaim 9191 (CLAIM_PREVIOUS) */ 9192 nfs_space_limit4 space_limit; /* Defines condition that 9193 the client must check to 9194 determine whether the 9195 file needs to be flushed 9196 to the server on close. 9197 */ 9198 nfsace4 permissions; /* Defines users who don't 9199 need an ACCESS call as 9200 part of a delegated 9201 open. */ 9202 }; 9204 union open_delegation4 9205 switch (open_delegation_type4 delegation_type) { 9206 case OPEN_DELEGATE_NONE: 9207 void; 9208 case OPEN_DELEGATE_READ: 9209 open_read_delegation4 read; 9210 case OPEN_DELEGATE_WRITE: 9211 open_write_delegation4 write; 9212 }; 9214 /* 9215 * Result flags 9216 */ 9217 /* Mandatory locking is in effect for this file. */ 9218 const OPEN4_RESULT_MLOCK = 0x00000001; 9219 /* Client must confirm open */ 9220 const OPEN4_RESULT_CONFIRM = 0x00000002; 9222 struct OPEN4resok { 9223 stateid4 stateid; /* Stateid for open */ 9224 change_info4 cinfo; /* Directory Change Info */ 9225 uint32_t rflags; /* Result flags */ 9226 verifier4 open_confirm; /* OPEN_CONFIRM verifier */ 9227 open_delegation4 delegation; /* Info on any open 9228 delegation */ 9229 }; 9231 union OPEN4res switch (nfsstat4 status) { 9232 case NFS4_OK: 9233 /* CURRENT_FH: opened file */ 9234 OPEN4resok result; 9235 default: 9236 void; 9237 }; 9239 /* 9240 * OPENATTR: open named attributes directory 9241 */ 9242 struct OPENATTR4res { 9243 /* CURRENT_FH: name attr directory*/ 9244 nfsstat4 status; 9245 }; 9246 /* 9247 * OPEN_CONFIRM: confirm the open 9248 */ 9249 struct OPEN_CONFIRM4args { 9250 /* CURRENT_FH: opened file */ 9251 seqid4 seqid; 9252 verifier4 open_confirm; /* OPEN_CONFIRM verifier */ 9253 }; 9255 struct OPEN_CONFIRM4resok { 9256 stateid4 stateid; 9257 }; 9259 union OPEN_CONFIRM4res switch (nfsstat4 status) { 9260 case NFS4_OK: 9261 OPEN_CONFIRM4resok resok4; 9262 default: 9263 void; 9264 }; 9266 /* 9267 * OPEN_DOWNGRADE: downgrade the access/deny for a file 9268 */ 9269 struct OPEN_DOWNGRADE4args { 9270 /* CURRENT_FH: opened file */ 9271 stateid4 stateid; 9272 seqid4 seqid; 9273 uint32_t share_access; 9274 uint32_t share_deny; 9275 }; 9277 struct OPEN_DOWNGRADE4resok { 9278 stateid4 stateid; 9279 }; 9281 union OPEN_DOWNGRADE4res switch(nfsstat4 status) { 9282 case NFS4_OK: 9283 OPEN_DOWNGRADE4resok resok4; 9284 default: 9285 void; 9286 }; 9288 /* 9289 * PUTFH: Set current filehandle 9290 */ 9291 struct PUTFH4args { 9292 nfs_fh4 object; 9293 }; 9295 struct PUTFH4res { 9296 /* CURRENT_FH: */ 9297 nfsstat4 status; 9299 }; 9301 /* 9302 * PUTPUBFH: Set public filehandle 9303 */ 9304 struct PUTPUBFH4res { 9305 /* CURRENT_FH: public fh */ 9306 nfsstat4 status; 9307 }; 9309 /* 9310 * PUTROOTFH: Set root filehandle 9311 */ 9312 struct PUTROOTFH4res { 9313 /* CURRENT_FH: root fh */ 9314 nfsstat4 status; 9315 }; 9317 /* 9318 * READ: Read from file 9319 */ 9320 struct READ4args { 9321 /* CURRENT_FH: file */ 9322 stateid4 stateid; 9323 offset4 offset; 9324 count4 count; 9325 }; 9327 struct READ4resok { 9328 bool eof; 9329 opaque data<>; 9330 }; 9332 union READ4res switch (nfsstat4 status) { 9333 case NFS4_OK: 9334 READ4resok resok4; 9335 default: 9336 void; 9337 }; 9339 /* 9340 * READDIR: Read directory 9341 */ 9342 struct READDIR4args { 9343 /* CURRENT_FH: directory */ 9344 nfs_cookie4 cookie; 9345 verifier4 cookieverf; 9346 count4 dircount; 9347 count4 maxcount; 9348 bitmap4 attr_request; 9349 }; 9350 struct entry4 { 9351 nfs_cookie4 cookie; 9352 component4 name; 9353 fattr4 attrs; 9354 entry4 *nextentry; 9355 }; 9357 struct dirlist4 { 9358 entry4 *entries; 9359 bool eof; 9360 }; 9362 struct READDIR4resok { 9363 verifier4 cookieverf; 9364 dirlist4 reply; 9365 }; 9367 union READDIR4res switch (nfsstat4 status) { 9368 case NFS4_OK: 9369 READDIR4resok resok4; 9370 default: 9371 void; 9372 }; 9374 /* 9375 * READLINK: Read symbolic link 9376 */ 9377 struct READLINK4resok { 9378 linktext4 link; 9379 }; 9381 union READLINK4res switch (nfsstat4 status) { 9382 case NFS4_OK: 9383 READLINK4resok resok4; 9384 default: 9385 void; 9386 }; 9388 /* 9389 * REMOVE: Remove filesystem object 9390 */ 9391 struct REMOVE4args { 9392 /* CURRENT_FH: directory */ 9393 component4 target; 9394 }; 9396 struct REMOVE4resok { 9397 change_info4 cinfo; 9398 }; 9399 union REMOVE4res switch (nfsstat4 status) { 9400 case NFS4_OK: 9401 REMOVE4resok resok4; 9402 default: 9403 void; 9404 }; 9406 /* 9407 * RENAME: Rename directory entry 9408 */ 9409 struct RENAME4args { 9410 /* SAVED_FH: source directory */ 9411 component4 oldname; 9412 /* CURRENT_FH: target directory */ 9413 component4 newname; 9414 }; 9416 struct RENAME4resok { 9417 change_info4 source_cinfo; 9418 change_info4 target_cinfo; 9419 }; 9421 union RENAME4res switch (nfsstat4 status) { 9422 case NFS4_OK: 9423 RENAME4resok resok4; 9424 default: 9425 void; 9426 }; 9428 /* 9429 * RENEW: Renew a Lease 9430 */ 9431 struct RENEW4args { 9432 stateid4 stateid; 9433 }; 9435 struct RENEW4res { 9436 nfsstat4 status; 9437 }; 9439 /* 9440 * RESTOREFH: Restore saved filehandle 9441 */ 9443 struct RESTOREFH4res { 9444 /* CURRENT_FH: value of saved fh */ 9445 nfsstat4 status; 9446 }; 9448 /* 9449 * SAVEFH: Save current filehandle 9450 */ 9452 struct SAVEFH4res { 9453 /* SAVED_FH: value of current fh */ 9454 nfsstat4 status; 9455 }; 9457 /* 9458 * SECINFO: Obtain Available Security Mechanisms 9459 */ 9460 struct SECINFO4args { 9461 /* CURRENT_FH: */ 9462 component4 name; 9463 }; 9465 /* 9466 * From RFC 2203 9467 */ 9468 enum rpc_gss_svc_t { 9469 RPC_GSS_SVC_NONE = 1, 9470 RPC_GSS_SVC_INTEGRITY = 2, 9471 RPC_GSS_SVC_PRIVACY = 3 9472 }; 9474 struct rpcsec_gss_info { 9475 sec_oid4 oid; 9476 qop4 qop; 9477 rpc_gss_svc_t service; 9478 }; 9480 struct secinfo4 { 9481 uint32_t flavor; 9482 /* null for AUTH_SYS, AUTH_NONE; 9483 contains rpcsec_gss_info for 9484 RPCSEC_GSS. */ 9485 opaque flavor_info<>; 9486 }; 9488 typedef secinfo4 SECINFO4resok<>; 9490 union SECINFO4res switch (nfsstat4 status) { 9491 case NFS4_OK: 9492 SECINFO4resok resok4; 9493 default: 9494 void; 9495 }; 9497 /* 9498 * SETATTR: Set attributes 9499 */ 9500 struct SETATTR4args { 9501 /* CURRENT_FH: target object */ 9502 stateid4 stateid; 9503 fattr4 obj_attributes; 9505 }; 9507 struct SETATTR4res { 9508 nfsstat4 status; 9509 bitmap4 attrsset; 9510 }; 9512 /* 9513 * SETCLIENTID 9514 */ 9515 struct SETCLIENTID4args { 9516 nfs_client_id4 client; 9517 cb_client4 callback; 9518 }; 9520 struct SETCLIENTID4resok { 9521 clientid4 clientid; 9522 verifier4 setclientid_confirm; 9523 }; 9525 union SETCLIENTID4res switch (nfsstat4 status) { 9526 case NFS4_OK: 9527 SETCLIENTID4resok resok4; 9528 case NFS4ERR_CLID_INUSE: 9529 clientaddr4 client_using; 9530 default: 9531 void; 9532 }; 9534 struct SETCLIENTID_CONFIRM4args { 9535 verifier4 setclientid_confirm; 9536 }; 9538 struct SETCLIENTID_CONFIRM4res { 9539 nfsstat4 status; 9540 }; 9542 /* 9543 * VERIFY: Verify attributes same 9544 */ 9545 struct VERIFY4args { 9546 /* CURRENT_FH: object */ 9547 fattr4 obj_attributes; 9548 }; 9550 struct VERIFY4res { 9551 nfsstat4 status; 9552 }; 9554 /* 9555 * WRITE: Write to file 9556 */ 9558 enum stable_how4 { 9559 UNSTABLE4 = 0, 9560 DATA_SYNC4 = 1, 9561 FILE_SYNC4 = 2 9562 }; 9564 struct WRITE4args { 9565 /* CURRENT_FH: file */ 9566 stateid4 stateid; 9567 offset4 offset; 9568 stable_how4 stable; 9569 opaque data<>; 9570 }; 9572 struct WRITE4resok { 9573 count4 count; 9574 stable_how4 committed; 9575 verifier4 writeverf; 9576 }; 9578 union WRITE4res switch (nfsstat4 status) { 9579 case NFS4_OK: 9580 WRITE4resok resok4; 9581 default: 9582 void; 9583 }; 9585 /* 9586 * Operation arrays 9587 */ 9589 enum nfs_opnum4 { 9590 OP_ACCESS = 3, 9591 OP_CLOSE = 4, 9592 OP_COMMIT = 5, 9593 OP_CREATE = 6, 9594 OP_DELEGPURGE = 7, 9595 OP_DELEGRETURN = 8, 9596 OP_GETATTR = 9, 9597 OP_GETFH = 10, 9598 OP_LINK = 11, 9599 OP_LOCK = 12, 9600 OP_LOCKT = 13, 9601 OP_LOCKU = 14, 9602 OP_LOOKUP = 15, 9603 OP_LOOKUPP = 16, 9604 OP_NVERIFY = 17, 9605 OP_OPEN = 18, 9606 OP_OPENATTR = 19, 9607 OP_OPEN_CONFIRM = 20, 9608 OP_OPEN_DOWNGRADE = 21, 9609 OP_PUTFH = 22, 9610 OP_PUTPUBFH = 23, 9611 OP_PUTROOTFH = 24, 9612 OP_READ = 25, 9613 OP_READDIR = 26, 9614 OP_READLINK = 27, 9615 OP_REMOVE = 28, 9616 OP_RENAME = 29, 9617 OP_RENEW = 30, 9618 OP_RESTOREFH = 31, 9619 OP_SAVEFH = 32, 9620 OP_SECINFO = 33, 9621 OP_SETATTR = 34, 9622 OP_SETCLIENTID = 35, 9623 OP_SETCLIENTID_CONFIRM = 36, 9624 OP_VERIFY = 37, 9625 OP_WRITE = 38 9626 }; 9628 union nfs_argop4 switch (nfs_opnum4 argop) { 9629 case OP_ACCESS: ACCESS4args opaccess; 9630 case OP_CLOSE: CLOSE4args opclose; 9631 case OP_COMMIT: COMMIT4args opcommit; 9632 case OP_CREATE: CREATE4args opcreate; 9633 case OP_DELEGPURGE: DELEGPURGE4args opdelegpurge; 9634 case OP_DELEGRETURN: DELEGRETURN4args opdelegreturn; 9635 case OP_GETATTR: GETATTR4args opgetattr; 9636 case OP_GETFH: void; 9637 case OP_LINK: LINK4args oplink; 9638 case OP_LOCK: LOCK4args oplock; 9639 case OP_LOCKT: LOCK4args oplockt; 9640 case OP_LOCKU: LOCK4args oplocku; 9641 case OP_LOOKUP: LOOKUP4args oplookup; 9642 case OP_LOOKUPP: void; 9643 case OP_NVERIFY: NVERIFY4args opnverify; 9644 case OP_OPEN: OPEN4args opopen; 9645 case OP_OPENATTR: void; 9646 case OP_OPEN_CONFIRM: OPEN_CONFIRM4args opopen_confirm; 9647 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4args opopen_downgrade; 9648 case OP_PUTFH: PUTFH4args opputfh; 9649 case OP_PUTPUBFH: void; 9650 case OP_PUTROOTFH: void; 9651 case OP_READ: READ4args opread; 9652 case OP_READDIR: READDIR4args opreaddir; 9653 case OP_READLINK: void; 9654 case OP_REMOVE: REMOVE4args opremove; 9655 case OP_RENAME: RENAME4args oprename; 9656 case OP_RENEW: RENEW4args oprenew; 9657 case OP_RESTOREFH: void; 9658 case OP_SAVEFH: void; 9659 case OP_SECINFO: SECINFO4args opsecinfo; 9660 case OP_SETATTR: SETATTR4args opsetattr; 9661 case OP_SETCLIENTID: SETCLIENTID4args opsetclientid; 9662 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4args 9663 opsetclientid_confirm; 9664 case OP_VERIFY: VERIFY4args opverify; 9665 case OP_WRITE: WRITE4args opwrite; 9666 }; 9668 union nfs_resop4 switch (nfs_opnum4 resop){ 9669 case OP_ACCESS: ACCESS4res opaccess; 9670 case OP_CLOSE: CLOSE4res opclose; 9671 case OP_COMMIT: COMMIT4res opcommit; 9672 case OP_CREATE: CREATE4res opcreate; 9673 case OP_DELEGPURGE: DELEGPURGE4res opdelegpurge; 9674 case OP_DELEGRETURN: DELEGRETURN4res opdelegreturn; 9675 case OP_GETATTR: GETATTR4res opgetattr; 9676 case OP_GETFH: GETFH4res opgetfh; 9677 case OP_LINK: LINK4res oplink; 9678 case OP_LOCK: LOCK4res oplock; 9679 case OP_LOCKT: LOCKT4res oplockt; 9680 case OP_LOCKU: LOCKU4res oplocku; 9681 case OP_LOOKUP: LOOKUP4res oplookup; 9682 case OP_LOOKUPP: LOOKUPP4res oplookupp; 9683 case OP_NVERIFY: NVERIFY4res opnverify; 9684 case OP_OPEN: OPEN4res opopen; 9685 case OP_OPENATTR: OPENATTR4res opopenattr; 9686 case OP_OPEN_CONFIRM: OPEN_CONFIRM4res opopen_confirm; 9687 case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4res opopen_downgrade; 9688 case OP_PUTFH: PUTFH4res opputfh; 9689 case OP_PUTPUBFH: PUTPUBFH4res opputpubfh; 9690 case OP_PUTROOTFH: PUTROOTFH4res opputrootfh; 9691 case OP_READ: READ4res opread; 9692 case OP_READDIR: READDIR4res opreaddir; 9693 case OP_READLINK: READLINK4res opreadlink; 9694 case OP_REMOVE: REMOVE4res opremove; 9695 case OP_RENAME: RENAME4res oprename; 9696 case OP_RENEW: RENEW4res oprenew; 9697 case OP_RESTOREFH: RESTOREFH4res oprestorefh; 9698 case OP_SAVEFH: SAVEFH4res opsavefh; 9699 case OP_SECINFO: SECINFO4res opsecinfo; 9700 case OP_SETATTR: SETATTR4res opsetattr; 9701 case OP_SETCLIENTID: SETCLIENTID4res opsetclientid; 9702 case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4res 9703 opsetclientid_confirm; 9704 case OP_VERIFY: VERIFY4res opverify; 9705 case OP_WRITE: WRITE4res opwrite; 9706 }; 9708 struct COMPOUND4args { 9709 utf8string tag; 9710 uint32_t minorversion; 9711 nfs_argop4 argarray<>; 9712 }; 9713 struct COMPOUND4res { 9714 nfsstat4 status; 9715 utf8string tag; 9716 nfs_resop4 resarray<>; 9717 }; 9719 /* 9720 * Remote file service routines 9721 */ 9722 program NFS4_PROGRAM { 9723 version NFS_V4 { 9724 void 9725 NFSPROC4_NULL(void) = 0; 9727 COMPOUND4res 9728 NFSPROC4_COMPOUND(COMPOUND4args) = 1; 9730 } = 4; 9731 } = 100003; 9733 /* 9734 * NFS4 Callback Procedure Definitions and Program 9735 */ 9737 /* 9738 * CB_GETATTR: Get Current Attributes 9739 */ 9740 struct CB_GETATTR4args { 9741 nfs_fh4 fh; 9742 bitmap4 attr_request; 9743 }; 9745 struct CB_GETATTR4resok { 9746 fattr4 obj_attributes; 9747 }; 9749 union CB_GETATTR4res switch (nfsstat4 status) { 9750 case NFS4_OK: 9751 CB_GETATTR4resok resok4; 9752 default: 9753 void; 9754 }; 9756 /* 9757 * CB_RECALL: Recall an Open Delegation 9758 */ 9759 struct CB_RECALL4args { 9760 stateid4 stateid; 9761 bool truncate; 9762 nfs_fh4 fh; 9764 }; 9766 struct CB_RECALL4res { 9767 nfsstat4 status; 9768 }; 9770 /* 9771 * Various definitions for CB_COMPOUND 9772 */ 9773 enum nfs_cb_opnum4 { 9774 OP_CB_GETATTR = 3, 9775 OP_CB_RECALL = 4 9776 }; 9778 union nfs_cb_argop4 switch (unsigned argop) { 9779 case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; 9780 case OP_CB_RECALL: CB_RECALL4args opcbrecall; 9781 }; 9783 union nfs_cb_resop4 switch (unsigned resop){ 9784 case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; 9785 case OP_CB_RECALL: CB_RECALL4res opcbrecall; 9786 }; 9788 struct CB_COMPOUND4args { 9789 utf8string tag; 9790 uint32_t minorversion; 9791 nfs_cb_argop4 argarray<>; 9792 }; 9794 struct CB_COMPOUND4res { 9795 nfsstat4 status; 9796 utf8string tag; 9797 nfs_cb_resop4 resarray<>; 9798 }; 9800 /* 9801 * Program number is in the transient range since the client 9802 * will assign the exact transient program number and provide 9803 * that to the server via the SETCLIENTID operation. 9804 */ 9805 program NFS4_CALLBACK { 9806 version NFS_CB { 9807 void 9808 CB_NULL(void) = 0; 9809 CB_COMPOUND4res 9810 CB_COMPOUND(CB_COMPOUND4args) = 1; 9811 } = 1; 9812 } = 40000000; 9814 19. Bibliography 9816 [Floyd] 9817 S. Floyd, V. Jacobson, "The Synchronization of Periodic Routing 9818 Messages," IEEE/ACM Transactions on Networking, 2(2), pp. 122-136, 9819 April 1994. 9821 [Gray] 9822 C. Gray, D. Cheriton, "Leases: An Efficient Fault-Tolerant Mechanism 9823 for Distributed File Cache Consistency," Proceedings of the Twelfth 9824 Symposium on Operating Systems Principles, p. 202-210, December 1989. 9826 [ISO10646] 9827 "ISO/IEC 10646-1:1993. International Standard -- Information 9828 technology -- Universal Multiple-Octet Coded Character Set (UCS) -- 9829 Part 1: Architecture and Basic Multilingual Plane." 9831 [Juszczak] 9832 Juszczak, Chet, "Improving the Performance and Correctness of an NFS 9833 Server," USENIX Conference Proceedings, USENIX Association, Berkeley, 9834 CA, June 1990, pages 53-63. Describes reply cache implementation 9835 that avoids work in the server by handling duplicate requests. More 9836 important, though listed as a side-effect, the reply cache aids in 9837 the avoidance of destructive non-idempotent operation re-application 9838 -- improving correctness. 9840 [Kazar] 9841 Kazar, Michael Leon, "Synchronization and Caching Issues in the 9842 Andrew File System," USENIX Conference Proceedings, USENIX 9843 Association, Berkeley, CA, Dallas Winter 1988, pages 27-36. A 9844 description of the cache consistency scheme in AFS. Contrasted with 9845 other distributed file systems. 9847 [Macklem] 9848 Macklem, Rick, "Lessons Learned Tuning the 4.3BSD Reno Implementation 9849 of the NFS Protocol," Winter USENIX Conference Proceedings, USENIX 9850 Association, Berkeley, CA, January 1991. Describes performance work 9851 in tuning the 4.3BSD Reno NFS implementation. Describes performance 9852 improvement (reduced CPU loading) through elimination of data copies. 9854 [Mogul] 9855 Mogul, Jeffrey C., "A Recovery Protocol for Spritely NFS," USENIX 9856 File System Workshop Proceedings, Ann Arbor, MI, USENIX Association, 9857 Berkeley, CA, May 1992. Second paper on Spritely NFS proposes a 9858 lease-based scheme for recovering state of consistency protocol. 9860 [Nowicki] 9861 Nowicki, Bill, "Transport Issues in the Network File System," ACM 9862 SIGCOMM newsletter Computer Communication Review, April 1989. A 9863 brief description of the basis for the dynamic retransmission work. 9865 [Pawlowski] 9866 Pawlowski, Brian, Ron Hixon, Mark Stein, Joseph Tumminaro, "Network 9867 Computing in the UNIX and IBM Mainframe Environment," Uniforum `89 9868 Conf. Proc., (1989) Description of an NFS server implementation for 9869 IBM's MVS operating system. 9871 [RFC1094] 9872 Sun Microsystems, Inc., "NFS: Network File System Protocol 9873 Specification", RFC1094, March 1989. 9875 http://www.ietf.org/rfc/rfc1094.txt 9877 [RFC1345] 9878 Simonsen, K., "Character Mnemonics & Character Sets", RFC1345, 9879 Rationel Almen Planlaegning, June 1992. 9881 http://www.ietf.org/rfc/rfc1345.txt 9883 [RFC1700] 9884 Reynolds, J., Postel, J., "Assigned Numbers", RFC1700, ISI, October 9885 1994 9887 http://www.ietf.org/rfc/rfc1700.txt 9889 [RFC1813] 9890 Callaghan, B., Pawlowski, B., Staubach, P., "NFS Version 3 Protocol 9891 Specification", RFC1813, Sun Microsystems, Inc., June 1995. 9893 http://www.ietf.org/rfc/rfc1813.txt 9895 [RFC1831] 9896 Srinivasan, R., "RPC: Remote Procedure Call Protocol Specification 9897 Version 2", RFC1831, Sun Microsystems, Inc., August 1995. 9899 http://www.ietf.org/rfc/rfc1831.txt 9901 [RFC1832] 9902 Srinivasan, R., "XDR: External Data Representation Standard", 9903 RFC1832, Sun Microsystems, Inc., August 1995. 9905 http://www.ietf.org/rfc/rfc1832.txt 9907 [RFC1833] 9908 Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC1833, 9909 Sun Microsystems, Inc., August 1995. 9911 http://www.ietf.org/rfc/rfc1833.txt 9913 [RFC2025] 9914 Adams, C., "The Simple Public-Key GSS-API Mechanism (SPKM)", RFC2025, 9915 Bell-Northern Research, October 1996. 9917 http://www.ietf.org/rfc/rfc2026.txt 9919 [RFC2054] 9920 Callaghan, B., "WebNFS Client Specification", RFC2054, Sun 9921 Microsystems, Inc., October 1996 9923 http://www.ietf.org/rfc/rfc2054.txt 9925 [RFC2055] 9926 Callaghan, B., "WebNFS Server Specification", RFC2055, Sun 9927 Microsystems, Inc., October 1996 9929 http://www.ietf.org/rfc/rfc2055.txt 9931 [RFC2078] 9932 Linn, J., "Generic Security Service Application Program Interface, 9933 Version 2", RFC2078, OpenVision Technologies, January 1997. 9935 http://www.ietf.org/rfc/rfc2078.txt 9937 [RFC2152] 9938 Goldsmith, D., "UTF-7 A Mail-Safe Transformation Format of Unicode", 9939 RFC2152, Apple Computer, Inc., May 1997 9941 http://www.ietf.org/rfc/rfc2152.txt 9943 [RFC2203] 9944 Eisler, M., Chiu, A., Ling, L., "RPCSEC_GSS Protocol Specification", 9945 RFC2203, Sun Microsystems, Inc., August 1995. 9947 http://www.ietf.org/rfc/rfc2203.txt 9949 [RFC2277] 9950 Alvestrand, H., "IETF Policy on Character Sets and Languages", 9951 RFC2277, UNINETT, January 1998. 9953 http://www.ietf.org/rfc/rfc2277.txt 9955 [RFC2279] 9956 Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC2279, 9957 Alis Technologies, January 1998. 9959 http://www.ietf.org/rfc/rfc2279.txt 9961 [RFC2623] 9962 Eisler, M., "NFS Version 2 and Version 3 Security Issues and the NFS 9963 Protocol's Use of RPCSEC_GSS and Kerberos V5", RFC2623, Sun 9964 Microsystems, June 1999 9966 http://www.ietf.org/rfc/rfc2623.txt 9968 [RFC2624] 9969 Shepler, S., "NFS Version 4 Design Considerations", RFC2624, Sun 9970 Microsystems, June 1999 9972 http://www.ietf.org/rfc/rfc2624.txt 9974 [RFC2847] 9975 Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism Using 9976 SPKM", RFC2847, Sun Microsystems, June 2000 9978 http://www.ietf.org/internet-drafts/draft-ietf-cat-lipkey-03.txt 9980 [Sandberg] 9981 Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, "Design 9982 and Implementation of the Sun Network Filesystem," USENIX Conference 9983 Proceedings, USENIX Association, Berkeley, CA, Summer 1985. The 9984 basic paper describing the SunOS implementation of the NFS version 2 9985 protocol, and discusses the goals, protocol specification and trade- 9986 offs. 9988 [Srinivasan] 9989 Srinivasan, V., Jeffrey C. Mogul, "Spritely NFS: Implementation and 9990 Performance of Cache Consistency Protocols", WRL Research Report 9991 89/5, Digital Equipment Corporation Western Research Laboratory, 100 9992 Hamilton Ave., Palo Alto, CA, 94301, May 1989. This paper analyzes 9993 the effect of applying a Sprite-like consistency protocol applied to 9994 standard NFS. The issues of recovery in a stateful environment are 9995 covered in [Mogul]. 9997 [Unicode1] 9998 The Unicode Consortium, "The Unicode Standard, Version 3.0", 9999 Addison-Wesley Developers Press, Reading, MA, 2000. ISBN 0-201- 10000 61633-5. 10002 More information available at: http://www.unicode.org/ 10004 [Unicode2] 10005 "Unsupported Scripts" Unicode, Inc., The Unicode Consortium, P.O. Box 10006 700519, San Jose, CA 95710-0519 USA, September 1999 10008 http://www.unicode.org/unicode/standard/unsupported.html 10010 [XNFS] 10011 The Open Group, Protocols for Interworking: XNFS, Version 3W, The 10012 Open Group, 1010 El Camino Real Suite 380, Menlo Park, CA 94025, ISBN 10013 1-85912-184-5, February 1998. 10015 HTML version available: http://www.opengroup.org 10017 20. Authors 10019 20.1. Editor's Address 10021 Spencer Shepler 10022 Sun Microsystems, Inc. 10023 7808 Moonflower Drive 10024 Austin, Texas 78750 10026 Phone: +1 512-349-9376 10027 E-mail: shepler@eng.sun.com 10029 20.2. Authors' Addresses 10031 Carl Beame 10032 Hummingbird Ltd. 10034 E-mail: beame@bws.com 10036 Brent Callaghan 10037 Sun Microsystems, Inc. 10038 901 San Antonio Road 10039 Palo Alto, CA 94303 10041 Phone: +1 650-786-5067 10042 E-mail: brent.callaghan@eng.sun.com 10044 Mike Eisler 10045 5565 Wilson Road 10046 Colorado Springs, CO 80919 10048 Phone: +1 719-599-9026 10049 E-mail: mike@eisler.com 10051 Dave Noveck 10052 Network Appliance 10053 200 West Street 10054 Waltham, MA 02451 10056 Phone: +1 781-861-9291 10057 E-mail: dave.noveck@netapp.com 10059 David Robinson 10060 Sun Microsystems, Inc. 10061 901 San Antonio Road 10062 Palo Alto, CA 94303 10063 Phone: +1 650-786-5088 10064 E-mail: david.robinson@eng.sun.com 10066 Robert Thurlow 10067 Sun Microsystems, Inc. 10068 901 San Antonio Road 10069 Palo Alto, CA 94303 10071 Phone: +1 650-786-5096 10072 E-mail: robert.thurlow@eng.sun.com 10074 20.3. Acknowledgements 10076 The author thanks and acknowledges: 10078 Neil Brown for his extensive review and comments of various drafts. 10080 21. Full Copyright Statement 10082 "Copyright (C) The Internet Society (2000). All Rights Reserved. 10084 This document and translations of it may be copied and furnished to 10085 others, and derivative works that comment on or otherwise explain it 10086 or assist in its implementation may be prepared, copied, published 10087 and distributed, in whole or in part, without restriction of any 10088 kind, provided that the above copyright notice and this paragraph are 10089 included on all such copies and derivative works. However, this 10090 document itself may not be modified in any way, such as by removing 10091 the copyright notice or references to the Internet Society or other 10092 Internet organizations, except as needed for the purpose of 10093 developing Internet standards in which case the procedures for 10094 copyrights defined in the Internet Standards process must be 10095 followed, or as required to translate it into languages other than 10096 English. 10098 The limited permissions granted above are perpetual and will not be 10099 revoked by the Internet Society or its successors or assigns. 10101 This document and the information contained herein is provided on an 10102 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 10103 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 10104 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 10105 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 10106 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."