[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [nfsv4] [FedFS] Meeting Minutes, 11/5/2009



On Thu, 5 Nov 2009, Nicolas Williams wrote:

> On Thu, Nov 05, 2009 at 03:58:25PM -0500, James Lentini wrote:
> > + Admin FEDFS_LOOKUP_FSN resolve parameter
> > 
> >   We reviewed the updates posted here:
> > 
> >   http://jlentini.users.sourceforge.net/draft-ietf-nfsv4-federated-fs-admin-03bis.txt
> > 
> >   and diff here:
> > 
> >   http://jlentini.users.sourceforge.net/draft-ietf-nfsv4-federated-fs-admin-rfcdiff.html
> > 
> >   The first parts of the diff should be ignored. The xml2rfc tool has 
> >   been updated to place the abstract first in the document.
> > 
> >   The real changes start on page 6.
> > 
> >   Craig suggested adding an unstructured text field to the results to 
> >   indicate the source of a failure.
> 
> Beware of unstructured text fields!  You can't localize them _unless_
> you have the client send language tags so the client and server can
> negotiate a language to localize to.  See RFC5646, a.k.a. BCP0047.
> 
> For G11N (globalization) reasons you must support localization...
> 
> If you can come up with a complete list of errors, then it's better to
> have error codes for all of them.  You can always punt and say "go look
> at logs" (logs are never localized).

Thank you for making us aware of that. Including a text field would be 
in the "bells and whistles" category. If it would require a 
significant increase in complexity, it probably isn't worth it.
 
> >   We discussed the NSDB error. There is only one error being proposed without 
> >   distinction between NSDB node is down, the NSDB service is not listening, 
> >   the fileserver can't negotiate an LDAP connection, etc. There was consensus 
> >   that a coarse grained error for failure to establish and end-to-end NSDB 
> >   connection was all that should be provided.
> 
> A coarse error is a lot better than nothing.  But there's no reason that
> we couldn't specify a finer-grained set of errors and let
> implementations use them that can.  The errors I can think of:
> 
>  - NSDB name NXDOMAIN (not resolvable)
> 
>  - NSDB name resolution timeout (e.g., DNS servers not reachable)
> 
>  - NSDB name partially resolvable (e.g., can find SRV RRs but not A
>    RRs; this would be misconfiguration)
> 
>  - NSDB hosts all not reachable / connect timeout
> 
>  - NSDB hosts' server certificates cannot be validated
> 
>  - NSDB LDAP SASL/GSSAPI bind failed (should probably have some
>    sub-errors for this)
> 
>     - Unfortunately the set errors for this could be enormous, and
>       because SASL and the GSS-API are open plug-in systems,
>       indeterminate.
> 
>       Here an unstructured text field would be nice.  (Unfortunately at
>       least one very common LDAP library does a very poor job of
>       reporting SASL/GSSAPI bind errors, and this is probably true of
>       other LDAP client implementations.)
> 
>  - NSDB LDAP protocol error N

We could add more error codes. They won't eliminate the need for an 
administrator to diagnose the problem. 
 
>  - NSDB object not found

The FEDFS_ERR_NOFSN and FEDFS_ERR_NOFSL error codes in the update are 
for this case.

>  - NSDB object missing required attributes

Rob suggested the same thing below (see note about malformed response 
error). That is on the list of error codes to add.

> >   We also discussed reflecting LDAP error codes back through the procedure's 
> >   results. We agreed that the free form string field that will be added is 
> >   the appropriate place for this.
> 
> See above.
> 
> >   Rob suggested that a malformed response be added as a fourth type of
> >   resolution error.
> > 
> >   As an aside, we discussed what the fileserver's behavior would be 
> >   when a junction resolution error occurred when a client was traversing
> >   the junction. What should the client see? Should the fileserver wait or 
> >   return an error? It seemed prudent for the fileserver to return a DELAY
> >   error if the problem was transient. For persistent errors the fileserver 
> >   should indicate a hard error.
> 
> How long to delay?  Users will want to know.  Shouldn't NFSv4.1 have a
> way to say "this is a junction, but I can't figure out where to refer
> you to"??

This was a side discussion. The NFS GETATTR operation, which is used 
to fetch the fs_locations or fs_locations_info data, may return the 
error NFS4ERR_DELAY. We were discussing times when it would be 
appropriate for the fileserver NFS4ERR_DELAY. Obviously, this isn't 
part of FedFS. It has already been described in NFSv4.

> >   The admin protocol should return an error on junction resolution failure 
> >   immediately. This will allow the administrator to diagnose problems 
> >   immediately.
> 
> Sure.
> 
> >   Craig suggested that the resolve parameter be a ternary instead of binary
> >   value: 0 = don't resolve, 1 = resolve using cache value, and 2 = resolve 
> >   using nsdb.
> 
> What cached value?

The fileserver may have cached FSL values. See Section 2.4.2 "Caching 
of Fileset Locations" in 

http://www.ietf.org/id/draft-ietf-nfsv4-federated-fs-protocol-04.txt