[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[nfsv4] When a server's network address moves
Suppose an NFSv4.1 server has two network interfaces, each with
a distinct network address, say 10.0.0.1 and 10.0.0.2. When the
client sends an EXCHANGE_ID over connections to 10.0.0.1 and 10.0.0.2,
the indication from the results is that session trunking is supported.
(The indication would be equal server owners, equal server scopes,
and equal client IDs.)
So the client sends a CREATE_SESSION, and successful associates both
connections with the same session.
Now the site administrator does some re-configuration, and
IP address 10.0.0.2 is re-assigned to another NFSv4.1 server.
The TCP connection does not survive the move, and after a reconnect,
the client's attempts to use SEQUENCE or BIND_CONN_TO_SESSION
with the session ID, fail with NFS4ERR_BADSESSION.
However the session is valid on the connection
to 10.0.0.1.
The current text in the NFSv4.1 specification is not
clear on this point (I will make sure the text
is clear before an RFC is published), but I wanted
to give client implementors a heads up that
they have to deal with this scenario. Or if I
am incorrect, then start a dialog about what we
should do. What follows is my opinion on what
client implementations should do.
Essentially, receipt of NFS4ERR_BADSESSION does not
mean the session is gone. It just means that
the NFSv4.1 server listening
no longer knows about the session.
Here is one suggested algorithm for the client
when it gets NFS4ERR_BADSESSION.
If the client has other connections to
other server network addresses
associated with the same session, attempt
a COMPOUND with a single operation, SEQUENCE,
on each of the other connections.
If they succeed, the session is still alive,
and this is a strong indicator the server's
network address has moved.
The client might send an EXCHANGE_ID on the
connection that returned NFS4ERR_BADSESSION
to see if there are opportunities for client ID
trunking (i.e. the same client ID and so_major are
returned). The client might use DNS to see if
the moved network address was replaced with another,
so that the performance and availability benefits of
session trunking can continue.
If the SEQUENCE requests fail with NFS4ERR_BADSESSION
then the session no longer exists on any of the
server network addresses the client has connections
associated with the session ID. It is possible the
session is still alive and available on other
network addresses. The client sends an EXCHANGE_ID
on all the connections to see if the server owner
is still listening on those network addresses.
If the same server owner is returned, but a new
client ID is returned, this is a strong
indicator of a server restart. If the same
server owner is returned, the same client ID
is returned, then this is a strong indication
that the server did delete the session, and the
client will need to send a CREATE_SESSION if it
has no other sessions for that client ID.
If a different server owner is returned,
the client can use DNS to find
other network addresses. If it does not, or if
DNS does not find any other addresses for the server,
then the client will be unable to provide NFSv4.1
service, and fatal errors should be returned
to processes that were using the server. If the
client is using a "mount" paradigm, unmounting
the server is advised.
If the client knows of no other connections associated
with the session ID, and server network addresses that
are, or have been associated with the session ID,
then the client can DNS to find
other network addresses. If it does not, or if
DNS does not find any other addresses for the server,
then the client will be unable to provide NFSv4.1
service, and fatal errors should be returned
to processes that were using the server. If the
client is using a "mount" paradigm, unmounting
the server is advised.
A variation on the above is that after a server's
network address moves, there is no NFSv4.1 server
listening. E.g. no listener on port 2049, the NFSv4
server returns NFS4ERR_MINOR_VERS_MISMATCH, the NFS
server server returns a PROG_MISMATCH error, the
listener on 2049 returns PROG_MISMATCH, attempts to
re-connect to the network address timeout. These
are equivalent to SEQUENCE returning NFS4ERR_BADSESSION.
--
Mike Eisler, Senior Technical Director, NetApp, 719 599 9026,
http://blogs.netapp.com/eislers_nfs_blog/
_______________________________________________
nfsv4 mailing list
nfsv4 at ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4