we are facing a connection problem from time to time when trying to connect to the master:
110908 09:24:11 001 Proofx-E: Conn::Connect: failed to connect to proof://user@proof.ifca.es:1093//
110908 09:24:11 001 Proofx-E: XrdProofConn: XrdProofConn: severe error occurred while opening a connection to server [proof.ifca.es:1093]
It seems a network connection that I am not sure why appears, but that can be solved by restarting the xrootd (/etc/init.d/xrootd restart).
When the user gets the above error, the following message appears in the log master:
We know that unfortunately there are still cases in which the daemon becomes unresponsive to a connection attempt. I have not understood if in this case it was like this for all users or only for a specific one.
We are working to a modification of the connection setup which should improve stability and speed of the connections.
For what relates to automatic restarts, if the daemon is up but unresponsive you can use a script+binary which I have recently provided to other admins for similar purposes. It is standalone, i.e. does not depend on ROOT or xrootd. I have just committed to our SVN repository for convenience (it may also go in the next ROOT dist).
To try it out do the following:
$ cd somedir
$ svn co http://root.cern.ch/svn/root/branches/dev/xrdping
$ cd xrdping
$ make
$ ./xrdping myhost
There is a README describing the usage and an example of script.
This should allow you to test quickly if the service is up and responding where you expect.