Missing events in PROOF because of accessing problems

Hello Rooters,

testing PROOF with slaves and files spreaded over two facilities I get an strange error:

when I start two slaves at my local facility and want to access files via rootd’s at the other location I had the problem that one slave is able to authenticate (via GLOBUS) to the rootd but the other slave failed (this error is only the trigger for the real problem and isn’t interesting).

The analysis is running as normal, depending on that I still have one slave that is able to access the remote file but near to the end of the analysis the PROOF progress bar stops. Pressing the CLOSE button close the progress dialog. Looking at the results I remark that some (ca. 2%) of my events are missing.

This behaviour is not because of the authentication error but also because of the fact that one of the slaves is not able to access all files.

Does anybody know this problem or have an idea how to solve this problem?

Cheers Carsten.

Hi Carsten,

There is actually a bug when the two slaves are on the same machine: the second one does not access correctly the delegated credentials. This will be soon fixed in CVS. You can skip the problem by adding the following line

default:proofd method globus ru:0

in the system.rootauthrc file seen by the master.

This should solve your first problem. Please try.
Then we will investigate the second one.
Please, give also the version(s) of ROOT you are using.

Cheers, Gerri

Hallo Gerri,

the authentication error is not really a problem. I have only tested a PROOF session with two different hosts
(one with access to /etc/grid-security/hostcert.pem, the other without that). The proofd running on the host without /etc/… can’t authenticate to the rootd.
We at GSI use a LSF cluster with a dedicated PROOF queue there this problem is solved.
But thanks for your suggestion, I will try it as soon as possible and give you a feedback on this.

The real problem is the abortive analysis and the fact that the problem occurs at the end of the analysis.

It’s very disappointing to run an analysis on 15 hosts for 12 hours and in the last few minutes before the end of the analysis you see the error.

If the file(s) generally are not accessible, OK, but I don’t understand why the analysis crashes if only one slave is not able to access the file(s).

Cheers Carsten.

P.S. We are using ROOT v4-00-08.