Error: ProofServMgr::Create: problems accepting callback: ti

Dear all,

we have a PROOF-Cluster installation at our site, that have been performing in average quite well. However, since one week ago I can not start any new PROOF session:

root [0] TProof *p = TProof::Open(“arodrig@proof.ifca.es:1093”)
Starting master: opening connection …
tarting master: connection open: setting up server …
| Error condition occured: message from server:
| timeout: process killed
Info in TXSlave::HandleError: 0x303d6e0:proof.ifca.es:0 got called … fProof: 0x2f91ab0, fSocket: 0x303db50 (valid: 1)
Info in TXSlave::HandleError: 0x303d6e0: proof: 0x2f91ab0
TXSlave::HandleError: 0x303d6e0: DONE …
Starting master: OK
Info in TProof::Collect: proof.ifca.es
Error in TProof::StartSlaves: setting up master
Error in TProof::Open: new session could not be created
Error: illegal pointer to class object proofSession 0x0 1512 scripts/PAFUtils.C:964:
*** Interpreter error recovered ***

In the log master it appears:

xpd-E: ProofServMgr::Create: problems accepting callback: timeout: process killed

I checked the connection between the client and the master, and with the workers nodes at the cluster, and it is allowed in all the cases.

We used root5.28.00a.

Apparently nothing has changed at our site. This is the first time I face this issue.

Do you have any idea of what can be doing wrong?

Thank you very much,
Ana Rodríguez.

Hi Ana,

Do you mean that systematically you get this problem?
Is there nothing in the session log?
You can try to get it from the ROOT prompt with TProof::LogViewer(“arodrig@proof.ifca.es:1093”) …
You can perhaps add some verbosity with

root [0] TProof *p = TProof::Open("arodrig@proof.ifca.es:1093", 0, 0, 4)

Gerri

Hi Gerri,

yes, I get this error systematically.
Setting the verbosity I get the same amount of information:

root [0] TProof *p = TProof::Open(“arodrig@proof.ifca.es:1093”, 0, 0, 4)
Starting master: opening connection …
tarting master: connection open: setting up server …
| Error condition occured: message from server:
| timeout: process killed
Info in TXSlave::HandleError: 0x198b79a0:proof.ifca.es:0 got called … fProof: 0x1980bd70, fSocket: 0x198b7e10 (valid: 1)
Info in TXSlave::HandleError: 0x198b79a0: proof: 0x1980bd70
Starting master: OK
TXSlave::HandleError: 0x198b79a0: DONE …

From the log I got this error:

xpd-E: ProofServMgr::SetProofServEnv: problems creating symlink to ‘session.rootrc’

Retrieving logs: 1 ok, 0 not ok (100 % processed)

// --------- Start of element log -----------------

// Ordinal: 0 (role: master)

// Path: proof://proof.ifca.es:1093//pool/proofb … -20614.log
// # of retrieved lines: 13

// ------------------------------------------------

110325 16:32:02 6617 xpd-I: ProofServMgr::Create: srvtype = 2
110325 16:32:02 6617 xpd-E: Aux::SymLink: problems creating symlink session.rootrc (errno: 28)
110325 16:32:02 6617 xpd-E: ProofServMgr::SetProofServEnv: problems creating symlink to ‘session.rootrc’ (errno: 28)
proofserv: starting /opt/root5.28a/root/bin/proofserv.exe
proofserv: redirecting output to /pool/proofbox/arodrig/session-proof-1301067122-20614/master-0-proof-1301067122-20614.log
proofserv: RedirectOutput: enter: /pool/proofbox/arodrig/session-proof-1301067122-20614/master-0-proof-1301067122-20614.log
proofserv: RedirectOutput: reopen /pool/proofbox/arodrig/session-proof-1301067122-20614/master-0-proof-1301067122-20614.log
proofserv: RedirectOutput: dup2 …
proofserv: RedirectOutput: read open …
proofserv: RedirectOutput: done!
proofserv: output redirected to: /pool/proofbox/arodrig/session-proof-1301067122-20614/master-0-proof-1301067122-20614.log
proofserv: running the TProofServ application
16:32:02 20614 Mst- | Error in TXProofServ::CreateServer: Socket setup by xpd undefined
// --------- End of element log -------------------

Best,
Ana.

Hi Ana,

There is a problem with the path for the unix socket used internally.
Now, you have errno=28 while creating files or links in the sandbox:

This usually means ‘No space left on device’.
Can you check the status of the device on which the sandboxes are located?
Also, on the master sandbox there should be two files, one with extension “.env”, the other with extension “.rootrc”, something like

and

Can you post those files for a failing session?

Gerri

Hi Gerri,

This usually means ‘No space left on device’.
Can you check the status of the device on which the sandboxes are located?

you are right, 100% of the space was used. The *env and *rootc were empty may be for the same reason.

I free some space, tried again and it works now:

gridui01:~/TopAnalysis/PAF$ root -l
root [0] TProof *p = TProof::Open(“arodrig@proof.ifca.es:1093”);
Starting master: opening connection …
Starting master: OK
Opening connections to workers: OK (6 workers)
Setting up worker servers: OK (6 workers)
PROOF set to parallel mode (6 workers)
root [1]

Thanks you very much.

Ana.