Problems with simple PROOF-Setup

Hello Everybody,

I’m tying to set up a very simple PROOF environment on my desktop computer containing two slaves. I followed the description in the README.PROOF file. Currenty I’m using root version 3.10/00 with gcc3.3. All data files and classes I have been using can be found under http://iktp.tu-dresden.de/~jsunderm/prooftest.tar.gz. The setup of my system is as follows:

file: proof.conf
node localhost
slave localhost usrpwd
slave localhost usrpwd

file /var/log/messages:
Oct 10 18:26:49 jsunderm rootd[6245]: main: no config directory specified
Oct 10 18:26:49 jsunderm rootd[6245]: Rootd: file closed, rd=0, wr=0, rx=0, tx=0
Oct 10 18:26:49 jsunderm rootd[6249]: main: no config directory specified
Oct 10 18:26:49 jsunderm rootd[6249]: Rootd: file closed, rd=0, wr=0, rx=0, tx=0
Oct 10 18:26:50 jsunderm sshd[6217]: Did not receive identification string from ::ffff:127.0.0.1
Oct 10 18:26:50 jsunderm sshd[6212]: Did not receive identification string from ::ffff:127.0.0.1
Oct 10 18:28:19 jsunderm sshd[6207]: fatal: Timeout before authentication for ::ffff:127.0.0.1

file proof.log:
Oct 10 18:26:49 jsunderm proofslave[6208]: jsunderm:slave 0:SysError:TUnixSystem::UnixRecv:recv (Connection reset by peer)
Oct 10 18:26:49 jsunderm proofslave[6213]: jsunderm:slave 1:SysError:TUnixSystem::UnixRecv:recv (Connection reset by peer)
Oct 10 18:26:49 jsunderm proofserv[6203]: jsunderm:master:Error:TPacketizer2::TPacketizer2:kPROOF_FATAL from slave-0 (localhost)
Oct 10 18:26:49 jsunderm proofserv[6203]: jsunderm:master:*** Break ***:segmentation violation

My root-session looks like this:
root [0] gROOT->Proof()
Name (localhost:jsunderm): jsunderm
Password:
PROOF set to parallel mode (2 slaves)
root [1] TDSet *set;
root [2] set = new TDSet(“TTree”,“h42”);
root [3] set->Add(“root://localhost/temp/prooftest/data1.root”);
root [4] set->Add(“root://localhost/temp/prooftest/data2.root”);
root [5] set->Process(“h42.C++”)
Info in TUnixSystem::ACLiC: creating shared library /home/jsunderm/temp/prooftest/./h42_C.so
In file included from /home/jsunderm/temp/prooftest/fileZZD7SS.h:29,
from /home/jsunderm/temp/prooftest/fileZZD7SS.cxx:13:
/home/jsunderm/temp/prooftest/h42.C: In member function virtual Bool_t h42::ProcessCut(int)': /home/jsunderm/temp/prooftest/h42.C:66: warning: unused parameterInt_t entry’
Initializing with tree 0

The TDSet class always gets a NULL-pointer for initialization which causes the segmentation violation.

When using ssh for authentication already the execution of gROOT->Proof() fails:

Oct 10 17:59:17 jsunderm sshd[5780]: Failed password for jsunderm from ::1 port 1395 ssh2
Oct 10 17:59:17 jsunderm last message repeated 2 times
Oct 10 17:59:17 jsunderm sshd[5780]: Connection closed by ::1
Oct 10 17:59:17 jsunderm proofd[5781]: SshToolNotifyFailure: cannot connect socket: exiting
Oct 10 17:59:17 jsunderm proofd[5781]: RpdSshAuth: failure notification perhaps unsuccessful …
Oct 10 17:59:17 jsunderm sshd[5785]: Did not receive identification string from ::ffff:127.0.0.1

Best regards,

Jan Erik.

Hello Jan Erik,

Thanks for reporting the problem. I am looking at your files, in the mean time it
would be great if you could repeat the proof test run but do a:

root[] gProof->SetLogLevel(3)

after you have started the proof session. Then if you could atach the master and slave
log files (from the ~/proof directory) then that will give a detailed log of what the
system was doing.

As far as the ssh problem goes, does ssh authentication work for rootd?

Regards,

Maarten.

Hi Maarten,

I retried with logging level set to 3.

output of the root-session
root [0] gROOT->Proof()
Name (localhost:jsunderm): jsunderm
Password:
PROOF set to parallel mode (2 slaves)
root [1] gProof->SetLogLevel(3)
root [2] .x ./getmyset.C
root [3] set->Process(“h42.C++”)
Info in TProofPlayerRemote::Process: Enter
Info in TProofPlayerRemote::Process: Sendfile: h42.C
Info in TProof::SendFile: sending file h42.C to:
slave = localhost:0
Info in TProofPlayerRemote::Process: SendFile: h42.h
Info in TProof::SendFile: sending file h42.h to:
slave = localhost:0
Info in TUnixSystem::ACLiC: creating shared library /home/jsunderm/temp/prooftest/./h42_C.so
In file included from /home/jsunderm/temp/prooftest/fileHY0bsr.h:29,
from /home/jsunderm/temp/prooftest/fileHY0bsr.cxx:13:
/home/jsunderm/temp/prooftest/h42.C: In member function virtual Bool_t h42::ProcessCut(int)': /home/jsunderm/temp/prooftest/h42.C:66: warning: unused parameterInt_t entry’
Initializing with tree 0
Info in TProofPlayerRemote::Process: Calling Broadcast
Info in TProofPlayerRemote::Process: Calling Collect
*** Break *** keyboard interrupt FILE:(tmpfile) LINE:1
Error in TProof::Interrupt: server 0 does not respond

/var/log/messages
Oct 10 21:28:09 jsunderm rootd[3445]: main: no config directory specified
Oct 10 21:28:09 jsunderm rootd[3445]: Rootd: file closed, rd=0, wr=0, rx=0, tx=0
Oct 10 21:28:09 jsunderm rootd[3446]: main: no config directory specified
Oct 10 21:28:09 jsunderm rootd[3446]: Rootd: file closed, rd=0, wr=0, rx=0, tx=0
Oct 10 21:28:12 jsunderm sshd[3413]: Did not receive identification string from ::ffff:127.0.0.1
Oct 10 21:28:12 jsunderm sshd[3418]: Did not receive identification string from ::ffff:127.0.0.1
Oct 10 21:28:59 jsunderm sshd[3408]: fatal: Timeout before authentication for ::ffff:127.0.0.1

proof.log
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.C not yet on node
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:47 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:47 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.C already on node
Oct 10 21:27:47 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:47 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:47 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.C already on node
Oct 10 21:27:47 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.h not yet on node
Oct 10 21:27:47 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:48 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:48 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:48 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:48 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.h already on node
Oct 10 21:27:48 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:27:48 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::LockDir:file /tmp/proof-cache-lock-jsunderm locked
Oct 10 21:27:48 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::HandleSocketInput:kPROOF_CHECKFILE:file h42.h already on node
Oct 10 21:27:48 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::UnlockDir:file /tmp/proof-cache-lock-jsunderm unlocked
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::HandleSocketInput:kPROOF_PROCESS:Enter
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TProofPlayerRemote::Process:Enter
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TProofPlayerRemote::Process:Sendfile: h42.C
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TProofPlayerRemote::Process:SendFile: h42.h
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TProofPlayerRemote::Process:Create Proxy TDSet
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:Enter
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:Socket added to monitor: 0x86aab88 (localhost)
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:Socket added to monitor: 0x86ab878 (localhost)
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:sent to slave-0 (localhost) via 0x86aab88 reportsize on tree root://localhost/temp/prooftest/data1.root / h42
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:sent to slave-1 (localhost) via 0x86ab878 reportsize on tree root://localhost/temp/prooftest/data2.root / h42
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:waiting for 2 slaves:
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2: slave-0 (localhost)
Oct 10 21:28:08 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2: slave-1 (localhost)
Oct 10 21:28:08 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::HandleSocketInput:kPROOF_REPORTSIZE:Enter
Oct 10 21:28:08 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::HandleSocketInput:kPROOF_REPORTSIZE:Enter
Oct 10 21:28:08 jsunderm proofslave[3409]: jsunderm:slave 0:Info:TProofServ::HandleSocketInput:kPROOF_REPORTSIZE:Report size of object h42 (T) in dir / in file root://localhost/temp/prooftest/data1.root
Oct 10 21:28:08 jsunderm proofslave[3414]: jsunderm:slave 1:Info:TProofServ::HandleSocketInput:kPROOF_REPORTSIZE:Report size of object h42 (T) in dir / in file root://localhost/temp/prooftest/data2.root
Oct 10 21:28:09 jsunderm proofslave[3409]: jsunderm:slave 0:SysError:TUnixSystem::UnixRecv:recv (Connection reset by peer)
Oct 10 21:28:09 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:Select returned: 0x86aab88
Oct 10 21:28:09 jsunderm proofslave[3414]: jsunderm:slave 1:SysError:TUnixSystem::UnixRecv:recv (Connection reset by peer)
Oct 10 21:28:09 jsunderm proofserv[3404]: jsunderm:master:Error:TPacketizer2::TPacketizer2:kPROOF_FATAL from slave-0 (localhost)
Oct 10 21:28:09 jsunderm proofserv[3404]: jsunderm:master:Info:TPacketizer2::TPacketizer2:waiting for 1 slaves:
Oct 10 21:28:09 jsunderm proofserv[3404]: jsunderm:master:*** Break :segmentation violation
Oct 10 21:39:45 jsunderm proofserv[3404]: jsunderm:master:Info:TProofServ::HandleUrgentData:
Hard Interrupt
Oct 10 21:39:55 jsunderm proofserv[3404]: jsunderm:master:Error:TProof::Interrupt:server 1 does not respond

Find the content of the ~/proof directory attached to this message.

I also tried to open the .root files directly via rootd:

[localhost] ~/temp/prooftest >rootd -p 20213

output of the root-session
root [0] TNetFile fnet(“root://localhost:20213/temp/prooftest/data1.root”)
Name (localhost:jsunderm): jsunderm
root [1] fnet.ls()
TNetFile** root://localhost:20213/temp/prooftest/data1.root HBOOK file: data1.hbook converted to ROOT
TNetFile* root://localhost:20213/temp/prooftest/data1.root HBOOK file: data1.hbook converted to ROOT
KEY: TTree h42;1 dstar

… this seems to wirk without problems.

Best regards,

Jan Erik.
proof_dir.tar.gz (21.6 KB)

Thanks,

I think I’ve located the problem. I overlooked a backward compatibility issue with the old TSelectors. A fix for the problem should be in CVS
shortly.

Were you able to look at the ssh authentication and rootd?

Cheers,

Maarten.

Hi,

the ssh authentication seems to work without problems (have a look at the end of my previous posting).

Best regards,

Jan Erik.

Hello,

the ssh authentication with proof works now. I think the reported problems were due to a mistake I made with my local ssh configuration :blush:. I tried the ssh authentication before I entered my public key into the authorized_keys file assuming that gROOT->Proof would ask me for my password.

Regards,

Jan Erik.

Hello,

The fix is in CVS now. Your problem shoudl be solved now.

Cheers,

Maarten.