Proof crashed with submergers

Hi,

We experienced crash with submergers in Proof. I tried the examples in tutorials, the submergers only works for not data-driven example “simple”, and it still crashed with data-driven example “ntuple”.

The following example “simple” ran successfully:

root [] runProof.C+ root [] runProof("simple(hist=1000,submergers=0)");

While the following example “ntuple” crashed:

root [] runProof.C+ runProof("ntuple(submergers)");

The error I got is:

[quote]Starting master: opening connection …
Starting master: OK
Opening connections to workers: OK (8 workers)
Setting up worker servers: OK (8 workers)
PROOF set to parallel mode (8 workers)
getProof: WARNING: started/attached a session on external cluster (proof://localhost:11093): ‘dir="/tmp/yesw2000/.proof-tutorial"’ ignored
runProof: ntuple: ACLiC mode: '+'
runProof: ntuple: enabling merging via sub-mergers (optimal number)

runProof: running “ntuple” with nevt= 1000

Mst-0: Number of mergers set dynamically to 3 (for 8 workers)
Worker ‘localhost.localdomain-0.2’ has been removed from the active list

+++ Message from top master at acas0251.usatlas.bnl.gov:11093 : marking localhost.localdomain:11093 (0.2) as bad
+++ Reason: received kPROOF_FATAL

+++ Most likely your code crashed on worker 0.2 at localhost.localdomain:11093.
+++ Please check the session logs for error messages either using
+++ the ‘Show logs’ button or executing
+++
+++ root [] TProof::Mgr(“acas0251.usatlas.bnl.gov:11093”)->GetSessionLogs()->Display(“0.2”,0)

Mst-0: merging output objects … done
Mst-0: grand total: sent 9 objects, size: 245991 bytes )
Error in ProofNtuple::Terminate: TProofOutputFile not found
[/quote]

If I turned off the submergers, the example “ntuple” then ran successfully.

The ROOT version I tried is 5.27.04, and OS is SL5.3.

Any idea?

–Shuwei

Hi Shuwei,

I could reproduce the problem. There is a missing protection affecting the case when the output list contains TProofOutputFile objects. This is the reason why it shows up in ‘ntuple’ (but it should not show up in other data-driven examples, like ‘eventproc’: can you confirm?).

I have added the protection in the trunk, in 5-27-06-patches and 5-26-00-patches .

Thanks for reporting.

Gerri

Hi Gerri,

I confirm that “eventproc’” does work with submergers in ROOT-5.27.04. In addition, I tried ROOT-5.27.06b and trunk, both work fine with submergers. Many thanks for the fix.

–Shuwei

Hi Gerri,

I just found a bug in ROOT-5.27.06 preventing us from running our python analysis code. At BNL most users are using ROOT-5.27.04. If convenient, please help make a patch with the TProofOutputFile bugfix for ROOT-5.27.04. Thanks.

–Shuwei

[quote]I just found a bug in ROOT-5.27.06 preventing us from running our python analysis code.[/quote]If this is unrelated to this topic, can you open a new topic or better yet if it is a bug, please report it through savannah.cern.ch

[quote]I just found a bug in ROOT-5.27.06 preventing us from running our python analysis code. At BNL most users are using ROOT-5.27.04. If convenient, please help make a patch with the TProofOutputFile bugfix for ROOT-5.27.04[/quote]I am confused … is the bug in v5.27/04 or 5.27/06? Are you requesting a bug fix for 04/ or 06? Can you try the head of the svn repository?

Philippe.

Hi,

Sorry for the confusion. There was a bug related with TProofOutputFile. Gerri fixed it in trunk, 5.27.06 and 5.26.00 patches. I tried 5.27.06b and trunk in ROOT macro, both work in term of TProofOutputFile. But there is [color=#FF0000]another bug in root which shows up only in python[/color], preventing us from using those ROOT versions in python, as I reported:

https://savannah.cern.ch/bugs/index.php?75406

So my request is to make a patch on top of 5.27.04 with Gerri’s bugfix for TProofOutputFile, so our python analysis code can make use of submergers in Proof.

-Shuwei