Thanks to Anar and our sys admin , we have setup xrootd on our DPM storage element (for local ATLAS use) and gLitePROOF. I am trying now to do some benchmarks using root.cern.ch/twiki/bin/view/ROOT/ProofBench
I use 10 workers nodes and 20 input files having 10 000 events (20 times the same file). If you look at the logfiles, proof “sees” 10 slaves but the job seems to be only run on 2-3 slaves (and look at the processing rate plot).
[quote]PROOF_MaxSlavesPerNode
Type: int Description: Parameter for the packetizers. Limit the number of slaves accessing data on any single node.
Default Value: In TPacketizer the default value is 4. In TPacketizerAdaptive and TPacketizerProgressive? it is 2.
Example: proof->SetParameter(“PROOF_MaxSlavesPerNode”, 2); [/quote]
Please try and let me know. I will also think how to automate it in gLitePROOF.
I have just checked carefully the nodes
in fact for one job with 10 workers, these are for example 5 different nodes :
3 slaves on node wn43
2 on node wn48
2 on node wn46
2 on wn45
1 on wn44
[quote=“karim”]I have just checked carefully the nodes
in fact for one job with 10 workers, these are for example 5 different nodes :
3 slaves on node wn43
2 on node wn48
2 on node wn46
2 on wn45
1 on wn44
this can explain the xrootd logfile[/quote]
If I understood your post correctly, that would mean you are mistaking a bit.
gLitePROOF makes the PROOF server to think that ALL it’s workers are on the localhost. The PROOF server don’t actually know that real workers are on node wn48-wnXX or something.
Look what it writes: sent to slave-0.0 (localhost.localdomain).
gLitePROOF hides remote PROOF workers from the PROOF server and acts as a “proxy”.
And since default value for PROOF_MaxSlavesPerNode is 2, therefore only 2 slaves get packages. Since all slaves (for PROOF server) are on the localhost, the other 8 workers won’t get packages.
In my previous post I meant, that for this problem (subject), I think, it doesn’t really matter how gLitePROOF distributes workers. But it is important how this workers look like for PROOF server. And for the server they are a local workers. Just as you would start a PROOF session with 10 workers on a single machine.
BTW, I hope you remember, that for this kind of changes you shouldn’t resubmit gLitePROOF jobs again.
gLitePROOF supports reconnections and if you just need to change something in your analysis script or need to restart a local ROOT session, just make your changes, restart ROOT and connect to your old workers. gLitePROOF will automatically manage that
I am really glad to read that But of course most of the credits should go to ROOT/PROOF and XROOTD developers, and managers.
BTW, please, install new version of gLitePROOF. As I told you already, I have several important fixes there. Anyway, since the subject of the topic is resolved, we could continue per email.
All the best!