Only 2 workers from N>2 available workers are used

I have a cluster of machines running PROOF, all of them seem to be running ok and run simple commands like gProof->Exec(".!uname -a"). But whenever I try to run my analysis, only 2 are used, even when I have 10 of them avaiable. The memory plots only show activity on 2 of them.
I forced to use all the workers setting

<max_workers> Maximum number of workers to be assigned to user

session [-1, i.e. all]

to wmx:-1, but it is not working.
Is there any way to force all the workers to be used? Or any place where I can debug why only 2 workers are used?
Ana Rodríguez.

Dear Ana,

The only reason I could think is a that all the workers are reading from the same server. There is, in such a case, a limitation on the number of workers accessing the server. The limitation can be lifted by setting:

gProof->SetParameter("Packetizer.MaxWorkersPerNode", 9999)

If this does not work, could you say a bit more about you hardware and configuration, and specify the ROOT version?

G. Ganis

I tried the gProof statement you suggested but it did not solve anything.

I am using ROOT Version 5.25/02 29 September 2009.

I start the xroot workers through a SGE batch system, the master location is fixed in another machine. All of them run SL5 64-bits.
They are 8 cores machines with 16 GB RAM, usually I get 1 core per machine as a worker.

Thanks for your help.

How many files are you processing?
Are the files located on the workers?


I have tried reading from 1 to 5 files, each of them ~ 11 GB with ~ 2.e6 events.
The workers have access to the files through GPFS.
Always just 3 workers (1 master, 2 workers) are used.
The “show logs” shows “// # of retrieved lines: 0” for all the other available workers, and a number greater than 0 for the others.

root [0] TProof p = TProof::Open(“”)
Starting master: opening connection …
Starting master: OK
Opening connections to workers: OK (10 workers)
Setting up worker servers: OK (10 workers)
PROOF set to parallel mode (10 workers)
root [1] gProof->SetParameter(“Packetizer.MaxWorkersPerNode”, 9999)
root [2] TDSet
set = new TDSet(“TTree”, “Tree”)
root [3] set->Add("/gpfs/csic_projects/cms/PROOF_data/data/minitree_Wjets_IC.root")
root [4] p->Process(set,“myselector.C”)
Looking up for exact location of files: OK (1 files)
Looking up for exact location of files: OK (1 files)
Validating files: OK (1 files)
Mst-0: merging output objects … / (3 workers still sending)

Ok, I still believe that it should come form the limitation that I was mentioning, but by mistake I swapped the name of the related rootrc env and of the parameter to be used in SetParameter. Sorry.

Could you try by setting:

p->SetParameter("PROOF_MaxSlavesPerNode", 9999)


Now it is working.