PROOF-Lite performance (or lack of)

jade2 · August 22, 2012, 3:00pm

Hi,
I was seeking to improve the performance on data analyzing a large data set on one machine (8 core). I was reading up on this Proof-Lite and followed the instructions, where starting it up shows this:

root [0] TProof::Open("")
+++ Starting PROOF-Lite with 8 workers +++
Opening connections to workers: OK (8 workers)
Setting up worker servers: OK (8 workers)
PROOF set to parallel mode (8 workers)
(class TProof*)0x2c51960

Then I loaded my root file and from the TreeViewer, I did some simple histogram plotting of about 300 million entries. But I didn’t really see any performance improvement compared to running it without proof. While it was in the process of plotting, I ran top and could see only 1 core being used:

top - 10:46:46 up 55 min, 4 users, load average: 0.89, 0.59, 0.47
Tasks: 223 total, 2 running, 221 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
[color=#800080]Cpu1 : 99.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.7%hi, 0.3%si, 0.0%st[/color]
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8175500k total, 3566016k used, 4609484k free, 405772k buffers
Swap: 2064380k total, 0k used, 2064380k free, 2051880k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
[color=#800040] 9429 jade 20 0 377m 128m 27m R 99.4 1.6 1:36.37 root.exe [/color]
2354 jade 20 0 654m 17m 9m S 0.3 0.2 0:02.54 gnome-terminal
9592 jade 20 0 15372 1404 1008 R 0.3 0.0 0:00.01 top
1 root 20 0 39588 4828 2044 S 0.0 0.1 0:03.05 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
5 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u:0
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
7 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1

As you can see, root is the only application having significant cpu activity at this point. Isn’t proof suppose to optimize by making use of all cores? But looking at the top results above, it doesn’t seem to be using the multiple cores (but only one as highlighted in color). Was there something else I needed to do to get the other cores engaged in the process?

ganis · August 22, 2012, 4:55pm

Hi,

There is no direct interface from TreeViewer to Proof.
You need to create a TChain with your file and run TChain::Draw:

  root [] p = TProof::Open("")
  root [] TChain c("treename")
  root [] c.Add("myfile.root")
  root [] c.SetProof()
  root [] c.Draw("varx","cut1")  // Same syntax as TTree::Draw ...

Note, however, that, unless your file is on a fast support - e.g SSD or memory - you will be limited by disk access; i.e., most likely you won’t get a factor 8 speed-up .

G. Ganis