Naive question about CPU usage in ROOT

Dear experts,

I have noticed a trend in the time consumption of some ROOT jobs and I wish to understand it.

Say I execute a ROOT macro which performs a job on a dataset of size N, and that when I execute it locally on the command line of my cluster, it takes half an hour.

Now say I have 100 datasets all of size similar to N (within statistical fluctuations), and I submit 100 parallel instances of my ROOT job to HTCondor which delegates the jobs to various work nodes on my cluster. I now see that many jobs takes a lot longer than half an hour… something in the neighbourhood of 2 hrs for example. There are a few that complete quicker (say in an hour), but none as quick as the local execution.

The only explanation I can come up with is that when Condor submits jobs to these work nodes, often times it might grab all the cores on a particular machine. There might be 10 jobs running on 10 cores of a machine at 1 job/core.

Is it possible that when I execute a single job locally, presumably without grabbing up all the cores, then ROOT, unknown to me, shares the job among a few cores and runs faster? Or is there some other explanation for this?

Thanks in advance for your help,
Arvind.

Hi @avenkate ,
ROOT only runs on multiple cores “under the hood” if you activate implicit multi-threading by calling ROOT::EnableImplicitMT() somewhere in your code. Running your program under /usr/bin/time should give you a percentage CPU utilization that will be 100% for a single core and e.g. 200% for an average usage of 2 cores. You can use that to check whether multiple cores are being used.

Could it also just be that the local machine is more powerful and less busy, while the cluster nodes are less powerful (i.e. slower CPUs) and more busy (i.e. you don’t get as much CPU power per second for your jobs)?

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.