pyRoot very slow for "uwfunc"-like plotting

oleroy · December 14, 2006, 5:21pm

Dear pyRoot experts,

I compare PAW, pyRoot and ROOT.
I plot for example “the highest pT electron candidate” of my tuple.root (cwn-like ntuple).

PAW-speaking, I use “n/pl 1.mypT.f” , where mypT.f is a short fortran function starting with “include ?”
With pyRoot, the command are rather simple too:
"python Myplot.py"
where Myplot.py is the following file:

[code]from ROOT import TFile, TH1D, TCanvas

f= TFile(“jpsikstBrunelv30r14.root”)
t = f.Get(“1”)
h1 = TH1D(“P_{T} MCID=11”,"",100,0,5)
nentries = t.GetEntries()
for i in range(nentries):
t.GetEntry(i)
pmax = 0.
for j in range(t.N):
if(abs(t.MCID[j])==11 and t.P[j]>5. and t.Pt[j]>1.2):
if(t.P[j]>pmax):
ptmax = t.Pt[j]
if(ptmax>0.):
h1.Fill(ptmax0)
h1.Draw()[/code]

With Root, I use MakeClass

pyRoot is nice an elegant, but 7 times slower than ROOT to make the same plot !!
It’s a pity that a plot as simple as the above one take 40 seconds to be done.
There is certainly a way to speed up the above code ?
Do I have to abandon pyRoot for ROOT ?

I already read suggestions on this forum which do not solve this cpu problem yet.

Thanks a lot,
Cheers,

Olivier

wlav · December 15, 2006, 5:09am

Olivier,

not sure which other things you have already tried, and not sure which ROOT release this is, but lookups of members in python are slow in inner loops, and if this is a recent version of PyROOT, then the results of the lookups are not cached.

So, with a line like:

MCID, P, Pt = t.MCID, t.P, t.Pt

just in front of the “for j in range(t.N):” loop, and using those variables
instead of lookups every time, would arguably help quite a bit.

I have to find a way of caching, but what I had originally did not suffice in all situations. As well, I should reorder the way that index checking is done. In both cases there are today too many function calls for what in C++ or Fortran is basically direct addressing.

Cheers,
Wim

oleroy · December 15, 2006, 8:38am

Hi Wim,

Thanks for your suggestion:

allows to gain a factor 4 in execution time.

I use root 5.13/02 and python 2.4.1

Is there a python equivalent to the root command “MakeClass” ?
Or is there an easy way to automatically write the above code for a given ntuple ?

Thanks again,
Cheers
Olivier

wlav · December 15, 2006, 8:54am

Olivier,

that’s on the one hand good, but on the other hand this is performance that I’m leaving (unnecessarily) on the table. Iow., I have my work cut out for me.

PI had something in the past: a wrapper function that given a TTree instance would create all the proper accessors in python, rather than the on-the-fly creation that I’m doing now. That earlier approach avoided caching problems, but I’m not sure of its performance. An old example is here http://cern.ch/PI/Examples/PyAIDAProxy/examples/readTree.py, see PyListOfLeaves, which can further be prettified using properties.

In practice, I prefer accessing and lookups, b/c once I have the performance down proper, it allows for automatically reading only those branches that are actually used.

Not that I’m aware of, but I can imagine a toolkit can be built for that purpose, using overloading of | and &, along the lines of TCut.

Cheers,
Wim

pcanal · December 15, 2006, 12:32pm

As a note, this feature is also available in ROOT:

t->Draw("mypT.C")where mypT.C is a short C++ that can use directly the tree variable names and (by default) whose return value will be histogrammed. This uses TTree::MakeProxy which is similar to MakeClass/MakeSelector but support accessing C++ object (in the Event in the TTree) and support autoloading.

Cheers,
Philippe.