Prescaling TTree

Peter_Thomassen · May 20, 2015, 4:05pm

Hi,

Is there a way to read only a fraction x of the events in a TTree to decrease run time?

I would like to avoid reading only the first x*N events, because the tree might have been written according to some order. It would be nice if I could read a tree in a way such that each event is read with probability x, maybe starting out from some seed in order to ensure reproducibility.

Ideally, this would be applicable to Draw(), too.

Thanks a lot for your comments!

Best,
Peter

couet · May 20, 2015, 4:12pm

See the 2 last parameters of TTree::Draw():
root.cern.ch/root/htmldoc/TTree … ree:Draw@1

Peter_Thomassen · May 20, 2015, 4:14pm

Hi,

Sure, that’s an option. However, I then have to call Draw() for each event after deciding whether I want to read it (with nentries = 1 always). In this case, the selection expression would be parsed for every event, unnecessarily.

I was looking for a way to have nentries > 1 and still do some prescaling. Is the approach with nentries = 1 the best (= lowest runtime) approach to do this?

Best,
Peter

couet · May 20, 2015, 4:23pm

Your question was:

… the two last parameter do that nentries gives you the number of entries you want to read and firstentry gives you the first event from which you will read these entries.

So I guess that’s the “way to read only a fraction of the events”

Peter_Thomassen · May 20, 2015, 4:29pm

Hi,

I know and understand that this is the standard solution to the broad question I asked. However, I cannot conclude from it whether one can set a prescale flag or something that would make Draw() skip events with a certain probability by itself, without having to call Draw() multiple times, for the reasons I’ve outlined (e.g. having to parse the cut string multiple times).

However, I take your second reply as a hint that my detailed question did not make you think of such an option, so I infer that the answer is no.

Thanks,
Peter

jfcaron · May 20, 2015, 5:34pm

Unless you expect correlations between events (the data for event N somehow determines or influences the data for event N+1), then skipping events should be equivalent to only reading a partial block.

Say you have N=100000 events and you want to only read 20% of them. If you were coding your own loop over the entries, you could skip entries where ( i % 5 != 0 ), so you’d only really read every 5th event. But if the events are not correlated, that should be completely equivalent to reading only events [1 … 20000] or any other block with the same number of entries.

If you really insist on skipping every 5th entry but you want to use the TTree::Draw method, you could first create a TEntryList with a suitable selection in the 2nd TTree::Draw argument, then use that entrylist for looping over your tree. I don’t think there is a way to do it probabilistically (i.e. with random numbers) using TTree::Draw, but you could just fill a TEntryList manually with random entry numbers.

Jean-François

Peter_Thomassen · May 20, 2015, 5:50pm

Hi Jean-François,

Thank you for your reply. The problem is that my code is going to be used for all sorts of trees, and I don’t know ahead of time whether there will be correlations. It could be that in an MC signal sample, Z bosons in events with even event number decay to ee, and with odd event number they decay to mumu … (If that were not the case, I agree that reading the first x*N events is equivalent.)

Your idea with the randomly filled entry list is great. Thanks!

Best,
Peter

Wile_E_Coyote · May 20, 2015, 6:15pm

Actually, you should be able to use:
tree->Draw(“something”, “Entry$ % 5 != 0”); // skip every 5th entry
and/or:
tree->Draw(“something”, “Entry$ % 5 == 0”); // draw every 5th entry
and/or:
tree->Draw(“something”, “rndm() < 20.0/100.0”); // draw random sample of about 20% of entries

Peter_Thomassen · May 20, 2015, 8:02pm

Cool, Wile! Thanks.

jfcaron · May 20, 2015, 10:33pm

Ah cool, I looked in TMath to see if there was a random-number generator in there (which you can use in TTree::Draw formulas), but couldn’t find anything.

Where is this “rndm” function from? It’s not in cmath or anything. Is it the same as TRandom::Rndm()?

You can make a TF1 with the formula “rndm()” and it re-generates the points each time you draw it (e.g. when you click on the canvas). Pretty nifty, and solving the OP’s problem might be the only reasonable application… = )

Jean-François

Wile_E_Coyote · May 21, 2015, 6:36am

Entry$ -> TTree::Draw
rndm -> TFormula::Analyze