Cloning tree

Hi,

I am trying to slim an ntuple as follows, in analogy to how I would do this in C++ using a class generated via TChain::MakeClass():

ch = TChain("qcd")
ch.Add("*qcd.root*")

nEntries = ch.GetEntries()

#--- Throw away some branches
ch.SetBranchStatus("el_*",    0)
ch.SetBranchStatus("mu_*",    0)

ch_new = ch.CloneTree(0)

for i in range(nEntries):
    ch.GetEntry(i)
    #--- Only write out certain events that pass some cut
    if passCut(ch.RunNumber, ch.lbn):
        ch_new.Fill()

#--- Write to new file
outFile = "qcd.slim.root"
newFile = TFile(outFile,"RECREATE")
newFile.cd()
ch_new.Write()
newFile.Close()

Unfortunately this leaves me with a new chain ch_new that has no events. I guess when I call TChain::Fill() its branches are not associated with addresses of the variables whose values to fill, unlike the C++ case. How can I resolve this?

I have verified btw that TChain::CloneTree() is working properly, e.g. if I just do:

ch_new = ch.CloneTree(nEntries)

then I get a new tree with nEntries as expected.

Thanks,
Eric

Hi,

You need to make sure the new TTree is associate either with no file or better yet with the output file before starting to fill it. The following should work nicely:[code]ch = TChain(“qcd”)
ch.Add(“qcd.root”)

nEntries = ch.GetEntries()

#— Throw away some branches
ch.SetBranchStatus(“el_", 0)
ch.SetBranchStatus("mu_
”, 0)

#— Write to new file
outFile = "qcd.slim.root"
newFile = TFile(outFile,“RECREATE”)
ch_new = ch.CloneTree(0)

for i in range(nEntries):
ch.GetEntry(i)
#— Only write out certain events that pass some cut
if passCut(ch.RunNumber, ch.lbn):
ch_new.Fill()

use GetCurrentFile just in case we went over the

(customizable) maximum file size

ch_new.GetCurrentFile().Write()
ch_new.GetCurrentFile().Close() [/code]

Cheers,
Philippe.

[quote=“pcanal”]Hi,

You need to make sure the new TTree is associate either with no file or better yet with the output file before starting to fill it. The following should work nicely:[code]ch = TChain(“qcd”)
ch.Add(“qcd.root”)

nEntries = ch.GetEntries()

#— Throw away some branches
ch.SetBranchStatus(“el_", 0)
ch.SetBranchStatus("mu_
”, 0)

#— Write to new file
outFile = "qcd.slim.root"
newFile = TFile(outFile,“RECREATE”)
ch_new = ch.CloneTree(0)

for i in range(nEntries):
ch.GetEntry(i)
#— Only write out certain events that pass some cut
if passCut(ch.RunNumber, ch.lbn):
ch_new.Fill()

use GetCurrentFile just in case we went over the

(customizable) maximum file size

ch_new.GetCurrentFile().Write()
ch_new.GetCurrentFile().Close() [/code]

Cheers,
Philippe.[/quote]

Hi Philippe,

I tried what you suggested (in fact I was already doing it in this order), but this did not work for me i.e. get same problem namely zero events.

Any idea what might be wrong? if it helps I could post actual code and root file (30MB) to test.

Eric

Hi,

I assume you double checked that the condition:if passCut(ch.RunNumber, ch.lbn):was sometimes true …

If it is, then we would need a complete running example reproducing the problem.

Cheers,
Philippe

[quote=“pcanal”]Hi,

I assume you double check that the condition:if passCut(ch.RunNumber, ch.lbn):was sometimes true …

If it is, then we would need a complete running example reproducing the problem.

Cheers,
Philippe[/quote]

I apologize Philippe, this function works properly but was not selecting anything because I was using a special ntuple for testing. It works now after correcting this – sorry about this dumb mistake.

Thanks a lot for your help.

Eric

Hello,

Following the similar line. I try to slim the data.
What I see is the following problem:

I get 2 trees with same name both are slimmed.

One tree has slightly less events than the other.

any suggestion that why should I get the 2 trees with same name ?

In principle, one would expect the slimmed version of a tree.

Thanks in advance,

—Saleem

Here is piece of code:

[code]#get main tree
ch = TChain(“physics”)
#set max size = 10 times of default ~20 GB
ch.SetMaxTreeSize(1900000000*10)
for file in inputFiles:
ch.Add(file)

nEntries = ch.GetEntries()
print "nEntries = ", nEntries

#set brances

#set branche satus, at first, all off
ch.SetBranchStatus("*", 0)

#event information
ch.SetBranchStatus(“RunNumber”,1)
ch.SetBranchStatus(“EventNumber”,1)

#new tree
ch_new = ch.CloneTree(0)

ch_new = ch.CloneTree(0)

if (maxEntries!=-1 and nEntries>maxEntries): nEntries = maxEntries

for i in range(nEntries):
ch.GetEntry(i)
if i%10000==0:
print “Processing event nr. %i of %i” % (i,nEntries)
ch_new.Fill()
ch_new.Print()

use GetCurrentFile just in case we went over the

(customizable) maximum file size

ch_new.GetCurrentFile().Write()
ch_new.GetCurrentFile().Close()[/code]

Hi,

Isn’t the issue that CloneTree is called twice?

Philippe.

Hi Phillipe,

In my code this is only once.

Sorry, I cut and pasted from the code and this appeared twice in my email.

So in principle, this is not twice in my code and is not the source of issue.

Thanks for further help in advance,

—Saleem

Hi,

Alright, what do you mean by “get the 2 trees with same name”? Aren’t those 2 ‘cycles’ of the same key (See User’s Guide chapter on the I/O)?

Philippe.

Hi,

Alright, what do you mean by “get the 2 trees with same name”? Aren’t those 2 ‘cycles’ of the same key (See >User’s Guide chapter on the I/O)?

OK, these are 2 cycles (cycles should not have different entries ). Calling cycles does not solve the problem anyways :slight_smile:

I managed to solve this already.

Thanks for your reply.

cheers,

—Saleem

[quote]these are 2 cycles (cycles should not have different entries )[/quote]Why not? Each cycle is a copy/snapshot in time of the meta-data of the TTree/Object and they should in most case have different number entries …

Philippe.

Hi Philippe,

I am not saying that why this can not have 2 cycles.

The annoying part is you get 2 cycles with different entries in the output (final) root file.

I think this is because clonetree is associated with the chain of input root file as well as with the output root file. (my guess…)

In principle, clonetree has to be associated with the output root file.

cheers,

—Saleem

[quote]The annoying part is you get 2 cycles with different entries in the output (final) root file.[/quote]I am a bit confused of ‘why’ it would be annoying or even surprising … the lower number cycles is a copy of one of the previous state of the TTree object and as thus it should have a different number of entries. Even more so, most of the retrieval tool automatic pick the highest cycle unless explicitly requested otherwise. In addition (see User’s Guide for detail) you can disable the extra safety of having those cycles …

[quote]I think this is because clonetree is associated with the chain of input root file as well as with the output root file. (my guess…)[/quote]A TTree can only be associated with one file … (however switching the association in mid-stream is very likely to cause severe problems).

[quote]In principle, clonetree has to be associated with the output root file.[/quote]Yes, and this is usually achieve by making sure that the ‘output’ file is the current ROOT file (i.e. gFile == ouput_file) when you call CloneTree.

Philippe.