Problem with .CopyTree("cuts") PyRoot

Hellp there!

I have been trying to use the copytree algorithm with cuts to get rid of a lot of event that I do not need, but for some reason the same code works in certain circumstances and it doesn’t in other. It is very weird.

Here is my code:

def strip_n_save (min, max, cuts, directory):

filename = "blabla.root"

alldata = TChain("blabla/DecayTree")

for job in range(min, max) :
    
    alldata.Add("{0}/{1}/output/{2}".format(directory,job,filename))



subtree = alldata.CopyTree( cuts )
wfile = TFile.Open(directory + "_cluster_{0}-{1}.root".format(min, max), "RECREATE")
wfile.cd()
subtree.Write()
print ("created cluster_{0}-{1}.root".format(min, max))
wfile.Close()

This function works in combination with another script that contains a loop that loops over the whole number of TFiles, but in small steps so that the memory is not flooded.

Do you see anything wrong with this code?

Thank you very much in advance,

Simon

Hi Simon,

it’s a bit hard to say because we don’t see the other part of the script.
One thing that I noticed is that you might get in trouble since the output tree can only be kept in memory. You could try the following order:

  1. Load inputs
  2. Open output file
  3. Create copy of tree, writing it directly into the output file.

If you change the order of steps 2 and 3 as you did, the tree has to stay in memory because it’s not clear where to write it to. This probably causes the memory flooding, and makes the other script more complicated than it has to be.

Further, to split the trees (maybe you don’t even need to because they should use less memory), you can use the full signature of CopyTree:
https://root.cern.ch/doc/master/classTTree.html#a2b57854ba133da5fc523931aa0427e21
As you can see, it allows you to select how many entries you want as well as the starting point of the selection.

Hi Stephan,

Thank you very much for your interest in my problem and for dedicating your time to helping me. I greatly appreciate it.

Unfortunately, I am not sure I fully understand your suggestions yet since I do not have too much experience with CopyTree or with ROOT in general.

If I understand correctly, your first suggestion is simply to create (or open) the file in which this subtree should be written on, before the subtree is created at all right?
That would make sense and I am going to try that right away.

As far ad the other options of the CopyTree function are concerned, I am not entirely sure I understand how to exploit them fully.

The nentries option would just correspond to how many of the jobs contained in the TChain I want to include in my subtree and the firstentry input variable does that correspond to what the first subjob would be? Is this correct?

Thank you very much again for your help.

Simon

It’s not really the number of jobs, it’s the number of events in the tree. Using the arguments I pointed out to you, you can say:
“I want one million events, starting from event 0”
“I want one million events, starting from event one million”
“…”
to split into multiple output files.

If you don’t want to split based on event counts, you can alternatively split based on the input files. In this case don’t create a big TChain with all files, but just create one TChain for every folder (job number).

So your suggestion would be to create one TChain per folder and then on that one TChain apply the cuts and save the new subtree correct?

Or alternatively, TChain everything and just do a certain number of events within that TChain at a time using the input parameters of CopyTree().

That seems reasonable. Thank you very much

I have tried your suggestion of creating the output file before creating the copy, but I get this error message, which I also got in previous instances:

line 17, in strip_n_save
    subtree = alldata.CopyTree( cuts )
SystemError: TTree* TTree::CopyTree(const char* selection, const char* option = "", Long64_t nentries = kMaxEntries, Long64_t firstentry = 0) =>
    problem in C++; program state has been reset

I have no idea why this happens.

Do you happen to know what could be a source of this particular error?

Usually, it prints the error that occurred in C++ after/before the error message you provided. Do you have a bit more?

Hi Stephan,

After working on the issue for some time, I have managed to get around it.

The problem had to do with the fact that the TChain file was too big still and now everything works fine.

Thank you very much for all your help!

Simon