I have been trying to use the copytree algorithm with cuts to get rid of a lot of event that I do not need, but for some reason the same code works in certain circumstances and it doesn’t in other. It is very weird.
This function works in combination with another script that contains a loop that loops over the whole number of TFiles, but in small steps so that the memory is not flooded.
it’s a bit hard to say because we don’t see the other part of the script.
One thing that I noticed is that you might get in trouble since the output tree can only be kept in memory. You could try the following order:
Load inputs
Open output file
Create copy of tree, writing it directly into the output file.
If you change the order of steps 2 and 3 as you did, the tree has to stay in memory because it’s not clear where to write it to. This probably causes the memory flooding, and makes the other script more complicated than it has to be.
Further, to split the trees (maybe you don’t even need to because they should use less memory), you can use the full signature of CopyTree: https://root.cern.ch/doc/master/classTTree.html#a2b57854ba133da5fc523931aa0427e21
As you can see, it allows you to select how many entries you want as well as the starting point of the selection.
Thank you very much for your interest in my problem and for dedicating your time to helping me. I greatly appreciate it.
Unfortunately, I am not sure I fully understand your suggestions yet since I do not have too much experience with CopyTree or with ROOT in general.
If I understand correctly, your first suggestion is simply to create (or open) the file in which this subtree should be written on, before the subtree is created at all right?
That would make sense and I am going to try that right away.
As far ad the other options of the CopyTree function are concerned, I am not entirely sure I understand how to exploit them fully.
The nentries option would just correspond to how many of the jobs contained in the TChain I want to include in my subtree and the firstentry input variable does that correspond to what the first subjob would be? Is this correct?
It’s not really the number of jobs, it’s the number of events in the tree. Using the arguments I pointed out to you, you can say:
“I want one million events, starting from event 0”
“I want one million events, starting from event one million”
“…”
to split into multiple output files.
If you don’t want to split based on event counts, you can alternatively split based on the input files. In this case don’t create a big TChain with all files, but just create one TChain for every folder (job number).
I have tried your suggestion of creating the output file before creating the copy, but I get this error message, which I also got in previous instances:
line 17, in strip_n_save
subtree = alldata.CopyTree( cuts )
SystemError: TTree* TTree::CopyTree(const char* selection, const char* option = "", Long64_t nentries = kMaxEntries, Long64_t firstentry = 0) =>
problem in C++; program state has been reset
I have no idea why this happens.
Do you happen to know what could be a source of this particular error?