I am dealing with a large number of .root files using TChain::Add, and I want to speed up the reading procedure. Is there a way to achieving this (like multi-processing, but I don’t know how to realize it)?
By the way, I also want to speed up the CopyTree part when I want to extract a subset of the original TTree and keeps the tree structure. Any suggestions on this are also welcomed!
@eguiraud Thanks for your suggestions! I will try to work around.
Actually, what I want to do is nothing special but extract a subset of entries of the original TTree under some cuts, and generate a new .root file, keeping the same tree structure. That’s it without any calculation steps. So I think the bottleneck would only be the reading and writing .root file or the copytree part that I want to speed up.
To be more specific, my core codes are as following:
// Read original tree.
TChain *ch = new TChain(tree_name);
ch->Add(root_chain_path);
// Create output file & Save new tree after cuts.
TFile *out_file = new TFile(ofname, "recreate");
TTree *copy_tree = ch->CopyTree(apply_cuts);
copy_tree->Write();
out_file->Close();
// Free space.
delete out_file;
delete ch;
where apply_cuts is a combination of some TCut.
Since I am not an expert on managing threads or processes, could you show me a small demo that I can follow as a start?
We have tutorials for all the features I mentioned, you can grep for the class names in the directory pointed by root-config --tutdir.
You can also open the tutorials as notebooks from here (RDataFrame) and here (TTreeProcessor{MP,MT} etc.) (unfortunately these latter tutorials are not super well labeled in terms of which of the classes are used inside, hence the suggestion to grep for what you need).