Difficulties with cloning only some branches of a TTree

Dear all,

I am trying to reduce the size of my TTree by cloning only some branches.
I am using a variation of copytree2.C to try to make it happen.

The trees have the following structure (showing only the “MissingET” branch, but it is a Delphes tree so has a lot more objects):

*............................................................................*
*Br  268 :MissingET : Int_t MissingET_                                       *
*Entries :    50000 : Total  Size=     427568 bytes  File Size  =      80349 *
*Baskets :      110 : Basket Size=      64000 bytes  Compression=   5.10     *
*............................................................................*
*Br  269 :MissingET.fUniqueID : UInt_t fUniqueID[MissingET_]                 *
*Entries :    50000 : Total  Size=     413308 bytes  File Size  =      81119 *
*Baskets :      110 : Basket Size=       6656 bytes  Compression=   5.06     *
*............................................................................*
*Br  270 :MissingET.fBits : UInt_t fBits[MissingET_]                         *
*Entries :    50000 : Total  Size=     412852 bytes  File Size  =      81230 *
*Baskets :      110 : Basket Size=       6656 bytes  Compression=   5.05     *
*............................................................................*
*Br  271 :MissingET.MET : Float_t MET[MissingET_]                            *
*Entries :    50000 : Total  Size=     412624 bytes  File Size  =     276628 *
*Baskets :      110 : Basket Size=       6656 bytes  Compression=   1.48     *
*............................................................................*
*Br  272 :MissingET.Eta : Float_t Eta[MissingET_]                            *
*Entries :    50000 : Total  Size=     412624 bytes  File Size  =     284151 *
*Baskets :      110 : Basket Size=       6656 bytes  Compression=   1.44     *
*............................................................................*
*Br  273 :MissingET.Phi : Float_t Phi[MissingET_]                            *
*Entries :    50000 : Total  Size=     412624 bytes  File Size  =     284844 *
*Baskets :      110 : Basket Size=       6656 bytes  Compression=   1.44     *
*............................................................................*
*Br  274 :MissingET_size : MissingET_size/I                                  *
*Entries :    50000 : Total  Size=     211770 bytes  File Size  =      13748 *
*Baskets :      110 : Basket Size=       4137 bytes  Compression=  15.22     *
*............................................................................*

Running this causes the following crash:

 *** Break *** segmentation violation
 Generating stack trace...
 0x000000011d5a3163 in TBranchElement::GetEntry(long long, int) (in libTree.so) + 163
 0x000000011d5e1faa in TTree::GetEntry(long long, int) (in libTree.so) + 186
 0x000000011d5df0ed in TTree::CopyEntries(TTree*, long long, char const*) (in libTree.so) + 1805
 0x0000000119b996af in <unknown function>
 0x0000000119b9904d in <unknown function>
 0x0000000107f9aadf in cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) (in libCling.so) + 383
 0x0000000107f98f3a in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (in libCling.so) + 154
 0x0000000107f983e9 in cling::Interpreter::EvaluateInternal(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**) (in l
 0x0000000107f97da2 in cling::Interpreter::process(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, cling::Value*, cling::Transaction**) (in libCling.so) + 98
 0x0000000107fd32ab in cling::MetaProcessor::process(char const*, cling::Interpreter::CompilationResult&, cling::Value*) (in libCling.so) + 427
 0x0000000107ea8275 in TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (in libCling.so) + 981
 0x0000000107c171ed in TRint::ProcessLineNr(char const*, char const*, int*) (in libRint.so) + 205
 0x0000000107c17d29 in TRint::HandleTermInput() (in libRint.so) + 649
 0x00000001079cd387 in TUnixSystem::CheckDescriptors() (in libCore.so) + 327
 0x00000001079d620b in TMacOSXSystem::DispatchOneEvent(bool) (in libCore.so) + 395
 0x000000010794b75a in TSystem::InnerLoop() (in libCore.so) + 26
 0x000000010794b5ae in TSystem::Run() (in libCore.so) + 206
 0x00000001078e7d34 in TApplication::Run(bool) (in libCore.so) + 36
 0x0000000107c170af in TRint::Run(bool) (in libRint.so) + 1375
 0x000000010786eebf in main (in root.exe) + 79
 0x00007fff9e0025ad in start (in libdyld.dylib) + 1
Root > .q

So, at this point I am at a loss. I understand that this has to do with the subtleties of the split level of the branches, TClonesArray, and so on; however, up until now I have not managed to navigate this? Can somebody give me some hints?

Cheers,
Thiago
copytree2.C (4.03 KB)

Your original TChain / TTree contains objects stored in “split mode”.
I think, the best for you would be to try an “analysis skeleton”. See, for example, links in: How are multiple TTree->Draw()s done?

Dear Pepe,

I have used MakeClass() to generate the skeleton code that I have attached, namely to create the variables, TBranches and doing the TChain::SetBranchAddress.

Of course I had forgotten to do the all-magic fChain->SetMakeClass(1);
Still, didn’t solve my problem.

Now, of course I am able to read the tree. What I want is really to copy the tree.
I have tried many combinations of:

fChain->CloneTree();

fChain->CloneTree(0);
newtree->CopyEntries(fChain);

fChain->CloneTree(0);
fChain->CopyAddresses(newtree);
newtree->CopyEntries(fChain);

and all of these give me a tree that is all filled with zeros. :frowning:

This “split mode” is really frustrating. Everybody that I know has problems dealing with this.
Anyway, any tips?

Cheers,
Thiago

See a note about “cloning” in: How are multiple TTree->Draw()s done?
I think Philippe would need to comment on it (maybe something changed in the meantime).

Dear Pepe,

Thanks for your kind help!

I quote the post from Phillipe here:

Unfortunately, I am not familiar with generating code via “MakeProxy” - I have tried the obvious thing:
Delphes->MakeProxy(“metDelphes2”)

but it complains that:

Error in TTreePlayer::MakeProxy: A file name for the user script is required
(Int_t) 0

and I don’t follow. I have tried to look at the tutorials, but the only one that has it is h1analysisProxy.C, which seems to use MakeProxy to generate the script itself.

Any further tips?

Cheers,
Thiago

Hi,

So, after some work together with a colleague, I found that the following script solves the problem.

void mySkim(TString fileName, TString suffix="_skimmed") {

  TChain* chain=new TChain("Delphes");
  chain->Add(fileName.Data());
  // Deactivate all branches
  chain->SetBranchStatus("*",0);
  // Activate 4 branches only: our skim
  chain->SetBranchStatus("Jet*",1);
  chain->SetBranchStatus("Jet_size*",1);
  chain->SetBranchStatus("MissingET*",1);
  chain->SetBranchStatus("MissingET_size*",1);
  //Create a new file + a clone of old tree header. Do not copy events

  TString newFileName = fileName.ReplaceAll(".root","")+suffix+TString(".root");
  TFile * newfile = TFile::Open(newFileName.Data(),"recreate");
  TTree * newtree = chain->CloneTree(0);

  // Here we copy the branches
  newtree->CopyEntries(chain);

  // Flush to disk
  newfile->Write();
  newfile->Close();
  newfile->Delete();
  chain->Delete();
}

Cheers,
Thiago

Hello,

I run the example below

R__LOAD_LIBRARY(/opt/root6/test/libEvent.so)
      
void copytree3() {
   gSystem->Load("/opt/root6/test/libEvent.so");
   //Get old file, old tree and set top branch address
   TFile *oldfile;
   TString dir = "/home/andre/FPMC/fpmc-master/file1.root";
   gSystem->ExpandPathName(dir);
   if (!gSystem->AccessPathName(dir))
       {oldfile = new TFile("/home/andre/FPMC/fpmc-master/file1.root");}
   else {oldfile = new TFile("./file1.root");}
   TTree *oldtree = (TTree*)oldfile->Get("h777");

   
   oldtree->SetBranchStatus("*",0);
   oldtree->SetBranchStatus("px",1);
   oldtree->SetBranchStatus("py",1);
   oldtree->SetBranchStatus("pz",1);
   oldtree->SetBranchStatus("e",1);
   
   
   //Create a new file + a clone of old tree in new file
   TFile *newfile = new TFile("small1.root","recreate");
   TTree *newtree = oldtree->CloneTree();
   newtree->Print();
   newfile->Write();
   delete oldfile;
   delete newfile;
}

I would like that the output file only have px, py, pz and e, but I can’t remove
the other columns.

root [0] 
Processing copytree3.C...
******************************************************************************
*Tree    :h777      : ntuple                                                 *
*Entries :      999 : Total =         1857800 bytes  File  Size =    1661739 *
*        :          : Tree compression factor =   1.06                       *
******************************************************************************
*Br    0 :ngen      : ngen/I                                                 *
*Entries :      999 : Total  Size=       4624 bytes  One basket in memory    *
*Baskets :        0 : Basket Size=      64000 bytes  Compression=   1.00     *
*............................................................................*
*Br    1 :px        : px[ngen]/F                                             *
*Entries :      999 : Total  Size=     463302 bytes  File Size  =     415147 *
*Baskets :        7 : Basket Size=      64000 bytes  Compression=   1.06     *
*............................................................................*
*Br    2 :py        : py[ngen]/F                                             *
*Entries :      999 : Total  Size=     463302 bytes  File Size  =     415309 *
*Baskets :        7 : Basket Size=      64000 bytes  Compression=   1.06     *
*............................................................................*
*Br    3 :pz        : pz[ngen]/F                                             *
*Entries :      999 : Total  Size=     463302 bytes  File Size  =     421030 *
*Baskets :        7 : Basket Size=      64000 bytes  Compression=   1.05     *
*............................................................................*
*Br    4 :e         : e[ngen]/F                                              *
*Entries :      999 : Total  Size=     463289 bytes  File Size  =     410253 *
*Baskets :        7 : Basket Size=      64000 bytes  Compression=   1.08     *
*............................................................................*

the output file small1.root ----->txt

***********************************************************************************
*    Row   * Instance *      ngen *        px *        py *        pz *         e *
***********************************************************************************
*        0 *        0 *       118 * -0.399503 * 0.0436147 * 6446.8862 * 6446.8862 *
*        0 *        1 *       118 * -0.032813 * 0.3375689 * 0.1784433 * 0.4063141 *
*        0 *        2 *       118 * -0.218344 * 0.4673282 * 1.0106648 * 1.1426867 *

How i can remove the columns, Row, Instance and ngen…
can someone help me?

Thanks,

The “Row” (i.e. the “tree entry number”) and “Instance” (from 0 to “ngen - 1” for each “tree entry”) are automatically generated by the “Scan” method.
Note that you MUST have “oldtree->SetBranchStatus(“ngen”, 1);” because “px”, “py”, “pz” and “e” DEPEND on it.

Hi, Wile

Thanks for the answer,

Now I understand what is happening, because “ngen” is the size of the
vector in the TTree.

Cheers,
Andre

Hi,
if you have access to ROOT v6.10, you can easily clone a TTree with TDataFrame:

#include "ROOT/TDataFrame.hxx"
ROOT::Experimental::TDataFrame d("h777", "/home/andre/FPMC/fpmc-master/file1.root");
d.Snapshot("newtree", "small1.root", {"px", "py", "pz", "e"}); // saves the listed branches to file

You can read more about TDataFrame here.

Cheers,
Enrico

1 Like