Root file filter

Hello, the output of a Geant4 simulation is very big (41GB for 10^9 primary events) then I’m writing a ROOT macro to copy the Tree branches in several root files.

The macro should take the tree in Sim.root file and store

  1. Edep branch in SimEdep.root file
  2. Zint branch in SimZint.root file
  3. Egamma branch in SimEgamma.root file
  4. Egammacascade branch in SimEgammacascade.root file
  5. Pgx+Pgy+Pgz branches in SimPgamma.root file

I wrote the macro, but it crashes

root [0] .x treefilter.cpp
TFile**         Sim.root
 TFile*         Sim.root
  KEY: TH1D     ParticleCodes;1
  KEY: TH1D     hGeTotE0;1
  KEY: TH1D     hGeTotEGauss0;1
  KEY: TH1D     hGeTotE1;1
  KEY: TH1D     hGeTotEGauss1;1
  KEY: TH1D     SecGammaKinE;1
  KEY: TH2D     SecGammaKinEVSParKinE;1
  KEY: TH2D     ParPosZVSSecKinE;1
  KEY: TH2D     ParPosZVSParKinE;1
  KEY: TH1D     SrcGGCosTheta;1
  KEY: TH1D     ParKinEne;1
  KEY: TH1D     ParPosZ;1
  KEY: TH2D     KinEFragIon1VSKinEFragIon2;1
  KEY: TH2D     thetaFragIon1VSKinEFragIon1;1
  KEY: TH2D     thetaFragIon2VSKinEFragIon2;1
  KEY: TH2D     KinECapGammaVSKinECapFragIon;1
  KEY: TH2D     thetaCapGammaVSKinECapGamma;1
  KEY: TH2D     thetaCapFragIonVSKinECapFragIon;1
  KEY: TH2D     PrimaryXvsY;1
  KEY: TH1D     PrimaryZ;1
  KEY: TH1D     PrimarykinE;1
  KEY: TH1D     PrimaryDiv;1
  KEY: TTree    Tree1;1 RawData
Start Edep copy

==========================================
=============== STACKTRACE ===============
==========================================


================ Thread 0 ================
  libCling!TClingCallbacks::PrintStackTrace()
  libCling!cling::runtime::internal::EvaluateDynamicExpression()
  0x693154e ??
  libCling!cling::Value::print()
  libCling!cling::runtime::internal::EvaluateDynamicExpression()
  libCling!cling::runtime::internal::EvaluateDynamicExpression()
  libCling!cling::runtime::internal::EvaluateDynamicExpression()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!cling::runtime::internal::setValueWithAlloc()
  libCling!TClingLookupHelper__ExistingTypeCheck()
  libCling!TCling::ProcessLine()
  libCling!TCling::ProcessLineSynch()
  libCore!TApplication::ExecuteFile()
  libCore!TApplication::ProcessFile()
  libCore!TApplication::ProcessLine()
  libRint!TRint::ProcessLineNr()
  libRint!TRint::HandleTermInput()
  libCore!TWinNTSystem::DispatchOneEvent()
  libCore!TSystem::InnerLoop()
  libCore!TSystem::Run()
  libCore!TApplication::Run()
  libRint!TRint::Run()
  root!Init_thread_footer()
  root!Init_thread_footer()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 1 ================
  ntdll!ZwWaitForWorkViaWorkerFactory()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 2 ================
  ntdll!ZwWaitForWorkViaWorkerFactory()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 3 ================
  ntdll!ZwWaitForWorkViaWorkerFactory()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 4 ================
  ntdll!ZwDelayExecution()
  KERNELBASE!SleepEx()
  KERNELBASE!Sleep()
  libCore!TWinNTSystem::TimerThread()
  libCore!TWinNTSystem::ThreadStub()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 5 ================
  win32u!NtUserGetMessage()
  libCore!TWinNTSystem::GetProcInfo()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 6 ================
  ntdll!ZwWaitForWorkViaWorkerFactory()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

================ Thread 7 ================
  ntdll!ZwWaitForWorkViaWorkerFactory()
  KERNEL32!BaseThreadInitThunk()
  ntdll!RtlGetFullPathName_UEx()
  ntdll!RtlGetFullPathName_UEx()

==========================================
============= END STACKTRACE =============
==========================================

treefilter.cpp (1.9 KB)

      TTree *ntuple = (TTree*) fin->Get("ntuple");
      if (!ntuple) {
         printf("Error: ntuple not found!\n");
         return;
      }
      cout << "Start Edep copy" << endl;

Hi @bellenot…indeed, if I add the check it says that it can’t find the ntuple…
but as you can see, there are the branches in the tree

root [1] Tree1->Print();
******************************************************************************
*Tree    :Tree1     : RawData                                                *
*Entries :   100000 : Total =        15111127 bytes  File  Size =    4534509 *
*        :          : Tree compression factor =   3.33                       *
******************************************************************************
*Br    0 :Edep      : vector<double>                                         *
*Entries :   100000 : Total  Size=    3011005 bytes  File Size  =     285545 *
*Baskets :      107 : Basket Size=      32000 bytes  Compression=  10.54     *
*............................................................................*
*Br    1 :Zint      : vector<double>                                         *
*Entries :   100000 : Total  Size=    1781477 bytes  File Size  =     531407 *
*Baskets :       69 : Basket Size=      32000 bytes  Compression=   3.35     *
*............................................................................*
*Br    2 :Ep        : vector<double>                                         *
*Entries :   100000 : Total  Size=    1595631 bytes  File Size  =     463711 *
*Baskets :       63 : Basket Size=      32000 bytes  Compression=   3.44     *
*............................................................................*
*Br    3 :Egammacascade : vector<double>                                     *
*Entries :   100000 : Total  Size=    1782134 bytes  File Size  =     658927 *
*Baskets :       69 : Basket Size=      32000 bytes  Compression=   2.70     *
*............................................................................*
*Br    4 :Egamma    : vector<double>                                         *
*Entries :   100000 : Total  Size=    1595899 bytes  File Size  =     475237 *
*Baskets :       63 : Basket Size=      32000 bytes  Compression=   3.35     *
*............................................................................*
*Br    5 :Pgx       : vector<double>                                         *
*Entries :   100000 : Total  Size=    1781404 bytes  File Size  =     704555 *
*Baskets :       69 : Basket Size=      32000 bytes  Compression=   2.53     *
*............................................................................*
*Br    6 :Pgy       : vector<double>                                         *
*Entries :   100000 : Total  Size=    1781404 bytes  File Size  =     704427 *
*Baskets :       69 : Basket Size=      32000 bytes  Compression=   2.53     *
*............................................................................*
*Br    7 :Pgz       : vector<double>                                         *
*Entries :   100000 : Total  Size=    1781404 bytes  File Size  =     704452 *
*Baskets :       69 : Basket Size=      32000 bytes  Compression=   2.53     *
*............................................................................*
root [2]

Right, your code is obviously not doing what you expect to do. I (again) strongly suggest you to look at (and try) the tree tutorials, you will most probably find exactly what you’re looking for…
Anyway, here is an example of code doing what you want:

#include "TFile.h"
#include "TTree.h"
 
void treefilter()
{
   TFile *fin = TFile::Open("Sim.root");
   if (fin == 0) {
      printf("Error: cannot open the file!\n");
      return;
   }
   fin->ls();
   TTree *tree = nullptr;
   fin->GetObject("Tree1", tree);
   if (!tree) {
      printf("Error: failed to get the tree from the file!\n");
      return;
   }
   for (auto branchName : {"Edep", "Zint", "Egamma", "Egammacascade", "Pgx", "Pgy", "Pgz"}) {
      tree->SetBranchStatus("*", 0);
      cout << "Start " << branchName << " copy" << endl;
      std::string filename("sim");
      filename += branchName;
      filename += ".root";
      tree->SetBranchStatus(branchName, 1);
      auto fout = TFile::Open(filename.c_str(), "recreate");
      auto copytree = tree->CloneTree();
      fout->Write();
      delete fout;
      cout << branchName << " Ntupla copied in " << filename << endl;
   }
}

And I’m sure you can achieve similar goal with rdataframe, but for that I’ll let @eguiraud comment :wink:

Thank you @bellenot,
Truely, for the macro I used this previous post RootTalk: Re: [ROOT] Filtering events from a Tree

OK, fine, but this is a post from 2001 and what was asking the user was not related to what you tried to achieve… So I still think that reading the documentation and looking at the tutorials is the easiest and fastest way of solving your problems

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.