Automatised way of reading and processing data from Root-Files with c++

Hello!

I want to read out the raw data from a root file (actually a lot of them) and manipulate the data before I do any plots, histograms, etc…
Probably some basic stuff, but somehow for the last few days ROOT always came up with a way of not giving me what I want^^.

Some Background (maybe thats useful):
For my master’s thesis, I wrote a Geant4 application that stores it’s output as NTuples in Root - Files. The code in the application is fairly simple and looks like this:

//#########Snippet#########
//ntuple for GlassPlate data (ID = 1)
//#########################
   analysisManager ->
      CreateNtuple("GlassPlate_Run" + to_string (runID),"GlassPlateData");
   analysisManager -> CreateNtupleSColumn("Particle");
   analysisManager -> CreateNtupleIColumn("TrackID");
   analysisManager -> CreateNtupleDColumn("TimeStamp");
   analysisManager -> CreateNtupleIColumn("GlassPlateID");
   analysisManager -> CreateNtupleDColumn("PositionX");
   analysisManager -> CreateNtupleDColumn("PositionY");
   analysisManager -> CreateNtupleDColumn("PositionZ");
   analysisManager -> CreateNtupleDColumn("MomentumDirX");
   analysisManager -> CreateNtupleDColumn("MomentumDirY");
   analysisManager -> CreateNtupleDColumn("MomentumDirZ");
   analysisManager -> CreateNtupleDColumn("Energy");
   analysisManager -> FinishNtuple();
   
//output of time sorted glass plate hits
   
   n_hit = GPHC -> entries();   
   vector<G4double> USGPHC;
   for(G4int i = 0; i < n_hit; i++)
   {
      USGPHC.push_back( ((* GPHC)[i])->GetTime() );
   }
   
   vector<G4double> TSGPHC = USGPHC;
   sort(TSGPHC.begin(),TSGPHC.end());
   
   G4SDManager * sdManager = G4SDManager :: GetSDMpointer();      
   MyGlassPlateSD * GlassPlateSD = (MyGlassPlateSD *)
      sdManager -> FindSensitiveDetector("mySim/Detectors/GlassPlate");      
   vector<G4int> gotLoA = GlassPlateSD -> GetGotLostOrAbsorbed();
   
for(G4int i = 0; i < n_hit; i++)
   {      
      G4double timestamp = TSGPHC[i];      
      vector<G4double>::iterator it = find(USGPHC.begin(),
                                           USGPHC.end(),
                                           timestamp);      
      G4int index = distance(USGPHC.begin(), it);
      
      MyGlassPlateHit * hit = (* GPHC)[index];
      
      if(find(gotLoA.begin(),
              gotLoA.end(),
              (hit -> GetTrackID()))
         == gotLoA.end()
      )
      { 
         analysisManager ->
            FillNtupleSColumn(1,
                              0,
                              hit -> GetPartName());
         analysisManager ->
            FillNtupleIColumn(1,
                              1,
                              hit -> GetTrackID());
         analysisManager ->
            FillNtupleDColumn(1,
                              2,
                              hit -> GetTime() / ps);
         analysisManager ->
            FillNtupleIColumn(1,
                              3,
                              hit -> GetID());     
         analysisManager ->
            FillNtupleDColumn(1,
                              4, 
                              (hit -> GetEntryPoint()).x() / cm);
         analysisManager ->
            FillNtupleDColumn(1,
                              5, 
                              (hit -> GetEntryPoint()).y() / cm);
         analysisManager ->
            FillNtupleDColumn(1,
                              6, 
                              (hit -> GetEntryPoint()).z() / cm);         
         analysisManager ->
            FillNtupleDColumn(1,
                              7, 
                              (hit -> GetMomentumDir()).x() / cm);
         analysisManager ->
            FillNtupleDColumn(1,
                              8, 
                              (hit -> GetMomentumDir()).y() / cm);
         analysisManager ->
            FillNtupleDColumn(1,
                              9, 
                              (hit -> GetMomentumDir()).z() / cm);                               
         analysisManager ->
            FillNtupleDColumn(1,
                              10,
                              hit -> GetEtot() / eV);
         
         analysisManager -> AddNtupleRow(1);
      }
   }

Don’t bother with the details of the above snippet if you don’t know Geant4, I just wanted to show, how I filled my NTuples. The snippet shows how I fill one of three NTuples that will be stored in a root file.

Basically, if I tell my App to do a 1000 runs, I get 1000 root files where each one contains 3 NTuples with the data.
Optionally, I could also only produce one root file as output which will then contain 3000 NTuples with the data. But the problem remains: I need a fast way of reading the values stored in all these NTuples and store them in a way, that makes it easy to manipulate them.

Now to ROOT:
ROOT recognizes my ntuples as TTree Objects which have a TBranch object for every column of the NTuple.
I wrote a little code that gives me access to the Files and Trees, and the Print() method shows, that the data is there:

//############Code Snippet############
//(myFile is the TFile object to a corresponding .root-File, myTree is a TTree object to one of the trees in the file)
//####################################

myFile -> ls();

for(int i = 0; i < myTree -> GetNbranches(); i++)
   {
      TBranch * theBranch =
         (TBranch *) (myTree -> GetListOfBranches()) -> At(i);
      
      theBranch -> Print();
   }
   
//############OUTPUT:############

/home/fdachs/Dropbox/Geant4/Projects/mySimulation/build/Output/root/rootData_31-3-2016_Run3.root
TFile**		/home/fdachs/Dropbox/Geant4/Projects/mySimulation/build/Output/root/rootData_31-3-2016_Run3.root	
 TFile*		/home/fdachs/Dropbox/Geant4/Projects/mySimulation/build/Output/root/rootData_31-3-2016_Run3.root	
  KEY: TTree	Scintillator_Run3;1	PrimaryData
  KEY: TTree	GlassPlate_Run3;1	GlassPlateData
  KEY: TTree	SPADs_Run3;1	SPADData
*Br    0 :Particle  : Char_t GlassPlate_Run3                                 *
*Entries :    11896 : Total  Size=     215245 bytes  File Size  =      18515 *
*Baskets :        5 : Basket Size=      32000 bytes  Compression=  11.13     *
*............................................................................*
*Br    1 :TrackID   : Int_t GlassPlate_Run3                                  *
*Entries :    11896 : Total  Size=      95919 bytes  File Size  =      28936 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   2.21     *
*............................................................................*
*Br    2 :TimeStamp : Double_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=     143614 bytes  File Size  =      68796 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.40     *
*............................................................................*
*Br    3 :GlassPlateID : Int_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=      95949 bytes  File Size  =      17330 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   3.70     *
*............................................................................*
*Br    4 :PositionX : Double_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=     143614 bytes  File Size  =      53642 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.79     *
*............................................................................*
*Br    5 :PositionY : Double_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=     143614 bytes  File Size  =      55196 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.74     *
*............................................................................*
*Br    6 :PositionZ : Double_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=     143614 bytes  File Size  =      61880 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.55     *
*............................................................................*
*Br    7 :MomentumDirX : Double_t GlassPlate_Run3                            *
*Entries :    11896 : Total  Size=     143635 bytes  File Size  =      71916 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.34     *
*............................................................................*
*Br    8 :MomentumDirY : Double_t GlassPlate_Run3                            *
*Entries :    11896 : Total  Size=     143635 bytes  File Size  =      71879 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.34     *
*............................................................................*
*Br    9 :MomentumDirZ : Double_t GlassPlate_Run3                            *
*Entries :    11896 : Total  Size=     143635 bytes  File Size  =      71876 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.34     *
*............................................................................*
*Br   10 :Energy    : Double_t GlassPlate_Run3                               *
*Entries :    11896 : Total  Size=     143593 bytes  File Size  =      70479 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.36     *
*............................................................................*

So there are 11 Branches with the names that I gave them in my Geant4 app and they contain, for this particular run, 11896 entries and the branches seem to know what format they’re storing.
A look at the files with TBrowser even shows me Histograms of the data, so the root files should be OK.

Finally my Question:
Is there a way of getting the actual values of the entries of the Branches and putting them, for example, into an array or vector?

Basically like:

vector<double> energies;
for(int i = 0; i < theBranch -> GetEntries(); i++)
{
   energies.push_back(theBranch -> GetEntry(i));
}

Sadly, the GetEntry() method only returns the number of bytes read, but not the value of the entry (why is this not called GetBytesRead() or something??) but I think you know what I mean.

I feel like I missed something obvious and the answer is easy, but I can’t see it for some reason. Maybe ROOT is just meant to be used in a different way but all the tutorials I could find either only explain about the ROOT shell or super easy examles that aren’t much use :frowning:

If you could help me out or point me towards a similar post in the forum, I would be most grateful!!
I looked through many posts myself, but didn’t find anything that could help me :confused:

best regards,

Badger

I think, the best for you would be to try an “analysis skeleton”. See, for example, links in: How are multiple TTree->Draw()s done?

Wow, that was fast, thanks for the link!
I’ll look right into it :smiley: