Getting value from TBranch is extremely slow

ROOT Version: 6.24/00
Platform: 5.12.13-300.fc34.x86_64
Compiler: gcc version 11.1.1 20210531 (Red Hat 11.1.1-3) (GCC)


Hello,

I have a ROOT Tree which contains 105 Branches. I am trying to access the values of each branch and fill the corresponding histogram using the attached code. This code works, but dead slow. With this it will take ages to complete the data analysis.

Any help/guidance is highly appreciated.

I can provide a sample root file which I am using for the analysis, if required.

Regards,

Ajay

GetValue.C (1.4 KB)

Hi @ajaydeo ,
that code does a lot of extra work:

  • calling fChain->LoadTree at every event is expensive
  • calling fCHain->GetListOfBranches()->At(i) is expensive
  • calling fChain->GetLeaf at every event might be expensive

On Linux you can use e.g. perf to profile an execution of your code and see where time is spent.

Also, if you need speed you should compile the code with optimizations (i.e. with g++ -O2 ...) or execute it via ACLiC (root GetValue.C+, note the + at the end) which compiles with with optimizations under the hood.

Also check ROOT: ROOT::RDataFrame Class Reference for a higher-level data processing interface that takes care of these details for you.
If you want a histogram for each branch in the tree, with RDataFrame the code looks like this (have not tested this but it should give you an idea):

ROOT::RDataFrame df("RoseNIAS", "eu152_11july_afterGlitch.001");
std::vector<ROOT::RDF::RResultPtr<TH1D>> histos;
for (const std::string &col : df.GetColumnNames()) {
  auto h = df.Histo1D<double>(col);
  histos.push_back(h);
}
// Computation has been booked but it has not started yet.
// The first time you access one of the histograms,
// all of them will be filled in a single event loop

If you add ROOT::EnableImplicitMT() at the beginning of that snippet, reading data and filling histograms is done in parallel on multiple threads for better performance.

Cheers,
Enrico

P.S.
note that the 6 lines of code above are really all the code needed to substitute what you have in GetValue.C: RDataFrame is more concise.

Dear @eguiraud,

Thank you very much for your quick response.

I am not familiar with RDataFrame class, it looks bit abstract for me.
Filling histograms is NOT the only objective! What I want is:

  1. Generating & Filling Histograms with the names corresponding to each branch.
  2. Value of each branch. This will be calibrated to find energies of gamma rays.

Can you please guide me further on the above aspects?

Regards,

Ajay

Hi,
about 1., that’s what my code above does.

About 2., what do you mean when you say you want the “value of each branch”? Each branch has one value per entry. Do you want to put all those values e.g. in a vector?

The documentation I linked plus the RDataFrame tutorials should explain everything.

Cheers,
Enrico

I ran the code. It is getting executed, but can’t find histograms in it!
For example, one of the histograms in the Tree has name: "CL_04_E01"

I have a question : How do I get the number of entries in the Tree?

Yes, I would like to put all the values in an array, say data (or a vector); which I can then access in another loop.

May be a code from my earlier post may help in explaining what I am trying to achieve.

All I want to do is the following:

Regards,

Ajay

The way you were doing it in GetValue.C is totally fine. RDataFrame can count entries but it does not offer a method to retrieve the number directly from TTree.

With RDF, you can use Take:

auto vectorPtr = df.Take<double>("columnName");

vectorPtr is a RResultPtr<std::vector<double>>, i.e. a pointer to your vector.

Cheers,
Enrico

After I execute the code that you have shared, I should get the histograms, right? If I do
‘’[root 1]’’ .ls
I don’t see any histograms! How to access those? Where are they?

As an aside (@eguiraud answer are more relevant):
In,

   Long64_t ientry = fChain->LoadTree(k);
   if (ientry < 0) break;
   fChain->GetEntry(k);

The first 2 lines are redundant with the 3rd one per se. However, calling fChain->GetEntry(k); is often an overkill as it reads the whole entry whereas you only need a “few” branches. The more typical use is:

   Long64_t ientry = fChain->LoadTree(k);
   if (ientry < 0) break;
   // Do the following for all the branch you need.
   branch->GetEntry(ientry);

(This pattern is used (in essence) by RDF.)

In the histos vector

is not quite accurate it “just” move the cursor (if needed) to the requested tree in the TChain and return the corresponding entry in the local TTree. It is cheap (except when its time to open a new file)

1 Like

According to you - if I access one of the histograms all of them will be filled in a single event loop.

I am still struggling to retrieve/access the histograms from the histos vector and draw them.

Hi,
does something like histos[0]->GetMean() return something meaningful?

When I execute: [in both ROOT 6.24/00 and ROOT 6.25/01]

root [0] ROOT::RDataFrame df("RoseNIAS", "eu152_11july_afterGlitch.002");
root [1] auto h1 = df.Histo1D("CL_02_E01");
root [2] h1->Draw();

This is what I am getting:

Fatal: fConcreteAction != nullptr violated at line 43 of `/opt/root-6.24.00/tree/dataframe/src/RJittedAction.cxx'
aborting
#0  0x00007f2329b20aca in ?? () from /lib64/libc.so.6
#1  0x00007f2329a9e09b in ?? () from /lib64/libc.so.6
#2  0x00007f232a2de7dc in TUnixSystem::StackTrace() () from /opt/root/pro/lib/libCore.so
#3  0x00007f232a1b0e12 in DefaultErrorHandler(int, bool, char const*, char const*) () from /opt/root/pro/lib/libCore.so
#4  0x00007f232a2697f1 in ErrorHandler () from /opt/root/pro/lib/libCore.so
#5  0x00007f232a26a208 in Fatal(char const*, char const*, ...) () from /opt/root/pro/lib/libCore.so
#6  0x00007f23118648e6 in ROOT::Internal::RDF::RJittedAction::TriggerChildrenCount() () from /opt/root/pro/lib/libROOTDataFrame.so
#7  0x00007f231186c265 in ROOT::Detail::RDF::RLoopManager::EvalChildrenCounts() () from /opt/root/pro/lib/libROOTDataFrame.so
#8  0x00007f231186c2ac in ROOT::Detail::RDF::RLoopManager::InitNodes() () from /opt/root/pro/lib/libROOTDataFrame.so
#9  0x00007f2311873eb3 in ROOT::Detail::RDF::RLoopManager::Run() () from /opt/root/pro/lib/libROOTDataFrame.so
#10 0x00007f2322cdc0cf in ?? ()
#11 0x0000000000000000 in ?? ()

I am sharing one data file with you for further dignosis, is required.

Any help is highly appreciated.

Uhm yeah that should not happen, sorry about that. And unless there are other errors above, I am not sure how it could happen :confused:

…so I have to debug. I requested access to the file you shared.

Permission granted :grinning_face_with_smiling_eyes:

I tried with ROOT’s master branch compiled from source today, with ROOT v6.24.02 installed via conda and ROOT v6.24.00 on LXPLUS (using LCG view LCG_100). They all work.
They histogram is not particularly pretty (long tails!) but the axes limits can be adjusted :slight_smile:

root [0] ROOT::RDataFrame df("RoseNIAS", "eu152_11july_afterGlitch.002")
(ROOT::RDataFrame &) A data frame built on top of the RoseNIAS dataset.
root [1] auto h1 = df.Histo1D("CL_02_E01")
(ROOT::RDF::RResultPtr<TH1D> &) @0x7fe186264050
root [2] h1->Draw()
Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1

image

So I can’t reproduce the crash. Things should also be fine for you if you install ROOT via conda or you use an LCG view on lxplus, as we’d be using the exact same ROOT (instructions are at Installing ROOT - ROOT ).

If you can reproduce the crash on LXPLUS, with a conda package or in a Docker container I should be able to reproduce it and check what’s going on.

1 Like

This is very strange! I always build ROOT from source. It would be difficult to install ROOT via conda only for this purpose.

Is there a way to improve my original code without using RDataFrame? Would certainly love to use RDF, but right now my objective is to get the histograms and access branch value efficiently, with the present versions of ROOT [6.24/00 & 6.25/01] installed on my laptop.

Hi @eguiraud

As per your suggestion, I have installed ROOT via conda and could draw the above histogram.
However, when I try to run the attached macro, with different methods as shown below, it is throwing error. Am I doing something wrong in the macro?

root [0] .x GetValue.C
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TTreeReader::SetEntriesRange()>: Error setting first entry 1185030: problem reading dictionary info from tree
Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1
root [1] .q

(myROOT) ajay@ZBook Misc$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.24/02                        https://root.cern |
  | (c) 1995-2021, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Jul 03 2021, 08:02:00                 |
  | From tag , 28 June 2021                                          |
  | With                                                             |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------
root [0] .x GetValue.C+
Info in <TUnixSystem::ACLiC>: creating shared library /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/./GetValue_C.so
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TTreeReader::SetEntriesRange()>: Error setting first entry 1185030: problem reading dictionary info from tree
Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1
root [1] .q

(myROOT) ajay@ZBook Misc$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.24/02                        https://root.cern |
  | (c) 1995-2021, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Jul 03 2021, 08:02:00                 |
  | From tag , 28 June 2021                                          |
  | With                                                             |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------
root [0] .x GetValue.C++
Info in <TUnixSystem::ACLiC>: creating shared library /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/./GetValue_C.so
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist
Warning in <TTreeReader::SetEntryBase()>: There was an issue opening the last file associated to the TChain being processed.
Error in <TTreeReader::SetEntriesRange()>: Error setting first entry 1185030: problem reading dictionary info from tree
Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1
root [1] 

Can you please help?

GetValue.C (1.3 KB)

Hi,
ok so it looks like your ROOT builds have something weird/broken :confused:

To do things without RDF: see some suggestions in my first reply (+ I suggest checking with perf where time is spent, see e.g. Linux perf Examples ).

About the errors, the first one is the root cause:

Error in <TFile::TFile>: file /home/ajay/Research/IUAC_Experiments_July2021/DATA/Source/Misc/eu152_11july_afterGlitch.002/RoseNIAS does not exist

TFile can’t find your file, hard to say why – are you executing the script from the same directory where the file is?

@eguiraud
I built ROOT fresh yesterday with

cmake -DCMAKE_INSTALL_PREFIX="/opt/root/pro" /opt/root-6.24.00,

which proceeded without any errors. It is difficult to believe that there is a problem with the installation.

And YES, I am executing the script in the same directory where the data file resides!

Can you confirm that the macro attached earlier works at your end?