The fastest way to read all tree branch data?

Hello,

I would like to learn the fastest way to read all the data from a simple tree branch.
Imagine a single branch MyData with a single leaf of type Float_t, one entry per event.
The tree is a single TTree stored in one TFile.
I would like to unpack all its data, i.e. across all events, in memory (Float_t*) at once.
Which is the approach that takes the least run time?

Thank you.


ROOT Version: 6.18


Hi Gianluca,
welcome to the ROOT forum! Without any other context, probably pre-allocating the vector and using TBranch::GetEntry is the fastest (@pcanal can comment on whether I’m missing a tree->SetBranchStatus("*", 0); tree->SetBranchStatus("x", 1);):

TFile f("file.root");
TTree* t = f.Get<TTree>("tree");
TBranch* b = t->GetBranch("x");
float x;
b->SetAddress(&x);
const auto n_entries = t->GetEntries();
std::vector<float> v;
v.reserve(n_entries);
for (auto i = 0ll; i < n_entries; ++i) {
    b->GetEntry(i);
    v.push_back(x);
}

Note that, for performance, you should always use compiled code (not interpreted macros) and compile with at least -O2. Also, the exact ROOT API might matter as much as or less than TTree clustering/basket size or network bandwidth if reading over the network. Finally, depending on the dataset size, you could see significant speed-ups reading different TTree clusters in different threads (speed-ups that would trample any performance difference in the API used).

I attach benchmarking code that you can play with.

Cheers,
Enrico

read_bench.cpp (2.7 KB)

{
  TFile *f = TFile::Open("some_file.root");
  TTree *t; f->GetObject("some_tree", t);
  t->SetEstimate(-1); // keep all results (assumes one result per entry)
  Long64_t n = t->Draw("some_leaf", "", "goff");
#if 1 /* 0 or 1 */
  if (n > t->GetEstimate()) // just a precaution
    { t->SetEstimate(n); t->Draw("some_leaf", "", "goff"); }
#endif /* 0 or 1 */
  for (Long64_t i = 0; i < n; i++) std::cout << t->GetV1()[i] << std::endl;
}

Wow thanks, I had no idea TTree::Draw could do that :smile:
I added your solution to the benchmark (hopefully without misinterpreting/mistranslating anything).
The new version is below. On my machine the accumulated runtimes for 10 iterations on 1e6 entries (might not be super representative because it’s only one TTree cluster):

branch           0.326195
tree             0.505985
treereader       0.457618
take             0.62232
foreach          0.588362
tree_draw        0.665106

read_bench_v2.cpp (3.4 KB)

2 Likes

By the way – may I ask what your actual usecase is? Depending on the answer there could be other options (e.g. if you have to fill a histogram with the data, it’s better to do it on the fly, and if you have more than one float per entry my numbers above might not really apply).

Cheers,
Enrico

Well, I cheated. My actual use case is to fill a Python linear data structure (let’s say a numpy.ndarray, or an array.array) with data from the tree, where the data is stored sparsely in two branches, in a tree like index:value, and the indices not included in the tree are associated to value 0. I have no control on the format on the input data.
Going for event in tree: was slower than I had hoped.
But in the end my question was genuine beyond the specific use case.

Thank you for the answers!

Well if the language is python that changes things. For loops are out of the question.
You can use TTree::AsMatrix or RDataFrame::AsNumpy, tutorials here and here.

Other helpful posts: https://root-forum.cern.ch/search?context=topic&context_id=38263&q=event%20loop%20Python%20slow&skip_context=true

Out of curiosity: what’s the physics usecase for a TTree like that?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.