TChain with friends and GetListOfBranches [2]

I’m observing the same behaviour as reported in this post, which has no answer. Could you please clarify?

Hi,

Thanks for the post.
I am sorry you are experiencing problems and I apologise for having a post w/o answer on the forum. Given that a long time passed since the post you are referencing, would you mind:

  • Posting a minimal reproducer?
  • Specifying what ROOT version you are using?

Best,
D

Sure. I’m using ROOT 6.24/06. You can find the needed input root files here:

small_ggF_Radion_m800_output_0.root (31.7 KB)
small_ggF_Radion_m800_eval_0.root (29.6 KB)

Minimal reproducer:

#include<iostream>
#include "TFile.h"
#include "TTree.h"

using namespace std;

void test()
{
  string skim_name = "small_ggF_Radion_m800_output_0.root";
  string eval_name = "small_ggF_Radion_m800_eval_0.root";

  TFile *skim_file = new TFile(skim_name.c_str());
  TFile *eval_file = new TFile(eval_name.c_str());

  TTree *skim_tree = (TTree*)skim_file->Get("HTauTauTree");
  TTree *eval_tree = (TTree*)eval_file->Get("evaluation");
  assert(skim_tree->GetEntries() == eval_tree->GetEntries());
  skim_tree->AddFriend(eval_tree);
  
  string skim_branch = "hbtresdnn_mass3000_spin2_hh";
  string eval_branch = "dau1_pt";

  float skim_var, eval_var;
  skim_tree->SetBranchAddress(skim_branch.c_str(), &skim_var);
  skim_tree->SetBranchAddress(eval_branch.c_str(), &eval_var);

  // for (int iEntry=0; iEntry<skim_tree->GetEntries(); ++iEntry) {
  //     skim_tree->GetEntry(iEntry);
  //     std::cout << skim_var << ", " << eval_var << std::endl;
  //  }

  TObjArray* skim_branches = skim_tree->GetListOfBranches();
  TObjArray* eval_branches = eval_tree->GetListOfBranches();
  int nbr_skim = skim_tree->GetNbranches();
  int nbr_eval = eval_tree->GetNbranches();

  std::cout << "Skim branches"  << std::endl;
  for (unsigned iB=0; iB < nbr_skim; ++iB)	{
	std::cout << " - " << skim_branches->At(iB)->GetName() << std::endl;
  }

  std::cout << std::endl;
  std::cout << "Eval branches"  << std::endl;
  for (unsigned iB=0; iB < nbr_eval; ++iB)	{
	std::cout << " - " << eval_branches->At(iB)->GetName() << std::endl;
  }
}

// Docs: https://root.cern.ch/root/htmldoc/guides/users-guide/Trees.html#example-3-adding-friends-to-trees
//compile with g++ test.cc -o exec `root-config --cflags --glibs`
int main() {
  test();
  return 0;
}

The code was compiled with:

g++ test.cc -o exec `root-config --cflags --glibs`

(g++ version 11.4.0) and run with ./exec.

I would expect the first loop to print the branches of both trees.

For what it’s worth, I never solved this. I just wrote another function that looped over the friend trees to get their branches too. Unfortunately, that was written back when I was in ATLAS, and I no longer have access to that repo.

Dear @bfontana ,

Thanks for reaching out to the forum. I am not sure I understand what is the expected behaviour here. There are two TTree objects in this case (or two TChain in the linked post). TTree::GetListOfBranches returns the list of branches for the TTree/TChain you are calling this function upon. it shouldn’t be surprising that the list refers to only one tree at a time.

That being said, in the linked post there were such comments

I’m hunting for a complete list of branches available from the chain, including those supplied by friend trees
Currently, I’m thinking that the only option is to loop over the friend trees, and for each friend tree loop over their branches. Is that accurate, or is there a way to get all the branches available to a TChain without directly looping over the friends?

That is exactly what needs to be done and the correct thing to do as well. In fact, this is also what is done internally by RDataFrame when you call df.GetColumnNames() on a data frame that wraps a TTree with some attached friends.

Cheers,
Vincenzo

Dear @vpadulan,

Given that we can access all branches of the trees when looping over the tree (both the ones from the original tree and the ones from the “friend” tree), IMHO the expected behaviour of TTree::GetListOfBranches should be to return the branches of both trees. Same for TTree::GetNbranches() and potentially other TTree methods.
The whole idea of having a friend tree is to expand it at column level, abstracting away the need to access both trees at all times.

Best,
Bruno

Dear @bfontana ,

I see your point, I might not agree 100% but that is out of scope for this post. In any case, we cannot at this point change behaviour of the TTree/TChain interface, but the ROOT team is learning from that experience and targeting the new RNTuple data format for this type of evolution. It will be very likely that a similar case in RNTuple will be treatable with a better interface.

Meanwhile, just to give you an example of how you can implement yourself a function that can abstract away the GetListOfBranches from one or more TChain objects, see how it is done in RDataFrame at root/tree/dataframe/src/RLoopManager.cxx at 1f86c2410d8e1dcd914f4adeb6c8cd523f7c56ff · root-project/root · GitHub .

Cheers,
Vincenzo