Home | News | Documentation | Download

Make a new TTree from a **deep** vertical union of existing TTrees

I’d like to solve the following problem of making vertical union of TTrees:

Given 2 TTrees (t1, t2) with identical number of entries. Each tree has a list of (distinct) branches
t1 -> GetListOfBranches() = { a, b, c }
t2 -> GetListOfBranches() = { d, e }

The goal is to combine these TTrees into a single TTree containing the union
of all branches. This procedure should be as fast as possible, possibly never unfolding
any data and using parallelism if possible.

Further constraints, that rule out a simple TFriend mechanism, are:

  • Like to have a single file and single TTree at the end of the procedure.
  • The return value of GetListOfBranches() on the final TTree should be the union of all branches.

I’d like to document a working solution to this problem, using an intermediate RDataFrame:

 TFile file(f1, "OPEN");  // file of first TTree
 auto t1=(TTree*)file.Get(treename);
 t1->AddFriend(treename, f2); // add second TTree as friend
 ROOT::RDataFrame df(*t1);
 df.Snapshot(treename, finalfilename, ".*"); // write out combined columns

Thanks to @Axel for this proposition.

While this solution is working, I am wondering whether there is a faster mechanism
never unfolding/decompressing any data (and possibly using parallel copy).
Any suggestion how this could be achieved, using for instance TTreeCloner etc., would be appreciated.

1 Like

While this solution is working, I am wondering whether there is a faster mechanism
never unfolding/decompressing any data (and possibly using parallel copy).
Any suggestion how this could be achieved, using for instance TTreeCloner etc., would be appreciated.

@pcanal can you please suggest something here?

We are have not implemented the horizontal merge yet.

One solution (not exact to Sandro’s requirement but close):

auto mainfile = TFile::Open(firsttreefilename, "UPDATE");
auto friendfile = TFile::Open(secondtreefilename, "READ");
auto friendtree = ffriendfile>Get<Tree>(secondtreename);
mainfile->cd();
auto friendcopy = friendtree->CloneTree(-1, "fast");
auto maintree = mainfile->Get<TTree>(firsttreename);
maintree->AddFriend(friendcopy);
mainfile->Write();

now for the updated firsttreefile the main/first will “appear” to the user as a single tree horizontally merging the 2 tree (because the TTreeFriend-ship is persistent.

Further more replacing the AddFriend line above with:

maintree->GetListOfBranches()->AddAll( friendcopy->GetListOfBranches() );

might do the trick for most cases. (Things not handled properly includes the BranchRef mechanism, content of TTree::GetUserInfo, the cluster boundaries (only the one from the main tree are kept, the new branches may or may not be aligned, tree size information (bytes saves,etc.), conflict of branch name ).