Merge friend trees into a single tree

Hi ROOTers,
I am trying to merge two trees, with different branches, into a single tree containing all of them. I tried to do it using TChain and the AddFriend function.
I currently tried to do it directly from terminal.
What I do is to add the main tree to a first chain, then add the other tree to another chain, and add it as a friend to the main one.

originalChain = TChain("merged")
originalChain.Add("mainFile.root")
friendChain = TChain("variables")
friendChain.Add("additionalFile.root")
originalChain.AddFriend("variables")

I manage to access all the variables contained in the "variables" tree using Draw() or Scan() functions on originalChain.
What I don’t manage to do, is to save everything as a single TTree that contains all the branches.
I tried to do

originalChain.Merge("newFile.root")

but the output contained only the branches from the "merged" tree.
Is there a way to do this?

_ROOT Version: 6.18/00
_Platform: Linux 3.10.0-1127.13.1.el7.x86_64

Maybe @pcanal can help

Merging friend trees (horizontal merge) is not yet implement in CloneTree

To do such a merge you would need:

  • CloneTree(0) [just the structure] the main tree.
  • Add/Create the branch corresponding to the friend tree in the cloned structure
  • Connect the branch addresses from the friend to the new branches in the cloned structure
  • [probably] Connect a notify object (SetNotify on the friend chain) so that you can update the branches address of the friend tree and the cloned structure.
  • loop over the entry and call GetEntry/Fill as appropriate.
1 Like

i.e. usually, developers prefers to just rely on the friend connection instead of duplicating the files/data.

Does any RdataFrame operation be used in this case?
In the past to merge a tree and a friend one to a single tuple i did the AddFriend on a given TTree passing a nickname to each friend, then i apply a Recursive Define so that nickname.XX gets Define as XX, then snapshot with dropped columns.

I don’t know if anything like dataframe from TChain with an AddedFriend would work as well. Might be good a try maybe.

This sounds interesting, thanks.
Could you provide an example/tutorial on how to do it?

THis is a bit of pseudo code , not tested but hopefully gives you the idea

    TTree *BaseTree  = fileBase.Get<TTree>("CovTuple") ; // or your original TChain 
    std::vector< TString, pair< TStriing, TString > >  _listOfFriends{ 
             { "friend1", { "treeName", "fileName1.root"}}, 
             .........
    } ; 
    for( auto & friends : _listOfFriends){ 
          //Base->AddFriend("alias = treeNameInFile", "fileFriend.root"); 
          BaseTree->AddFriend( TString::Format("%s = %s", friends.first.Data(), friend.second.first.Data()), friends.second.second.Data()); 
    } 

    ROOT::RDataFrame df( *BaseTree); 
    ROOT::RDF::RNode node(df); 
    //all columns, there should be stuff like friendAlias.XXX ones 
    std::vector<std::string> _allColumns = df.GetColumnNames(); 
    std::vector<std::string> _colKeep;
    for( auto & c : _allColumns){ 
         TString colCheck(c); 
         bool _isFriendColumn = false; 
         for( auto & friends : _listOfFriends){  
               TString _preFixBranch = friends.first; 
               if( colCheck.Contains(_preFixBranch)) { 
                     TString _definedBranch = colCheck.ReplaceAll(".","_")
                     node = node.Define( _definedBranch.Data(), c); 
                     _colKeep.push_back( _definedBranch.Data()); 
                     _isFriendColumn = true; 
               }
               if( _isFriendColumn) break;
         }
         if( ! _isFriendColumn) _colKeep.push_back( c); 
    } 
    node.Snapshot( "merged","merged.root", _colKeep); 

Your final tuple will contains in principle “all-columns” from originalTree, and all added friends branches with

friendAliasX_COLUMNY

The aliasing is useful if you have 4-5 different friends all done with the same name on branches, so to not confuse the RDataFrame the aliasing push the branches to have friend1.XX , friend2.XX in the ColumnNames() list, the Define with . replaced with _ is arbitrary, is just to flatten everything

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.