Easily aggregrate 2 Rdataframes with same n(entries) in a sorted way

Dear Expert,
I would like to solve an issue i have which lead me to do some nasty gymnastic using rdataframe and numpy array while there might be instead already an easier solution.

In practice i have 2 TTree, both with same branch names and number of entries.
What i would like to do is a global renaming of branches in tree 1 and tree2 and then parallel merge of the 2 ttrees. Say entry one of tree1 can now be compared to entry 1 of tree 2.
Is there any appropriate tool to massively rename branches and merge 2 Ttrees with ordered entries keeping constant the nEntries total to the original values of the tree1 and 2?
Thanks
Renato


Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Did you try: https://root.cern.ch/root/htmldoc/guides/users-guide/Trees.html#example-3-adding-friends-to-trees ?

1 Like

Hi @pcanal, indeed, i was not smart enough to think about making the 2 friends. It might do the trick, however i have the base and friend tree with the exact same branch name, is there any issue if this is happening?

It should be fine, the branches in the friend tree should be accessible as "friendname.branchname". Let us know if there is any problem.

Cheers,
Enrico

Yes, i will, if that’s the case it will work, however in my use case i have say 8 TTree to merge as friend, all with same TTree name and all with same branches, that might be tricky , probably.

Actually, i think that AddFriend( "CustomName = SAMENAMES", "FILEX") would do the trick.
Where CustomName must be different on each of the 8 friends to chain

Yes you can specify an alias when adding a friend.

1 Like

Here a snippet which works very nicely :


    ROOT::RDataFrame df(100);
    df.Define("bsIDX", "rdfentry_").Snapshot("PID_TRK_BDT_BKIN_MULT_RECO_L0_HLT_nTracks", "Base.root");

    TFile f("Base.root");
    TTree * EfficiencyTree = f.Get<TTree>( "PID_TRK_BDT_BKIN_MULT_RECO_L0_HLT_nTracks");

    map< TString, TString> Files = { 
        {"RK_EE_11_MD", "/eos/lhcb/wg/RD/RKstar/efficiencies/v9/RK/Efficiency_EE_rJPsi_RECO_HLT_L0_PIDCalib_TRK_BDT-DTF_L0_HLT_nTracks_priorBDT_PIDCalib_3D_PIDCALIB_PIDCALIB_BS-q2jps/11MD/EffTuple.root"},
....
   } 
   TString _treeName = "PID_TRK_BDT_BKIN_MULT_RECO_L0_HLT_nTracks";
    for( auto && el : Files){
        TString friend_aliasing = el.first+" = PID_TRK_BDT_BKIN_MULT_RECO_L0_HLT_nTracks";
        std::cout<< "adding as friend "<< friend_aliasing << " from file \n \t"<< el.second << std::endl;
        EfficiencyTree->AddFriend(friend_aliasing,el.second );
    }

And creating afterwards a unique dataframe from the base one has all branches accessible via

RKst_MM_12_MD.effnorm_L0L_exclusive2_wB0_kde
RKst_MM_12_MD.eps_L0L_exclusive2_wBp_kde
RKst_MM_12_MD.norm_L0L_exclusive2_wBp_kde
RKst_MM_12_MD.effnorm_L0L_exclusive2_wBp_kde
RKst_MM_12_MU.bsIDX
RKst_MM_12_MU.eps_L0I_exclusive_wB0_kde
RKst_MM_12_MU.norm_L0I_exclusive_wB0_kde
RKst_MM_12_MU.effnorm_L0I_exclusive_wB0_kde
RKst_MM_12_MU.eps_L0I_exclusive_wBp_kde
RKst_MM_12_MU.norm_L0I_exclusive_wBp_kde
RKst_MM_12_MU.effnorm_L0I_exclusive_wBp_kde
RKst_MM_12_MU.eps_L0L_exclusive_wB0_kde
RKst_MM_12_MU.norm_L0L_exclusive_wB0_kde
RKst_MM_12_MU.effnorm_L0L_exclusive_wB0_kde
RKst_MM_12_MU.eps_L0L_exclusive_wBp_kde
RKst_MM_12_MU.norm_L0L_exclusive_wBp_kde
RKst_MM_12_MU.effnorm_L0L_exclusive_wBp_kde
RKst_MM_12_MU.eps_L0L_exclusive2_wB0_kde
RKst_MM_12_MU.norm_L0L_exclusive2_wB0_kde
RKst_MM_12_MU.effnorm_L0L_exclusive2_wB0_kde
RKst_MM_12_MU.eps_L0L_exclusive2_wBp_kde
RKst_MM_12_MU.norm_L0L_exclusive2_wBp_kde
RKst_MM_12_MU.effnorm_L0L_exclusive2_wBp_kde

The only caveat is that i need to have 1 branch in common to all ntuples , in my case the bsIDX whcih i have to emulate on a base root file using the _rdfentry trick.

Thanks a lot for the suggestions, this solution is very elegant and bookkeping wise it’s perfect.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.