Combining root files with weighting

Chris_G · March 20, 2025, 3:07pm

I have two root files file1.root and file2.root. I need to combine them so that I get file3.root, which I will then use to continue my analysis. The problem is, is that I need to merge the root files by weighting them accordingly.

For context:

file1.root contains Z(longitudinal)Z(transverse) events
file2.root contains Z(longitudinal)Z(longitudinal) events
I need file3.root to give me Z(longitudinal)Z(unpolarised) events.

The tree structures are the same for both root files. I do not know where to start, as I believe I cannot use hdd.

ROOT Version: 6.30/02
Platform: Mac

bellenot · March 20, 2025, 3:15pm

I’m not sure, but maybe @pcanal can give some hints…

vpadulan · March 21, 2025, 8:47am

Dear @Chris_G ,

Thanks for reaching out to the forum! I believe a mix of TTree friends and RDataFrame can get you where you want. You can befriend the two files like so

TFile f1{"f1.root"};
auto *t1 = f1.Get<TTree>("tree");
TFile f2{"f2.root"};
auto *t2 = f2.Get<TTree>("tree");
t1->AddFriend(t2);

Then you can give the main tree to an RDataFrame object like so

ROOT::RDataFrame df{*t1};

You need to apply the weighting with a function that either replaces the values of an existing column (this is called Redefine) or simply by creating a new column with the reweighted values (via Define)

Once you have all the columns you need you can call the Snapshot operation to store the result into an output tree

df.Snapshot("output_tree", "output_file.root");

Note that the syntax above by default will save to the output tree all the branches, i.e. all the branches of t1 and all the branches of t2 and all the new columns you created with your RDataFrame via Define. If you want to select which columns get stored in the output tree, you can specify the list by simply adding it as the third argument

df.Snapshot("output_tree", "output_file.root", list_of_column_names);

These are some indications, let me know if you get stuck along the way.

Cheers,
Vincenzo

Chris_G · March 21, 2025, 11:20am

Hello,

Many thanks for your response. The problem is, is that I want to combine the files such that file3 contains not only the weights, but also all other information contained in them. This is because in my analysis, I need to use the Px, Py, Pz, E and Event.Weight information.

For your advice, I have this so far:

#include <TFile.h>
#include <TTree.h>
#include <ROOT/RDataFrame.hxx>
#include <iostream>

void combine_root_files() {
    TFile f1("delphes_ZLZT.root");
    TFile f2("delphes_ZLZL.root");

    // Create output file
    TFile outFile("file3.root", "RECREATE");
    TTree outTree("Delphes", "Merged Delphes Tree");

    // Create TTreeReaders
    TTreeReader reader1("Delphes", &f1);
    TTreeReader reader2("Delphes", &f2);

    // Read weight arrays
    TTreeReaderArray<float> weight1(reader1, "Event.Weight");
    TTreeReaderArray<float> weight2(reader2, "Event.Weight");

    // Define the output weight variable
    float mergedWeight;
    outTree.Branch("Merged.Weight", &mergedWeight, "Merged.Weight/F");

    // Loop over events
    int nEntries = reader1.GetEntries();
    for (int i = 0; i < nEntries; i++) {
        reader1.Next();
        reader2.Next();

        // Take the first weight entry in each event
        mergedWeight = 0.75 * weight1[0] + 0.25 * weight2[0];

        outTree.Fill();  // Fill new tree
    }

    // Write the merged tree to file
    outFile.Write();
    std::cout << "Merged ROOT file created: file3.root\n";
}

This is not returning me the correct averaged cross-section though.

leonhard · March 21, 2025, 1:04pm

Hi, I think you have a conceptual issue here, you are attempting to merge the weights of two different events but I can not really imagine a scenario where that makes much sense.
Normally you would just modify the weight of each event separately and then take the modified events into account in your analysis. In principle, if the factor you are multiplying with is a constant for each file you could also produce the histograms for each file separately and then add them together (scaled by the factor).

I.e. in your example here just add a new branch to file1 containing the weights multiplied by 0.75 and one to file2 with the weights multiplied by 0.25 (if those are the correct numbers) and then use those as the weights when performing your analysis.

system · April 4, 2025, 1:04pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.