I have two root files file1.root and file2.root. I need to combine them so that I get file3.root, which I will then use to continue my analysis. The problem is, is that I need to merge the root files by weighting them accordingly.
For context:
- file1.root contains Z(longitudinal)Z(transverse) events
- file2.root contains Z(longitudinal)Z(longitudinal) events
- I need file3.root to give me Z(longitudinal)Z(unpolarised) events.
The tree structures are the same for both root files. I do not know where to start, as I believe I cannot use hdd.
ROOT Version: 6.30/02
Platform: Mac
I’m not sure, but maybe @pcanal can give some hints…
Dear @Chris_G ,
Thanks for reaching out to the forum! I believe a mix of TTree friends and RDataFrame can get you where you want. You can befriend the two files like so
TFile f1{"f1.root"};
auto *t1 = f1.Get<TTree>("tree");
TFile f2{"f2.root"};
auto *t2 = f2.Get<TTree>("tree");
t1->AddFriend(t2);
Then you can give the main tree to an RDataFrame object like so
ROOT::RDataFrame df{*t1};
You need to apply the weighting with a function that either replaces the values of an existing column (this is called Redefine) or simply by creating a new column with the reweighted values (via Define)
Once you have all the columns you need you can call the Snapshot operation to store the result into an output tree
df.Snapshot("output_tree", "output_file.root");
Note that the syntax above by default will save to the output tree all the branches, i.e. all the branches of t1
and all the branches of t2
and all the new columns you created with your RDataFrame via Define. If you want to select which columns get stored in the output tree, you can specify the list by simply adding it as the third argument
df.Snapshot("output_tree", "output_file.root", list_of_column_names);
These are some indications, let me know if you get stuck along the way.
Cheers,
Vincenzo
1 Like
Hello,
Many thanks for your response. The problem is, is that I want to combine the files such that file3 contains not only the weights, but also all other information contained in them. This is because in my analysis, I need to use the Px, Py, Pz, E and Event.Weight information.
For your advice, I have this so far:
#include <TFile.h>
#include <TTree.h>
#include <ROOT/RDataFrame.hxx>
#include <iostream>
void combine_root_files() {
TFile f1("delphes_ZLZT.root");
TFile f2("delphes_ZLZL.root");
// Create output file
TFile outFile("file3.root", "RECREATE");
TTree outTree("Delphes", "Merged Delphes Tree");
// Create TTreeReaders
TTreeReader reader1("Delphes", &f1);
TTreeReader reader2("Delphes", &f2);
// Read weight arrays
TTreeReaderArray<float> weight1(reader1, "Event.Weight");
TTreeReaderArray<float> weight2(reader2, "Event.Weight");
// Define the output weight variable
float mergedWeight;
outTree.Branch("Merged.Weight", &mergedWeight, "Merged.Weight/F");
// Loop over events
int nEntries = reader1.GetEntries();
for (int i = 0; i < nEntries; i++) {
reader1.Next();
reader2.Next();
// Take the first weight entry in each event
mergedWeight = 0.75 * weight1[0] + 0.25 * weight2[0];
outTree.Fill(); // Fill new tree
}
// Write the merged tree to file
outFile.Write();
std::cout << "Merged ROOT file created: file3.root\n";
}
This is not returning me the correct averaged cross-section though.
Hi, I think you have a conceptual issue here, you are attempting to merge the weights of two different events but I can not really imagine a scenario where that makes much sense.
Normally you would just modify the weight of each event separately and then take the modified events into account in your analysis. In principle, if the factor you are multiplying with is a constant for each file you could also produce the histograms for each file separately and then add them together (scaled by the factor).
I.e. in your example here just add a new branch to file1 containing the weights multiplied by 0.75 and one to file2 with the weights multiplied by 0.25 (if those are the correct numbers) and then use those as the weights when performing your analysis.