I am currently having problems saving a RDataframe after calling Range() on it. Most variables are empty (at least everything that is a std::vector),
What I do is
// create dataframe
ROOT::RDataFrame df(tree_name,"some_rootfile.root");
// define variables needed for cuts and/or in the output
df=df.Define("leptmp", "return Construct<ROOT::Math::PtEtaPhiEVector>(lep_pt/1000., lep_eta, lep_phi, lep_E/1000.);"); // new RDataFrame to save the variable
df=df.Define("lep_theta", "return ROOT::VecOps::Map(leptmp,[](ROOT::Math::PtEtaPhiEVector x){return x.Theta();})");
// Filter events
df = df.Filter("All(lep_pt > 27000. && lep_z0*sin(lep_theta) < 0.5)");
// get a second dataframe with just 50000 entries
auto df_example = df.Range(0,50000.);
// save both dataframes
df_example.Snapshot(tree_name, "some_example_rootfile.root");
df.Snapshot(tree_name, "some_output_rootfile.root");
For the dataframe with all entries everything works fine, all variables are stored in the output file. For the example file, the variables are there but all std::vector ones are empty. What am I doing wrong or what am I missing here?
As this is a rather old code, I am using ROOT version 6.26/04.
Thanks for the interesting post.
This is not expected. Could you try out with a recent 6.30 release? Could you also, if that fails, share with us the input file so that we can reproduce?
I tried it now with ROOT version 6.30/02, but it still does not work. I have sent you a direct message, since I was not able to upload the input file here (probably too large). Here is the link to my cernbox: CERNBox
I have tried your example but I re-wrote it a bit so that it can be compiled and reproduced easily and standalone:
#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <Math/Vector4D.h>
#include <TInterpreter.h>
int main(){
gInterpreter->GenerateDictionary("ROOT::RVec<ROOT::Math::PtEtaPhiEVector>", "Math/Vector4D.h;ROOT/RVec.hxx");
// create dataframe
ROOT::RDataFrame df("mini","mc_410000.ttbar_lep.root");
// define variables needed for cuts and/or in the output
auto df1 = df.Define("leptmp", "ROOT::VecOps::Construct<ROOT::Math::PtEtaPhiEVector>(0.001*lep_pt, lep_eta, lep_phi, 0.001*lep_E);"); // new RDataFrame to save the variable
auto df2 = df1.Define("lep_theta", "return ROOT::VecOps::Map(leptmp,[](ROOT::Math::PtEtaPhiEVector x){return x.Theta();})");
auto df_norange = df2.Filter("All(lep_pt > 27000. && lep_z0*sin(lep_theta) < 0.5)");
auto df_range = df_norange.Range(0,50.);
// save both dataframes
auto snapshot_df_ranges = df_range.Snapshot("outputTree", "output_ranges.root");
auto snapshot_df_no_ranges = df_norange.Snapshot("outputTree", "output_no_ranges.root");
return 0;
}
The problem has nothing to do with the call to Ranges(). The main issue in your example is the missing the dictionary, with my debug version of ROOT what I see as an error is:
Error in <TTree::Branch>: The class requested (ROOT::VecOps::RVec<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiE4D<double> > >) for the branch "leptmp" is an instance of an stl collection and does not have a compiled CollectionProxy. Please generate the dictionary for this collection (ROOT::VecOps::RVec<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiE4D<double> > >) to avoid to write corrupted data.
did not work as well. I still have empty variables in the output where I use Range(). (In the other output file everything is still complete, as before.).
Is it important where the files are stored, which are generated by GenerateDictionary? Because I run the my script from outside the directory where the executable is located. So, the files generated from GenerateDictionary are located where I run the script.
Is there a chance for me to run a debugging version of ROOT to check if it gives some error messages I otherwise do not get?
I tested your solution with the ROOT 6.26/04 and ROOT 6.30/02.