Hi,
I am using TMVA::Experimental::RReader with k-fold cross validation and datasets split by deterministic SplitExpr to apply MVA prediction using RDataFrame. However, I am getting wrong MVA responses, which are very close to -1.
The code snippet I am using is like:
using namespace TMVA::Experimental;
ROOT::EnableImplicitMT(48);
RReader model("/path/to/TMVAdataset/weights/TMVAClassification_BDTG.weights.xml");
auto training_variables = model.GetVariableNames();
auto spectator_variables = model.GetSpectatorNames();
std::vector<std::string> variables(training_variables.size() + spectator_variables.size());
std::merge(training_variables.begin(), training_variables.end(), spectator_variables.begin(), spectator_variables.end(), variables.begin());
ROOT::RDataFrame rdf("DecayTree", "/path/to/input.root");
auto rdf2 = rdf.Define("BDTG_response", Compute<21, float>(model), variables);
rdf2.Snapshot("DecayTree", "/path/to/output.root");
and the response in the output file is like
+-----+-----------------+
| Row | BDTG_B_response |
+-----+-----------------+
| 0 | -0.999969 |
+-----+-----------------+
| 1 | -0.999946 |
+-----+-----------------+
| 2 | -0.999880 |
+-----+-----------------+
| 3 | -0.999982 |
+-----+-----------------+
| 4 | -0.999988 |
+-----+-----------------+
| 5 | -0.999983 |
+-----+-----------------+
| 6 | -0.999998 |
+-----+-----------------+
| 7 | -0.999995 |
+-----+-----------------+
| 8 | -0.999997 |
+-----+-----------------+
| 9 | -0.999997 |
+-----+-----------------+
So does TMVA::Experimental::RReader support k-fold CV with SplitExpr for now and what is the right way to use it?
The ROOT version I am using is 6.32.00.