Hello ROOT team, I am trying to do the following in RDataFrame,
auto df_higgs = df_leps.Define("HCandidate", "MyVar > 0")
.Define("Score1", "Branch1[HCandidate]")
.Define("Score2", "Branch2[HCandidate]")
.Define("HScore", "Score1 / Score2")
.Define("HighestHScoreIdx", "ArgMax(HScore)")
.Define("HighestHScore", "HScore[HighestHScoreIdx]")
.Filter("HighestHScore > 0", "Higgs candidate exists")
It compiles and runs fine but when I Display the df after these selections, I see
+------+-------------+------------------+---------------+
| Row | HScore | HighestHScoreIdx | HighestHScore |
+------+-------------+------------------+---------------+
| 54 | 0.947388f | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 60 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 62 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 63 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 67 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 69 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 75 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 91 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 99 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 105 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 106 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 112 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 118 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 120 | | 0 | 0.947388f |
+------+-------------+------------------+---------------+
| 226 | 0.00148309f | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 229 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 237 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 242 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 244 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 246 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 251 | | 0 | 0.00148309f |
+------+-------------+------------------+---------------+
| 270 | 0.00133200f | 0 | 0.00133200f |
+------+-------------+------------------+---------------+
| 277 | | 0 | 0.00133200f |
+------+-------------+------------------+---------------+
It seems ArgMax sets the default value to 0 if the RVec has size zero. Why it’s doing this makes sense after looking at RVec.cxx but I don’t know if this is ideal because there is no way to tell if the maximum is the 0th index element or if the vec size is 0.
Also, If HScore has size zero then the HighestHScore is automatically set to be the last non-zero vec maximum.
Using .Define("HighestHScore", "Max(HScore)")
leads to the same behavior.
Can you suggest a way of circumventing this issue? Perhaps a way to check if the vector size is non zero? Thank you.
Update: I was able to apply a quick fix by defining
RVec<int> MyArgMax(const RVec<float> &v){
RVec<int> idx = {};
if (v.size() != 0){
idx.push_back(std::distance(v.begin(), std::max_element(v.begin(), v.end())));
}
return idx;
}
and picking v[0] and something similar for MyMax. It works for now.
ROOT Version: 6.30/04
Platform: linuxx8664gcc / installed through conda-forge