Index out of range in RVec and RDF

ROOT Version: 6.26/10
Platform: MacOS 13.3
Compiler: Not Provided


Dear ROOT experts,

When I tried to select some element in a RVec column by vec[i] and save the result as a new column in a TTree. I found that if the index is out of range of the RVec, the value can be wrong.

For example, I use the code likes

import ROOT

d = ROOT.RDataFrame("tree_fake", "foo.root")
d = d.Define("elec_signal1", "elec_signal[1]")
d.Snapshot("tree_fake", "validate.root", ("elec_signal", "elec_signal1"))

to select the second element in the RVec. Then, I use tree_fake->Scan() in the output, the scan output is

root [2] tree_fake->Scan()
***********************************************
*    Row   * Instance * elec_sign * elec_sign *
***********************************************
......
*       55 *        0 *         1 *         1 *
*       55 *        1 *         1 *         1 *
*       55 *        2 *         1 *         1 *
*       56 *        0 *         1 *         1 *
*       56 *        1 *         1 *         1 *
*       57 *        0 *         0 *         1 *
*       58 *        0 *         1 *         1 *
*       58 *        1 *         1 *         1 *
*       58 *        2 *         1 *         1 *
*       59 *        0 *            *         0 *
*       60 *        0 *         0 *         1 *
......

“elec_signal” is on the left and “elec_signal1” is on the right. Most of the value in elec_signal1 is right. But in rows 57 and 60, the RVec only has 1 element. The second element doesn’t exist, i.e. the index 1 is out of range. But the “elec_signal1” is 1 in both rows. The root file for testing is appended.

When the index is out of range, the return value should be NaN or 0, or raise an exception. Is there a bug in RVec or RDF? Do you have any suggestions to avoid the problem?

Best regards,
Zhuolin

validate.root (6.7 KB)

I’m sure @eguiraud or @vpadulan can help

Hi @zzl0024 ,

that is not how RVec’s [] operator works: it works the same as std::vector’s: accessing non-existing objects is undefined behavior (you will be effectively reading the bytes at a memory address beyond the end of the allocated buffer).

To throw an exception if the element does not exist you can use elec_signal.at(1) (at a small performance cost) and to default to 0 or NaN you can use elec_signal.at(1, /*fallback*/0).

Alternatively you can also only process events that have at least two elements in elec_signal with

d.Filter("elec_signal.size() > 1").Define("elsig1", "elec_signal[1]")

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.