Operation on RDataFrame::Range with large entry IDs is slow?


ROOT Version: 6.32.06
Platform: Debian GNU/Linux trixie
Compiler: GCC 14.2.0


Hello,

I found an performance issue when I was doing something like

for (int i{}; i<=nEvent; ++i) {
    df.Range(eventEntry[i], eventEntry[i+1]).Foreach(...)
}

The data of an event consists of a continuous set of entries, so I process them like above. But I found that with the entry range growth, it takes much more time to process. For example,

df.Range(0, 10000).Foreach(...)

is much faster than

df.Range(9990000, 10000000).Foreach(...)

though it consists of same number of entries.

Is the performance degradation an expected behavior or not? Or am I doing something wrong here?

Let see if @vpadulan can help here