In order to only look at 10% of my data I have used the filter
Rdf = Rdf.Filter("rdfentry_ % 10 == 0")
When I check the number of events before and after this filter is applied I see that 10% of events are maintained. However when I create plots using this Rdf (with no further filters applied), different plots will have different numbers of events (none of which match the event count after the rdfentry_ filter was applied). I was under the impression that rdfentry_ is a column in the Rdf which contains the entry number, am I misinterpreting this?
If rdfentry_ cannot be used in this manner, what is the best way to only keep 10% of the events in the Chain?
I think you want to use df.Range(nEntriesConsider).Not sure why rdfentry doesn’t work in this case but Range should. Maybe when you fill histograms you have different counters because of over/underflow?
Also you might want to consider upgrading your ROOT version, since RDataFrame has gone through big improvements since 6.18. This works for me with current master for example:
No you are not, that’s correct.
Something else is happening here but we’d need a reproducer to figure out what exactly (unless a more recent ROOT version just works – you can try in a docker container, on lxplus or in a conda environment, for example, see Installing ROOT - ROOT).