I am currently discovering the RDataFrame tool, and I was wondering if mixing events are possible. I couldn’t see anything close to it in the different tutorials.
Let´s say I have a Tree with a Jpsi and its muons for each entry, how can I get something like the difference of the pseudorapidity of the j/psi in entry 1 and the pseudorapidity of the j/psi in entry 2 (or any other uncorrelated entry).
Maybe it is better to use plain ROOT for this kind of thing but if anybody knows something let me know !
Cheers,
Samuel
Hi @nmangane, I think I understood why it might be of interest but it is not clear now how to use it. Did you ever try this ? Is it like an extension of RDataFrame or a totally new independent class ?
Sam
Hi @Samuel1 ,
sorry for the high latency, I was off last week.
The most tricky part about processing entries with a sliding window is multi-threading – each entry will process a bunch of entries at a time, and the first and last entries in the bunch will not have a previous/next entry so you’ll miss some statistics.
If you are happy with single-thread processing, or you don’t mind losing some of the pairs of consecutive entries, you can use a stateful functor + RDataFrame, something like (haven’t tested, it’s just to give you an idea):
// not thread-safe, but can easily be made thread-safe by using a vector of
// previousPseudoRapidity values, one per thread (df.GetNSlots() returns the
// the required number of threads/processing slots).
struct PseudoRapidityDiff {
double previousPseudoRapidity = 1e20;
double operator()(double pseudoRapidity) {
if (previousPseudoRapidity > 1e19) // no previous value
return -999;
double diff = pseudoRapidity - previousPseudoRapidity;
previousPseudoRapidity = pseudoRapidity;
return diff;
}
};
// only safe in single-thread processing
df.Define("pseudoRapDiff", PseudoRapidityDiff{}, {"pseudoRapidity"});
I hope this helps!
Enrico
P.S.
I am not familiar with TimeFrame personally, but my understanding is that it is an independent class that reuses some RDF concepts but natively works with sliding windows. @Axel will know more.
Thanks a lot for taking the time to reply. Ok I understand, my point was to use the multi threading as I am processing a lots of data. I am actually pairing the pseudorapidity of my jpsi with all the hadrons of the next event (ideally a selected independent event) so it is quite long! I’ll try anyway thanks !
Samuel
I completely understand, and then you need to decide what to do with those boundaries at which RDataFrame splits the dataset for multi-thread processing, I don’t think it’s clear in general.
Is it possible to get the entry number of the last entry of one bunch of data ? So I can add something like “if last take first”. That would fix it I think.
Yes but only in ROOT master and the upcoming release v6.26: you need DefinePerSample. The expression you pass to DefinePerSample takes a RSampleInfo object as input that can tell you the entry range that is going to be processed.