Hi @beojan,
basically RDF is warning you of a possible performance degradation. You can ignore it if you are not worried about the last drop of performance.
Otherwise you can change the way the files are written (probably not straightforward) or as you ask convert the branch to something split. One way to do it is to simply Snapshot that branch to a separate TTree and add that tree as a friend of the main tree. If this is not satisfactory, it would be great if you could share a (very small) TTree with these kind of events so we can come back with an ad-hoc solution.
The performance hit seems to be quite large, so it seems to be a good idea to fix it.
I’m trying to make a small sample tree with rooteventselector but it fails because it has no dictionary for the Delphes classes. Is there a way to fix this?
you can always reconstruct the data model and its dictionaries starting to the information which was written in the TFile with TFile::MakeProject.
I would be also interested to the behaviour you reported of RDataFrame (and ultimately TTreeReaderArray): would it be possible to
Elaborate on the performance penalty you mention: is it slow with respect to what? Do you have any profile which points to a particular time consuming symbol?
Have the file and RDataFrame code you are using which leads to the warning?
I based my performance estimate on comparing this code to a similar (but more complicated) piece of code I have that runs on a different ntuple, which seems a lot faster. However, I think I may have been misled, because I have
auto start_events_proxy = frame.Count();
start_events_proxy.OnPartialResult(
10000, [](const unsigned long long &num_events) {
fmt::print("Processed {} events\n", num_events);
});
all_cuts.Snapshot(...);
start_events_proxy.GetValue(); // For printing progress
To print progress, but that doesn’t seem to be working correctly here, since it only prints “Processed 10000 events” but the input ntuple should have many times that many events.
I can comment on OnPartialResult printing a lower count than expected: in multi-thread runs, OnPartialResult is not called by every thread concurrently, but only by one thread “at a time”, so you might be missing the counts of other threads.
You can use OnPartialResultSlot to call your lambda in every thread.
If you have suggestions on how to specify this more clearly in the docs I’m all ears
For the resolution of the original issue I think you are in better hands with @Danilo
I think even that wouldn’t work, if each thread keeps it’s own count, and the separate counts are added at the end. I would just get Processed 10000 events printed 16 times, then Processed 20000 events 16 times, and so on.
@Danilo I used MakeProject, then used rooteventselector and it worked (surprisingly, since I wasn’t sure how to load the shared library into rooteventselector's ROOT instance). The sample file is attached.sample.root (82.8 KB)
I think even that wouldn’t work, if each thread keeps it’s own count, and the separate counts are added at the end. I would just get Processed 10000 events printed 16 times, then Processed 20000 events 16 times, and so on
Yes of course your current lambda only makes sense in a single-threaded context.
With multiple threads you can have them add to an atomic counter variable (at the cost of some synchronization) – or whatever you want, but by construction each thread will have its own partial result.
Anything else would be less performant, I think.
@beojan that warning is only present in builds with debug symbols (in practice, they are inside a #ifndef NDEBUG clause), so they do not appear in an optimized build such as the one you are getting from LCG.