Converting non-split branch to split branch

Using RDataFrame to process Delphes output, I get many errors of the form:

Warning in <TColumnValue::Get>: Branch Jet hangs from a non-split branch. A copy is being performed in order to properly read the content.

Is there an easy way to convert these non-split branches to split branches?

Hi @beojan,
basically RDF is warning you of a possible performance degradation. You can ignore it if you are not worried about the last drop of performance.
Otherwise you can change the way the files are written (probably not straightforward) or as you ask convert the branch to something split. One way to do it is to simply Snapshot that branch to a separate TTree and add that tree as a friend of the main tree. If this is not satisfactory, it would be great if you could share a (very small) TTree with these kind of events so we can come back with an ad-hoc solution.

Cheers,
Enrico

The performance hit seems to be quite large, so it seems to be a good idea to fix it.
I’m trying to make a small sample tree with rooteventselector but it fails because it has no dictionary for the Delphes classes. Is there a way to fix this?

Hi,

you can always reconstruct the data model and its dictionaries starting to the information which was written in the TFile with TFile::MakeProject.

I would be also interested to the behaviour you reported of RDataFrame (and ultimately TTreeReaderArray): would it be possible to

  1. Elaborate on the performance penalty you mention: is it slow with respect to what? Do you have any profile which points to a particular time consuming symbol?
  2. Have the file and RDataFrame code you are using which leads to the warning?

Cheers,
D

I based my performance estimate on comparing this code to a similar (but more complicated) piece of code I have that runs on a different ntuple, which seems a lot faster. However, I think I may have been misled, because I have

auto start_events_proxy = frame.Count();
start_events_proxy.OnPartialResult(
      10000, [](const unsigned long long &num_events) {
        fmt::print("Processed {} events\n", num_events);
      });

all_cuts.Snapshot(...);
start_events_proxy.GetValue(); // For printing progress

To print progress, but that doesn’t seem to be working correctly here, since it only prints “Processed 10000 events” but the input ntuple should have many times that many events.

I’ll try using MakeProject to split the tree.

I can comment on OnPartialResult printing a lower count than expected: in multi-thread runs, OnPartialResult is not called by every thread concurrently, but only by one thread “at a time”, so you might be missing the counts of other threads.
You can use OnPartialResultSlot to call your lambda in every thread.

If you have suggestions on how to specify this more clearly in the docs I’m all ears :slight_smile:

For the resolution of the original issue I think you are in better hands with @Danilo :smiley:

Cheers,
Enrico

I think even that wouldn’t work, if each thread keeps it’s own count, and the separate counts are added at the end. I would just get Processed 10000 events printed 16 times, then Processed 20000 events 16 times, and so on.

@Danilo I used MakeProject, then used rooteventselector and it worked (surprisingly, since I wasn’t sure how to load the shared library into rooteventselector's ROOT instance). The sample file is attached.sample.root (82.8 KB)

I think even that wouldn’t work, if each thread keeps it’s own count, and the separate counts are added at the end. I would just get Processed 10000 events printed 16 times, then Processed 20000 events 16 times, and so on

Yes of course your current lambda only makes sense in a single-threaded context.
With multiple threads you can have them add to an atomic counter variable (at the cost of some synchronization) – or whatever you want, but by construction each thread will have its own partial result.
Anything else would be less performant, I think.

Cheers,
Enrico

Hi,

thanks a lot!
Still I am not sure about:

  1. what was slower: I understand you did not really compare apples to apples. Is this an accurate statement?
  2. What branch are you actually reading which causes the error to be printed?

Cheers,
D

Indeed, it wasn’t an apples-to-apples comparison.

The branch that produces the error is Jet.

So this is weird: if I use LCG 94 built with GCC 8, the warning disappears altogether.

@beojan that warning is only present in builds with debug symbols (in practice, they are inside a #ifndef NDEBUG clause), so they do not appear in an optimized build such as the one you are getting from LCG.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.