I want to open a TTree created by Delphes with RDataFrames, remove events based on some cuts and then write the remaining events to a new TTree that has exactly the same format as the previous one. The first part works now thanks to this thread, but now I run into troubles saving.
When I process delphestestsample.root with the below program, the outputfile doesn’t have the same structure – in particular, I’d expect Jet to have sub-branches, see the attached picture. The in- and output files are also attached.
I’ve also tried to simply write all columns using df.GetColumnNames(), but then I run into the type issues I had in the linked thread.
How does one write a TTree with a complex structure such as the ones created by Delphes to disk with RDataFrames?
Hi @jndrf,
thanks for the thorough report with a file that we can use for debugging. If I understand correctly, the problem is that in the input file Jet has a series of sub-branches, but in the output file it somehow became just a leaf (similarly for other branches).
I will take a look as soon as possible, probably over the course of next week.
Cheers,
Enrico
Hello,
the problem is that TClonesArray are not well supported. I will try to improve the situation (certainly we must not silently write wrong data) but, in the meanwhile, it seems that telling Snapshot that these branches are in fact TClonesArrays (rather than the catch-all RVec) solves the problem:
Hi again,
if I understood the problem correctly, you have 3 options (1 and 2 are currently broken in different ways, but fixed by the PR I link below):
call Snapshot<RVec<Jet>>(..., {"Jet"}): this tells Snapshot to read Jet as an RVec rather than a TClonesArray. At this point, we cannot easily write it out as a TClonesArray, but instead we would write a std::vector<Jet>. You will require dictionaries for std::vector<Jet>
call Snapshot(..., {"Jet"}): this should just work and write out a TClonesArray with the fix linked below
call Snapshot<TClonesArray>(..., {"Jet"}): this should already work, and it will keep working
Does this sound reasonable? If yes, is there any chance you can try whether this PR fixes your issue?
thank you for the reply. I am using the last method from your list for now, this works.
On a quick glance, the contents of the file look sensible (at least the cut is applied correctly), but I haven’t made a thorough check.
I don’t have a self compiled version of ROOT at hand right now, but I think I can get around to try the PR this week.
Hi Jonas,
just so you know, the next release v6.22 and the next v6.20/06 patch release will contain the PR that I linked above, which makes Snapshot(..., {"Jet"}) work out of the box and Snapshot<RVec<Jet>>(..., {"Jet"}) either work (if dictionaries are present) or error out noisily with a (hopefully) helpful error message. Note that Snapshot<RVec<Jet>>(..., {"Jet"}) will write out a vector<Jet>, while the other two methods will write out TClonesArrays.