Incremental branch fill

Nicola_Mori · May 1, 2024, 8:04am

I’m working on a MC simulation code that produces a Root tree with branches in a legacy format that I cannot break. Some of these branches hold TClonesArrays of particle data generated by the simulation which is becoming too large to fit in memory and be Filled once per event. So I thought about writing particle data to file as it is generated by the MC during the event simulation, in an incremental fashion which however should produce a single TClonesArray branch entry per event with all the particles of that event.
Is this possible?

Danilo · May 1, 2024, 11:16am

Hi Nicola,

This is an interesting problem.
This is of course a naive observation, but let me write it nevertheless: the first step to try would be to either move to a machine with more memory or to optimise the memory usage of the whole process that leads to filling the tree.

If that’s not possible, one might think to “spread” 1 entry over multiple entries, to then “recompact the dataset”, e.g. with RDataFrame (accumulate through Define’s and filter out partial events with Filter’s, Snapshot the full new dataset).

That solution would be a bit sophisticated, but if no platform with more memory is available and no optimisation is possible in the legacy code, the difficulty has somehow to be moved somewhere else.

How big is the problem, i.e. how much memory do you have and how much would you need?

Cheers,
Danilo

Nicola_Mori · May 1, 2024, 3:01pm

Hi Danilo,

I’ll try to be more detailed. The application is a Geant4 simulation that records global energy releases and single-particle releases in a scoring transient structure based on G4VHit. Hits are then converted to a Root-streamable format and written to disk event by event. For event with lots of tracks (like for several PeV/n primary cosmic-ray particles) at the end of event we have a situation where for a time interval each hit is present twice in memory (one as a G4VHit, and one as a streamable hit), reaching peaks of memory occupation that can touch 18 GB and probably more (this is the amount of memory making the job crash in our current test setup). I think it’s not feasible to just scale up the hardware.

There are many other options, the most obvious one being to eliminate the double representation but this implies a deep rewrite of all the code so for the moment it has been put aside. My idea was then to operate at tracking level: at the end of the simulation of each track (or bunch of tracks) the hits can be transferred to disk, the in-memory structures cleaned, and in this way the overall memory occupancy for scoring can in principle be reduced as needed by tuning the size of the bunch of tracks at the expense of more frequent disk I/O. But then the original problem arises, since I’d need that in the end I find in the output branch a TClonesArray entry per event with all the track hits of the event. Loosely speaking, I’d need something like a TBranch::FillWithAppend method that works for branches holding a container, and that instead of creating a new entry simply appends the content of the container buffer to the current entry.

From your answer I guess that nothing similar currently exists so the only way forward seems to modify the output format and write a readout compatibility layer for it. Sorry for the long post and thanks for your advices!

system · May 15, 2024, 3:02pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.