I am trying to use distributed RDataFrame on SWAN. I create a dataframe, fillter it and save the columns into a new root file. I understand while use snapshot in distributed case, the resultant snapshots would be equal to number of partitions.
Is there a way I can obtain one single snapshot (combining all partitions)?
I think the only way is to post-process the outputs yourself, e.g. adding them together with the hadd command line tool that comes with ROOT. @vpadulan can correct me if Iām wrong (likely when the working week restarts ).