Memory issues for Dask Distributed RDataframe with a lot of entries per file?

Dear @gpetruc ,

We have just introduced a few improvements to the distributed RDataFrame scheduling that also affect positively the memory usage of the distributed tasks. Could I ask you to retry your analysis on SWAN, selecting the “Bleeding Edge” software stack so you get the latest ROOT master build? Let me know if you can see any improvements, otherwise we will continue the debugging of your specific case.

Thanks!
Vincenzo