Dear Experts,
I have a question about the way RDataFrame access a tree in a list of files.
In practice i have a series of root files for a total of 120Tb. Each file contains 200 ttrees and all of them are accessed via xrootd.
I was wonderijg how is RdataFrame when asked to process just 1 Ttree from the chain is affected in performance due to the big size of the files containing it.
I.e is RDataFrame(“treename”, listoffiles) event loop performance penalised (network, overhead of opening big files even if you access effectively only 1 tree out of 200) by the fact the file sizes are big but the actual disk space of the trees in the file is not? Should one consider to have many trees in as many tfiles instead or is rdataframe smart enough to do the same under the hood?
Thanks in advance
Renato
I hope i made the question clear.
Please read tips for efficient and successful posting and posting code
Please fill also the fields below. Note that root -b -q
will tell you this info, and starting from 6.28/06 upwards, you can call .forum bug
from the ROOT prompt to pre-populate a topic.
ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided