Dear root experts
I have a question concerning caching Filters results and following defines.
I basically write up a code which stores in a map<Myslice, vector> the outcome of a Take operation on some branches.
Where each Slice has a prior cut applied.
Sometimes Myslice uses the same exact cut of other slices, so I was tempted to make a
map<Uint_t, RNode> where I store
Cutstring.Hash() as key and the dataframe.Filter(CutString) outcome such that I don’t have to bookkep several times the same filter and bookkep several time the same define expression.
My question boils down to :
Is a Filter interpreted as a RNode?
I looked tô the documentation but I am not sure.
Thanks for the feedback.
yes, any dataframe object is convertible to a
ROOT::RDF::RNode (it’s easy to verify from the ROOT prompt).
Thanks a lot,
I think it will help a lot in my current work flow.
I am using a huge ntuple and bookkeeping kind of 5000 operations (nodes) on the initial huge ntuple. The virtual memory usage is O(180 Gb). With caching around I can go down of an order of magnitude of nodes definition. I suppose this will help. My jobs gets indeed killed by the system due to the large memory consumption. I give a try and report it back. Thanks
Making that change and recycling the previously filtered dataframe helped a lot ( roughly a factor 10 speedup and much less memory pressure) thanks a lot.
I was wondering if in the future it’s foressen to do automatic filters caching.
I mean hash the cut string expression and if it already exist in a given processing graph at the same stage, just reuse it and book other expressions. Thanks a lot for the hep anyway, indeed the root shell was enough to find it out
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.