Hi ROOT-ers
I am writing you because I am currently tailoring a solution for modifying
our usage of ROOT histograms for data-quality monitoring in ATLAS
reconstruction jobs, without paying the memory penalty which we currently
pay (as you are probably aware we are fighting a battle against our present
~2GB mem-usage). So I would like to hear if anyone else has experience with
this sort of thing (or perhaps would have an interest in our solution).
The problem we face is that we are booking and filling a rather large number
of histograms in each reco-job (~5000), totalling more than 100MB of
mem-usage (meaning that often people turn off the monitoring just so the
grid machinery wont kill their jobs). However, once written out and
compressed in a .root file, the usage is much less, typically just a few
megabytes ~5MB, indicating what one can also see by looking at the
histograms themselves: Often most bins in a given histogram are unused, or
they are used in a way which could be described by just a few bits rather
than 4 or 8 bytes (i.e. after calling ::Fill(x) on a given bin less than 256
times, one could describe that bin with just an uchar rather than e.g. the
full float/double/int in TH1F/TH1D/TH1I). We have thus written a simple
histogram replacement taking some of these ideas into account, wasting a bit
of cpu in favour of having a much tighter memory footprint in our jobs
(freeing up hopefully 80-90% of the memory used by histograms).
However, since we have a very large number of histogram-using monitoring
packages and a large number of scripts/programs set up to process these root
histograms (often using more advanced features of root histograms such as
fitting or adding names to bins, etc.), we can not just migrate to something
else. So the solution we are pursuing now is to provide these light-weight
(mem-wise) histogram classes with names and method names similar to those of
root (e.g. TH1F_LW vs. TH1F), and use those for histogram filling during our
job. Only at the post-processing stage are they converted into their root
equivalents and written out.
Sorry for the somewhat lengthy and slightly convoluted explanations
Looking forward to hearing your comments.
Cheers,
Thomas
ps. I looked into THnSparse, but that seems not to be appropriate for our
situation since it is first of all not similar to the usual TH1F, etc (the
migration would be too difficult)., and since the optimisation used
internally doesn’t scale well to the situation where most bins are actually
starting to be filled (e.g. in a 12 hour long monitoring job). A similar comment
goes for TH1C/TH2C since users can’t know the maximal bin content in
advance (also, TH1C/TH2C books even empty bins).