Hi ROOT team,
I am working on the implementation of thread parallelism in the Marlin event processing framework, which is used by most linear collider studies. In this context, one classic issue which we’ve recently been looking into is the multi-threaded histogramming problem.
Because we have very enthusiastic DQM users who enjoy having plenty of histograms in flight, including large 2D ones, and given that RAM footprint is the main reason why we’re implementing thread-based parallelization, we would prefer to share histogram data across threads, without paying too high of a thread synchronization price for it.
We really like the buffering approach that ROOT 7’s experimental RHistConcurrentFill*
uses in this respect, and more generally the fact that RHist actually looks like something that was designed for thread safety when compared to the THx ROOT 6 classes. But obviously, we cannot ask physicists to use experimental ROOT 7 histogramming interfaces which may change at any moment, and still lack very important features such as I/O or fitting.
Which is how we ended up going for the following plan instead:
- Use RHist internally for Marlin’s multi-threaded histogramming needs.
- Only expose a minimal Fill interface to users, for which we can easily maintain compatibility wrappers even if the underlying experimental ROOT 7 interface ends up changing.
- Export ROOT-6 compatible THx histograms at the end, which physicists can process using all the tools that they are used to.
And thus, I ended up working on a ROOT 7 -> ROOT 6 histogram converter…
You can find the current state of this work, together with a bunch of semi-organized notes, on the following GitHub repository: https://github.com/HadrienG2/hists-mt .
This part of histogram conversion works:
- Convert RHist<1, T, STATS…> with an axis type of RAxisEquidistant or RAxisIrregular to TH1, for any ROOT6-compatible combination of T and STATS…
- Configure the output TH1 as close as possible to RHist’s behavior for a smooth transition.
- Detect incompatible T and STATS… and report them using a clean compiler error (no unchecked C++ template error madness).
- Detect and cleanly report various run-time usage errors (e.g. null RHist).
- …and there is an extensive test suite asserting that most of the above (except for compiler errors) works and keeps working in the future. It will soon be extended with compiler error testing in order to achieve complete coverage.
Unfortunately, other things do not work yet:
- Multi-dimensional histograms (due to a global binning convention incompatibility between RHist and THn which I’m not sure how to address yet)
- RAxisGrow (the implementation in ROOT 7 does not seem to be in a good enough shape yet for me to support it, for example the grow method is not implemented yet and there are other logic errors in the implementation)
- RAxisLabels (again, the ROOT 7 implementation does not seem ready yet, given that merely passing an RAxisLabels configuration to the RHist constructor will lead to a crash)
As you can see, I am now at a point where I need help from the ROOT team to go further. There are three things which we need to discuss:
- I found a bunch of clear-cut bugs in ROOT (and in particular ROOT 7 histogramming). These are probably best discussed on JIRA, so I think I’ll open tickets directly instead of bringing them up on this thread
- EDIT: Done as of November 5, see JIRA issues ROOT-10400 to ROOT-10409.
- I found a smaller amount of things that look suspicious to me, but which I’m not sure if you would consider to be bugs or not. I’m not sure if these should be discussed here or on JIRA.
- Given how limited the “official” RHist API is, a histogram converter must dig deep into RHist implementation details, so I expect plenty of breakage across future ROOT updates. I see two complementary ways we could resolve this:
- The public RHist API could be extened to support the histogram conversion use case.
- Given that backwards compatibility will be vital to ROOT 7’s success, another possible route would be to get such histogram conversions integrated into ROOT proper.
How shall we proceed on these last two fronts?