About to take my first plunge into multi-threaded coding, and wondering
if my plan is even feasible. I already have a C++ program (which writes but
does not read a root file) whose dominant CPU-time is in TH1F::Fit of a large
number of distinct but similar histograms; the histograms are completely
independent but are all being fit to the same function. Naively seems like
an ideal case for parallelizing on a multi-core machine.
However, it’s not worth doing if TH1::Fit is not thread-safe, so that
is why I’m asking. Also, from what I’ve skimmed quickly about PROOF,
it doesn’t seem like it (PROOF) is necessary; rather some code using
pthread directly, or TThreads to set up parallel fits.
TH1::Fit is not currently thread safe because after fitting it uses a static TVirtualFitter object to set result of the fitting for backward compatibility. I will probably remove this usage of this static instance in one of the next releases, since it is not really needed anymore.
However, you can fit directly the histograms using the Fitter class, instead of calling TH1::Fit. An example of doing this is the function testHisto1DFit() in root.cern.ch/viewvc/trunk/math/m … iew=markup
This interface should be thread safe and you should be able to fit different histograms with different threads. If you have any problem of further questions, please let us know,
So before I got your message I used TThread and TF1::Fit and it kind-of-worked
for a little while for 2 threads. Eventually I got “corrupted unsorted chunks” and
all sorts of complaints about it trying to find Minuit2 and failing.
I will try using the Fitter class. Looks like it’s maybe not quite the same
arguments so I will have to stare at it for a while. I assume it still uses
Minuit under the hood? Similar speed?
One question about TThread. I had two problems. First of all,
when I check the status of the thread, it is either 2 (Running) or
6 (canceled). Never “Finished”. And frequently I had trouble doling
the TThread::Delete on it (crash), but not every time, just once it
a while. The frequency of it crashing seemed to be correlated with
how long I slept between checks. That is, I sleep for 10msec, check
if the status is 6 yet and if so do the Delete. Perhaps this is related
to the non-threadsafeness of TH1::Fit in the first place, I don’t know.
Any guidance on this would be appreciated.
The Fitter class is used by TH1::Fit, so you can use the same minimizers (Minuit, Minuit2, etc…)
For your second question, I don’t know, maybe is related to the non-threadsafeness of TH1::Fit
I’m a bit confused about what FillData does with the argument TF1*,
since the TF1 is also sent in via the WrappedMultiTF1, and the
parameter initialization is set again (to different values),
even though it is set earlier in the TF1.
Also, since my code is already set up to query the TF1 for the
fit results, will they still be present there after the Fitter::Fit(d,f) ?
Since you say that TH1::Fit uses Fitter behind the scenes anyway,
is there a simple way to make this (get fit results into the TF1
from the Fitter) happen?
Thanks for your guidance!
OK, it appears it is all straightforward. This is what I think needs to be done:
ROOT::Fit::BinData bd ;
ROOT::Fit::FillData( bd, myTH1, myTF1 ) ;
ROOT::Math::WrappedMultiTF1 wf ( *myTF1 ) ;
ROOT::Math::IParamMultiFunction& f ( wf ) ;
bool ret ( fitter.Fit( bd, f ) ) ;
if( ret ) tf1->SetFitResult( fitter.Result() ) ;
and it seems to at least not crash a few times (just allowing 2 active threads at a time) until
Thread 2 (Thread 0x7f32d65ce700 (LWP 12909)):
#0 0x00000032a3cab59d in waitpid () from /lib64/libc.so.6
#1 0x00000032a3c3e349 in do_system () from /lib64/libc.so.6
#2 0x00000032a3c3e680 in system () from /lib64/libc.so.6
#3 0x00007f32dcf41c42 in TUnixSystem::StackTrace() () from /nfs/acc/libs/Linux_
#4 0x00007f32dcf3eb8a in TUnixSystem::DispatchSignals(ESignals) () from /nfs/ac
#6 0x00007f32d88844da in ROOT::Fit::FitUtil::EvaluateChi2(ROOT::Math::IParametr
icFunctionMultiDim const&, ROOT::Fit::BinData const&, double const*, unsigned in
t&) () from /nfs/acc/libs/Linux_x86_64_intel/current/packages/root/lib/libMathCo
#7 0x00007f32dac2f4d7 in TMinuitMinimizer::Fcn(int&, double*, double&, double*,
int) () from /nfs/acc/libs/Linux_x86_64_intel/current/packages/root/lib/libMinu
#8 0x00007f32dac1324b in TMinuit::Eval(int, double*, double&, double*, int) ()
#9 0x00007f32dac1fdcc in TMinuit::mnhes1() () from /nfs/acc/libs/Linux_x86_64_i
#10 0x00007f32dac20d02 in TMinuit::mnhess() () from /nfs/acc/libs/Linux_x86_64_i
#11 0x00007f32dac1c27c in TMinuit::mnmigr() () from /nfs/acc/libs/Linux_x86_64_i
#12 0x00007f32dac2a906 in TMinuit::mnexcm(char const*, double*, int, int&) () fr
#13 0x00007f32dac30702 in TMinuitMinimizer::Minimize() () from /nfs/acc/libs/Lin
#14 0x00007f32d887a642 in ROOT::Fit::Fitter::DoMinimization(ROOT::Math::IBaseFun
ctionMultiDim const*) () from /nfs/acc/libs/Linux_x86_64_intel/current/packages/
#15 0x00007f32d887a7ec in ROOT::Fit::Fitter::DoMinimization(ROOT::Math::IBaseFun
ctionMultiDim const&, ROOT::Math::IBaseFunctionMultiDim const*) () from /nfs/acc
#16 0x00007f32d887d7cb in ROOT::Fit::Fitter::DoLeastSquareFit(ROOT::Fit::BinData
const&) () from /nfs/acc/libs/Linux_x86_64_intel/current/packages/root/lib/libM
#17 0x000000000045f137 in ROOT::Fit::Fitter::Fit(ROOT::Fit::BinData const&) ()
#18 0x000000000045f8e5 in bool ROOT::Fit::Fitter::Fit<ROOT::Fit::BinData, ROOT::
Math::IParametricFunctionMultiDim>(ROOT::Fit::BinData const&, ROOT::Math::IParam
etricFunctionMultiDim const&) ()
#19 0x000000000045a041 in threadFit(void*) ()
#20 0x00007f32d85ae2c6 in TThread::Function(void*) () from /nfs/acc/libs/Linux_x
#21 0x00000032a4807851 in start_thread () from /lib64/libpthread.so.0
#22 0x00000032a3ce76dd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f32d7c51820 (LWP 12907)):
#0 0x00000032a3ce02c3 in select () from /lib64/libc.so.6
#1 0x00007f32dcf3a54b in TUnixSystem::Sleep(unsigned int) () from /nfs/acc/libs
#2 0x00007f32d85ad5ce in TThread::Sleep(unsigned long, unsigned long) () from /
#3 0x000000000045a26a in XbsmHFitter::wait(unsigned int) ()
#4 0x000000000045c797 in XbsmHFitter::XbsmHFitter(XbsmHstMgr&, XbsmFuncBase&) (
#5 0x000000000043c56b in main ()
I notice that the EState’s of the threads are still always either 2 (Running) or 6 (Canceled).
One more thing: If I restrict the job to a single thread, it runs
without crashing. When I examine the results, I find that the
limits on floating parameters that are set in TF1::SetParLimits()
are not respected. The way I fix parameters is via setting
the upper and lower limits to be the same. This feature
is indeed respected by the 1-thread code as given in
the previous post. So part of the parameter limit
information is being passed through from TF1, but not all.
From the state of some of the documentation pages, it seems
as if some of this code might at the bleeding edge, or is
this an incorrect conclusion?
One more thing I finally noticed: in my previous code using TH1::Fit,
I put some options into the 2nd argument (“QEMR”). Offhand I don’t
see how to set those in this alternate code. I don’t see an easy
place to do it.
Probably spinning my wheels here, but I got to thinking about possible
collisions when the function gets evaluated. I use the TF1 constructor
that gives an address of a function in a class as input. The comments
in that constructor state that such a TF1 cannot be cloned.
For each thread I do create a new instance of the TF1
via the copy constructor ( TF1* newTF1 ( new TF1( originalTF1 ) ) ).
If I understand things (and I very well may not), this will still
use the address of the original function given to the first TF1.
Now, this function is a non-static const member function of a
class, which in turn calls non-static virtual const member functions
specified from a superclass. I wonder how the heck the address
of the original function can possibly be enough information in
this situation. None of the calls, however, change the state
of the function (since they are const and I don’t use mutable).
So, on the surface, they should be fine, I think. But I worry.
So I replaced my TF1 with a simple TF1(“Poly”, “pol5(0)”, -800, 800 )
to bypass the potential problems in my complicated function.
It runs restricted to 1 thread.
It crashes in exactly the same place for two threads.
So it appears it is not a problem in my function.
The code I use on our local computers gives an interactive root version of
Version 5.32/00 2 December 2011
Perhaps this is not expected to work for multi-threading ?
Would a new version help?
Any updates here?
I’m currently trying out the solution proposed by Lorenzo, but I still get a segfault.
I probably need to set up indepentend instances of ROOT::Math::WrappedMultiTF1 and ROOT::Math::IParamMultiFunction for every thread. Does one have a hint for me to do so?
Proposed solution can be found here:
and in the bug tracker: