Threads in ROOT

Dear ROOT experts,

I have a function that loops over TTree and draws variables, doing some stuff so it’s getting slow if the tree has large number of events. So I would like to speed it up by creating threads.

So let’s say in case of 2 variables I have something like this:

[color=#4000BF] std::thread t[2];
t[0] = std::thread(draw,string(argv[1]), string(argv[2]), variables[0]);
t[1] = std::thread(draw,string(argv[1]), string(argv[2]), variables[1]);

t[0].join();
t[1].join();[/color]

If compile it like this :
g++ -o draw draw.cpp --std=c++11 root-config --cflags --libs -O2 -I./ -lASImage

I get segmentation violation:
*** Break *** segmentation violation
Fatal in TClass::Init: gInterpreter not initialized
aborting

When adding -pthread compilation actually fails… I have looked through ROOT forums and seems like that one has to add

[color=#4000BF]ROOT::EnableThreadSafety();[/color]

but I don’t know where this function is defined:
error: ‘EnableThreadSafety’ is not a member of ‘ROOT’

I would appreciate any help and advice since I don’t have any experience with threads.

Regards,
Ivan.

Hi,

This function is available on in v6.08 (a earlier version of the post said incorrectly v6.06) and later. For older release you need to call TThread::Initialize() and create one TThread object in each of your threads.

Cheers,
Philippe.

It’s still not working although I’m in 6.06:

echo $ROOTSYS
/afs/cern.ch/sw/lcg/app/releases/ROOT/6.06.00/x86_64-slc6-gcc49-opt/root

I’m getting:

‘EnableThreadSafety’ is not a member of 'ROOT’
ROOT::EnableThreadSafety();

Regards,
Ivan.

Hi Ivan,

following what Philippe said, you can do this provided that for every thread you open a TFile and proceed with the treatment of the TTree.
The syntax for 6.06 is still TThread::Initialize() (and this will not be dropped in the immediate future). For releases later than 6.06, e.g. 6.07, you’ll be able to call ROOT::EnableThreadSafety(). The binary for these development releases are not yet available on afs.
You can speedup your problem also with a multiprocessing approach, using the TProcPool class. You can have a look to this tutorial (and TProcPool is available in 6.06)
root.cern.ch/doc/master/mp102__r … it_8C.html
In this case the protection of shared states is way easier since you are using processes. The price to pay is some (actually little) overhead for the forking and other operations but since you mention that you are dealing with a large number of events, this could be a good solution for you.

Cheers,
Danilo

Hi Danilo,

I think my case is different. I have a function that is reading several files, fills several histograms, calculates uncertainties etc. and creates a plot for a given variable:

github.com/ishvetso/aTGCsAnalys … er.cpp#L36

the basic reason for the slow down is that I’m using tree -> Project() many times, and this becomes really slow if one wants to calculate systematic uncertainties, so what happens is I have a lot of loop over trees from several files. The reason why I use tree -> Project() is that I want to have a selection configurable via std::string. That’s why I wanted to paralyze it.

Is there any way to speed up the code in this case?

Regards,
Ivan.

Hi Ivan,

I see.
In ROOT6, you can always “configure a cut with a string” thanks to its jitting capabilities: anything can be just in time compiled at runtime, even the function passed to TProcPool::ProcTree.
Before resorting to that though, did you try to:

  1. Call ROOT::EnableThreadSafety() (TThread::Initialize() for releases <= 6.06)
  2. Open a file per thread
  3. Perform the operations you need, e.g. line 82 here github.com/ishvetso/aTGCsAnalys … er.cpp#L36
  4. Merge to the final histogram
    ?

About the error you reported

Fatal in <TClass::Init>: gInterpreter not initialized

is it fixed by the usage of ROOT::EnableThreadSafety() or equivalent? If not a TApplication might be needed, but I would then need to see the full code and be able to run it.

Cheers,
Danilo

Hi Danilo,

no, I have not done. And from what you say, I think it’s not possible in a current code. So the operation I wanted to paralyze originally was basically: loop over many files, trees -> draw a variable -> paralyze it over variables, so in one thread many files and trees are opened. This is not possible, right?

As far as I understand you now one can open files and trees per thread and merge them into the final hist.

Let’s say:
without threads you have:

TH1F * getHist(....){
   TH1D *hist = new TH1D("hist","hist", Nbins,low,high);
   for (filename in fileNames)
   {
      TFile file(filename, "READ");
      TTree * tree = (TTree*)file.Get(treeName);
      TH1FD *temp = new TH1D("temp","temp", Nbins,low,high);
      tree ->Project("hist",varName, selection);
      hist -> Add(temp);
   }
   return hist;
}

Now you want to do this threaded:

#include <TThread>
TH1F * getHistFromSingleFile(....){
    TH1F *hist = new TH1D("temp","temp", Nbins,low,high);
    TFile file(filename, "READ");
    TTree * tree = (TTree*)file.Get(treeName);
    tree ->Project("temp",varName, selection);	   
    return hist;
}

TH1F * getHistThreaded(....){
   TThread workers[NumberOfFiles];
   TThread::Initialize();
   TH1D *hist = new TH1D("hist","hist", Nbins,low,high);
   for (filename in fileNames)
   {
      workers[iFile] = TThread("thread " + iNumber, getHistFromSingleFile, ... your filename etc );
   }
   for (filename in fileNames)
   {
      workers[iFile].Join();
      //how to get result of the function from the thread?
      hist -> Add(YourHistFromThread);
   }
   return hist;
}

Does this make sense to you?

To the best of my understanding,
Ivan.

PS.
for my education: is it possible to do these things in std::thread?

can anybody give me feedback on the previous post?

Regards,
Ivan.

Hi Ivan,

yes, it makes sense. One has to be careful to open one file per thread and treat the tree relative to this file.

Cheers,
Danilo

Yes, it is possible. With the version of ROOT you have, you do need to still create a TThread object per thread but this can be done as the first statement inside the function std::thread starts.

Cheers,
Philippe.

thanks for help. What is not clear to me is how to get TH1F object from the thread. Can you show me some example? I can see examples here:
root.cern.ch/root/htmldoc/tutor … index.html

but all functions are void …

Regards,
Ivan.

Hi Ivan,

the threads are sharing the same memory. An idea could be to pass a pointer by reference to the function and assign it inside the function you run in the thread.
Other examples can be found here:

Cheers,
Danilo

Hi Danilo,

thanks a lot for help. So trying to run it with passing histogram pointer by reference, I have something like that:

void Plotter::GetHistFromSingleFile(std::string filename_, Var var_, std::string selection_, std::string TreeName, TH1D *& hist_){
   	TFile file(filename_.c_str(), "READ");
	TTree * tree = (TTree*)file.Get(TreeName.c_str());
	TH1D *temp = new TH1D("temp","temp", Nbins,var_.Range.low, var_.Range.high);
	tree ->Project("temp",var_.VarName.c_str(), selection_.c_str());
	hist_ -> Add(temp);
}

void Plotter::GetHistThreaded(Sample sample_, Var var_, std::string TreeName,const TH1D *& hist_){

	int Nthread = sample_.filenames.size();
	TThread *t[Nthread];
	TThread::Initialize();
	for (uint file_i = 0; file_i < sample_.filenames.size(); ++file_i)
	{
    	t[file_i] = new TThread(("t" + to_string(file_i)).c_str(), GetHist, sample_.filenames.at(file_i), var_, sample_.selection, TreeName, hist_ );
    	t[file_i] -> Run();
	}



	for (uint file_i = 0; file_i < sample_.filenames.size(); ++file_i)
	{
    	t[file_i] -> Join();
	}
}

Compiling this code gives the following error:

Plotter.cpp: In member function 'void Plotter::GetHistThreaded(Sample, Var, std::string, const TH1D*&)':
Plotter.cpp:63:145: error: invalid use of non-static member function
      t[file_i] = new TThread(("t" + to_string(file_i)).c_str(), GetHist, sample_.filenames.at(file_i), var_, sample_.selection, TreeName, hist_ );

What’s the problem ? I’m not sure if the arguments are provided correctly …

Thanks,
Ivan.

Hi,

what TThread constructor would you like to invoke*? I think you are treating TThread as std::thread.
Not strictly related but strings should not be passed by value (see Plotter::GetHistFromSingleFile implementation).

D

root.cern.ch/doc/master/classTT … aee2ca6c90 and below.

Hi,

Is the function ‘GetHist’ marked as static?

Cheers,
Philippe.

I have managed to compile the code. The key point was that I had to give arguments of the function with std::bind.

However, I had to create a TThread object as you mentioned before. This is I didn’t manage to do with my function but just had to create a function void DoNothing(){}.

However, I still see that code crashes at some point but not every time.

The code is here:

github.com/ishvetso/aTGCsAnalys … pp#L36-L59

running it right now gives:

*** Break *** illegal instruction

*** Break *** segmentation violation

I have a suspect that deleting of threads is not being done properly, especially I’m not sure how to do this with std::thread.

Can you please have a look and give me some feedback?

Thanks & regards,
Ivan.

Hi Ivan,

what is the outcome of your analysis of the issue with gdb?

Cheers,
Danilo

Hi Danilo,

here is the output:

(gdb) run
Starting program: /afs/cern.ch/work/i/ishvetso/aTGCRun2/CMSSW_7_4_14/src/aTGCsAnalysis/Common/test/Plotting/draw draw
warning: File “/cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6.0.20-gdb.py” auto-loading has been declined by your `auto-load safe-path’ set to “$debugdir:$datadir/auto-load”.
To enable execution of this file add
add-auto-load-safe-path /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6.0.20-gdb.py
line to your configuration file “/afs/cern.ch/user/i/ishvetso/.gdbinit”.
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file “/afs/cern.ch/user/i/ishvetso/.gdbinit”.
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info “(gdb)Auto-loading safe path”
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib64/libthread_db.so.1”.
You should use only 2 arguments …
terminate called after throwing an instance of 'std::logic_error’
what(): basic_string::_S_construct null not valid

Program received signal SIGABRT, Aborted.
0x00007ffff60f7625 in raise () from /lib64/libc.so.6

Thanks,
Ivan.

Hi Ivan,

this does not tell much.
Could you investigate what is the status of the threads at the moment of the crash? Compiling with debugging symbols is perhaps a requirement.

Danilo

Hi,

t[file_i] = new TThread(("t" + to_string(file_i)).c_str(), DoNothing, (void*)0 ); t[file_i] -> Run(); workers[file_i] = std::thread(std::bind(&Plotter::GetHistFromSingleFile,this, sample_.filenames.at(file_i), var_, sample_, TreeName, file_i, hist_ )); is not what I meant. this actually starts 2 independent threads (and the one started by std::thread is missing its TThread).

What I meant is that you use ‘just’:workers[file_i] = std::thread(std::bind(&Plotter::GetHistFromSingleFile,this, sample_.filenames.at(file_i), var_, sample_, TreeName, file_i, hist_ ));
and you update GetHistFromSingleFile do start with:

void Plotter::GetHistFromSingleFile(std::string filename_, Var var_, Sample sample_, std::string TreeName, int Number, TH1D *& hist_){ TThread th; std::cout << "from the thread : " << filename_ << std::endl;

Cheers,
Philippe.