Parallel canvas generation and saving

_ROOT Version:_6.13/08
_Platform:_Ubuntu 16.04


I have a bunch of histograms that I want to draw on a canvas (one histogram per canvas) and save to an image file on disk. At the moment I do this sequentially and that works fine. However, I was hoping to speed this up a little, by writing to file in parallel. Is there any way this can be done “natively” with ROOT? I couldn’t find anything relevant, other than “parallel write” being a future plan in ROOT development.

I tried the following myself with OpenMP: (1.1 KB)

Compilation command:

g++ -o parallel_write `root-config --cflags --libs` -fopenmp

This segfaults like such:

*** Error in `./parallel_write': double free or corruption (fasttop): 0x00007f77a4000cb0 ***

Any help is appreciated.


Is TBufferMerger fitting the bill?

AFAICT, the OP isn’t interested in writing to a root file in parallel, but creating PNG files in parallel.

Yes exactly @sbinet. In my example code I let each omp thread create a canvas, and call hist->Draw() on one of the N histograms in a vector of histograms, and Print() to a .png file with a unique name (resulting in N .png files).

I see.
Then it comes down to whether this code can be executed in parallel or not:

TH2S* h = vec[i];
TCanvas *c = new TCanvas(std::to_string(i).c_str());
std::string name = std::string(h->GetName());
c->Print((name + ".png").c_str());
delete c;

It seems not: gdb shows there are at least two places that remain thread-unsafe even with ROOT::EnablelThreadSafety, in TApplication and TPad (the problem is not in I/O: parallel writing of different TFile’s is supported in general).

A solution that removes the thread-safety discussion altogether – multi-processing:

   auto draw = [](TH2S *h) {                                                                                            
      TCanvas *c = new TCanvas((std::string("c_") + h->GetName()).c_str());                                             
      std::string name = std::string(h->GetName());                                                                     
      c->Print((name + ".png").c_str());                                                                                
      delete c;                                                                                                         
      return 0;                                                                                                         
   ROOT::TProcessExecutor e(4);                                                                                         
   e.Map(draw, vec);

This should work, and might give you a good speed-up depending on your actual application: TProcessExecutor forks ROOT and executes that lambda function in a different process for each argument.

Hope this helps,

1 Like

Thanks Enrico, that’s exactly what I was looking for. It works perfectly.

For 18 histograms (each having 4.8M bins), this resulted in a speedup of about 13X using 18 threads (on a 16 core machine), compared to a single-thread execution.

Good stuff :smiley:


Edit: mentioned speedup is for the parallel part. Overall speedup of application including serial part is about 9X.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.