I have a bunch of histograms that I want to draw on a canvas (one histogram per canvas) and save to an image file on disk. At the moment I do this sequentially and that works fine. However, I was hoping to speed this up a little, by writing to file in parallel. Is there any way this can be done “natively” with ROOT? I couldn’t find anything relevant, other than “parallel write” being a future plan in ROOT development.
Yes exactly @sbinet. In my example code I let each omp thread create a canvas, and call hist->Draw() on one of the N histograms in a vector of histograms, and Print() to a .png file with a unique name (resulting in N .png files).
I see.
Then it comes down to whether this code can be executed in parallel or not:
TH2S* h = vec[i];
TCanvas *c = new TCanvas(std::to_string(i).c_str());
h->Draw("colz");
std::string name = std::string(h->GetName());
c->Print((name + ".png").c_str());
delete c;
It seems not: gdb shows there are at least two places that remain thread-unsafe even with ROOT::EnablelThreadSafety, in TApplication and TPad (the problem is not in I/O: parallel writing of different TFile’s is supported in general).
A solution that removes the thread-safety discussion altogether – multi-processing:
auto draw = [](TH2S *h) {
TCanvas *c = new TCanvas((std::string("c_") + h->GetName()).c_str());
h->Draw("colz");
std::string name = std::string(h->GetName());
c->Print((name + ".png").c_str());
delete c;
return 0;
};
ROOT::TProcessExecutor e(4);
e.Map(draw, vec);
This should work, and might give you a good speed-up depending on your actual application: TProcessExecutor forks ROOT and executes that lambda function in a different process for each argument.
Thanks Enrico, that’s exactly what I was looking for. It works perfectly.
For 18 histograms (each having 4.8M bins), this resulted in a speedup of about 13X using 18 threads (on a 16 core machine), compared to a single-thread execution.
Good stuff
Ahmad
Edit: mentioned speedup is for the parallel part. Overall speedup of application including serial part is about 9X.