Hi,
I am reading a branch from a root file. Then using the branch to create a histogram in a loop and doing some calculations. So every time I need to make a histogram the first loop is for a variable and another loop over entries to fill the histogram. So it is taking too much time over my system and using only one CPU. My PC has 24 CPUs and I want to utilize all of them. I have tried implicit MT using ROOT::EnableImplicitMT(nThreads); but no benefit as it is then taking more time to perform. So is there a way so that I can distribute my loop over multiple threads. The loop looks like this…
TFile *input = new TFile(Form("%s",fname), "READ");
TTree *data = (TTree*) input->Get("data");
data->SetBranchAddress("q", &q);
int entries = data->GetEntries();
for(CHID=1;CHID<384;CHID++)
{
TH1F* h1 = new TH1F("h1", "Spectrum of Gamma Rays", nbins, 0, 100);
if(CHID >64 && CHID < 320){continue;}
cout<<"Performing fit for Channel ID ="<<CHID<<"\n";
for(j=1;j<=entries1;j++)
{
data->GetEntry(j);
if(q[CHID]>0)
{
h1->Fill(q[CHID]);
}
else{continue;}
}
h1->Draw("L")
}
yes there is a typo in my/your code, it should be for(int CHID = 1; CHID < 384; CHID++). But make sure to read RDataFrame’s docs that I linked, check out the tutorials, and understand what’s going on.
In line
histos.emplace_back(df.Define(“qchid”, “q[CHID]”).Filter(“qchid > 0”).Histo1D(“qchid”));
I have changes “q[CHID]” to only “q” so now it not showing any error message but I don’t know if it running correctly. I am attaching my old code below.
Ah yes, df.Define("qchid", "q[CHID]") should be df.Define("qchid", [CHID] (const ROOT::RVec<float> &q) { return q[CHID]; }, {"q"}) (we need to capture CHID from the outer scope).
As an aside, note that a way to speed up your original code by a lot, even if it still runs on a single thread, is to move the loop over entries outside of all others, so you loop over data once and fill all histograms in that single loop.
Your current code loops over the dataset 128 times or so.
Yes, I tried that also but my input root file is around 1 GB so anyways it is taking time for execution. I have also tried to write a separate script for multithreading which I am attaching. If you have time then please give your suggestions.
After using GetEntry() in the event loop only as suggested, my old script is considerably faster and I think I don’t need to use multi threading for this task.
Thank you so much for pointing out this simple yet very important mistake.