I have no idea why my programs is so slow

Hi there.

I have a root file that has two branches, ADC and hit, which has the type of std::vector<int> and std::vector<bool> respectively.
With the kind help, I was able to write a code to draw histograms of each element in ADC.
Then I tried to do something like

TH1D histo[30];
for (int i=0;i<30;++i){
  if (hit[i] == true){
    h[i].Fill(ADC[i]);
  }
}

And the actual code I wrote follows:

using ints = ROOT::RVec<int>;
using bools = ROOT::RVec<bool>;

std::vector<ROOT::RDF::RResultPtr<TH1D>> histos;
for (int i = 0; i < 30; ++i)
{
    // Book the filling of the histogram of i-th element of branch `ADC'
    // this is my code before modified 
    /*
    auto ith = [i](const ints &v) { return v.at(i); };
    ROOT::RDF::RResultPtr<TH1D> histo = df.Define("ADCi", ith, {"ADC"})
                                            .Histo1D({Form("histo%d", i + 1), "title;ADC(ch);Counts/2ch", 2048, 0, 4096}, "ADCi");
    */
    auto ith = [i](const ints &v, const bools &b)
    {
        if (b.at(i))
        {
            return v.at(i);
        }
    };
    ROOT::RDF::RResultPtr<TH1D> histo = df.Define("ADCi", ith, {"ADC", "isHit"})
                                            .Histo1D({Form("histo%d", i + 1), "title;ADC(ch);Counts/2ch", 2048, 0, 4096}, "ADCi");
    histos.push_back(histo);
}

The problem of the modified code above is its speed. It took 23 minutes to run the program before modified, but after modified it takes 3 hours! Both are executed using ACLiC (adding + after the filename).
Please help me speed up my program.

Cheers,
haltack.

Hi,
two simple changes that might help a bit:

  • change Histo1D(...) to Histo<int>(...)
  • change at(i) to [i]

After those changes, if runtimes are still unsatisfactory, we should check where time is spent.
To that end you can, for example, compile your program into an executable (e.g. g++ -g -o program program.C $(root-config --libs --cflags), we need -g to have debug symbols) and then sample a few seconds of its execution with perf (if on linux), or vtune, or similar: perf record --call-graph dwarf -F99 ./program (and stop it with ctrl-C after a few seconds). Then you can check the result with perf report or share with me the output of perf script so I can take a look.
A nice overview of perf usage can be found here.

Alternatively you can also share program and (part of the) data with me and I will take a look when possible.

Without measuring where time is spent, my suggestion would be to try and rework this:

so that you instead fill a single 2D histogram, e.g. with (i might be getting something wrong, but it should convey the idea):

RVec<int> yrange = {0, 1, 2, 3, ..., 29};
df.Define("yrange", [&] { return yrange; })
  .Histo2D({"histo2d", "histo2d", 2048, 0, 4096, 30, 0, 30}, {"ADC", "yrange", "isHit"});

Here we are using isHit as a list of weights and ADC/yrange as lists of values. Internally, the filling is performed as:

for (int i = 0; i < nelements; ++i) {
  histo2D.Fill(/*valuex=*/ADC[i], /*valuey=*/yrange[i], /*weight=*/isHist[i]);
}

An approach such as the one above will be much faster: only one histogram to fill, many less temporaries/copies, and whenever we load data into the CPU cache we make good use of it rather than only using one element of the vectors at a time.

I hope this helps!
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.