Garfield program not running at 100% cpu load


As the title suggests, I have recently come across an issue where the execution of a Garfield++ code does not run at 100% of CPU load. This problem causes the program to take longer to execute as well as wasting “resources” when run on a supercomputer. So, any suggestion on how I can alleviate this problem is greatly appreciated.

Previously, this is the typical running load of my code (display from htop):

As one can see, all threads are utilized at 100%, and this is kept up throughout the execution. However, at this point, this is no longer the case. As I increase the number of electrodes to be calculated for signals, the running load becomes like this

So, the load for all threads just drops by ~30%. I was able to narrow down to this part of my code that causes this issue

  double xe1, ye1, ze1, te1, e1;
  double xe2, ye2, ze2, te2, e2;
  double xi1, yi1, zi1, ti1;
  double xi2, yi2, zi2, ti2;
  int status;
  int Aright = 0, Aleft = 0;
  int count = index;
  bool calculate_signal = true;
  int max = omp_get_max_threads();
  std::cout << "Maximum number of threads: " << max << std::endl;
  #pragma omp parallel for
  for (int k = 0; k < index; k++) {
    AvalancheMicroscopic aval;
    aval.AvalancheElectron(*4 + 0),*4 + 1),*4 + 2),*4 + 3), 0.1, 0,0,0);
    const int np = aval.GetNumberOfElectronEndpoints();
    DriftLineRKF drift;
    for (int j = np; j--;) {
      aval.GetElectronEndpoint(j, xe1, ye1, ze1, te1, e1, 
                                  xe2, ye2, ze2, te2, e2, status);
      drift.DriftIon(xe1, ye1, ze1, te1);

Specifically, it is the second for loop (the inside for loop) that causes this issue (when I comment out the second for loop, all thread runs at ~100%). If anyone has come across this issue or knows how to fix it, please let me know.

Perhaps a Garfield++ developer could answer? Not sure who to cite @Axel

I’ll look into it, but I’m afraid I won’t have time in the next two days…


Thank you for the reply. It would be great if you can look at it when you get the chance. In the meantime, I am doing some profiling and trying different solutions. I will post my progress here once I get some sort of update on my end. Thank you again.


I have not found any solution so far, but here are some behaviors that I have noticed:

  1. The reduction in utilization can be alleviated (only slightly) by decreasing the number of threads used.

  2. Since the second loop has a rather large range (np can be between 1 to ~2000), this may be causing load imbalance? (This is only a guess).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.