Is it possible to parallelize Garfield?

Pete · September 29, 2020, 5:45am

I have recently noticed that when I run my program, only one processor is used, which takes a really long time to execute. I narrow things down to this while loop that takes the majority of the runtime:

  double xc = 0., yc = 0., zc = 0., tc = 0., ec = 0., extra = 0.;
  int nc = 0;
  while (track.GetCluster(xc, yc, zc, tc, nc, ec, extra)) {
    for (int k = 0; k < nc; ++k) {
      double xe = 0., ye = 0., ze = 0., te = 0., ee = 0.;
      double dx = 0., dy = 0., dz = 0.;
      track.GetElectron(k, xe, ye, ze, te, ee, dx, dy, dz);
      aval.AvalancheElectron(xe, ye, ze, te, ee, dx, dy, dz);
      const int np = aval.GetNumberOfElectronEndpoints();
      double xe1, ye1, ze1, te1, e1;
      double xe2, ye2, ze2, te2, e2;
      double xi1, yi1, zi1, ti1;
      double xi2, yi2, zi2, ti2;
      int status;
      for (int j = np; j--;) {
        aval.GetElectronEndpoint(j, xe1, ye1, ze1, te1, e1, 
                                    xe2, ye2, ze2, te2, e2, status);
        drift.DriftIon(xe1, ye1, ze1, te1);

Given the nature of the first for loop, this seems like a good opportunity to run things in parallel. My first thought was to make copies of track and then break up the for loop. However, after looking at the source and header files of TrackHeed, it seems like making copies of track is a bad idea since the operator= was explicitly made private to prevent track from being copied.

I am looking for any suggestions on how to parallelize this part. Thank you!

hschindl · September 29, 2020, 9:10am

Hi,
yes, Heed used to rely on a number of global variables, this is why a disabled the copy constructor at some point. I should revisit that however, maybe it’s not necessary any more.

The bottleneck in your simulation is not the track though but the avalanche simulation. So what you could perhaps do is

simulate a track, retrieve the clusters and store the coordinates of the electrons in a container,
loop over the electrons in your container and simulate an avalanche for each of them (AvalancheElectron).

For the second part you could (I think) indeed use multiple instances of an AvalancheMicroscopic class. Let me know if that makes sense and if you manage to get it to run…

Pete · September 30, 2020, 4:06pm

Hi,

I have updated the original while loop to the following:

  double xc = 0., yc = 0., zc = 0., tc = 0., ec = 0., extra = 0.;
  int nc = 0;
  int NofE = 0;
  while (track.GetCluster(xc, yc, zc, tc, nc, ec, extra)) {
    for (int k = 0; k < nc; ++k) {
      ++NofE;
    }
  }
  
  double electrons[NofE][8];
  double xe = 0., ye = 0., ze = 0., te = 0., ee = 0.;
  double dx = 0., dy = 0., dz = 0.;
  int index = 0;
  while (track.GetCluster(xc, yc, zc, tc, nc, ec, extra)) {
    for (int k = 0; k < nc; ++k) {
      track.GetElectron(k, xe, ye, ze, te, ee, dx, dy, dz);
      electrons[index][0] = xe;
      electrons[index][1] = ye;
      electrons[index][2] = ze;
      electrons[index][3] = te;
      electrons[index][4] = ee;
      electrons[index][5] = dx;
      electrons[index][6] = dy;
      electrons[index][7] = dz;
      ++index;
    }
  }
  

  double xe1, ye1, ze1, te1, e1;
  double xe2, ye2, ze2, te2, e2;
  double xi1, yi1, zi1, ti1;
  double xi2, yi2, zi2, ti2;
  int status;
  for (int k = 0; k < NofE; ++k) {
      aval.AvalancheElectron(electrons[k][0],electrons[k][1],electrons[k][2],electrons[k][3],
                             electrons[k][4],electrons[k][5],electrons[k][6],electrons[k][7]);
      const int np = aval.GetNumberOfElectronEndpoints();
      for (int j = np; j--;) {
        aval.GetElectronEndpoint(j, xe1, ye1, ze1, te1, e1, 
                                    xe2, ye2, ze2, te2, e2, status);
        drift.DriftIon(xe1, ye1, ze1, te1);

      }
  }

This seems to create some problems. I get the following errors when I run the code:

The code seems to run normally until it gets to about the halfway point. That’s when the segmentation fault error occurs.

hschindl · September 30, 2020, 6:17pm

Hi,
sorry for the confusion, I should probably document this better.
You can’t “replay” a track. When you call GetCluster, the previous cluster and its electrons are discarded. After the first loop, you’ve reached the end of the track and the body of the second loop is never executed because GetCluster always returns false (as there are no more clusters on the track). I’m a bit surprised that the program crashes with a segmentation fault though (not sure why).

The issue should be easy to fix, you can just add the coordinates of the electrons to a vector already in the first loop.

PS: you don’t need to add the ee, dx, dy, dz parameters to the list. They are there in the interface because other implementations might provide them, but Heed doesn’t calculate them.

Pete · October 1, 2020, 6:00am

Hi,

Thank you for your response!

I was able to fix the previous problems by rewriting the above section as follow:

  double xc = 0., yc = 0., zc = 0., tc = 0., ec = 0., extra = 0.;
  int nc = 0;
  std::vector<double> electrons;
  double xe = 0., ye = 0., ze = 0., te = 0., ee = 0.;
  double dx = 0., dy = 0., dz = 0.;
  int index = 0;
  while (track.GetCluster(xc, yc, zc, tc, nc, ec, extra)) {
    for (int k = 0; k < nc; ++k) {
      track.GetElectron(k, xe, ye, ze, te, ee, dx, dy, dz);
      electrons.push_back(xe);
      electrons.push_back(ye);
      electrons.push_back(ze);
      electrons.push_back(te);
      electrons.push_back(ee);
      electrons.push_back(dx);
      electrons.push_back(dy);
      electrons.push_back(dz);
      ++index;
    }
  }
  

  double xe1, ye1, ze1, te1, e1;
  double xe2, ye2, ze2, te2, e2;
  double xi1, yi1, zi1, ti1;
  double xi2, yi2, zi2, ti2;
  int status;  
  #pragma omp parallel for
  for (int k = 0; k < index; ++k) {  
      aval.AvalancheElectron(electrons.at(k*8 + 0),electrons.at(k*8 + 1),electrons.at(k*8 + 2),electrons.at(k*8 + 3),
                             electrons.at(k*8 + 4),electrons.at(k*8 + 5),electrons.at(k*8 + 6),electrons.at(k*8 + 7));
      }

The code works successfully ( I have tested it for several runs). However, when I try to run the same code in parallel (activate by #pragma omp parallel for), I get different errors every time I run. Here are the errors that I encounter so far:
1)

2)

These errors seem to stem from aval.AvalancheElectron().

hschindl · October 1, 2020, 7:31am

Can you move the creation of the AvalancheMicroscopic object inside your parallelized loop? Something like this:

for (int k = 0; k < index; ++k) {
  AvalancheMicroscopic aval;
  aval.SetSensor(&sensor);
  aval.AvalancheElectron(electrons.at(k*8 + 0), electrons.at(k*8 + 1),
                         electrons.at(k*8 + 2), electrons.at(k*8 + 3), 
                         0.1, 0, 0, 0);
}

By the way, the parameters ee, dx, dy, dzin track.GetElectron are not meaningful. I would instead a small value (of order 0.1 - 1 eV) for the initial electron energy in AvalancheElectron.

Pete · October 1, 2020, 10:02pm

Hi,

I tried your suggestion and still get the 2nd error with this extra stuff:

Occasionally, I will also get the segmentation violation and the code would crash.

hschindl · October 2, 2020, 7:27am

Then I’m afraid I don’t have an immediate solution. We’ll need to make MediumMagboltz thread-safe. I’ll put it on my to-do list…

Pete · October 2, 2020, 4:37pm

Thank you for your help so far. Just out of curiosity, do you think this is something that can be resolved relatively quickly? I’m sure you have other things to take care of, so I just want to know an estimate to decide if I should wait or start to try some other things.

hschindl · October 5, 2020, 4:34pm

Hi,
sorry for the delay in getting back to you. I’ve added a guard lock in the Mixer function of the MediumMagboltz class.

Not sure if this is enough to solve the issue, but can you give it a try and let me know if it works?

Pete · October 5, 2020, 7:27pm

Hi,

Your solution seems to have fixed the errors for MediumMagboltz. However, I am still getting warnings(?) from AvalancheMicroscopic:

AvalancheMicroscopic::TransportElectron: Got collision rate <= 0 at 7.0351e-08 eV (band 0).
AvalancheMicroscopic::TransportElectron: Got collision rate <= 0 at 2.26634e-07 eV (band 0).
AvalancheMicroscopic::TransportElectron: Got collision rate <= 0 at 1.36955e-06 eV (band 0).
AvalancheMicroscopic::TransportElectron: Got collision rate <= 0 at 1.93251e-06 eV (band 0).

I get the same warnings when AvalancheMicroscopic aval is inside or outside of the for loop.

hschindl · October 5, 2020, 7:54pm

Hi,
ah, that’s probably because you give it an initial energy of (close to) zero. Can you try with something like 0.1 eV instead?
Great that the parallelized loop works now by the way!

Pete · October 5, 2020, 9:25pm

Hi,

Yes! The parallelized loop seems to work now. Thank you so much.

I changed the initial energy to 0.1-0.5, but the warnings are still there. I did notice that the warnings only show up when I try to run things in parallel. If I comment out the “#pragma omp parallel for shared(electrons)”, the warnings just go away. One other thing is that while the code is running, I noticed that at the beginning when the warnings are being printed out, multiple cores are used simultaneously. However, at some point (maybe halfway through), the warnings stop, and then only 1 core is used for the rest of the execution.

hschindl · October 6, 2020, 10:41am

Can you upload a (minimal version of) your program?

Pete · October 6, 2020, 3:11pm

Hi,

Here’s the program. I have taken out the unimportant parts.
parallel.C (6.0 KB)

hschindl · October 10, 2020, 7:43pm

Hi,
sorry for the delay… I made a couple of small modifications to the program,

initialising MediumMagboltz (i. e. calculating the cross-section table) upfront,
moving DriftLineRKF also inside the loop.

I did a quick test and it seems to run without crashing now. Can you give it a try?

parallel.C (6.0 KB)

Pete · October 11, 2020, 4:12am

Hi,

Thank you so much! You seem to have fixed the issues. I have run the program several times with different kinetic energy and so far have not any of the previous errors/warnings. I also noticed that all threads were used throughout the execution unlike previously!

I have just one more follow up question. With the way driftline and aval are being defined inside of the for loop, how should I call for the ViewSignal, ViewCell, and ViewDrift?

hschindl · October 11, 2020, 1:28pm

Hi,
for ViewSignal and ViewCell you shouldn’t need to do anything special.
To plot the electron and ion drift lines, you would need to create a ViewDrift object outside the loop, and then switch on plotting for each of the AvalancheMicroscopic and DriftLineRKF objects inside the loop, something like this

ViewDrift driftView;
// ...
#pragma omp parallel for
for (int k = 0; k < index; k++) {
  AvalancheMicroscopic aval;
  aval.SetSensor(&sensor);
  aval.EnableSignalCalculation();
  aval.EnablePlotting(&driftView);
  // ...
  DriftLineRKF drift;
  drift.SetSensor(&sensor);
  drift.EnableSignalCalculation();
  drift.EnablePlotting(&driftView);
}

driftView.Plot();

Plotting the drift lines can be quite memory-hungry though and might slow down your program a lot, so I would only do it for debugging or illustration purposes.

PS: I noticed that you are simulating a 100 TeV proton. Out of curiosity: what’s the application?

Pete · October 11, 2020, 8:50pm

Hi,

I am not planning to plot the program every time. So, as you mentioned, I usually plot when I need a “picture” for presentation or to make sure that I placed the track where I actually want. I tried what you suggested and everything seems fine for plotting the drift lines.

Regarding the signals, I am still having some issues. Currently, this is how I am plotting the signals:

// Plot the induced current.
  ViewSignal signalView_left;
  signalView_left.SetSensor(&sensor);
  TCanvas c1("c1", "", 800, 600);
  signalView_left.SetCanvas(&c1);
  signalView_left.SetLabelY("left anode raw [fC]");
  signalView_left.PlotSignal("al");

  ViewSignal signalView_right;
  signalView_right.SetSensor(&sensor);
  TCanvas c2("c2", "", 800, 600);
  signalView_right.SetCanvas(&c2);
  signalView_right.SetLabelY("right anode raw [fC]");
  signalView_right.PlotSignal("ar");

  // Convolute with the transfer function and plot again.
  sensor.SetTransferFunction(transfer);
  constexpr bool fftl = true;
  sensor.ConvoluteSignals(fftl);
  TCanvas c7("c7", "", 800, 600);
  signalView_left.SetCanvas(&c7);
  signalView_left.SetLabelY("convolute signal left [mV]");
  signalView_left.PlotSignal("al");

  sensor.SetTransferFunction(transfer);
  constexpr bool fftr = true;
  sensor.ConvoluteSignals(fftr);
  TCanvas c8("c8", "", 800, 600);
  signalView_right.SetCanvas(&c8);
  signalView_right.SetLabelY("convolute signal right [mV]");
  signalView_right.PlotSignal("ar");

For some reason, the plots are just showing up empty (i.e. so signals shown inside of graphs).

Also, I am not using 100 TeV for my application. I set the kinetic energy that high so that only a few electrons would get created, which speed up the execution. For what I am doing (charge particle detection), the particles usually have energy in the range of MeV.

Pete · October 12, 2020, 1:14am

This is the warnings that I get when I try to plot:

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c1 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c1 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c2 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c2 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c7 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c7 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c8 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c1 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c2 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c7 height changed from 0 to 10

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects.
Warning in <TCanvas::ResizePad>: c8 height changed from 0 to 10