Issue merging large number of gas files

Hello everyone (and most importantly, hello @hschindl!),

I am currently working on a tool to generate/read/merge gas files (any feedback on this is very welcome!) and I am facing some issues when merging a large amount of gas files.

In particular, in order to simulate a mixture for a large number of electric field points (n), I have split the simulation in n jobs each with a single electric field point (please, if there is any potential issue with this appoach let me know!). Afterwards my idea is to merge the files into a single one in a similar fashion to this example.

The following code attempts to do this:

// main.cpp
#include <algorithm>
#include <filesystem>
#include <iostream>
#include <string>

#include "Garfield/MediumMagboltz.hh"

using namespace std;
using namespace Garfield;

std::vector<double> GetElectricFieldValues(MediumMagboltz& gas) {
    vector<double> electricField, magneticField, angle;
    gas.GetFieldGrid(electricField, magneticField, angle);
    // sort electric field in case it's not ordered
    sort(electricField.begin(), electricField.end());
    return electricField;
}

int main() {
    string path = "/root/C4H10-10.0-Ar/";

    string mergeOutput = "/tmp/merge.gas";

    vector<string> files;
    for (const auto& entry: std::filesystem::directory_iterator(path)) {
        if (entry.path().extension() != ".gas") {
            continue;
        }
        files.emplace_back(entry.path());
    }
    cout << "number of gas files: " << files.size() << endl;

    for (int i = 0; i < files.size() - 1; i++) {
        MediumMagboltz gas;

        if (i == 0) {
            // in the first iteration the merge file does not exist yet
            gas.LoadGasFile(files[0]);
        } else {
            gas.LoadGasFile(mergeOutput);
        }

        cout << "Electric field values for base gas file: " << files[i];
        for (const auto& value: GetElectricFieldValues(gas)) {
            cout << " " << value;
        }
        cout << endl;

        {
            MediumMagboltz gasToMerge;
            gasToMerge.LoadGasFile(files[i + 1]);
            cout << "Electric field values for gas file to merge: " << files[i + 1];
            for (const auto& value: GetElectricFieldValues(gasToMerge)) {
                cout << " " << value;
            }
            cout << endl;
        }

        constexpr bool replaceOld = false;
        gas.MergeGasFile(files[i + 1], replaceOld);
        gas.WriteGasFile(mergeOutput);
    }

    MediumMagboltz gas;
    gas.LoadGasFile(mergeOutput);
    cout << "Electric field values for merged gas file: " << mergeOutput;
    for (const auto& value: GetElectricFieldValues(gas)) {
        cout << " " << value;
    }
    cout << endl;

    cout << "Number of electric field values: " << GetElectricFieldValues(gas).size() << endl;

    return 0;
}
# CMakeLists.txt
find_package(Garfield REQUIRED)
add_executable(demo main.cpp)
target_link_libraries(demo PUBLIC Garfield::Garfield)

The code will print the electric field values present in each file, which matches the substring in the filename.

The issue is that some electric field values are missing from the merged file, there are warnings for some iterations (Keeping existing data for E = 4445 V/cm, not using data from the file. etc.) warning that the value for e-field is already present on the file (but this is not the case!), when the warning appears there is no new e-field value added.

I attach the same gas files I used for testing, perhaps there is a problem on the files?

gas.tar.gz (451.5 KB)

Possibly I am overlooking something very fundamental (maybe it makes no sense to simulate for a single e-field point?) but I am a bit lost, any help is welcome.

Thanks!
Luis

I managed to get around this by spacing out more the electric field points (i.e. by 10V instead of 5V) but I am still curious as to why this happens.

Hi,
sorry for my late reply! The function MergeGasFile skips electric field values that are “similar” to existing ones in the table (this is the similarity check, with eps = 1.e-3). If you want finer spacing, one could, in principle, reduce the tolerance parameter. On the other hand, too fine spacing is often an overkill. You want to make sure that the grid is fine enough to accurately reproduce the shape of the drift velocity, diffusion coefficients, Townsend coefficient, etc. as function of the electric field. But I don’t think you need 5 - 10 V/cm spacing.

Just a minor comment unrelated to your question: you could skip the intermediate WriteGasFile steps:

MediumMagboltz gas;
gas.LoadGasFile(files[0]);
for (int i = 1; i < files.size(); i++) {
  constexpr bool replaceOld = false;
  gas.MergeGasFile(files[i], replaceOld);
}
gas.WriteGasFile("merged.gas"); 

Thanks! I figured as much but I just wanted to make sure this was not the issue for the question at hand. Not writing the file does indeed make a huge difference in running time.

Thank a lot for your reply!

I figured the spacing was overkill specially for the higher voltages, it’s great to know the specific piece of code that controls this, thanks!

One of the reasons I was trying different (small) spacings is to figure out why the diffusion coefficients I calculate appear as noisy as they do (see image), you don’t notice this for large spacings and I assumed it was being calculated very accurately. I guess it is because the simulation does not have enough statistics and some parameter needs to be increased, but I haven’t figured out which one yet (the number of collisions I use is 10, increasing it did not seem to solve it, but I will check again). The values currently offered are probably good enough for my needs but I am planning on building a database of gas files so I would like to have as much accuracy as possible, don’t mind the extra computing time.

Thanks again,
Luis

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.