Hello,
I am having some difficulties when multiple processes try to write to the same ROOT file. I have an MPI application that spawns N processes that at some point want to write some events to a TTree in the ROOT file.
I made a small reproducer of what I’m trying to achieve (see the end of this post).
First attempt
My first attempt was to (naively) use the default TFile::Write() method. This ended up in the errors:
Error in <TFile::ReadBuffer>: error reading all requested bytes from file test.root, got 63 of 300
Error in <TFile::Init>: test.root failed to read the file type data.
Warning in <TFile::Write>: file test.root not opened in write mode
Which I assume is the result of two processes trying to access one file at the same time.
Second attempt
In my second attempt, I tried using the TParallelMergingFile, which according to the reference of TBufferMerger should be able to “write data in parallel to a single output file … using processes that connect to a network socket”. As you can see in the reproducer below I tried using the default constructor as such. Using the “UPDATE” option I get the errors:
Error in <TMemFile::Init>: test.root not a ROOT file
Error in <TMemFile::Init>: test.root not a ROOT file
Warning in <TParallelMergingFile::Write>: file test.root not opened in write mode
Warning in <TParallelMergingFile::Write>: file test.root not opened in write mode
Using the “NEW” or “RECREATE” option, I get the errors:
SysError in <TUnixSystem::UnixTcpConnect>: connect (localhost:11111) (Connection refused)
Error in <TParallelMergingFile::UploadAndReset>: Could not contact the server localhost:11111
SysError in <TUnixSystem::UnixTcpConnect>: connect (localhost:11111) (Connection refused)
Error in <TParallelMergingFile::UploadAndReset>: Could not contact the server localhost:11111
I am guessing that I’m missing the step of setting up some sort of server that takes care of the merging. If that is the case, is there an example of how to do so?
Reproducer
#include "TFile.h"
#include "TObject.h"
#include "TParallelMergingFile.h"
#include "TTree.h"
#include <iostream>
#include "mpi.h"
int main(int argc, char **argv) {
if (argc != 2) {
std::cout << "./main <an_integer>" << std::endl;
return 1;
}
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
MPI_Comm_size( MPI_COMM_WORLD, &size );
TObject obj;
obj.SetUniqueID(rank);
// TFile tfile("test.root", "UPDATE");
TParallelMergingFile tfile("test.root?pmerge=localhost:11111", "UPDATE");
TTree *tree = static_cast<TTree *>(tfile.Get("someTree"));
// If tree doesn't exist yet; make a new one
if (!tree) {
tree = new TTree("someTree", "");
tree->Branch("someBranch", &obj);
} else { // else we modify the existing one
TObject *obj_ptr = &obj;
tree->SetBranchAddress("someBranch", &obj_ptr);
}
tree->Fill();
tfile.Write();
MPI_Finalize();
return 0;
}
Any help is welcome!
ROOT Version: 6.13/08
Platform: Ubuntu
_Compiler: mpic++ (relies on g++ 5.5.0)