Send TFile content with MPI

Dear Rooters,

I am actually developping a code parallelized with MPI. Each MPI process (over thousands) has to read ROOT files (TFile) during their initialization step. Hence all MPI processes read the ROOT files in the same time. This method does not scale at all resulting in very different initialization times for each process. Furthermore, it introduces desynchronization between processes.

A way to tackle with this problem would be that each process read a subset of the ROOT files and then brodcast to others.

Hence, my question is: is there a way to send TFile object with MPI communication using Streamers for example?

Thanks in advance,

Emeric

Hi Emeric,

How big is the TTree and how big is the fraction of data read; how many events are we talking about? Does that data change or is it fixed “at build time”?

Axel.

Hi Axel,

This is not a TTree object that is serialized but my own classes. All the data in the file are read. The size of each ROOT file is about 20 MB. Data are fixed at build time and are only read not modified.

Emeric

Hi,

Why don’t you have the same problem with root the binary and its libraries? (I am still making sure I understand enough context to be able to give you a reasonable suggestion…)

Cheers, Axel.

Hi,

More precisely, my objective is to read a TFile from a master process, then get a serialized buffer of the whole content and its size and then send it to other processes via a MPI_Send method.

Something like that, but I am really note sure about the buffer stuff :

TFile f("file.root");

TBufferFile buf(TBuffer::kRead);
f.GetList()->Streamer(buf);

// Send the size
int buffer_size = buf.BufferSize();
MPI_Send(&buffer_size,1,MPI_INT,1,1,MPI_COMM_WORLD);

// Send the buffer
MPI_Send(buf.Buffer(),buffer_size,MPI_BYTE,1,1,MPI_COMM_WORLD);

On the slave side:


int buffer_size;
MPI_Recv(&buffer_size,1,MPI_INT,0,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);

char* buffer = new char[buffer_size];
MPI_Recv(buffer,buffer_size,MPI_BYTE,0,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);

// Then reconstruct the TFile content from the buffer
// Is it possible ?

I am not sure it is the right method to do that.

Thanks a lot

Cheers, Emeric

[quote]f.GetList()->Streamer(buf);[/quote]By default this list is empty. It contains the object that have been explicitly read from the TFile.

It sounds like you are trying to pass through the entire file (almost) directly from the disk to the MPI node, is that correct?

In any case, to read the data that you did store and pass trhough you would use: TBufferFile buf(TBuffer::kRead, buffer_size, buffer); buf.SetReadMode(); TList *lst = (TObject *)buf.ReadObjectAny(TList::Class());

If instead you wanted to pass the whole file (and use something like fread to load it all in memory), on the receiving end you would use TMemFileTFile *file = new TMemFile("file.root", buffer, buffer_size);

Also note that for passing around buffer, TMessage has some extra features compared to TBuffer.

Cheers,
Philippe.