Hadd too slow

Server Info:
root 6.08
gcc 4.9.4

I run my geant4 simulation with MPI in Multithreading mode. It’s pretty quick. Since I didn’t figure out how to merge root files from different rank in geant4, I save all the root files and use hadd to merge them in script.
However the speed is very disappointing.

Geant4:

  1. <1 minutes to generat all the files. Each file is about 10M. Total 16 files.
    hadd:
  2. 15 minutes to merge all these files to one. 199M
    The structure of the root
    T/
    Generator/*
    Geometry/*
    Detector/*
    I try to use mpirun to run hadd, but cause error.

Is there anyway to speed up the hadd process? Thanks a lot.

Hi, it would be interesting to debug why hadd is so slow. Perhaps you can make available some of your files in publicly reachable place.

Meanwhile I suggest you to fill the histograms in parallel (sort of), so there is no problem adding them afterwards. Have a look at the tutorial root.cern.ch/doc/v608/mt201__pa … ll_8C.html
It uses ROOT::TThreadedObject to wrap a histogram per thread, such that they can be filled in parallel.

Cheers,

Pere

Thank you for your reply.

I will definitely look at the code and make my own to merge the files.

Thanks again.

Hi,

How many histograms do you have in each files?

Cheers,
Philippe.

Problem fixed. I add -O option for optimizing the output because for some reason, the output root file from geant4 has some problem if I omit the -O option when I use hadd to merge. Now I rewrite my root output part in the geant4, it’s ok now. So I can ignore the -O option. It’s reasonable fast.

Hi,

The -0 ask hadd to take the ‘slow’ path to copy the TTree’s data and would indeed explain the lower performance. There was a recent bug fix in Geant4 that resolved a problem where Geant4’s own re-implementation of ROOT I/O was producing files that could not be fast copied if the resulting output file was more than 2Gb.

Cheers,
Philippe.