Hello!
I’m building a Singularity image with ROOT and my app for a HPC environment for first time. However, ready images (Docker) are pretty huge (~1 GB) as I think, so I build ROOT from source with command like this:
Resulting image is around 400 MB. Can I shrink it further by deleting tutorials for example? Or is it ok in general to send such large images to a HPC cluster?
First of all, be very careful with this option: -Dminuit2_omp=ON.
This one (also minuit2_mpi) result in Minuit 2 only being able to minimize thread-safe functions. This is not the case for RooFit for example, or even the regular TH1::Fit() I think. So we don’t encourage using this option, because if breaks the rest of ROOT. In only makes sense when building Minuit 2 standalone.
To reduce the size further, did you already try -DCMAKE_BUILD_TYPE=MinSizeRel? This build type is using the -Os optimization flag, which is like the standard -O2 except that it omits optimizations that often increase the code size.
With this, I get a build around 330 MB. Is that acceptable? Deleting small things like the tutorial does not help much. Other ideas: maybe disable also tmva. Do you need it? It takes quite a few MB. And can you work with compression? There are some huge files, in particular etc/allDict.cxx.pch, which compresses from 90 to 50 MB with gzip.
Or is it ok in general to send such large images to a HPC cluster?
I don’t know, the ROOT forum is not the right place to ask this Better ask the administrators of your cluster.
Thank you for the warning! I’ll take this into account.
Yes, it is included intentionally.
As for the tutorials, as far as I remember they are actually about 15 MB, so I think their removal is a low hanging fruit with little risk.
I’ve been thinking that likely size gain from the optimization flag and compression may not worth the performance loss. May be it would be simpler to send compressed image.
I decided not to go into further research and just tested the working image. I realized that downloading of data sample (500-1500 MB) to a cluster node will have a greater impact on the processing speed than the deployment time (edit: and working image is cached on the node). Of course, I should have noticed this earlier instead of premature optimization. Anyway, I think the ROOT image with minimum required dependencies won’t hurt.
Apparently other ROOT build options require some packages at build time and run time, which ones is a subject to a further study.
If size of the image doesn’t matter much, it’s probably better to stick to ROOT Docker repo, or to do a one-stage build, remove only the build tools and artifacts afterwards, and not worry about transferring runtime dependencies to the minimal image.