Distributing ROOT binaries with my program

Good catch, the referenced link literally points to a README saying:

libCore does a dlopen of libCling

…so that resolves that mystery.

And I would suppose that libCore cannot possibly run without libCling, right? I found this relevant thread on that topic, which seems to suggest so. This is a shame, given that I really only need to be able to load/write TTrees from/to ROOT files in my C++ program. I was hoping there would be a lightweight approach that I could adopt. Anyway, I will assume for now that libCling is required for my program even though I don’t really need to use it. With that in mind, I still do not understand why does cling look for headers? And why does it not honor ROOTSYS?

What follows is just a quick recap of what my program does. The main executable does not link against ROOT. Instead it just modifies the environment for ROOT, and passes control to another function in another library, which is linked against ROOT. Modelling the behavior of thisroot.sh, my main() function currently sets the ROOTSYS env. variable to the path where ROOT is distributed (contains lib, etc, icons and fonts; on Windows it contains bin instead of lib). And then on POSIX systems it appends $ROOTSYS/lib to the following env. variables:

  • LD_LIBRARY_PATH
  • DYLD_LIBRARY_PATH
  • SHLIB_LIBRARY_PATH
  • LIBPATH

Conversely, on Windows systems it adds $ROOTSYS/bin to PATH (with semicolon as delimiter) and sets CLING_STANDARD_PCH=none.

After the environment is appropriately modified, ROOT should have no difficulties running. So my main() simply loads the rest of my software from a dynamic library that directly depends on ROOT, which lets the dynamic linker resolve the rest of dependencies. All loaded modules inherit this augmented environment – with strace I can verify that this is the case.

I am failing to understand:

  • Why does cling look for headers when there are pre-compiled headers, dictionaries and module maps in $ROOTSYS/etc?
  • Why does cling look for headers in /boot/root/src/build/include? Is that some kind of fallback path from CI? Or is there an empty env. variable, which I was supposed to configure?

My digging led me to this function in ROOT source. Is this what is responsible for all those headers being probed on load?

Exactly. Probably not the best design choice.

Maybe a bug. @bellenot or @jonas might know a better answer.
Maybe it’s related to: [CMake] Move towards target-based CMake and partly fix picking up headers from an installed ROOT by hageboeck · Pull Request #8709 · root-project/root · GitHub and https://its.cern.ch/jira/browse/ROOT-7580

I have another piece of the puzzle. It seems that even though $ROOTSYS/etc/allDict.pch exists at the time of loading libCore.so, it is ignored and my system’s /etc/root/allDict.pch is used instead. I suspect that my issue is in part caused by ROOT mistakenly mixing files from my system’s installation and my program’s distributed files.

I am aware that the ROOT_PCH variable exists, but I would like to avoid using it. By default TROOT loads its pre-compiled header from the etc directory, which appears to be detected wrongly. I am currently trying to identify and fix the underlying cause. Any ideas/suggestions would be appreciated.

Not quite. This is a necessary choice to avoid the leaking of the LLVM symbols through linking with libCore and allow ROOT version of LLVM to not clash with any other version (used either by the user directly or by any other 3rd library (like openGL))

I think that my problem may be caused by the ROOTSYS variable being ignored because the ROOTPREFIX pre-processor macro was defined when my ROOT package was built. This would explain everything: in the implementation of GetRootsys() this suppresses any externally supplied value, making it impossible for my software to override the default search path and point ROOT to the correct directory.

According to my (at this point forensic) search, ROOTPREFIX is defined only when the R__HAVE_CONFIG cmake variable is ON. And exploring further, R__HAVE_CONFIG=ON is only possible if the gnuinstall option is enabled. This is really curious because according to documentation this variable controls whether a ROOT package is “fixed” or “portable” and is disabled by default. When I built my ROOT package, I used the following cmake configuration:

-G Ninja 
-D minimal=ON 
-D runtime_cxxmodules=OFF 
-D CMAKE_INSTALL_PREFIX=/opt/root

Having just reviewed logs from the build, there is no indication that gnuinstall=ON and root-config --config (executed within the built environment) does not list it. I will now try rebuilding my ROOT with explicit directive to -Dgnuinstall=OFF and report back what I see.

I think that my -Dgnuinstall=OFF attempt will not help. I just ran a small-scale test and this is the content of ginclude/RConfigure.h my ROOT package was using:

#ifndef ROOT_RConfigure
#define ROOT_RConfigure

/* Configurations file for linuxx8664gcc */

/* #undef R__HAVE_CONFIG */

#ifdef R__HAVE_CONFIG
#define ROOTPREFIX    "$(ROOTSYS)"
#define ROOTBINDIR    "$(ROOTSYS)/bin"
#define ROOTLIBDIR    "$(ROOTSYS)/lib"
#define ROOTINCDIR    "$(ROOTSYS)/include"
#define ROOTETCDIR    "$(ROOTSYS)/etc"
#define ROOTDATADIR   "$(ROOTSYS)/."
#define ROOTDOCDIR    "$(ROOTSYS)/."
#define ROOTMACRODIR  "$(ROOTSYS)/macros"
#define ROOTTUTDIR    "$(ROOTSYS)/tutorials"
#define ROOTSRCDIR    "$(ROOTSYS)/src"
#define ROOTICONPATH  "$(ROOTSYS)/icons"
#define TTFFONTDIR    "$(ROOTSYS)/fonts"
#endif

#define EXTRAICONPATH ""

#define ROOT__cplusplus 202002L
#if defined(__cplusplus) && (__cplusplus != ROOT__cplusplus)
# if defined(_MSC_VER)
#  pragma message("The C++ standard in this build does not match ROOT configuration (202002L); this might cause unexpected issues. And please make sure you are using the -Zc:__cplusplus compilation flag")
# else
#  warning "The C++ standard in this build does not match ROOT configuration (202002L); this might cause unexpected issues"
# endif
#endif

#define R__HAS_SETRESUID   /**/
#undef R__HAS_MATHMORE   /**/
#undef R__HAS_PTHREAD    /**/
#undef R__HAS_XFT    /**/
#undef R__HAS_COCOA    /**/
#undef R__HAS_VC    /**/
#undef R__HAS_VDT    /**/
#undef R__HAS_VECCORE    /**/
#undef R__USE_CXXMODULES   /**/
#undef R__USE_LIBCXX    /**/
#define R__HAS_ATTRIBUTE_ALWAYS_INLINE /**/
#define R__HAS_ATTRIBUTE_NOINLINE /**/
#undef R__HAS_HARDWARE_INTERFERENCE_SIZE /**/
#undef R__USE_IMT   /**/
#undef R__COMPLETE_MEM_TERMINATION /**/
#undef R__HAS_CEFWEB  /**/
#undef R__HAS_QT5WEB  /**/
#undef R__HAS_DAVIX  /**/
#undef R__HAS_DATAFRAME /**/
#undef R__HAS_ROOT7 /**/
#undef R__LESS_INCLUDES /**/
#undef R__HAS_TBB /**/

#if defined(R__HAS_VECCORE) && defined(R__HAS_VC)
#ifndef VECCORE_ENABLE_VC
#define VECCORE_ENABLE_VC
#endif
#endif

#undef R__HAS_DEFAULT_LZ4  /**/
#define R__HAS_DEFAULT_ZLIB  /**/
#undef R__HAS_DEFAULT_LZMA  /**/
#undef R__HAS_DEFAULT_ZSTD  /**/
#undef R__HAS_CLOUDFLARE_ZLIB /**/

#undef R__HAS_TMVACPU /**/
#undef R__HAS_TMVAGPU /**/
#undef R__HAS_CUDNN /**/
#undef R__HAS_PYMVA /**/
#undef R__HAS_RMVA /**/

#undef R__HAS_URING /**/

#endif

As you can see, this seems to look just fine. Specifically, the R__HAVE_CONFIG macro (or cmake variable) is not defined (or true).

I am still attempting to figure out why would the ROOTSYS env. variable be ignored in TROOT.

I have a few more bits of information.

In my little toy example, there are 2 ROOT packages in the system:

  1. My system-wide installation that was installed by the package manager.
  2. The ROOT package that I intend to distribute with my program.

While #1 is located in the standard search paths, #2 is located close to my tested program, simulating a distributed package. As you may guess, #1 has gnuinstall=ON, whereas #2 has gnuinstall=OFF.

Since my main() modified LD_LIBRARY_PATH and friends to prefer #2 at runtime (we cannot be sure that #1 exists and matches my program’s ABI on a random user’s system), I was assuming that #2 did not honor ROOTSYS and therefore must have been built with incorrect gnuinstall setting. As I discovered, this was not the case.

What was happening instead was that my initial LD_LIBRARY_PATH change was ineffective, and the dynamic linker never successfully loaded #2. Instead, since it was discoverable at deault search paths, #1 was loaded all along. And when I configured ROOTSYS to a non-standard value, it correctly ignored it because that package had gnuinstall=ON for a system-wide installation.

I am still pulling my hair over why it is not possible to perform the following sequence of actions:

  1. Launch minimalist dynamically-linked executable that does not depend on ROOT.
  2. In the main function of that executable, call setenv("LD_LIBRARY_PATH") to extend search paths to include the directory, where ROOT lives.
  3. Later in the main function, call dlopen() to dynamically call another library, which depends on ROOT.
  4. Expect that the dynamic linker will use augmented value of LD_LIBARY_PATH during dependency resolution.

In any case, I found a viable workaround: between steps 2 and 3 I call execv() to restart the process, and force reloading of the environment into the dynamic linker. I use this as a variant of fork(), which does not detach the child process, and reduces overall memory footprint by not keeping the parent process blocked in the background.

Once execv() runs, the parent process is taken over by the child, which can detect that environment has already been augmented (parent sets a magic variable to indicate that).
This induces a bit of overhead, as dynamic linker has to run twice now, but has the benefit of actually accomplishing what I originally set out to do. I am now going to try scaling this up, and will report back once I get to run some tests.

Good point @pcanal.
Though for this use case, it could have been more convenient if ROOT had a split design with a separate libCore which only contains basic core things, and a libCoreCling, which dlopens libCling and linked with / depended on libCore. That way, if one would not need cling features, it could link just to the libCore part.

@ferhue Yes, precisely. Having a minimalist libCore without cling would be ideal for small self-contained applications like mine, which just want to be compatible with ROOT file format.

In other news, I integrated and tested my execv() workaround from yesterday. Using strace I can now confirm that libCling honors ROOTSYS and loads allDict.cxx.pch from my packaged ROOT directory. What I do not understand is why headers are still scanned even when the pre-compiled header is found and loaded.

See the following strace snippet, where /app is the working directory and ROOTSYS=/app/usr/share/foo/root:

stat("/app/usr/share/foo/root/etc//allDict.cxx.pch", {st_mode=S_IFREG|0644, st_size=79863920, ...}) = 0
openat(AT_FDCWD, "/app/usr/share/foo/root/etc//allDict.cxx.pch", O_RDONLY|O_CLOEXEC) = 15
readlink("/proc/self/fd/15", "/app/usr/"..., 4096) = 71
fstat(15, {st_mode=S_IFREG|0644, st_size=79863920, ...}) = 0
mmap(NULL, 79863920, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 15, 0) = 0x7623e5c9b000
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
close(15)                               = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
munmap(0x7623e5c9b000, 79863920)        = 0
getcwd("/", 4097) = 28
stat("/", {st_mode=S_IFDIR|0755, st_size=200, ...}) = 0
brk(0x3dbef000)                         = 0x3dbef000
stat("<<< cling interactive line includer >>>", 0x7ffc8618ca28) = -1 ENOENT (No such file or directory)
mmap(NULL, 200704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7623fda4a000
openat(AT_FDCWD, "/app/usr/share/foo/root/etc//allDict.cxx.pch", O_RDONLY|O_CLOEXEC) = 15
readlink("/proc/self/fd/15", "/app/usr/"..., 4096) = 71
fstat(15, {st_mode=S_IFREG|0644, st_size=79863920, ...}) = 0
mmap(NULL, 79863920, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 15, 0) = 0x7623e5c9b000
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
close(15)                               = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
munmap(0x7623e5c9b000, 79863920)        = 0
stat("/app/usr/share/foo/root/etc", {st_mode=S_IFDIR|0755, st_size=520, ...}) = 0
stat("/app/usr/share/foo/root/etc//cling", {st_mode=S_IFDIR|0755, st_size=100, ...}) = 0
stat("/app/usr/share/foo/root/etc//cling/plugins/include", 0x7ffc8618c0a8) = -1 ENOENT (No such file or directory)
stat("/app/usr/share/foo/root/etc//cling/plugins", 0x7ffc8618be98) = -1 ENOENT (No such file or directory)
stat("/app/usr/share/foo/root/include", {st_mode=S_IFDIR|0755, st_size=29700, ...}) = 0
stat("/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/14.1.1/../../../../include/c++/14.1.1", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/14.1.1/../../../../include/c++/14.1.1/x86_64-pc-linux-gnu", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/14.1.1/../../../../include/c++/14.1.1/backward", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/app/usr/share/foo/root/etc//cling/lib/clang/16/include", {st_mode=S_IFDIR|0755, st_size=3340, ...}) = 0
stat("/usr/local/include", {st_mode=S_IFDIR|0755, st_size=53248, ...}) = 0
stat("/../lib64/gcc/x86_64-pc-linux-gnu/14.1.1/../../../../x86_64-pc-linux-gnu/include", 0x7ffc8618c0a8) = -1 ENOENT (No such file or directory)
stat("/../lib64/gcc/x86_64-pc-linux-gnu/14.1.1/../../../../x86_64-pc-linux-gnu", 0x7ffc8618be98) = -1 ENOENT (No such file or directory)

...

lstat("/", {st_mode=S_IFDIR|0755, st_size=200, ...}) = 0
lstat("/app/usr", {st_mode=S_IFDIR|0755, st_size=160, ...}) = 0
lstat("/app/usr/bin", {st_mode=S_IFDIR|0755, st_size=80, ...}) = 0
lstat("/app/usr/share", {st_mode=S_IFDIR|0755, st_size=180, ...}) = 0
lstat("/app/usr/share/foo", {st_mode=S_IFDIR|0755, st_size=80, ...}) = 0
lstat("/app/usr/share/foo/root", {st_mode=S_IFDIR|0755, st_size=180, ...}) = 0
lstat("/app/usr/share/foo/root/lib", {st_mode=S_IFDIR|0755, st_size=3320, ...}) = 0
lstat("/app/usr/share/foo/root/lib/libCling.so", {st_mode=S_IFREG|0644, st_size=105949208, ...}) = 0
openat(AT_FDCWD, "/app/usr/share/foo/root/lib/libCling.so", O_RDONLY|O_CLOEXEC) = 15
fstat(15, {st_mode=S_IFREG|0644, st_size=105949208, ...}) = 0
mmap(NULL, 105949208, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 15, 0) = 0x7623e43bb000
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
close(15)                               = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
munmap(0x7623e43bb000, 105949208)       = 0
brk(0x3dc74000)                         = 0x3dc74000
openat(AT_FDCWD, "/app/usr/share/foo/root/etc//allDict.cxx.pch", O_RDONLY|O_CLOEXEC) = 15
readlink("/proc/self/fd/15", "/app/usr/"..., 4096) = 71
fstat(15, {st_mode=S_IFREG|0644, st_size=79863920, ...}) = 0
fstat(15, {st_mode=S_IFREG|0644, st_size=79863920, ...}) = 0
mmap(NULL, 79863920, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 15, 0) = 0x7623e5c9b000
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
close(15)                               = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
brk(0x3dc95000)                         = 0x3dc95000
stat("/tmp/root/root-6.32.00/build", 0x7ffc8618d7b8) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/<<< cling interactive line includer >>>", 0x7ffc8618d958) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/input_line_1", 0x7ffc8618d908) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/input_line_1", 0x7ffc8618d958) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/input_line_2", 0x7ffc8618d908) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/input_line_2", 0x7ffc8618d958) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/etc/cling/Interpreter", 0x7ffc8618d7b8) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/etc/cling/Interpreter/RuntimeUniverse.h", 0x7ffc8618d958) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/etc/cling/Interpreter/RuntimeOptions.h", 0x7ffc8618d908) = -1 ENOENT (No such file or directory)
stat("/tmp/root/root-6.32.00/build/etc/cling/Interpreter/RuntimeOptions.h", 0x7ffc8618d958) = -1 ENOENT (No such file or directory)

Then there are several thousand more stat() probes for headers in /tmp/root/root-6.32.00/build, which is where my ROOT package was built originally. Is it possible that even with -Dgnuinstall=OFF the build path was somehow persisted in the binaries?

I don’t know, as I never tried it. @amadio might also have some insights in this regard. Probably there are some remaining issues in the build system logic and CMake structure. It has been improved greatly lately, but there are probably some bugs in it, which you can report on GitHub, or even better fix yourself via a PR :wink: .

It has been improved greatly lately, but there are probably some bugs in it, which you can report on GitHub, or even better fix yourself via a PR :wink: .

I am happy to report / correct everything I encounter. My current difficulty is understanding where this behavior is originating from.

For one, where does ROOT/cling actually instruct the interpreter to load those headers? In the repository, I was unable to find the place where allDict.cxx.pch is loaded.

Maybe:

and
core/metacling/src/TCling.cxx:1392

1 Like

Thanks for those pointers @ferhue. They were really useful in my little packaging adventure!

I have been digging around some more, and testing intermittently. I can now confirm that my program runs with ROOT on a plain Linux VM. Initially, CLING complained about the absence of clang++ because it was trying to extract standard include paths from the system compiler. I thought that is not such a good idea, given that we cannot make any assumptions about user’s STL or toolchain.

Since I do not really need CLING to work (I just need it not to crash or complain), I have set the following environment variable to suppress its attempts to load system runtime:

EXTRA_CLING_ARGS="-nobuiltininc -nostdinc++ -noruntime"

This seems to work nicely, but it produces a compilation error because somewhere ROOT attempts to include Rtypes.h, TError.h, cassert and string, and these headers obviously fail to be located with the flags listed above. For that reason, I devised a sneaky solution. I have added a -I flag to EXTRA_CLING_ARGS, which I pointed to a dummy included directory that contains those 4 files. However, to keep things small and portable, I have completely truncated their contents. This means that CLING will find them when ROOT requests that they are included, but they have absolutely no effect (and no dependencies).

I have now reached a point where ROOT loads, initializes somewhat lobotomized version of CLING, which loads everything needed and does not print any warnings. With that said, I am still uncertain how to prevent CLING from stat()-ing several thousand files in my /tmp directory.

1 Like

@vvassilev Is there any additional context you can provide?

I am not sure what else I can provide. libCling is essential for ROOT and I do not see how anything could be used without it. We use it to build dictionaries and read data. I am surprised that one could get away without it.

The design of libCore and libCling has been not very optimal and we have discussed that at length. To disentangle this weird cyclic dependency we need to refactor quite a lot rootcling. I am not sure if removing libCling from ROOT should be supported generally.

@vvassilev And is the reading/stating of the code headers files expected in this setup?

If the build is around I expect the files at the build locations to be stat-ed.

If the build is around I expect the files at the build locations to be stat-ed.

@vvassilev In my case it is not around. My ROOT package is built from source in CI and installed to /opt/root – this is persisted in a Docker image. At a later point, when my software is built and packaged for distribution, the Docker image is pulled and my software links against the libraries present in /opt/root. The sources are long gone by then, yet there are still many stat() calls to non-existent files. Can this behavior be suppressed?

We use it to build dictionaries and read data. I am surprised that one could get away without it.

Perhaps this would be a good moment to ask: if I proceed with my hollowed out version of CLING, can you please clarify when I can expect things breaking? My understanding was that ROOT uses CLING for parsing interactive commands, C++ reflection, dictionaries, parsing TF1 expressions etc. For that reason, I expected that as long as I do not need persistence of custom complex types, but just read/write Int_t or Double_t in branches of a TTree, ROOT should be able to pull everything it needs from its pre-compiled header and modules. Can you see any flaws in my reasoning? (I would rather find out now than later when things start crashing randomly for users).

You can stop with a breakpoint there but these stats commands are probably coming from clang to check if the header files in the pch are present on disk. If they are not it will start streaming them from the pch. It is not easy to avoid that logic as that’s in clang and there is a good reason for it. This way it detects things like header invalidation, etc.

I really want to give something less handwavy but I can’t. In any case you will be using a setup that’s not tested by practically anybody so it might work or it might not work. The problem is that if it works it’d be unclear to me why and to what extent. If it does not work you can’t really ask for support by the community as that workflow is new and not “native” to ROOT.

I really want to give something less handwavy but I can’t. In any case you will be using a setup that’s not tested by practically anybody so it might work or it might not work. The problem is that if it works it’d be unclear to me why and to what extent. If it does not work you can’t really ask for support by the community as that workflow is new and not “native” to ROOT.

@vvassilev Understood. Just for the record, please note that I do not intend to use ROOT in ways in which it was never designed to be used. I simply aim to add support for the ROOT file format with the smallest possible tree of dependencies.

If I wanted to properly support a full CLING runtime without any tricks, should I also distribute the compiler toolchain together with my software (STL, compiler, linker, archiver etc.)? Or does the CLING already integrate everything needed for compiling, and only looks for standard headers? I would be open to distributing the STL in my package, but I cannot possibly see how I could pull off distributing an entire compiler (this would be especially tricky on Windows).

Thanks for your patience, and helpful explanations. I am certain that somebody in the future will want to do a similar thing, and this thread will hopefully save them a ton of time.