Compiling from source: First interactive command hangs

Short version: After compiling ROOT 6.22.06 from source, the first command I enter interactively hangs. The second and subsequent commands execute normally. Example (I hit Ctrl-C after waiting a while; once I even waited overnight):

$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.22/06                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Nov 27 2020, 15:14:08                 |
  | From tags/v6-22-06@v6-22-06                                      |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------


root [0] TH1D test("test","test",100,-3,3);
^C
 *** Break *** keyboard interrupt
Root > 
root [1] TH1D test("test","test",100,-3,3);
root [2] TH1D test2("test2","test",100,-3,3);
root [3] 

Details:

I’ve compiled ROOT from source since ROOT6 was released. I use a custom installation of GCC 6.4 and Python 3.6 (to take advantage of the latest language features), but never had a problem through ROOT 6.18.04.

But when I compiled 6.20 and later 6.22, the above problem emerged. I’ve tried recompiling with different GCC and Python versions (e.g., GCC 9.2 and Python 3.8), but the issue remains. I also tried including enough CMake options to make my build match the pre-built binaries; no change.

If I download the pre-built binaries for CentOS 7, they work fine, but of course I can’t take advantage of the latest C++ language features.

At this point, there is my standard CMake command:

vers=6.22.06
CC=${GCC_DIR}/bin/gcc CXX=${GCC_DIR}/bin/g++ cmake \
   -DCMAKE_INSTALL_PREFIX=/usr/nevis/el7/root-${vers} \
   -Dminuit2:BOOL=ON \
   ../root-${vers}

The OS is CentOS 7, cmake version 3.16.4, GCC 6.4 (or higher), Python 3.6 (or higher).

What am I missing? Is there any way to trace why interactive ROOT is hanging only on the first command?


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.22.06
Platform: CentOS 7
Compiler: GCC 6.4.0


This is really weird. Do you have something in the rootlogon.C file ? you can try to start root with option -n (ie root -n) and check if you still have this issue ? @axel may have an idea about it.

I’ve never seen anything like this. Couple of questions:

  • what kind of machine is this?
  • depending on the previous answer - can you try without connection to the internet?
  • can you run strace -o for_axel.txt root.exe -e 'TH1D test("test","test",100,-3,3);' and let it sit for 10 minutes, then CtrlC - and please upload for_axel.txt.
  • Does this happen for any first line? Even for just 1;?
  • Do you have your home on a network filesystem? Does $LD_LIBRARY_PATH contain directories that are on a network path?
  • If you reduce PATH and LD_LIBRARY_PATH to a bare minimum, enough to start ROOT: does that work around the issue?

Let’s see whether any of this gives us a hint!

Cheers, Axel.

Thanks for the help! Here are the answers to the suggestions:

  • I don’t have a rootlogon.C file.

  • If I start root with root -n it still hangs at that first command to define TH1D.

  • I’m running one of our lab’s production servers, remotely connecting from home during the pandemic. I cannot disconnect its internet. (Or rather I could, but then I couldn’t reach it and would be forced to remotely cycle its power via IPMI. Not a good deal.)

  • The systems on our cluster all run CentOS 7. They have x86_64 processors, mostly AMD but there might be some Intel systems in there; each working group chooses its own hardware.

  • for_axel.txt.gz (120.8 KB) . Note that the uncompressed file is 2.2MB.

  • 1; works! So does simple arithmetic of the form:

root [0] 1;
root [1] 2;
root [2] int a(1);
root [3] a
(int) 1
root [4] a+3;
root [5] int b = a+3;
root [6] b
(int) 4

It’s only when I start using ROOT features (defining a TH1D object; creating a new TBrowser) that I experience problems. I thought perhaps this had something to do with accessing shared libraries; recompiling with the soversion feature turned on changed nothing.

  • The lab’s analysis cluster relies heavily on NFS to share files and resources. So yes, there are many directories in $PATH and $LD_LIBRARY_PATH that are network mounted.

  • I tried running root removing what references I could to potentially eliminate any external network references, and that did not change the issue.

I’ll restate that my set-up worked for ROOT 6.18, but apparently broke with ROOT 6.20. Were there any new features or structural changes to ROOT at that time that might be relevant to this issue?

I can reproduce it, it’s a bug on our side. Thanks for your report, @seligman ! I’ll let you know once I know what to do about it and I created a fix…

Wow! Thank you for the quick diagnosis and response. And now I don’t have to feel guilty about making foolish mistake with compilers and options and such.

I look forward to the solution. Thanks again!

Oh and (sorry, forgot): quick workaround if possible (i.e. if you don’t need that file): rm /nevis/tanya/home/seligman/.lyxpipe.out And FYI I have unlinked the file you posted (Europeans and their appetite for privacy :wink: ).

The .lyxpipe.in and .lyxpipe.out named pipes are automatically created by the LyXServer, whenever one uses it, so ROOT should not touch them at all.

Yeah absolutely, and with [cling] Skip non-regular files to find dylibs: by Axel-Naumann · Pull Request #7566 · root-project/root · GitHub it stops looking at any non-regular files.

I just did:

rm -rf ~/.lyx*

since I haven’t used Lyx in years. Unfortunately, it did not work around the issue. Could there be some other old .config file causing the problem?

There’s probably another FIFO in ~/ - or in whatever other directory that your $LD_LIBRARY_PATH is pointing to. Alternatively you can grab my changes https://github.com/root-project/root/pull/7566.patch and apply that to your sources and re-build libCling.so.

It turns out that rm -rf ~/.lyx* didn’t delete .lyxpipe.out or .lyxpipe.in! When I explicitly deleted those files, ROOT 6.22.06 worked!

I look forward to ROOT 6.24 when I don’t have to worry about this at all. In the meantime, I’m fine with the work-around (I don’t think any of my users work with named pipes). Please treat this problem as solved!

Thanks!