Compiling from source: First interactive command hangs

Short version: After compiling ROOT 6.22.06 from source, the first command I enter interactively hangs. The second and subsequent commands execute normally. Example (I hit Ctrl-C after waiting a while; once I even waited overnight):

$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.22/06                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Nov 27 2020, 15:14:08                 |
  | From tags/v6-22-06@v6-22-06                                      |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------


root [0] TH1D test("test","test",100,-3,3);
^C
 *** Break *** keyboard interrupt
Root > 
root [1] TH1D test("test","test",100,-3,3);
root [2] TH1D test2("test2","test",100,-3,3);
root [3] 

Details:

I’ve compiled ROOT from source since ROOT6 was released. I use a custom installation of GCC 6.4 and Python 3.6 (to take advantage of the latest language features), but never had a problem through ROOT 6.18.04.

But when I compiled 6.20 and later 6.22, the above problem emerged. I’ve tried recompiling with different GCC and Python versions (e.g., GCC 9.2 and Python 3.8), but the issue remains. I also tried including enough CMake options to make my build match the pre-built binaries; no change.

If I download the pre-built binaries for CentOS 7, they work fine, but of course I can’t take advantage of the latest C++ language features.

At this point, there is my standard CMake command:

vers=6.22.06
CC=${GCC_DIR}/bin/gcc CXX=${GCC_DIR}/bin/g++ cmake \
   -DCMAKE_INSTALL_PREFIX=/usr/nevis/el7/root-${vers} \
   -Dminuit2:BOOL=ON \
   ../root-${vers}

The OS is CentOS 7, cmake version 3.16.4, GCC 6.4 (or higher), Python 3.6 (or higher).

What am I missing? Is there any way to trace why interactive ROOT is hanging only on the first command?


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.22.06
Platform: CentOS 7
Compiler: GCC 6.4.0


This is really weird. Do you have something in the rootlogon.C file ? you can try to start root with option -n (ie root -n) and check if you still have this issue ? @axel may have an idea about it.

I’ve never seen anything like this. Couple of questions:

  • what kind of machine is this?
  • depending on the previous answer - can you try without connection to the internet?
  • can you run strace -o for_axel.txt root.exe -e 'TH1D test("test","test",100,-3,3);' and let it sit for 10 minutes, then CtrlC - and please upload for_axel.txt.
  • Does this happen for any first line? Even for just 1;?
  • Do you have your home on a network filesystem? Does $LD_LIBRARY_PATH contain directories that are on a network path?
  • If you reduce PATH and LD_LIBRARY_PATH to a bare minimum, enough to start ROOT: does that work around the issue?

Let’s see whether any of this gives us a hint!

Cheers, Axel.

Thanks for the help! Here are the answers to the suggestions:

  • I don’t have a rootlogon.C file.

  • If I start root with root -n it still hangs at that first command to define TH1D.

  • I’m running one of our lab’s production servers, remotely connecting from home during the pandemic. I cannot disconnect its internet. (Or rather I could, but then I couldn’t reach it and would be forced to remotely cycle its power via IPMI. Not a good deal.)

  • The systems on our cluster all run CentOS 7. They have x86_64 processors, mostly AMD but there might be some Intel systems in there; each working group chooses its own hardware.

  • for_axel.txt.gz (120.8 KB) . Note that the uncompressed file is 2.2MB.

  • 1; works! So does simple arithmetic of the form:

root [0] 1;
root [1] 2;
root [2] int a(1);
root [3] a
(int) 1
root [4] a+3;
root [5] int b = a+3;
root [6] b
(int) 4

It’s only when I start using ROOT features (defining a TH1D object; creating a new TBrowser) that I experience problems. I thought perhaps this had something to do with accessing shared libraries; recompiling with the soversion feature turned on changed nothing.

  • The lab’s analysis cluster relies heavily on NFS to share files and resources. So yes, there are many directories in $PATH and $LD_LIBRARY_PATH that are network mounted.

  • I tried running root removing what references I could to potentially eliminate any external network references, and that did not change the issue.

I’ll restate that my set-up worked for ROOT 6.18, but apparently broke with ROOT 6.20. Were there any new features or structural changes to ROOT at that time that might be relevant to this issue?

I can reproduce it, it’s a bug on our side. Thanks for your report, @seligman ! I’ll let you know once I know what to do about it and I created a fix…

Wow! Thank you for the quick diagnosis and response. And now I don’t have to feel guilty about making foolish mistake with compilers and options and such.

I look forward to the solution. Thanks again!

Oh and (sorry, forgot): quick workaround if possible (i.e. if you don’t need that file): rm /nevis/tanya/home/seligman/.lyxpipe.out And FYI I have unlinked the file you posted (Europeans and their appetite for privacy :wink: ).

The .lyxpipe.in and .lyxpipe.out named pipes are automatically created by the LyXServer, whenever one uses it, so ROOT should not touch them at all.

Yeah absolutely, and with [cling] Skip non-regular files to find dylibs: by Axel-Naumann · Pull Request #7566 · root-project/root · GitHub it stops looking at any non-regular files.

I just did:

rm -rf ~/.lyx*

since I haven’t used Lyx in years. Unfortunately, it did not work around the issue. Could there be some other old .config file causing the problem?

There’s probably another FIFO in ~/ - or in whatever other directory that your $LD_LIBRARY_PATH is pointing to. Alternatively you can grab my changes https://github.com/root-project/root/pull/7566.patch and apply that to your sources and re-build libCling.so.

It turns out that rm -rf ~/.lyx* didn’t delete .lyxpipe.out or .lyxpipe.in! When I explicitly deleted those files, ROOT 6.22.06 worked!

I look forward to ROOT 6.24 when I don’t have to worry about this at all. In the meantime, I’m fine with the work-around (I don’t think any of my users work with named pipes). Please treat this problem as solved!

Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.