LLVM collision with OpenGL in CentOS7

Hello, rooters - our ROOTv6 fails to work because it’s LLVM collides with the LLVM in the stock OpenGL libraries in CentOS/RHEL 7.2. Using “strace” it is easy to see that error message spewage starts during loading of the OpenGL/MESA LLVM shared libraries. For now we run on a head-less machine and the workaround of using LIBGL_ALWAYS_INDIRECT works (the OpenGL/MESA LLVM shared libraries are not loaded). But at some point we will want to display ROOT graphics on a local display. Surely every CentOS7 user is affected by this problem, no? Should I provide more details, a simple crasher, etc?
K.O.

Hi,

Did you link you libraries and/or executable against the Cling library? If you did this is the likely cause of the problem. (i.e. remove the -lCling from your link line). If not, then we will need more details about your particular case.

Cheers,
Philippe.

Assuming that this problem really exists … can it be that it also breaks ROOT 6 building: Fatal error modified `<new>`

Very similar to a problem I’m having also:

I am not linking in Cling – it is being loaded by TROOT.

Yes, this is the same error we see, also I think with GEANT4:

Error in UnknownClass::InitInterpreter(): LLVM SYMBOLS ARE EXPOSED TO CLING! This will cause problems; > please hide them or dlopen() them after the call to TROOT::InitInterpreter()!

For reference, Axel asks to run following commands to check for pollution by external library. I shall try to do this (the person with the problem sits next door from me).

BTW, I believe the polluting library is /lib64/libLLVM-3.6-mesa.so:

alphagdaq:~$ objdump -T /lib64/libLLVM-3.6-mesa.so | grep EnablePrettyStackTrace
00000000003c90d0 g DF .text 0000000000000005 libLLVM-3.6-mesa.so LLVMEnablePrettyStackTrace
00000000003c9060 g DF .text 0000000000000061 libLLVM-3.6-mesa.so _ZN4llvm22EnablePrettyStackTraceEv
alphagdaq:~$

K.O.

Following commands to run for reference…

gdb --args your-binary…
r
(your program will now crash, but now gdb knows all the symbols. So now you can do:)
b TROOT::InitInterpreter()
r
(gdb will now hold at TROOT::InitInterpreter())
info sharedlibrary
(I need the output of that)
p LLVMEnablePrettyStackTrace

Hi,

Right, and if it’s libmesa then that was reported as a bug to the freedesktop people: bugs.freedesktop.org/show_bug.cgi?id=93103

This was apparently fixed in some recent-ish version of their software, see the bug report.

As they exposed llvm to the whole process it’s basically impossible to hide them from libCling and ling’s symbol resolution; there will be a clash if they don’t keep them local to their libs (as they do now).

Cheers, Axel.

So, the origin of this problem is known.

What is the advice for CentOS 7 users from the ROOT team?

Hi,

I don’t know. Don’t use centos and these GL drivers (but instead e.g. those from nvidia)?

I don’t see how we could possibly work around the issue that they are injecting the symbols into the binary :frowning: It’s not even cling’s symbol resolution: it’s really about exposing the same symbol name to the same process, which will violate assumptions of the dynamic loader.

If anyone has any ideas please let us know…

Cheers, Axel.

Hi, Axel.

I’m seeing a similar problem to this one that I posted about here in a separate thread.

Maybe we can consolidate into a single thread here on the forum like this one?

Or better yet, even if this is not a ROOT issue, it might be helpful to insert into your bug tracker so that the information and details can be put there.

Here’s some info about my GL install in CentOS along with the LLVM it is using:

[1062 $] yum whatprovides /lib64/libLLVM-3.6-mesa.so
mesa-private-llvm-3.6.2-2.el7.x86_64 : llvm engine for Mesa
Repo : @base
Filename : /lib64/libLLVM-3.6-mesa.so

[1063 $] yum whatprovides /lib64/libGL.so.1
mesa-libGL-10.6.5-3.20150824.el7.x86_64 : Mesa libGL runtime libraries and DRI drivers
Repo : @base
Filename : /lib64/libGL.so.1

[1064 $] yum whatprovides /lib64/libGLU.so.1
mesa-libGLU-9.0.0-4.el7.x86_64 : Mesa libGLU library
Repo : @base
Filename : /lib64/libGLU.so.1

Do you have more information about how to use the Nvidia GL drivers?

Thanks.

–Jeremy

Hi, again.

Does anyone know if this bug affects RHEL7 as well?

Thanks.

Maybe you could try to build your ROOT 6 from scratch again adding two flags to the cmake configuration line (and I’m afraid something similar should be done for Geant 4):
-Dsoversion=“ON” -Drpath=“ON”

BTW. There is another user who fights with CentOS 7: Fatal error modified `<new>`

Hi,

One more thing - maybe disabling ROOT’s use of GL might help: cmake -Dgl=Off should do.

Alternatively installing alternative GL drivers can do; I cannot really help there, but Google can, e.g. linuxconfig.org/nvidia-geforce- … nux-64-bit - warning: this looks messy…

Cheers, Axel.

And btw, the relevant ROOT issue for this is sft.its.cern.ch/jira/browse/ROOT-7744

Cheers, Axel.

So, maybe we should ask people, who clearly see this problem on CentOS 7, to try to build ROOT 6.06/08 from scratch using:

  1. cmake -Dall=“ON” -Dsoversion=“ON” -Drpath=“ON” …
  2. cmake -Dall=“ON” -Dsoversion=“ON” -Drpath=“ON” -Dopengl=“OFF” …

If “1.” produces a working executable (one needs to check that GL examples work) then there is no need to try “2.”.

BTW. The old ROOT 5 “configure” provided another flag “–enable-explicitlink”, but it doesn’t seem to exist among the new “cmake” options.

Hi Pepe,

explicit linking is the default (and was for all relevant platforms for configure/make).

I don’t see what soversion is going to do; this is not a question of renaming the libraries (and thus finding a different one) - this is about two libraries exposing the same set of symbol names, confusing the dynamic loader. Even if the libraries have different names they will still expose the same (duplicate) set of symbols.

Cheers, Axel.

Hi, Axel and Pepe.

Thanks for the attention to this and the link to the bug report.

I did confirm that my Geant4-ROOT application which I built in CentOS7 runs fine in an RHEL7 environment without any changes. So for now this is going to be my solution.

RE: disabling OpenGL

This seems like something I might try.

Will turning off OpenGL in ROOT have any undesirable side effects? For instance, can I still display geometries using some alternate rendering scheme?

More generally, does anyone know why exactly an OpenGL library needs to link in LLVM anyways? It seems completely unnecessary, but I guess I don’t know enough about Mesa to say why they’ve done this. RHEL7 doesn’t have this problem so I’m just wondering.

–Jeremy

Hi,

They are JITing code, adapted to the hardware found in your box. llvm is perfect for that :slight_smile: And this feature is only available for some versions of mesa, and broken (symbol visibility) for even less versions of mesa.

And no, viewing geometries is exactly what OpenGL is used for :frowning: Nonetheless, as a cross check, could you see whether ROOT crashes with -Dopengl=Off?

Axel.

Hi,

The is a bug in Mesa … they expose the LLVM symbol and this clashes with our (hidden) version of LLVM. Disabling OpenGL in ROOT might work around the problem

Cheers,
Philippe.

BTW, here is this bug in the Mesa OpenGL issue tracker: (filed in November 2015, no activity since then)
bugs.freedesktop.org/show_bug.cgi?id=93103

So what’s the opinion of the ROOT team, ROOT OpenGL is unusable on CentOS-7, cannot be fixed?

K.O.

Hi,

I just don’t know what to do to work around their problem - so indeed, until the drivers are updated (maybe that’s the case for CentOS 7.3) I don’t see how this can work. And while in the end it’s libmesa who should fix this, CentOS has its share (libllvm.so). I have opened cern.service-now.com/service-po … INC1242141 hoping that the CERN IT people have some visibility within the CentOS community. We’ll see…

Cheers, Axel.