Errors building system PCH with cppyy's Root on Android


ROOT Version: 6.26.04
Platform: Android x86_64 (Termux)
Compiler: Clang 15.0.1


Hi,

I’m trying to get cppyy working on my embedded device. I’m able to compile and run cling (installed with cpt.py), and can e.g. do import <stdio.h> and printf("Hello World");

However, cppyy requires Root, and so I have been trying to get Root to build. For reference I am using the version of Root from the cppyy-backend pip package.

I have gotten to the point where rootcling builds successfully, with some minor patches including one to TPosixThread.cxx (disabling specific pthread_setcancelstate functionality, since that’s not available on Android). I don’t think this should be causing my issue.

My issue comes at the Generating PCH for core core/thread io/io stage where I get a segfault:

[ 97%] Built target obj.clingInterpreter
[ 97%] Linking CXX static library ../../../../lib/libclingInterpreter.a
[ 97%] Built target clingInterpreter
[ 97%] Built target CLING
[ 97%] Built target LLVMRES
[ 98%] Built target Dictgen
[ 98%] Built target ClingUtils
[100%] Built target MetaCling
[100%] Linking CXX executable src/rootcling_stage1
[100%] Linking CXX shared library ../../../lib/libCling.so
[100%] Built target rootcling_stage1
[100%] Built target Cling
[100%] Built target G__CoreLegacy
[100%] Built target CoreLegacy
[100%] Built target G__ThreadLegacy
[100%] Built target ThreadLegacy
[100%] Built target G__RIOLegacy
[100%] Built target RIOLegacy
[100%] Linking CXX executable ../bin/rootcling
[100%] Built target rootcling
[100%] Generating etc/allDict.cxx.pch

Generating PCH for core core/thread io/io

IncrementalParser Setting up m_Consumer
<...>

DeclareInternal input is #include "cling/Interpreter/RuntimeUniverse.h"
<...>
DeclareInternal Finished
endTransaction empty tx, T is 0x72f71ce0f000

DeclareInternal input is <...>
<...>
DeclareInternal Finished
endTransaction empty tx, T is 0x72f71c9f9200
endTransaction empty tx, T is 0x72f71c9f9200

DeclareInternal input is #include "cling/Interpreter/DynamicLookupRuntimeUniverse.h"
<...>
DeclareInternal Finished

DeclareInternal input is namespace __CppyyLegacy_SpecialObjects{}
ParseInternal handling Decl
ParseInternal handing top level w/o trap
HandleTopLevelDecl: Handling top level decl
HandleTopLevelDecl: no m_Consumer, its pointer is 0x0
IncrementalParser::Compile Transaction CurT is 0x72f71c9fa600
IncrementalParser::Compile ParseRes 0
endTransaction empty tx, T is 0x72f71c9fa600
IncrementalParser::Compile PRT pointer is 0x0
IncrementalParser::Compile PRT int is 0
DeclareInternal setting *T to 0x0
DeclareInternal Finished
Stack dump:
0.	Program arguments: ./bin/rootcling -rootbuild -generate-pch -f /data/data/com.termux/files/usr/tmp/allDict.cxx -noDictSelection -D__CLING__ -DROOT_PCH -I./include -I./etc -I./etc/dictpch -I./etc/cling -I/data/data/com.termux/files/home/cppyy-backend/cling/builddir/include -cxxflags -std=c++17 -pthread etc/dictpch/allHeaders.h etc/dictpch/allLinkDefs.h
make[2]: *** [CMakeFiles/onepcm.dir/build.make:79: etc/allDict.cxx.pch] Error 139
make[1]: *** [CMakeFiles/Makefile2:6458: CMakeFiles/onepcm.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
~/.../cling/builddir $

The extra logging is adding by me. Basically with gdb and whatnot I can see that the error is happening in TClingCallbacks.cxx at

Transaction* T = 0;
m_Interpreter->declare("namespace __ROOT_SpecialObjects{}", &T);   
fROOTSpecialNamespace = dyn_cast<NamespaceDecl>(T->getFirstDecl().getSingleDecl());

T is a nullptr, causing the segfault. I looked in the interpreter code, and it looks like the “namespace …” expression is being parsed successfully, but it’s not being added to the transaction, which eventually causes the nullptr.

It’s not being added because for some reason, m_Consumer in DeclCollector::HandleTopLevelDecl is 0x0 when the transaction is parsed. In the IncrementalParser::IncrementalParser constructor, the value of WrappedConsumer is 0x0. This seems to be intentional, since m_CI->getFrontendOpts().ProgramAction == frontend::ParseSyntaxOnly and so the WrappedConsumer is never being created.

So to summarize my understanding at the moment, it looks to me that 1. the WrappedConsumer which is of type clang::ASTConsumer is never being created, since the ProgramAction is ParseSyntaxOnly. At the same time, the fROOTSpecialNamespace is expecting the transaction to be populated with declarations, which isn’t happening because WrappedConsumer (m_Consumer in DeclCollector) is never created. :scream:

Any guidance would be greatly appreciated!

Wow, impressive bug hunting, thanks! I’m trying to see how this behaves on Linux (because I don’t remember). I’ll be back!

1 Like

And back. So the reason is that we fail to see that this is rootcling: TClingCallbacks::TClingCallbacks gets an argument hasCodeGen (which would prevent the crashing code from being run). That’s false if the current process contains a symbol we use to identify rootcling being the current process - see IsFromRootCling() in TCling.cxx. That fails on Android, and I really don’t know why! Can you check that the symbol is available in rootcling, e.g. calling dlsym from gdb and nm $(which rootcling) | grep usedToIdentifyRootClingByDlSym?

1 Like

However, cppyy requires Root, and so I have been trying to get Root to build.

It doesn’t as such. Yes, it contains portions of ROOT packaged into cppyy-backend.

What I don’t understand here is that although the above traceback shows cppyy-backend specific parts (such as the Legacy moniker), __ROOT_SpecialObjects does not appear in cppyy-backend code (only __CppyyLegacy_SpecialObjects does). It does in ROOT proper. Are you mixing versions somehow? E.g. do you have a rootcling executable available from ROOT in $PATH during the build?

1 Like

@wlav Ah ok, that explains alot. For some reason I thought it was pulling the full source in. I’ve been looking at the ROOT source for reference on my computer, but on device it is correctly the cppyy-backend code from running create_src_directory.py

@Axel Thank you for that, I think I was able to solve that issue! nm rootcling showed the usedToIdentifyRootClingByDlSym symbol. However, dlsym could not find it, apparently because it was not in dynamic symbol table (nm --dynamic rootcling).

Adding set_property(TARGET rootcling PROPERTY ENABLE_EXPORTS 1) to src/main/CMakeLists.txt caused the symbol to be added to the dynamic table & now cppyy-backend is generating the PCH.

Still working on getting cpppyy running (there’s stuff with root-config being called and it not working X_X) but think I’m almost there.

1 Like

Got it running :slight_smile: Approximate instructions for future reference:

  1. git clone https://github.com/wlav/cppyy-backend/tree/master
  2. Run source downloader, python3 create_src_directory.py
  3. Apply following patch
diff -r cppyy-backend/cling/python/cppyy_backend/loader.py cppyy-backend-modified-tmp/cling/python/cppyy_backend/loader.py
36c36
<         return ctypes.CDLL(bkname, ctypes.RTLD_GLOBAL)
---
>         return ctypes.CDLL(bkname, ctypes.RTLD_GLOBAL), None

diff -r cppyy-backend/cling/src/cmake/modules/CheckCompiler.cmake cppyy-backend-modified-tmp/cling/src/cmake/modules/CheckCompiler.cmake
205a206,207
> elseif(CMAKE_SYSTEM_NAME MATCHES Android)
>   include(SetUpLinux)
diff -r cppyy-backend/cling/src/core/clib/src/mmapsup.c cppyy-backend-modified-tmp/cling/src/core/clib/src/mmapsup.c
45c45
< #if defined(R__LINUX) && !defined(R__GLIBC) && !defined(__CYGWIN__) \
---
> #if !defined(__ANDROID__) && defined(R__LINUX) && !defined(R__GLIBC) && !defined(__CYGWIN__) \
diff -r cppyy-backend/cling/src/core/clib/src/mvalloc.c cppyy-backend-modified-tmp/cling/src/core/clib/src/mvalloc.c
32c32
< #if defined(R__LINUX) && !defined(R__GLIBC) && !defined(__CYGWIN__) \
---
> #if !defined(__ANDROID__) && defined(R__LINUX) && !defined(R__GLIBC) && !defined(__CYGWIN__) \

diff -r cppyy-backend/cling/src/core/metacling/src/TCling.cxx cppyy-backend-modified-tmp/cling/src/core/metacling/src/TCling.cxx
19a20,21
> #include <stdio.h>
> 
159a162
> #include <link.h>
1108a1112
>   dlerror();
1109a1114
>   //printf("Is from rootcling? %d, dlerror is %s", foundSymbol, dlerror());
3345,3374c3350,3388
<    struct PointerNo4 {
<       void* fSkip[3];
<       void* fPtr;
<    };
<    struct LinkMap {
<       void* fAddr;
<       const char* fName;
<       void* fLd;
<       LinkMap* fNext;
<       LinkMap* fPrev;
<    };
<    if (!fPrevLoadedDynLibInfo || fPrevLoadedDynLibInfo == (void*)(size_t)-1) {
<       PointerNo4* procLinkMap = (PointerNo4*)dlopen(0,  RTLD_LAZY | RTLD_GLOBAL);
<       // 4th pointer of 4th pointer is the linkmap.
<       // See http://syprog.blogspot.fr/2011/12/listing-loaded-shared-objects-in-linux.html
<       LinkMap* linkMap = (LinkMap*) ((PointerNo4*)procLinkMap->fPtr)->fPtr;
<       if (!fSharedLibs.Contains(linkMap->fName))
<          RegisterLoadedSharedLibrary(linkMap->fName);
<       fPrevLoadedDynLibInfo = linkMap;
<       // reduce use count of link map structure:
<       dlclose(procLinkMap);
<    }
< 
<    LinkMap* iDyLib = (LinkMap*)fPrevLoadedDynLibInfo;
<    while (iDyLib->fNext) {
<       iDyLib = iDyLib->fNext;
<       if (!fSharedLibs.Contains(iDyLib->fName))
<          RegisterLoadedSharedLibrary(iDyLib->fName);
<    }
<    fPrevLoadedDynLibInfo = iDyLib;
---
>   auto callback = [](struct dl_phdr_info *info, size_t size, void *data) {
>     auto self = (TCling*)data;
>     if (info->dlpi_name && info->dlpi_name[0]) {
>       if (!self->fSharedLibs.Contains(info->dlpi_name))
>         self->RegisterLoadedSharedLibrary(info->dlpi_name);
>     }
>     return 0;
>   };
>   dl_iterate_phdr(callback, this);
diff -r cppyy-backend/cling/src/core/thread/src/TPosixThread.cxx cppyy-backend-modified-tmp/cling/src/core/thread/src/TPosixThread.cxx
26a27,45
> // Android compat. Will prob break things since kills threads w/o any of this cancel signalling stuff.
> #ifdef __ANDROID__
> int pthread_cancel(pthread_t h) {
>         return pthread_kill(h, 0);
> }
> int pthread_setcanceltype(int state, int oldstate) {
> 	return 0;
> }
> int pthread_setcancelstate(int state, int *oldstate) {
> 	return 0;
> }
> #define PTHREAD_CANCEL_DISABLE 0
> #define PTHREAD_CANCEL_ENABLE 0
> #define PTHREAD_CANCEL_ASYNCHRONOUS 0
> #define PTHREAD_CANCEL_DEFERRED 0
> #define PTHREAD_CANCEL_ENABLE 0
> void pthread_testcancel(void) {}
> #endif /* __ANDROID__ */
> 
diff -r cppyy-backend/cling/src/core/unix/src/TUnixSystem.cxx cppyy-backend-modified-tmp/cling/src/core/unix/src/TUnixSystem.cxx
212c212
< #if (defined(R__LINUX) && !defined(R__WINGCC))
---
> #if (defined(R__LINUX) && !defined(R__WINGCC) && !defined(__ANDROID__))
diff -r cppyy-backend/cling/src/interpreter/cling/lib/Interpreter/CIFactory.cpp cppyy-backend-modified-tmp/cling/src/interpreter/cling/lib/Interpreter/CIFactory.cpp
400a401,407
> 
>   #ifdef __TERMUX__
>       sArguments.addArgument("-isystem", __TERMUX_PREFIX__"/include");
>       sArguments.addArgument("-isystem", __TERMUX_PREFIX__"include/x86_64-linux-android");
>   #endif
> 
> 

diff -r cppyy-backend/cling/src/main/CMakeLists.txt cppyy-backend-modified-tmp/cling/src/main/CMakeLists.txt
25a26
> set_property(TARGET rootcling PROPERTY ENABLE_EXPORTS 1)
43,44c44,45
<                      COMMAND ln -f rootcling rootcint
<                      COMMAND ln -f rootcling genreflex
---
>                      COMMAND ln -fs rootcling rootcint
>                      COMMAND ln -fs rootcling genreflex


diff get_device_api_level_inlines.txt /data/data/com.termux/files/usr/include/bits/get_device_api_level_inlines.h
34c39,40
< int atoi(const char* __s) __attribute_pure__;
---
> //int atoi(const char* __s) __attribute_pure__;
>
38c44
<   int api_level = atoi(value);
---
>   int api_level = 29;//atoi(value);

Note the change required to /data/data/com.termux/files/usr/include/bits/get_device_api_level_inlines.h, when cling includes any system headers this file causes problems, might be Termux specific.
4. cd cppyy-backend/cling, pip install . --no-build-isolation --verbose (–no-build-isolation so it doesn’t try to install pip’s cmake)
5. cd clingwrapper, pip install . --no-build-isolation --verbose
6. Have to change loading on Android because of

  • https://issuetracker.google.com/issues/109986352 (RTLD_GLOBAL just broken) and
  • https ://groups.google.com/g/android-ndk/c/0WVNu6JSit4/m/e5VQIdF9CQAJ .
cd /data/data/com.termux/files/usr/lib/python3.10/site-packages/cppyy_backend/lib 
patchelf --add-needed libCoreLegacy.so libCling.so
patchelf --add-needed libRIOLegacy.so libCling.so
patchelf --add-needed libThreadLegacy.so libCling.so
patchelf --add-needed libCling.so libcppyy_backend.so
cd /data/data/com.termux/files/usr/lib/python3.10/site-packages/
patchelf --add-needed libcppyy_backend.so libcppyy.cpython-310.so
  1. LD_LIBRARY_PATH=/data/data/com.termux/files/usr/lib/python3.10/site-packages/cppyy_backend/lib python3 import cppyy
1 Like

@bellenot should we add

set_property(TARGET rootcling PROPERTY ENABLE_EXPORTS 1) to src/main/CMakeLists.txt

?

Wow impressive! Congrats and thanks for sharing!

1 Like

Maybe, we can try. I just created a pull request for this.

1 Like

For anyone who comes across this in the future, if you load cppyy in a shared lib on Android, you also have to apply the following diff, because the Android linker is weird (aka difficult)

diff -r cppyy-backend/cling/src/interpreter/llvm/src/lib/Support/Unix/DynamicLibrary.inc cppyy-backend-modified-tmp/cling/src/interpreter/llvm/src/lib/Support/Unix/DynamicLibrary.inc
49c51,62
<   return ::dlsym(Handle, Symbol);
---
>   dlerror();
>   void *out = ::dlsym(RTLD_DEFAULT, Symbol);
>   if (out) {
>     return out;
>   }
>   out = ::dlsym(Handle, Symbol);
>   return out;
>  // Not sure why, but if you load python/cppyy in a shared lib, dlsym RTLD_DEFAULT finds different symbols from dlsym (dlopen(null, RTLD_LAZY), "..."). So have to check both. 

Thanks! Maybe @etejedor could be interested…

1 Like